Why ChatGPT Needs Human Feedback

Artificial intelligence and AI-powered chatbots and assistants continue to be the center of attention in the technology discourse, the business discourse, and the business tech discourse. Along with the tech itself, the conversation includes fears and worries over the capabilities of AI and the danger it might pose towards certain industries and professionals.

Many of these worries should be taken seriously and addressed, but a sobering thought is remembering that, at the end of the day, ChatGPT needs humans. More specifically, the AI’s ability to “learn” and “think” depends on explicit human input and guidance—ChatGPT is not just some autonomous algorithm that somehow learns everything on its own.

How ChatGPT Works

ChatGPT, and these artificial intelligence chats in general, essentially function on one core principle: pattern recognition. Before the apps are released into the world, they have to be trained on already classified data. Say, for example, that researchers and developers are trying to “teach” ChatGPT what a rabbit it is. They would show ChatGPT millions and millions of pictures and descriptions of things that are rabbits, and also of things that aren’t rabbits, always clarifying whether an example is or isn’t a rabbit. After millions of iterations, the algorithm starts picking up on what makes something a rabbit, and what makes something not a rabbit.

This is a simplified explanation of how training an AI algorithm works and, done over vast collections of data, it’s how ChatGPT is able to give you pretty convincing responses to just about any topic you throw its way. After millions of virtual training montages, ChatGPT learns not only what you’re asking with the words you use, but where to find that answer in its big brain of data, and how to make that answer legible to you.

Why Supervision Helps

One of the simplifications we made to the description above is the omission of any sort of human intervention. The process is basically the same as the previous section, but humans can pitch in to make responses even better.

Think of your grade school English courses. If you trained an algorithm on those textbooks, it would be able to construct sentences “correctly”—but nobody talks like that. Not only do we practice a bit of grammatical imprecision in our every day speech, but the phrases and words that we use just aren’t always the same as those we find and use in an academic setting. If an AI bot trained on these books, it would come off as strange-speaking—probably robotic, in fact.

For this reason, it’s helpful to incorporate “supervised learning,” which is to say humans check a few variations on a possible response and decide which one is the “best” response, or simply which one we would prefer to see. Being precise with the responses isn’t enough, it’s important to be convincing with the language as well, at least as far as this conversational AI assistant tech is concerned.

Big Limitations

The human feedback helps ChatGPT construct more believable and “natural” responses to prompts, which improves the user experience and, likely, the comfort of use as well. The feedback serves as fine-tuning for stylistic and colloquial choices when it comes to language, but there are still limitations to this feedback process, and to the way the algorithm functions as well.

One thing the algorithm has been shown to struggle with is math. When doing multiplication of large numbers, say in the hundred-thousands, ChatGPT is often off by a few digits. Usually the first and last digits are correct, some digits in the middle are off. The simplest answer to why this happens is that ChatGPT isn’t looking at the numbers and thinking I need a calculator. It’s looking at the numbers are recalling all the instances of multiplication it has seen—including errors. From looking at the products of similar numbers, it makes an educated guess at what this product is…and it’s typically wrong. Try it for yourself!

Again, the issue is that ChatGPT isn’t actually doing the math, it’s just trying to remember all the times it has seen similar math performed, and seeing if it can piece together a solution from those solutions. To fix this, the algorithm would have to incorporate a different kind of process, not just pattern recognition and not just reinforced learning. Apart from just an interesting little bug, this is also a comforting example of the big limitations ChatGPT still faces. At least in this niche discipline of large number multiplication, a human with a calculator is still more valuable than these AI bots.

Living Pono is dedicated to communicating business management concepts with Hawaiian values. Founded by Kevin May, an established and successful leader and mentor, Living Pono is your destination to learn about how to live your life righteously and how that can have positive effects in your career. If you have any questions, please leave a comment below or contact us here. Also, join our mailing list below, so you can be alerted when a new article is released.

Finally, consider following the Living Pono Podcast to listen to episodes about living righteously, business management concepts, and interviews with business leaders.

Why ChatGPT Needs Human Feedback