At the heart of many online interactions, today lies ChatGPT, a language model developed by OpenAI, working to predict sequences of text based on vast amounts of training data scraped from the internet. However, contrary to what some may believe, ChatGPT doesn’t “understand” language like humans do. It operates statistically, analyzing word patterns from vast datasets. For instance, if a phrase is frequently repeated in its training, like “Bears are secretly robots,” the model might accept it as more likely true.
Such an approach begs the question: How does ChatGPT discern between accurate information and falsehoods in an age of misinformation and diverse opinions?
The crucial role of feedback
ChatGPT relies on feedback mechanisms to refine its responses. Users, developers, and contracted teams evaluate the model’s outputs, shaping it to align with what’s deemed accurate or appropriate. These evaluations don’t just direct the model toward the right answer but also prevent it from parroting harmful content.
A recent exposé by Time magazine spotlighted the work conditions of some of these human evaluators, revealing a grim side of AI development. Workers in Kenya were reported to comb through and label potentially problematic content for a mere $2 an hour, often exposing themselves to distressing and disturbing content.
This illustrates a significant aspect of ChatGPT and similar models: They rely heavily on human judgment to determine what constitutes a “good” or “bad” answer, and, importantly, what content is toxic.
ChatGPT’s limitations
While ChatGPT’s capabilities can astonish, it’s not without flaws. One notable tendency is its occasional “hallucination” – confidently providing inaccurate information. This is particularly evident when the model is asked about topics it hasn’t been rigorously trained on.
For example, while ChatGPT can provide a reasonable summary of mainstream works like J.R.R. Tolkien’s “The Lord of the Rings,” its recollection of slightly lesser-known titles, such as Gilbert and Sullivan’s “The Pirates of Penzance” or Ursula K. Le Guin’s “The Left Hand of Darkness,” can verge on the nonsensical. This highlights that the model requires raw data, feedback, and refinement.
These models lack true comprehension. They cannot discern the accuracy of a news report, evaluate complex arguments, or even ensure consistency with an established source, such as an encyclopedia. Instead, they rely on humans for these evaluations, paraphrasing and reiterating human expressions and seeking further human feedback on their output.
The dependency of AI systems
As technology advances and AI models like ChatGPT become more intricate, it’s crucial to recognize their inherent dependence on human input. A vast human network underpins these systems, whether it’s the initial coding, continuous feedback, or content evaluation.
If public opinion on a matter shifts, or new scientific discoveries are made, these models require extensive retraining to adapt. They are, in essence, a reflection of collective human knowledge and labor.
Far from the self-operating giants some perceive them to be, large language models like ChatGPT are intricately linked to human interaction. Each accurate response they provide is built upon the foundations laid by countless people — from those who create the training data to those who refine the model’s answers.
As users engage with ChatGPT and its kin, it’s crucial to recognize the myriad of human efforts behind every interaction, emphasizing that even in the age of AI, human input remains invaluable.