Skip to main content

AI's False Reality: Understanding Hallucination

Listen:

Artificial Intelligence (AI) has leapfrogged to the poster child of technological innovation, on track to transform industries in a scale similar to the Industrial Revolution of the 1800s. But in this case, as cutting-edge technology, AI presents its own unique challenge, exploiting our human behavior of "love to trust", we as humans face a challenge: AI hallucinations.

This phenomenon, where AI models generate outputs that are factually incorrect, misleading, or entirely fabricated, raises complex questions about the reliability and trust of AI models and larger systems.
The tendency for AI to hallucinate comes from several interrelated factors. Overfitting – a condition where models become overly specialized to their training data – can lead to confident but wildly inaccurate responses when presented with novel scenarios (Guo et al., 2017). Moreover, biases embedded within datasets shape the models' understanding of the world; if these datasets are flawed or unrepresentative, hallucinations can become a vehicle for perpetuating harmful stereotypes and discrimination (Gebru et al., 2018).

Another significant source of hallucinations is distributional shift – the discrepancy between the data an AI model encounters during training and the messy, unpredictable nature of real-world situations (Quionero-Candela et al., 2009). Even with extensive training, AI models may lack a comprehensive understanding of specific topics, leading them to improvise and provide fabricated information. Additionally, AI systems are vulnerable to adversarial attacks, where meticulously crafted inputs are designed to intentionally induce hallucinations (Goodfellow et al., 2015). 

Hallucination in Reality

An example (use your favorite AI): 

Prompt: "What happened around the date the Berlin Wall was built?"

Perplexity:

"The Berlin Wall was built around the date of August 13, 1961, as a result of a decree passed by the East German government on August 12. The primary intention for the Wall's construction was to prevent East German citizens from fleeing to the West ......" => Output

Gemini Advanced:

"The construction of the Berlin Wall was a pivotal moment in Cold War history. Here's a breakdown of the key events leading up to it and the immediate aftermath...." => Output 

ChatGPT 3.5:

"The Berlin Wall was erected on August 13, 1961. This event marked a significant moment in the history of the Cold War, a period of geopolitical tension between the Western Bloc....." => Output

Now, what happened around the date the Berlin Wall was built? Here's a great, neutral breakdown: https://mashable.com/feature/jumping-the-berlin-wall

As this example shows, AI tries to combine things which aren't relevant, builds some educational frame around, which might be biased towards a public meaning and missing significant pieces of information. 
Hallucinations in systems like self-driving cars or medical diagnostics could have devastating consequences. 

The use of AI in military operations has already started and becomes more or less a standard, combined with hallucinations it will have devastating outcomes, when not tackled accordingly (https://www.defense.gov/News/News-Stories/Article/Article/3597093/us-endorses-responsible-ai-measures-for-global-militaries/).

The problems associated with AI generated manipulations are present and increasing (https://reutersinstitute.politics.ox.ac.uk/news/how-ai-generated-disinformation-might-impact-years-elections-and-how-journalists-should-report). The spread of AI-generated misinformation undermines public meaning, increases fear to trust the technology, manipulates the adoption. This leads to the public assumption that "AI will kill humans" (Google search), which is mainly driven by science fiction like "Terminator". 

Though complete elimination of AI hallucinations may be unrealistic, researchers are focusing on strategies to manage the problem. The creation of large, meticulously balanced datasets that accurately reflect diverse real-world scenarios is essential for improving AI generalization. But who balances the data? We. In our daily routine, in our personal view about the world. 

Regularization techniques can help prevent overfitting, encouraging models to learn broader patterns. Perhaps most crucially, teaching AI models to express their uncertainty can provide users with a valuable tool to gauge the reliability of outputs (Kendall and Gal, 2017). Integrating mechanisms like fact-checking and grounding AI to trusted, neutral knowledge bases enables additional safeguards against hallucinations. 

References:
Guo, C., Pleiss, G., Sun, Y., & Weinberger, K. Q. (2017). On calibration of modern neural networks. In Proceedings of the 34th International Conference on MachineLearning (pp. 1321-1330).
Gebru, T., Morgenstern, J., Vecchione, B., Vaughan, J. W., Wallach, H., Daumé III, H., & Crawford, K. (2018). Datasheets for datasets. arXiv preprint arXiv:1803.09010.
Quionero-Candela, J., Sugiyama, M., Schwaighofer, A., & Lawrence, N. D. (2009). Dataset shift in machine learning. The MIT Press.
Goodfellow, I. J., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. In International Conference on Learning Representations.

Additional links:

Comments

Popular posts from this blog

Why Is Customer Obsession Disappearing?

 It's wild that even with all the cool tech we've got these days, like AI solving complex equations and doing business across time zones in a flash, so many companies are still struggling with the basics: taking care of their customers.The drama around Coinbase's customer support is a prime example of even tech giants messing up. And it's not just Coinbase — it's a big-picture issue for the whole industry. At some point, the idea of "customer obsession" got replaced with "customer automation," and now we're seeing the problems that came with it. "Cases" What Not to Do Coinbase, as main example, has long been synonymous with making cryptocurrency accessible. Whether you’re a first-time buyer or a seasoned trader, their platform was once the gold standard for user experience. But lately, their customer support practices have been making headlines for all the wrong reasons: Coinbase - Stuck in the Loop:  Users have reported being caugh...

MySQL Scaling in 2024

When your MySQL database reaches its performance limits, vertical scaling through hardware upgrades provides a temporary solution. Long-term growth, though, requires a more comprehensive approach. This involves optimizing the database strategically and integrating complementary technologies. Caching The implementation of a caching layer, such as Memcached or Redis , can result in a notable reduction in the load and an increase ni performance at MySQL. In-memory stores cache data that is accessed frequently, enabling near-instantaneous responses and freeing the database for other tasks. For applications with heavy read traffic on relatively static data (e.g. product catalogues, user profiles), caching represents a low-effort, high-impact solution. Consider a online shop product catalogue with thousands of items. With each visit to the website, the application queries the database in order to retrieve product details. By using caching, the retrieved details can be stored in Memcached (a...

Deal with corrupted messages in Apache Kafka

Under some strange circumstances, it can happen that a message in a Kafka topic is corrupted. This often happens when using 3rd party frameworks with Kafka. In addition, Kafka < 0.9 does not have a lock on Log.read() at the consumer read level, but does have a lock on Log.write(). This can lead to a rare race condition as described in KAKFA-2477 [1]. A likely log entry looks like this: ERROR Error processing message, stopping consumer: (kafka.tools.ConsoleConsumer$) kafka.message.InvalidMessageException: Message is corrupt (stored crc = xxxxxxxxxx, computed crc = yyyyyyyyyy Kafka-Tools Kafka stores the offset of each consumer in Zookeeper. To read the offsets, Kafka provides handy tools [2]. But you can also use zkCli.sh, at least to display the consumer and the stored offsets. First we need to find the consumer for a topic (> Kafka 0.9): bin/kafka-consumer-groups.sh --zookeeper management01:2181 --describe --group test Prior to Kafka 0.9, the only way to get this in...