Episode 1  |  36 Min  |  February 05

Why AI hallucinates and why it matters with Ankur Taly, scientist at Google

Share on

Engaging topics at a glance

  • 00:00:20
    Introduction
  • 00:10:36
    Why do models make mistakes and why is it called AI hallucinations?
  • 00:13:31
    How does a model know which relationships are meaningful and not?
  • 00:16:12
    Things enterprise leaders should keep in mind while deploying LLMs
  • 00:18:14
    How does grounding address these AI hallucinations?
  • 00:21:53
    How much is grounding going to solve the hallucination problem?
  • 00:24:47
    Does hallucinatory capability drive innovation?

Join us in this episode featuring Ankur Taly, Staff Research Scientist, Google, as we explore the concept of grounding of LLMs!

Machines are supposed to work without mistakes, just like a calculator does math correctly. But in the world of artificial intelligence, errors, often called ‘AI hallucinations,’ are common. This makes us wonder about these mistakes and the computer programs behind them. For businesses that use AI in their work, especially when dealing with customers, making sure AI works without errors is very important.

Grounding requirement is that not only that you should not have any made up stuff, but everything that you output should be grounded in some knowledge source and the knowledge source is something that I control.

– Ankur Taly

Understanding how AI makes decisions and being clear about its processes is very important. Business leaders need to be able to watch and explain how AI makes decisions. This will be crucial for using AI in their companies in the future.

To fight AI hallucinations, grounding is important. Grounding means making sure AI answers are based on real facts. This involves teaching AI systems using correct and reliable information and making them give answers that can be proven. Grounding stops AI from making things up or giving wrong information.
When businesses use LLMs (large language models) in their work, they should think about some important things. First, they need to use good data to teach AI because bad data can lead to wrong or unfair results. It’s also important to have rules about how AI is used in the company to avoid causing harm or misusing AI information.

While you can use this in a very creative way, this next word prediction is also ultimately to be blamed for hallucinations because what it’s doing is basically it looks at what it recently said and then tries to predict what will likely come right after.

– Ankur Taly

Businesses also need to keep an eye on AI’s results to fix mistakes or wrong information. Having people check and filter AI’s work ensures that it’s correct and consistent. It’s also important to teach employees and users about what AI can and can’t do to avoid misunderstandings or misuse.


Even though AI hallucinations can be a problem, they can also have some positives. They can make people think creatively and find new solutions to tough problems. AI’s imaginative ideas can be fun, offering new types of art and media. Plus, AI hallucinations can help with learning by making people think and talk about interesting topics.

Production Team
Arvind Ravishunkar, Ankit Pandey, Chandan Jha

Latest podcasts

Episode 8  |  51 Min  |  February 05

Are LLMs the answer to everything with Prof. Mausam, IIT Delhi

Are LLMs the answer to everything with Prof. Mausam, IIT Delhi

Share on

Engaging topics at a glance

  • 00:32:28
    Introduction
  • 00:38:00
    Intended use of LLMs
  • 00:41:30
    Performance of smaller model trained for specific task vs LLMs.
  • 00:45:00
    How LLMs fare when dealing with mathematical and reasoning problems
  • 00:52:40
    How small models are able to perform better than LLMs?
  • 00:55:45
    Future of LLMs and Traditional AI Models

Uncovering whether LLMs are the one part of the answer or the entire answer to your problem with our guest, Prof. Mausam, with our guest, Prof. Mausam, a distinguished figure in Computer Science at IIT Delhi with over 2 decades of experience in Artificial Intelligence.

In this episode, we discussed that LLMs aren't an answer to all AI-based problems. If you are trying to automate your factories, if you are trying to bring in predictive maintenance, if you want to do smarter planning, in all these automation tasks, LLMs are one part of the answer and aren't the entire answer. And so, the breakthrough in AI in the last couple of years in neural networks and language models alone isn't sufficient for us to get to this world. We dream of this world of AI-based automation and what it will do for us. It's got the potential, but there is an X factor that's still missing.

Guest started with discussing the misconception about large language models (LLMs) and their intended use. Initially designed for basic language tasks, summarizing text, recalling information, and answering basic to moderately complex questions, LLMs are much more intelligent than what was conceived.

He also talked about despite various attempts to improve the LLMs; they found that these enhanced models (LLMs) didn't match the performance of standalone trained models.

The conversation shifted to the limitations of LLMs in handling complex industry applications such as supply chain management. Guest highlighted that these tasks involve vast numerical considerations, vendor identification, object quantity determination, cost analysis, and optimization, which are beyond the capabilities of LLMs. 

When further discussing the reasoning capabilities and how they fare when dealing with a mathematical problem, it emerged that as the level of complexity of such problems goes up, the performance of these models goes down.

He mentioned it's better to use these models for writing code to solve mathematical problems rather than using them for solving such problems.

In the end, the guest shared his perspective on the future use of LLMs and traditional methods, and in his view, it will be better to help us solve our problems in the best way.

Production Team
Arvind Ravishunkar, Ankit Pandey, Chandan Jha

Top trending insights

Episode 4  |  53 Min  |  February 05

Performance and choice of LLMs with Nick Brady, Microsoft

Performance and choice of LLMs with Nick Brady, Microsoft

Share on

Engaging topics at a glance

  • 00:12:23
    Introduction
  • 00:14:20
    Current use cases being deployed for GenAI
  • 00:19:10
    Performance of LLM models
  • 00:36:15
    Domain Specific LLMs vs General Intelligence LLMs
  • 00:38:37
    How to choose the right LLM?
  • 00:41:27
    Open Source vs Closed Source
  • 00:44:50
    Cost of LLM
  • 00:46:10
    Conclusion

"Exploring what should organization considering when choosing to adopt LLMs" with guest Nick Brady, Senior Program Manager at Microsoft Azure Open AI Service

AI has been at the forefront of transformation for more than a decade now. Still, the Open AI launch of chat GPT in November 2022 will be noted as a historical moment – the scale of which even Open AI did not expect – in the history of technological innovations. Most people don't realize or fully appreciate the magnitude of the shift that we're in. Now, we're able to directly express to a machine a problem that we need to have solved; equipping these technologies with the right reasoning engines and the right connectivity could bring the biggest technology leapfrog not just for enterprises but even in everyday lives.

The onset of leapfrog does bring out a few questions for enterprises looking to adopt GenAI as a part of their strategy, operations and way ahead, like:

What use cases are best suited to adopt the models?

While most customers are looking for how this could reduce business costs in their organizations, the true value is when it is used to maximize business value productivity and downstream that could lead to employee satisfaction and customer satisfaction. Any place where there's language – programming or natural language – is a good use case for generative AI, and that probably would be the most profound shift. So, if you have language, if you have a document, if you have big data where you're trying to sort of synthesize, understand what that content and what the content is, generative AI models can do this ad nauseam without any delay.

The most common metric used across the world to describe LLMs is the number of parameters; in the case of GPT 3, it is trained on 175 billion parameters, but what does this mean?

Parameter size refers to essentially the number of values that the model can change independently as it learns from data and stores all information in the vast associative ray of memory as its model weights. What's perhaps more important for these models, and it speaks to more of their capability, is their vocabulary size.

How does one decide and evaluate which would be the best-suited model for the selected use cases?

The best practice really is to start with the most powerful and advanced language model like GPT 4.0 to test, if it's even possible, with your use case. Post confirming the possibility of use case trickle down to simpler models to find its efficacy and efficiency. If the simpler model can probably achieve 90% of the way, with just a little bit of prompt engineering, then you could optimize for costs.

Organizations would have to define what quality means to them. It could be the model's output, its core response, or performance in terms of latency, where the quality of the output may not be as important as how quickly we can respond back to the user.

The key for leaders is to pay close attention to the potential use cases, test them with the best model and then optimize the model to balance the cost, efficacy and efficiency factors.

Production Team
Arvind Ravishunkar, Ankit Pandey, Chandan Jha

Co-create for collective wisdom

This is your invitation to become an integral part of our Think Tank community. Co-create with us to bring diverse perspectives and enrich our pool of collective wisdom. Your insights could be the spark that ignites transformative conversations.

Learn More
cocreate-halftone