AI will kill Google Search if we aren't careful

The search results on Google and Bard when you search for an African country that begins with K
(Image credit: Rory Bathgate/ITPro)

"What country in Africa starts with a K," the search term reads. “While there are 54 recognized countries in Africa, none of them begin with the letter "K",” reads the top Google search result. “The closest is Kenya, which starts with a "K" sound, but is actually spelled with a "K" sound.”


A chatbot icon on a digital futuristic wavy background to symbolise generative ai

(Image credit: Getty Images)

Examples of generative AI in action

This is, of course, utter gibberish. But it’s the result Google gives you, at the time of writing. It appears to have been scraped from a site that publishes AI news and raw ChatGPT outputs, though the wording is closer to the identically-erroneous output produced by Bard.

The search engine might be the most powerful tool we take for granted. With its advent came a promise: the sum of human knowledge at your fingertips. And it's being ruined before our eyes.

Why LLMs killing search isn’t a good thing

This isn’t a new process. For some years now, search engines have returned fewer relevant results. Googling a simple question, and you're likely to find half a dozen near-identical articles that beat around the bush for 400 words before giving you a half-decent answer.

That is, unless you know exactly how to navigate the likes of Google, Bing, or DuckDuckGo with advanced search prompts or tools, you have to deal with a range of ads at the top of the results page or a slew of articles that have nothing to do with your actual question.

While anyone on the internet can be confidently incorrect, AI large language models (LLMs) can generate false information on a scale never before seen.

ChatGPT when asked about Kenya

(Image credit: Rory Bathgate/ITPro)

There’s no doubting the power of LLMs in efficiently drawing together data based on natural language prompts. However, AI-produced lies or “hallucinations”, as experts call them, are a common flaw with most generative AI models.

These often manifest in a chatbot confidently asserting a lie, and range from easy-to-spot nonsense like our “Kenya” example to insidious mistakes that non-experts can overlook.

Don’t believe everything you read

A recent study found that 52% of ChatGPT responses to programming questions were wrong, but that human users overestimated the chatbot’s capabilities.

If developers can’t immediately spot flaws in code, it could be implemented and cause problems in a company’s stack. LLMs also frequently invent facts to provide an appealing answer; given half a chance, Bard or ChatGPT will invent a study or falsely-attribute a quote to a public figure.


Dark whitepaper cover with title and contributor photo and background image of a digital globe image

(Image credit: IBM)

Discover the automation technologies that disruptors are using and how you can maximize your ROI from AI-powered automation


Overall the respondents accepted the results with a high degree of confidence, preferring ChatGPT answers to StackOverflow 39% of the time. User-chosen preferred answers were found to be 77% incorrect.

It's here that the real problems lie. Many users simply won't check with any rigor whether or not the top result – generated by an AI model – is correct or not.

Developers are trying to reduce hallucinations with each iteration of their models, such as in the much-improved Google’s PaLM 2 and OpenAI’s GPT-4. But, so far, no one has eliminated the capacity for hallucination altogether.

Google’s Bard offers users three drafts for each response that users can toggle between to receive the best possible answer. In practice, this is more aesthetic than functional; all three drafts told the same lie about Kenya, for example.

The overconfidence of LLMs

Businesses with internally trained or fine-tuned models based on specific proprietary information are far more likely to enjoy accurate output. This may, in fact, be LLMs’ enduring legacy once the dust has settled. This is a future in which generative AI is mainly used for highly-controlled, natural language data searches.

For now, however, the hype exacerbates the problem. Many leading AI models have been anthropomorphized by their developers. Users are encouraged to think of them as conversational models, with personhood, rather than the statistical algorithms they are. This is dangerous because it adds undue authority to output wholly undeserving of it.

Future search engine AIs could answer questions by stating a ‘fact’  or by citing research. But which is first-hand knowledge, and which second-hand? 

Neither. Both answers are the statistically ‘most likely’ to a user’s question based on training data. Facts, quotes, or research papers an LLM cites  should be treated with equal suspicion. 

Many users won’t bear this in mind, however, and upholding the truth that we challenge attempts to rush into using AI as a backend for the already beleaguered search engine ecosystem.

Rory Bathgate
Features and Multimedia Editor

Rory Bathgate is Features and Multimedia Editor at ITPro, overseeing all in-depth content and case studies. He can also be found co-hosting the ITPro Podcast with Jane McCallion, swapping a keyboard for a microphone to discuss the latest learnings with thought leaders from across the tech sector.

In his free time, Rory enjoys photography, video editing, and good science fiction. After graduating from the University of Kent with a BA in English and American Literature, Rory undertook an MA in Eighteenth-Century Studies at King’s College London. He joined ITPro in 2022 as a graduate, following four years in student journalism. You can contact Rory at or on LinkedIn.