Study Highlights Inaccuracies in ChatGPT’s Citations and Concerns for Publishers

author
By Tanu Chahal

30/11/2024

cover image for the blog

A recent study by the Tow Center for Digital Journalism has shed light on the challenges publishers face with ChatGPT's citation practices. The findings reveal ongoing issues with the chatbot's ability to accurately reference sources, raising concerns for news organizations, whether they have licensing agreements with OpenAI or not.

Overview of the Study

The research, conducted at Columbia Journalism School, examined how ChatGPT cites sources for quotations from news articles. The study involved 200 quotes sourced from 20 publishers, including major outlets like The New York Times, The Washington Post, and The Financial Times. Researchers selected quotes that would typically yield accurate results in traditional search engines like Google or Bing and tested whether ChatGPT could correctly identify the source article.

Key Findings

The results indicate significant inaccuracies in ChatGPT’s citations:

  • Out of 200 tests, ChatGPT provided partially or completely incorrect citations in 153 cases.

  • Only seven responses acknowledged an inability to locate the correct source.

  • In many cases, ChatGPT generated confident but false citations, misrepresenting or fabricating sources.

The researchers also noted that ChatGPT's inconsistency is problematic. When asked the same query multiple times, the chatbot often provided different answers, a common trait in generative AI tools but a drawback in citation accuracy.

Issues Identified

  1. Unreliable Sourcing
    ChatGPT frequently cited incorrect sources, including cases where it falsely attributed content to publishers who had blocked OpenAI’s crawlers. Instead of admitting its limitations, the chatbot often generated fabricated citations.

  2. Plagiarism Concerns
    In one instance, ChatGPT attributed a New York Times article to a website that had plagiarized the original content. This raises questions about OpenAI's ability to filter and validate its data sources, particularly when dealing with unlicensed or low-quality content.

  3. Decontextualized Journalism
    The study suggests that OpenAI’s approach treats journalism as isolated data points rather than contextualized content, undermining the integrity of the original work.

  4. Lack of Transparency
    Unlike traditional search engines, ChatGPT rarely communicates uncertainty in its responses, making it difficult for users to evaluate the reliability of its claims.

Implications for Publishers

The study reveals that publishers, regardless of their relationship with OpenAI, face risks:

  • For Licensed Publishers: Even organizations with licensing deals, such as The Financial Times, experienced inaccuracies in citations, showing that agreements do not guarantee reliable representation.

  • For Unlicensed Publishers: Those allowing OpenAI to crawl their content may still see errors, while those blocking crawlers are not immune to misattribution or reputational risks.

OpenAI’s Response

In response to the findings, OpenAI described the study as an "atypical test" of ChatGPT. The company emphasized its efforts to help users discover quality content through links, summaries, and attributions. OpenAI also highlighted ongoing improvements, including enhanced citation accuracy and tools for publishers to manage their preferences via their robots.txt settings.

Conclusion

The Tow Center study highlights the challenges publishers face in maintaining control over how their content is used and represented by AI tools like ChatGPT. As generative AI continues to evolve, ensuring accurate and ethical use of journalistic content remains a pressing concern for the media industry.