Generative AI
Are AI Hallucinations Getting Better or Worse? We Analyzed the Data
07 January 2026
BY SCOTT M. GRAFFIUS | ScottGraffius.com

This publication is organized into the following parts:
AI systems such as ChatGPT and its competitors sometimes produce answers that sound confident but are wrong. In an earlier article, Scott M. Graffius encapsulated it as: "Generative AI can dazzle. However, it’s prone to deliver fiction as fact, a phenomenon known as AI hallucinations" (Graffius, 2025).
Hallucinations can arise from multiple stages in the AI lifecycle, including data collection, model architecture, training processes, and inference. Key contributing factors include:
The danger to users is that AI hallucinations present fiction as fact—confidently, fluently, and persuasively. Left unchecked, these errors can cause harm. "When AI gets things wrong, using its output can spread false information, damage reputations, and create other issues" (Graffius, 2025).
There’s no singular universal metric for AI hallucinations, but many researchers focus on the percentage of responses containing at least one hallucinated claim. This review uses that measurement.
Different benchmarks test distinct situations. Vectara's Hughes Hallucination Evaluation Model (HHEM) leaderboard focuses on document summarization—how faithfully a model sticks to a provided source (Vectara, 2025). Others, such as SimpleQA and PersonQA, probe general factual accuracy on short, open-ended questions (OpenAI, 2025). This review draws from a range of such tests, reported by multiple sources.
As AI systems improve and become more ubiquitous, a pressing question is: Are AI hallucinations getting better or worse? As detailed next, data from 2024 through 2025 is mixed. On tightly controlled tasks, hallucinations are declining. However, they’ve spiked on more complex tasks.
In 2024, leading models exhibited hallucination rates in the range of 1-3% on standardized, grounded benchmarks (Stanford HAI, 2024; Vectara, 2025). But the picture was less rosy outside those settings. Domain-specific evaluations (such as scientific, medical, and technical analysis) often reported hallucination rates of 10-20% or higher (Cheilli et al., 2024).
In 2025, hallucination rates diverged sharply depending on what the models were asked to do.
On apples-to-apples benchmarks, such as Vectara's summarization leaderboard, performance improved across the board. Several top models dropped below 1%, including Google’s Gemini-2.0-Flash at roughly 0.7%, with OpenAI and Gemini variants clustering around 0.8–1.5% (Vectara, 2025; AllAboutAI, 2025). For grounded tasks—where the model can anchor its output to a source document—hallucinations are less frequent over time.
However, newer reasoning-focused models tell a different story. Systems optimized for complex chain-of-thought reasoning hallucinate more on open-ended factual benchmarks. OpenAI’s o3 series, for example, experienced hallucination rates of 33-51% on PersonQA and SimpleQA. That’s more than double earlier o1 models, which hovered around 16% (OpenAI, 2025; Techopedia, 2025). Broader evaluations in 2025 reflect this shift. Across task sets containing both simple and complex cases, hallucination rates are commonly 3-20% or higher (Stanford HAI, 2025).
On comparable benchmarks, hallucinations are declining year-over-year for non-complex cases. Top models dropped from roughly 1–3% in 2024 to 0.7–1.5% in 2025 on grounded summarization tasks. However, hallucinations remain high in complex reasoning and open-domain factual recall, where rates can exceed 33%.
The silver lining is mitigation. Retrieval-Augmented Generation (RAG), which forces models to ground answers in external documents, can reduce hallucinations by 40-71% in many scenarios (AIMultiple, 2025; AboutChromebooks, 2025). Industry guides also recommend complementary best practices, such as domain-specific fine-tuning, careful prompt design to constrain speculation, and instructing models to cite sources or admit uncertainty.
Researchers are also deploying layered defenses, such as multi-stage verification systems (Garcia-Fernandez et al., 2025), continuous detection pipelines (Anaokar et al., 2025), and domain-specific validators (Vangala et al., 2025; Bang et al., 2025).
The picture is mixed, but the takeaway is clear. AI hallucinations are evolving from a blanket failure mode into a situational risk. Where grounding is strong and tasks are constrained, the frequency of hallucinations drops. Where reasoning is expansive and factual recall is open-ended, they surge. Hallucinations will likely persist for the foreseeable future. Managing them requires situational awareness, vigilance, smarter evaluation, layered safeguards, mitigation strategies, and informed human oversight.
Take Action
This article provides a foundational overview. For in-depth guidance on AI, including human-AI teamwork (where the AI is advanced—agentic, autonomous, or autopoietic) and the "exotic team dynamics" which emerge, contact Scott M. Graffius. To request a consultation, speaking engagement, or other work, complete a request form or email him today.



Scott M. Graffius is a strategic transformation leader who drives AI, Agile, and broader business and technology initiatives to deliver measurable value across projects, programs, portfolios, and PMOs. He is an expert in the teamwork tradecraft of both human and human-AI teams, including the “exotic team dynamics” that emerge. He is also an authority on the temporal patterns of social media, including the half-life of audience engagement.
He’s a practitioner, researcher, thought leader, award-winning author, and keynote speaker who’s taken the stage at 96 conferences and other events across 25 countries.
He’s delivered over $2.3 billion in value for Fortune 500 companies and other leaders in technology, entertainment, financial services, healthcare, and beyond.
Businesses, professional associations, government agencies, and universities use Graffius and feature his work. Examples include Adobe, Bayer, Boston University, Ford, Gartner, Harvard Medical School, IEEE, Johns Hopkins University, Microsoft, MSN, National Academy of Sciences, Oracle, Pinterest Inc., Project Management Institute, UC San Diego, Verizon, Yale University, and others.
The following sections provide additional information on his experience, contributions, and influence.
Experience
Graffius heads the professional services firm Exceptional PPM and PMO Solutions, along with its subsidiary Exceptional Agility. These consultancies offer strategic and tactical advisory, training, embedded expertise, and consulting services to the public, private, and government sectors. They help organizations enhance their capabilities and results in agile, project management, program management, portfolio management, and PMO leadership, supporting innovation and driving competitive advantage. The consultancies confidently back services with a Delighted Client Guarantee™.
Graffius is a former VP of project management with a publicly traded provider of diverse consumer products and services over the Internet. Before that, he ran and supervised the delivery of projects and programs in public and private organizations with businesses ranging from e-commerce to advanced technology products and services, retail, manufacturing, entertainment, and more.
He has experience with consumer, business, reseller, government, and international markets.
Award-Winning Author
Graffius has authored three books.
International Public Speaker
Organizations worldwide engage Graffius to present on tech (including AI), Agile, project management, program management, portfolio management, and PMO leadership. He crafts and delivers unique and compelling talks and workshops. Graffius has conducted 96 sessions across 25 countries. Select examples of events include Agile Trends Gov, BSides (Newcastle Upon Tyne), Conf42 Quantum Computing, DevDays Europe, DevOps Institute, DevOpsDays (Geneva), Frug’Agile, IEEE, Microsoft, Scottish Summit, Scrum Alliance RSG (Nepal), Techstars, and W Love Games International Video Game Development Conference (Helsinki), and more.
With an average rating of 4.81 (on a scale of 1-5), sessions are highly valued.
The speaker engagement request form is here.
Thought Leadership and Influence
Prominent businesses, professional associations, government agencies, and universities have showcased Graffius and his contributions—spanning his books, talks, workshops, and beyond. Select examples include:
Graffius has played a key role in the Project Management Institute (PMI) in developing professional standards. He was a member of multiple teams that authored, reviewed, and produced:
Additional details are here.
He was also a subject matter expert reviewer of content for the PMI’s Congress. Beyond the PMI, Graffius also served as a member of the review team for two of the Scrum Alliance’s Global Scrum Gatherings.
Acclaimed Authority on Teamwork Tradecraft

Graffius is a renowned authority on teamwork tradecraft. Informed by the research of Bruce W. Tuckman and Mary Ann C. Jensen, over 150 subsequent studies, and Graffius' first-hand professional experience with, and analysis of, team leadership and performance, Graffius created his "Phases of Team Development" intellectual property as a unique perspective and visual conveying the five phases of team development. First introduced in 2008 and periodically updated, his work provides a diagnostic and strategic guide for navigating team dynamics. It provides actionable insights for leaders across industries to develop high-performance teams. Its adoption by esteemed organizations such as Yale University, IEEE, Cisco, Microsoft, Ford, Oracle, Broadcom, the U.S. National Park Service, and the Journal of Neurosurgery, among others, highlights its utility and value, solidifying its status as an indispensable resource for elevating team performance and driving organizational excellence. In 2026, Graffius added human-AI teamwork—including the "exotic team dynamics" which emerge when advanced AI collaborates as a teammate—to his "Phases of Team Development."
The 2026 edition of Graffius' "Phases of Team Development" intellectual property is here.
Expert on Temporal Dynamics on Social Media Platforms

Graffius is also an authority on temporal dynamics on social media platforms. His 'Lifespan (Half-Life) of Social Media Posts' research—first published in 2018 and updated annually—delivers a precise quantitative analysis of post longevity across digital platforms, utilizing advanced statistical techniques to determine mean half-life with precision. It establishes a solid empirical base, effectively highlighting the ephemeral nature of content within social media ecosystems. Referenced and applied by leading entities such as the Center for Direct Marketing, Fast Company, GoDaddy, Pinterest Inc., and PNAS, among others, his research exemplifies methodological rigor and sustained significance in the field of digital informatics.
The 2025 edition of Graffius "Lifespan (Half-Life) of Social Media Posts" research is here.
Education and Professional Certifications
Graffius has a bachelor’s degree in psychology with a focus in Human Factors. He holds eight professional certifications:
He is an active member of the Scrum Alliance, the Project Management Institute (PMI), and the Institute of Electrical and Electronics Engineers (IEEE).
Advancing AI, Agile, and Project/PMO Management
Scott M. Graffius continues to advance the fields of AI, Agile, and Project/PMO Management through his leadership, research, writing, and real-world impact. Businesses and other organizations leverage Graffius’ insights to drive their success.
Discover Scott’s Books
Connect with and follow Scott on LinkedIn, X, YouTube, Facebook, Threads, Bluesky, Mastodon, and ResearchGate.













You may be interested in these stories.
Lessons from Yesterday’s Tomorrowland
Scott M. Graffius’ Update to His "Phases of Team Development” Coming Early 2026
A Data-Driven Analysis of the Evolution of Project Management: Tasks, Trends, and AI
Actionable Insights. Global Impact. Scott M. Graffius.
Beep Beep! Why Wile E. Coyote Is the Patron Saint of AI Failure
Social Media Half-Life Research Cited in Prestigious Peer-Reviewed Medical Journal
Navigating the Spectrum of Advanced AI – Agentic, Autonomous, and Autopoietic
Lessons from Unhinged AI in Fiction: What Rogue AIs in Sci-Fi Storytelling, Films, and TV Shows Reveal About Us
Scott M. Graffius Contributed to and Reviewed the PMBOK Guide, 8th Edition — the Global Standard for Project Management
UK Sports Institute Features Teamwork and Leadership Work of Scott M. Graffius
Definitions of Advanced AIs: Agentic, Autonomous, and Autopoietic
HAN University of Applied Sciences Features the Work of Scott M. Graffius
Gifts That Inspire Joy
Evergreen Echoes: How Pinterest Inc. and Pinterest Japan Spotlighted Scott M. Graffius’ Research on the Half-Life of Social Media Content
Scott M. Graffius’ Team Development Work Lights Up the University of Tasmania’s Curriculum
Taiwan’s Leading Outlet for Technology and Innovation Spotlights Graffius’ Research on Social Media Temporal Dynamics
Scott M. Graffius’ Insights on AI, Agile, Gaming, XR, and Defense Transformation Cited by MSN in Their Coverage of Innovation and Leadership in the Sector
Top Predictions of 2024, Tested in 2025
Government of Finland Agency Features Graffius' Phases of Team Development IP
Scott M. Graffius Premieres His New "Exotic Team Dynamics: Human-AI Collaboration" Talk at Corporate Event in Las Vegas
Environmental Science Journal References Scott M. Graffius’ Project Management Work
AI Institute of Switzerland Features Scott M. Graffius’ Work on Algorithms
Exotic Team Dynamics: The New Frontier of Human–AI Collaboration
Scott M. Graffius’ Work Featured at ACM DIS ’25
Climbing the Ladder: The Head of Agile/PMO’s Organizational Proximity to the CEO is Closer Than Ever
Harness Sci-Fi and Speculative Design While Embracing Imperfection to Drive Innovation and Proactively Predict and Prepare for the Future
Scott M. Graffius’ “Agile Scrum” Book Featured in a Publication of the Associação Nacional de Educação Católica do Brasil (ANEC)
Setting Direction with OKRs and Tracking Progress with KPIs: A Guide for Agile, Project Management, and Tech Teams
AI Showcase Showdown: Ranking AI Accuracy on Project Management Basics
BookAuthority Features “Agile Scrum” by Scott M. Graffius in “10 Agile Software Development Books That Define the Field”
Exploring Team Dynamics, Adaptability, and Creative Problem-Solving Through Felix the Cat’s Metaphorical Toolkit
Mind Games and Master Plans: How PsyOps Exploit Psychological Phenomena
The “Pants-on-Fire Index for AI”
16 Causes of Technical Debt and How to Avoid It
PMI’s Infinity AI Gets the Basics of Team Development Alarmingly and Repeatedly Wrong
Agile for Unicorns: 7 Keys to Thrive; Scott M. Graffius’ Workshop
Scott M. Graffius’ “Agile Scrum: Your Quick Start Guide with Step-by-Step Instructions” Featured in Prestigious National Academies of Sciences Publication
Meta and Anduril’s EagleEye and the Future of XR: How Gaming, AI, and Agile Are Transforming Defense
"Agile Protocol" and the Power of Satire, Parody, and Humor
AI in Agile: Benefits, Risks, Outlook
Hostinger Highlights Scott M. Graffius’ “Agile Protocol” Book in Feature on Software Projects
French Ministry of Culture Links to Scott M. Graffius Research in Their Guide for Responsible Digital Communication
National Science Foundation’s LTER Network Features Scott M. Graffius’ Phases of Team Development IP
Waterfall vs. Agile: What’s Fixed, What’s Flexible, and Why It Matters
Scott M. Graffius Delivering $2.3 Billion in Value Through AI, Agile, and Project/PMO Leadership
The Problem with Heroes in Agile
The 3 Vital Rules of Science: What They Are and Why They Matter
“Agile Protocol: The Transformation Ultimatum” Lands on the Amazon Best Sellers List!
Scott M. Graffius Recognized as a Top Thought Leader in Digital Disruption by Thinkers360
“Agile Protocol: The Transformation Ultimatum” by Scott M. Graffius Crashes the Book Scene with Satirical Firepower
NCKU in Taiwan Integrates Graffius' 'Phases of Team Development' into Its Curriculum
The Art and Science of Aligning Initiatives with Strategic Objectives
RGPV University Adds Scott M. Graffius’ "Agile Scrum: Your Quick Start Guide with Step-by-Step Instructions" to Its Syllabus
Introducing ‘Engage or Fade: Decoding the Half-Life of Digital Resonance’ – A New Talk by Scott M. Graffius
Dropbox Company (Nira) Taps into Scott M. Graffius’ Expertise in Strategic Planning
Reporting Errors in a Publication: A Case Study on ‘Frontiers in Public Health’
NESEA Conference Session on Innovation Highlights Scott M. Graffius' 'Phases of Team Development'
U.S. Soccer Scores with Scott M. Graffius' Intellectual Property on Teamwork
U.S. Department of Commerce Partner (IEDC) Features Scott M. Graffius’ Intellectual Property
How Long Do Your Posts Live? AI Critiques Scott M. Graffius’ Research on the Half-Life of Social Media
Agile's Journey Through the Decades
Scott M. Graffius' Role in Advancing Project Management Institute (PMI) Standards Excellence
Most Valuable IT Certifications: Update for 2025
The PMI + Agile Alliance Merger: A Recipe for Success?
When Agile, AI, and Strategic Thinking Converge
Scott M. Graffius’ Phases of Team Development: 2025 Update
Lifespan (Half-Life) of Social Media Posts: Update for 2025
Verizon Features Scott M. Graffius
Do Not Read This Article! An Exploration of the Streisand Effect and Other Phenomena
Scott M. Graffius' 'Phases of Team Development' was Spotlighted in Journal of Neurosurgery
Pinterest Japan Uses Graffius’ Research on Temporal Dynamics on Social Media Platforms
Hochschule Coburg (Coburg University) Germany Uses Scott M. Graffius’ Phases of Team Development IP in Coursework on Agile Development
Wild World of Team Dynamics [Updated Two-Minute Video]
EU Europass Teacher Academy Features Scott M. Graffius’ ‘Phases of Team Development’ in Leadership Training
Side-by-Side Comparison of Retrospectives and Hot Washes
Constructor University 2024 Advanced Software Technology Handbook References Scott M. Graffius' Work on Team Dynamics
Strategies for Medical Team Success Featured Scott M. Graffius’ ‘Phases of Team Development’ Intellectual Property
SBG Neumark — Europe’s Largest Distribution Transformer Plant — Powers Up with Scott M. Graffius’ Intellectual Property
Scott M. Graffius’ Intellectual Property was Employed by the NHS — the Largest Single-Payer Healthcare System in Europe
Luxury Unwrapped: The Ultimate Holiday Gift Guide for Every Budget
Singapore Institute of Technology Features Work of Scott M. Graffius
Tufts University Features Scott M. Graffius 'Phases of Team Development' Intellectual Property
'Cat Herders': Retelling the Massive Success Story
Pennsylvania State Agency Used Scott M. Graffius' Intellectual Property
Copyright Infringement in a Book Published by AuthorHouse / Author Solutions / The Najafi Companies: Publisher Fails to Respond or Take Required Action
Pinterest Inc. References Scott M. Graffius’ Research
Bournemouth University Used Scott M. Graffius’ Intellectual Property
‘Comparative Methodological Guidelines: Handbook for Educators’ Violates Scott M. Graffius’ Copyright
Japan Backlog User Group Event Featured Scott M. Graffius’ ‘Phases of Team Development’
TurningWest's 'Trial'
Radio Silence from the American Association of Neurological Surgeons on Report of Blatant Plagiarism in Their ‘Journal of Neurosurgery’ Publication
Supplement to Graffius' 'Lifespan (Half-Life) of Social Media Posts' Research: Typical Engagement Distribution Pattern for Social Media Posts
How Algorithms Shape the User Experience on Social Media Platforms
Thinkers360 Named Scott M. Graffius a Top Thought Leader on Agile
'Maximizing LinkedIn for Business Growth' Book References and Incorporates Scott M. Graffius' 'Lifespan (Half-Life) of Social Media Posts' Research
Bayer Employs Scott M. Graffius' Intellectual Property
FINAT (Fédération Internationale des Fabricants et Transformateurs d'Adhésifs et Thermocollants sur Papiers et Autres Supports) Features Scott M. Graffius’ Intellectual Property
Broadcom Features Scott M. Graffius' Intellectual Property
Agile Project Management: Insights from Scott M. Graffius in ‘Managing Information Technology’ Book
Harvard Medical School Talk Featured Insights by Scott M. Graffius
More articles are listed here.

Graffius, S. M. (2026, January 7). Are AI Hallucinations Getting Better or Worse? We Analyzed the Data. ScottGraffius.com. https://scottgraffius.com/blog/files/ai-hallucinations-2026.html

DOI: (coming soon)

Names, marks, and content are the property of their respective owners.

This is the extended list of tags and hashtags for this article:

If there are any supplements or updates to this article after the date of publication, they will appear here.

Copyright © Scott M. Graffius. All rights reserved.
Content on this site—including text, images, videos, and data—may not be used for training or input into any artificial intelligence, machine learning, or automatized learning systems, or published, broadcast, rewritten, or redistributed without the express written permission of Scott M. Graffius.


This publication is organized into the following parts:
- Main Article
- References
- About Scott M. Graffius
- List of Additional Articles, How to Cite This Article, DOI, Content Acknowledgements, Tags & Hashtags, Post-Publication Notes, Copyright
Introduction
AI systems such as ChatGPT and its competitors sometimes produce answers that sound confident but are wrong. In an earlier article, Scott M. Graffius encapsulated it as: "Generative AI can dazzle. However, it’s prone to deliver fiction as fact, a phenomenon known as AI hallucinations" (Graffius, 2025).
Hallucinations can arise from multiple stages in the AI lifecycle, including data collection, model architecture, training processes, and inference. Key contributing factors include:
- Biases, incompleteness, or noise in the training data can lead to overgeneralization or pattern misinterpretation.
- Model overfitting or architectural limitations in autoregressive decoding, where probabilistic predictions prioritize fluency over factual fidelity.
- Lack of grounding in real-world knowledge during generation, exacerbated by the absence of explicit reasoning mechanisms.
The danger to users is that AI hallucinations present fiction as fact—confidently, fluently, and persuasively. Left unchecked, these errors can cause harm. "When AI gets things wrong, using its output can spread false information, damage reputations, and create other issues" (Graffius, 2025).
There’s no singular universal metric for AI hallucinations, but many researchers focus on the percentage of responses containing at least one hallucinated claim. This review uses that measurement.
Different benchmarks test distinct situations. Vectara's Hughes Hallucination Evaluation Model (HHEM) leaderboard focuses on document summarization—how faithfully a model sticks to a provided source (Vectara, 2025). Others, such as SimpleQA and PersonQA, probe general factual accuracy on short, open-ended questions (OpenAI, 2025). This review draws from a range of such tests, reported by multiple sources.
As AI systems improve and become more ubiquitous, a pressing question is: Are AI hallucinations getting better or worse? As detailed next, data from 2024 through 2025 is mixed. On tightly controlled tasks, hallucinations are declining. However, they’ve spiked on more complex tasks.
AI Hallucinations in 2024
In 2024, leading models exhibited hallucination rates in the range of 1-3% on standardized, grounded benchmarks (Stanford HAI, 2024; Vectara, 2025). But the picture was less rosy outside those settings. Domain-specific evaluations (such as scientific, medical, and technical analysis) often reported hallucination rates of 10-20% or higher (Cheilli et al., 2024).
AI Hallucinations in 2025
In 2025, hallucination rates diverged sharply depending on what the models were asked to do.
On apples-to-apples benchmarks, such as Vectara's summarization leaderboard, performance improved across the board. Several top models dropped below 1%, including Google’s Gemini-2.0-Flash at roughly 0.7%, with OpenAI and Gemini variants clustering around 0.8–1.5% (Vectara, 2025; AllAboutAI, 2025). For grounded tasks—where the model can anchor its output to a source document—hallucinations are less frequent over time.
However, newer reasoning-focused models tell a different story. Systems optimized for complex chain-of-thought reasoning hallucinate more on open-ended factual benchmarks. OpenAI’s o3 series, for example, experienced hallucination rates of 33-51% on PersonQA and SimpleQA. That’s more than double earlier o1 models, which hovered around 16% (OpenAI, 2025; Techopedia, 2025). Broader evaluations in 2025 reflect this shift. Across task sets containing both simple and complex cases, hallucination rates are commonly 3-20% or higher (Stanford HAI, 2025).
Conclusion
On comparable benchmarks, hallucinations are declining year-over-year for non-complex cases. Top models dropped from roughly 1–3% in 2024 to 0.7–1.5% in 2025 on grounded summarization tasks. However, hallucinations remain high in complex reasoning and open-domain factual recall, where rates can exceed 33%.
The silver lining is mitigation. Retrieval-Augmented Generation (RAG), which forces models to ground answers in external documents, can reduce hallucinations by 40-71% in many scenarios (AIMultiple, 2025; AboutChromebooks, 2025). Industry guides also recommend complementary best practices, such as domain-specific fine-tuning, careful prompt design to constrain speculation, and instructing models to cite sources or admit uncertainty.
Researchers are also deploying layered defenses, such as multi-stage verification systems (Garcia-Fernandez et al., 2025), continuous detection pipelines (Anaokar et al., 2025), and domain-specific validators (Vangala et al., 2025; Bang et al., 2025).
The picture is mixed, but the takeaway is clear. AI hallucinations are evolving from a blanket failure mode into a situational risk. Where grounding is strong and tasks are constrained, the frequency of hallucinations drops. Where reasoning is expansive and factual recall is open-ended, they surge. Hallucinations will likely persist for the foreseeable future. Managing them requires situational awareness, vigilance, smarter evaluation, layered safeguards, mitigation strategies, and informed human oversight.
Take Action
This article provides a foundational overview. For in-depth guidance on AI, including human-AI teamwork (where the AI is advanced—agentic, autonomous, or autopoietic) and the "exotic team dynamics" which emerge, contact Scott M. Graffius. To request a consultation, speaking engagement, or other work, complete a request form or email him today.

References
- AboutChromebooks. (2025). AI hallucination rates across different models in 2025. https://www.aboutchromebooks.com/ai-hallucination-rates-across-different-models/
- AllAboutAI. (2025). AI hallucination report 2025: Which AI hallucinates the most? https://www.allaboutai.com/resources/ai-statistics/ai-hallucinations/
- AIMultiple. (2025). AI hallucination: Compare top LLMs. https://research.aimultiple.com/ai-hallucination/
- Anaokar, S., Ganatra, S., Kashid, H., Bhattacharyya, S., Nair, S., Sekhar, R., Manohar, S., Hemrajani, R., & Bhattacharyya, P. (2025). HalluDetect: Detecting, mitigating, and benchmarking hallucinations in conversational systems. arXiv. https://arxiv.org/abs/2509.11619
- Bang, Y., Ji, Z., Schelten, A., Hartshorn, A., Fowler, T., Zhang, C., Cancedda, N., & Fung, P. (2025). HalluLens: LLM hallucination benchmark. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics. https://aclanthology.org/2025.acl-long.1176/
- Chelli, M., Descamps, J., Lavoué, V., Trojani, C., Azar, M., Deckert, M., Raynier, J. L., Clowez, G., Boileau, P., & Ruetsch‑Chelli, C. (2024). Hallucination rates and reference accuracy of ChatGPT and Bard for systematic reviews: Comparative analysis. Journal of Medical Internet Research, 26, e53164. https://www.jmir.org/2024/1/e53164/
- Garcia-Fernandez, C., Felipe, L., Shotande, M., Zitu, M., Tripathi, A., Rasool, G., El Naqa, I., Rudrapatna, V., & Valdes, G. (2025). Trustworthy AI for medicine: Continuous hallucination detection and elimination with CHECK. arXiv. https://arxiv.org/abs/2506.11129
- Graffius, S. M. (2025, June 25). The "Pants-on-Fire Index for AI". ScottGraffius.com. https://scottgraffius.com/blog/files/pants-on-fire-index-for-ai.html
- OpenAI. (2025). OpenAI o3 and o4-mini system card. https://cdn.openai.com/pdf/2221c875-02dc-4789-800b-e7758f3722c1/o3-and-o4-mini-system-card.pdf
- Stanford Human-Centered Artificial Intelligence Initiative. (2024). Artificial Intelligence Index Report 2024. https://hai.stanford.edu/ai-index/2024-ai-index-report
- Stanford Human-Centered Artificial Intelligence Initiative. (2025). Artificial Intelligence Index Report 2025. https://hai.stanford.edu/ai-index/2025-ai-index-report
- Techopedia. (2025). 48% error rate: AI hallucinations rise in 2025 reasoning systems. https://www.techopedia.com/ai-hallucinations-rise
- Vangala, B. P., Mahmud, S., Neupane, P., Selvaraj, J., & Cheng, J. (2025). HalluMat: Detecting hallucinations in LLM-generated materials science content. arXiv. https://arxiv.org/abs/2512.22396
- Vectara. (2025). Hallucination leaderboard – HHEM evaluation model. https://github.com/vectara/hallucination-leaderboard

About Scott M. Graffius

Scott M. Graffius is a strategic transformation leader who drives AI, Agile, and broader business and technology initiatives to deliver measurable value across projects, programs, portfolios, and PMOs. He is an expert in the teamwork tradecraft of both human and human-AI teams, including the “exotic team dynamics” that emerge. He is also an authority on the temporal patterns of social media, including the half-life of audience engagement.
He’s a practitioner, researcher, thought leader, award-winning author, and keynote speaker who’s taken the stage at 96 conferences and other events across 25 countries.
He’s delivered over $2.3 billion in value for Fortune 500 companies and other leaders in technology, entertainment, financial services, healthcare, and beyond.
Businesses, professional associations, government agencies, and universities use Graffius and feature his work. Examples include Adobe, Bayer, Boston University, Ford, Gartner, Harvard Medical School, IEEE, Johns Hopkins University, Microsoft, MSN, National Academy of Sciences, Oracle, Pinterest Inc., Project Management Institute, UC San Diego, Verizon, Yale University, and others.
The following sections provide additional information on his experience, contributions, and influence.
Experience
Graffius heads the professional services firm Exceptional PPM and PMO Solutions, along with its subsidiary Exceptional Agility. These consultancies offer strategic and tactical advisory, training, embedded expertise, and consulting services to the public, private, and government sectors. They help organizations enhance their capabilities and results in agile, project management, program management, portfolio management, and PMO leadership, supporting innovation and driving competitive advantage. The consultancies confidently back services with a Delighted Client Guarantee™.
Graffius is a former VP of project management with a publicly traded provider of diverse consumer products and services over the Internet. Before that, he ran and supervised the delivery of projects and programs in public and private organizations with businesses ranging from e-commerce to advanced technology products and services, retail, manufacturing, entertainment, and more.
He has experience with consumer, business, reseller, government, and international markets.
Award-Winning Author
Graffius has authored three books.
- Agile Scrum: Your Quick Start Guide with Step-by-Step Instructions, his first book, earned 17 awards.
- Agile Transformation: A Brief Story of How an Entertainment Company Developed New Capabilities and Unlocked Business Agility to Thrive in an Era of Rapid Change, his second book, was named one of the best Scrum books of all time by BookAuthority.
- Agile Protocol: The Transformation Ultimatum, his third book and his first work of fiction, was released in April 2025. The book trailer is on YouTube.
International Public Speaker
Organizations worldwide engage Graffius to present on tech (including AI), Agile, project management, program management, portfolio management, and PMO leadership. He crafts and delivers unique and compelling talks and workshops. Graffius has conducted 96 sessions across 25 countries. Select examples of events include Agile Trends Gov, BSides (Newcastle Upon Tyne), Conf42 Quantum Computing, DevDays Europe, DevOps Institute, DevOpsDays (Geneva), Frug’Agile, IEEE, Microsoft, Scottish Summit, Scrum Alliance RSG (Nepal), Techstars, and W Love Games International Video Game Development Conference (Helsinki), and more.
With an average rating of 4.81 (on a scale of 1-5), sessions are highly valued.
The speaker engagement request form is here.
Thought Leadership and Influence
Prominent businesses, professional associations, government agencies, and universities have showcased Graffius and his contributions—spanning his books, talks, workshops, and beyond. Select examples include:
- Adobe,
- American Management Association,
- Amsterdam Public Health Research Institute,
- Bayer,
- BMC Software,
- Boston University,
- Broadcom,
- Cisco,
- Coburg University of Applied Sciences and Arts - Germany,
- Computer Weekly,
- Constructor University - Germany,
- Data Governance Success,
- Deimos Aerospace,
- DevOps Institute,
- Dropbox,
- EU's European Commission,
- Ford Motor Company,
- Gartner,
- GoDaddy,
- Harvard Medical School,
- Hasso Plattner Institute - Germany,
- IEEE,
- Innovation Project Management,
- Johns Hopkins University,
- Journal of Neurosurgery,
- Lam Research (Semiconductors),
- Leadership Worthy,
- Life Sciences Trainers and Educators Network,
- London South Bank University,
- Microsoft,
- MSN,
- NASSCOM,
- National Academy of Sciences,
- New Zealand Government,
- Oracle,
- Pinterest Inc.,
- Project Management Institute,
- Mary Raum (Professor of National Security Affairs, United States Naval War College),
- SANS Institute,
- SBG Neumark - Germany,
- Singapore Institute of Technology,
- Torrens University - Australia,
- TBS Switzerland,
- Tufts University,
- UC San Diego,
- UK Sports Institute,
- University of Galway - Ireland,
- US Department of Energy,
- US National Park Service,
- US Soccer,
- US Tennis Association,
- Verizon,
- Wrike,
- Yale University,
- and many others.
Graffius has played a key role in the Project Management Institute (PMI) in developing professional standards. He was a member of multiple teams that authored, reviewed, and produced:
- Practice Standard for Work Breakdown Structures—Second Edition.
- A Guide to the Project Management Body of Knowledge—Sixth Edition.
- The Standard for Program Management—Fourth Edition.
- The Practice Standard for Project Estimating—Second Edition.
Additional details are here.
He was also a subject matter expert reviewer of content for the PMI’s Congress. Beyond the PMI, Graffius also served as a member of the review team for two of the Scrum Alliance’s Global Scrum Gatherings.
Acclaimed Authority on Teamwork Tradecraft

Graffius is a renowned authority on teamwork tradecraft. Informed by the research of Bruce W. Tuckman and Mary Ann C. Jensen, over 150 subsequent studies, and Graffius' first-hand professional experience with, and analysis of, team leadership and performance, Graffius created his "Phases of Team Development" intellectual property as a unique perspective and visual conveying the five phases of team development. First introduced in 2008 and periodically updated, his work provides a diagnostic and strategic guide for navigating team dynamics. It provides actionable insights for leaders across industries to develop high-performance teams. Its adoption by esteemed organizations such as Yale University, IEEE, Cisco, Microsoft, Ford, Oracle, Broadcom, the U.S. National Park Service, and the Journal of Neurosurgery, among others, highlights its utility and value, solidifying its status as an indispensable resource for elevating team performance and driving organizational excellence. In 2026, Graffius added human-AI teamwork—including the "exotic team dynamics" which emerge when advanced AI collaborates as a teammate—to his "Phases of Team Development."
The 2026 edition of Graffius' "Phases of Team Development" intellectual property is here.
Expert on Temporal Dynamics on Social Media Platforms

Graffius is also an authority on temporal dynamics on social media platforms. His 'Lifespan (Half-Life) of Social Media Posts' research—first published in 2018 and updated annually—delivers a precise quantitative analysis of post longevity across digital platforms, utilizing advanced statistical techniques to determine mean half-life with precision. It establishes a solid empirical base, effectively highlighting the ephemeral nature of content within social media ecosystems. Referenced and applied by leading entities such as the Center for Direct Marketing, Fast Company, GoDaddy, Pinterest Inc., and PNAS, among others, his research exemplifies methodological rigor and sustained significance in the field of digital informatics.
The 2025 edition of Graffius "Lifespan (Half-Life) of Social Media Posts" research is here.
Education and Professional Certifications
Graffius has a bachelor’s degree in psychology with a focus in Human Factors. He holds eight professional certifications:
- Certified SAFe 6 Agilist (SA),
- Certified Scrum Professional - ScrumMaster (CSP-SM),
- Certified Scrum Professional - Product Owner (CSP-PO),
- Certified ScrumMaster (CSM),
- Certified Scrum Product Owner (CSPO),
- Project Management Professional (PMP),
- Lean Six Sigma Green Belt (LSSGB), and
- IT Service Management Foundation (ITIL).
He is an active member of the Scrum Alliance, the Project Management Institute (PMI), and the Institute of Electrical and Electronics Engineers (IEEE).
Advancing AI, Agile, and Project/PMO Management
Scott M. Graffius continues to advance the fields of AI, Agile, and Project/PMO Management through his leadership, research, writing, and real-world impact. Businesses and other organizations leverage Graffius’ insights to drive their success.
Discover Scott’s Books
- Agile Scrum: Your Quick Start Guide with Step-by-Step Instructions — Deliver Products in Short Cycles with Rapid Adaptation to Change, Fast Time-to-Market, and Continuous Improvement
- Agile Transformation: A Brief Story of How an Entertainment Company Developed New Capabilities and Unlocked Business Agility to Thrive in an Era of Rapid Change
- Agile Protocol: The Transformation Ultimatum
Connect with and follow Scott on LinkedIn, X, YouTube, Facebook, Threads, Bluesky, Mastodon, and ResearchGate.












Sign up for Miro—it's free!
(Want more features? You can always upgrade to a paid plan.)

List of Additional Articles
You may be interested in these stories.
Lessons from Yesterday’s Tomorrowland
Scott M. Graffius’ Update to His "Phases of Team Development” Coming Early 2026
A Data-Driven Analysis of the Evolution of Project Management: Tasks, Trends, and AI
Actionable Insights. Global Impact. Scott M. Graffius.
Beep Beep! Why Wile E. Coyote Is the Patron Saint of AI Failure
Social Media Half-Life Research Cited in Prestigious Peer-Reviewed Medical Journal
Navigating the Spectrum of Advanced AI – Agentic, Autonomous, and Autopoietic
Lessons from Unhinged AI in Fiction: What Rogue AIs in Sci-Fi Storytelling, Films, and TV Shows Reveal About Us
Scott M. Graffius Contributed to and Reviewed the PMBOK Guide, 8th Edition — the Global Standard for Project Management
UK Sports Institute Features Teamwork and Leadership Work of Scott M. Graffius
Definitions of Advanced AIs: Agentic, Autonomous, and Autopoietic
HAN University of Applied Sciences Features the Work of Scott M. Graffius
Gifts That Inspire Joy
Evergreen Echoes: How Pinterest Inc. and Pinterest Japan Spotlighted Scott M. Graffius’ Research on the Half-Life of Social Media Content
Scott M. Graffius’ Team Development Work Lights Up the University of Tasmania’s Curriculum
Taiwan’s Leading Outlet for Technology and Innovation Spotlights Graffius’ Research on Social Media Temporal Dynamics
Scott M. Graffius’ Insights on AI, Agile, Gaming, XR, and Defense Transformation Cited by MSN in Their Coverage of Innovation and Leadership in the Sector
Top Predictions of 2024, Tested in 2025
Government of Finland Agency Features Graffius' Phases of Team Development IP
Scott M. Graffius Premieres His New "Exotic Team Dynamics: Human-AI Collaboration" Talk at Corporate Event in Las Vegas
Environmental Science Journal References Scott M. Graffius’ Project Management Work
AI Institute of Switzerland Features Scott M. Graffius’ Work on Algorithms
Exotic Team Dynamics: The New Frontier of Human–AI Collaboration
Scott M. Graffius’ Work Featured at ACM DIS ’25
Climbing the Ladder: The Head of Agile/PMO’s Organizational Proximity to the CEO is Closer Than Ever
Harness Sci-Fi and Speculative Design While Embracing Imperfection to Drive Innovation and Proactively Predict and Prepare for the Future
Scott M. Graffius’ “Agile Scrum” Book Featured in a Publication of the Associação Nacional de Educação Católica do Brasil (ANEC)
Setting Direction with OKRs and Tracking Progress with KPIs: A Guide for Agile, Project Management, and Tech Teams
AI Showcase Showdown: Ranking AI Accuracy on Project Management Basics
BookAuthority Features “Agile Scrum” by Scott M. Graffius in “10 Agile Software Development Books That Define the Field”
Exploring Team Dynamics, Adaptability, and Creative Problem-Solving Through Felix the Cat’s Metaphorical Toolkit
Mind Games and Master Plans: How PsyOps Exploit Psychological Phenomena
The “Pants-on-Fire Index for AI”
16 Causes of Technical Debt and How to Avoid It
PMI’s Infinity AI Gets the Basics of Team Development Alarmingly and Repeatedly Wrong
Agile for Unicorns: 7 Keys to Thrive; Scott M. Graffius’ Workshop
Scott M. Graffius’ “Agile Scrum: Your Quick Start Guide with Step-by-Step Instructions” Featured in Prestigious National Academies of Sciences Publication
Meta and Anduril’s EagleEye and the Future of XR: How Gaming, AI, and Agile Are Transforming Defense
"Agile Protocol" and the Power of Satire, Parody, and Humor
AI in Agile: Benefits, Risks, Outlook
Hostinger Highlights Scott M. Graffius’ “Agile Protocol” Book in Feature on Software Projects
French Ministry of Culture Links to Scott M. Graffius Research in Their Guide for Responsible Digital Communication
National Science Foundation’s LTER Network Features Scott M. Graffius’ Phases of Team Development IP
Waterfall vs. Agile: What’s Fixed, What’s Flexible, and Why It Matters
Scott M. Graffius Delivering $2.3 Billion in Value Through AI, Agile, and Project/PMO Leadership
The Problem with Heroes in Agile
The 3 Vital Rules of Science: What They Are and Why They Matter
“Agile Protocol: The Transformation Ultimatum” Lands on the Amazon Best Sellers List!
Scott M. Graffius Recognized as a Top Thought Leader in Digital Disruption by Thinkers360
“Agile Protocol: The Transformation Ultimatum” by Scott M. Graffius Crashes the Book Scene with Satirical Firepower
NCKU in Taiwan Integrates Graffius' 'Phases of Team Development' into Its Curriculum
The Art and Science of Aligning Initiatives with Strategic Objectives
RGPV University Adds Scott M. Graffius’ "Agile Scrum: Your Quick Start Guide with Step-by-Step Instructions" to Its Syllabus
Introducing ‘Engage or Fade: Decoding the Half-Life of Digital Resonance’ – A New Talk by Scott M. Graffius
Dropbox Company (Nira) Taps into Scott M. Graffius’ Expertise in Strategic Planning
Reporting Errors in a Publication: A Case Study on ‘Frontiers in Public Health’
NESEA Conference Session on Innovation Highlights Scott M. Graffius' 'Phases of Team Development'
U.S. Soccer Scores with Scott M. Graffius' Intellectual Property on Teamwork
U.S. Department of Commerce Partner (IEDC) Features Scott M. Graffius’ Intellectual Property
How Long Do Your Posts Live? AI Critiques Scott M. Graffius’ Research on the Half-Life of Social Media
Agile's Journey Through the Decades
Scott M. Graffius' Role in Advancing Project Management Institute (PMI) Standards Excellence
Most Valuable IT Certifications: Update for 2025
The PMI + Agile Alliance Merger: A Recipe for Success?
When Agile, AI, and Strategic Thinking Converge
Scott M. Graffius’ Phases of Team Development: 2025 Update
Lifespan (Half-Life) of Social Media Posts: Update for 2025
Verizon Features Scott M. Graffius
Do Not Read This Article! An Exploration of the Streisand Effect and Other Phenomena
Scott M. Graffius' 'Phases of Team Development' was Spotlighted in Journal of Neurosurgery
Pinterest Japan Uses Graffius’ Research on Temporal Dynamics on Social Media Platforms
Hochschule Coburg (Coburg University) Germany Uses Scott M. Graffius’ Phases of Team Development IP in Coursework on Agile Development
Wild World of Team Dynamics [Updated Two-Minute Video]
EU Europass Teacher Academy Features Scott M. Graffius’ ‘Phases of Team Development’ in Leadership Training
Side-by-Side Comparison of Retrospectives and Hot Washes
Constructor University 2024 Advanced Software Technology Handbook References Scott M. Graffius' Work on Team Dynamics
Strategies for Medical Team Success Featured Scott M. Graffius’ ‘Phases of Team Development’ Intellectual Property
SBG Neumark — Europe’s Largest Distribution Transformer Plant — Powers Up with Scott M. Graffius’ Intellectual Property
Scott M. Graffius’ Intellectual Property was Employed by the NHS — the Largest Single-Payer Healthcare System in Europe
Luxury Unwrapped: The Ultimate Holiday Gift Guide for Every Budget
Singapore Institute of Technology Features Work of Scott M. Graffius
Tufts University Features Scott M. Graffius 'Phases of Team Development' Intellectual Property
'Cat Herders': Retelling the Massive Success Story
Pennsylvania State Agency Used Scott M. Graffius' Intellectual Property
Copyright Infringement in a Book Published by AuthorHouse / Author Solutions / The Najafi Companies: Publisher Fails to Respond or Take Required Action
Pinterest Inc. References Scott M. Graffius’ Research
Bournemouth University Used Scott M. Graffius’ Intellectual Property
‘Comparative Methodological Guidelines: Handbook for Educators’ Violates Scott M. Graffius’ Copyright
Japan Backlog User Group Event Featured Scott M. Graffius’ ‘Phases of Team Development’
TurningWest's 'Trial'
Radio Silence from the American Association of Neurological Surgeons on Report of Blatant Plagiarism in Their ‘Journal of Neurosurgery’ Publication
Supplement to Graffius' 'Lifespan (Half-Life) of Social Media Posts' Research: Typical Engagement Distribution Pattern for Social Media Posts
How Algorithms Shape the User Experience on Social Media Platforms
Thinkers360 Named Scott M. Graffius a Top Thought Leader on Agile
'Maximizing LinkedIn for Business Growth' Book References and Incorporates Scott M. Graffius' 'Lifespan (Half-Life) of Social Media Posts' Research
Bayer Employs Scott M. Graffius' Intellectual Property
FINAT (Fédération Internationale des Fabricants et Transformateurs d'Adhésifs et Thermocollants sur Papiers et Autres Supports) Features Scott M. Graffius’ Intellectual Property
Broadcom Features Scott M. Graffius' Intellectual Property
Agile Project Management: Insights from Scott M. Graffius in ‘Managing Information Technology’ Book
Harvard Medical School Talk Featured Insights by Scott M. Graffius
More articles are listed here.

How to Cite This Article
Graffius, S. M. (2026, January 7). Are AI Hallucinations Getting Better or Worse? We Analyzed the Data. ScottGraffius.com. https://scottgraffius.com/blog/files/ai-hallucinations-2026.html

Digital Object Identifier (DOI)
DOI: (coming soon)

Content Acknowledgements
Names, marks, and content are the property of their respective owners.

Tags and Hashtags
This is the extended list of tags and hashtags for this article:
- AI accuracy
- AI benchmarks
- AI Hallucinations
- AI Trends
- Artificial Intelligence
- Generative AI
- Large language models (LLMs)
- Responsible AI
- #AIAccuracy
- #AIBenchmarks
- #AIHallucinations
- #AITrends
- #FactualAI
- #GenerativeAI
- #LargeLanguageModels
- #ResponsibleAI

Post-Publication Notes
If there are any supplements or updates to this article after the date of publication, they will appear here.

Copyright
Copyright © Scott M. Graffius. All rights reserved.
Content on this site—including text, images, videos, and data—may not be used for training or input into any artificial intelligence, machine learning, or automatized learning systems, or published, broadcast, rewritten, or redistributed without the express written permission of Scott M. Graffius.

