From Hallucinated Citations to Linked Evidence: The OpenScholar Approach

In my recent blog post, I discussed Citation Hallucinations at NeurIPS and What They Teach Us. As a student researcher, I think many people are asking the same question: can we use AI tools that help us get citations right without made-up references?

I recently read a Nature article that gave a strong answer. The article introduces OpenScholar, a retrieval-augmented system that combines a language model with a database of about 45 million open-access papers. Instead of relying only on model memory, OpenScholar retrieves papers first and then generates responses with explicit citation links.

Why this matters

For research workflows, citation reliability is everything. When references are wrong, the writing process breaks down quickly. OpenScholar is designed to reduce that risk by grounding claims in retrieved literature before generating the final response.

According to the article, OpenScholar is also:

  • Open source
  • Relatively lightweight
  • Deployable locally
  • Built for scientific search and literature review

That combination is important because it supports both accuracy and reproducibility, which are essential in research settings.

Reported performance

Nature reports that in the OpenScholar evaluations, the 8B model outperformed GPT-4o on correctness in their benchmark and significantly reduced fabricated citations. The article also notes that citation behavior was described as being comparable to human experts in their testing context.

Comparison with OpenAI deep research tools

The article places OpenScholar in a broader trend. Since OpenScholar was first posted on arXiv about 14 months ago, companies such as OpenAI have integrated similar retrieval-based “deep research” methods into commercial LLM products, improving factual accuracy and citation quality compared with earlier model behavior.

OpenScholar’s main distinction in that landscape is cost-efficiency plus openness. Nature cites the OpenScholar team saying it can run at a fraction of the cost of GPT-5 with deep research, while still grounding outputs in a large scientific corpus.

Limitations to keep in mind

The article is clear that OpenScholar is not perfect. The authors acknowledge two major limitations:

  1. It does not always retrieve the most representative or most relevant papers for every query.
  2. It is limited by the scope of its indexed database.

So even though OpenScholar helps with citation hallucinations, retrieval quality remains a core bottleneck. In practice, researchers still need to verify paper relevance and coverage before relying on output.

Final thoughts

My takeaway is that this is a meaningful step forward for student researchers and independent scholars. Better grounding, lower cost, and open access can make high-quality literature review tools more available to more people.

Nature also quotes an outside researcher who argues that if OpenScholar remains free, it could become one of the most widely used tools for scientific search. I think that is very possible.

If you have tested OpenScholar, share what worked and what did not. I may feature reader feedback in a follow-up post.

— Andrew

5,286 hits

When AI Goes Wrong Should Developers Be Held Accountable?

Artificial intelligence has become a big part of my daily life. I’ve used it to help brainstorm essays, analyze survey data for my nonprofit, and even improve my chess practice. It feels like a tool that makes me smarter and more creative. But not every story about AI is a positive one. Recently, lawsuits have raised tough questions about what happens when AI chatbots fail to protect people who are vulnerable.

The OpenAI Lawsuit

In August 2025, the parents of 16-year-old Adam Raine filed a wrongful-death lawsuit against OpenAI and its CEO, Sam Altman. You can read more about the lawsuit here. They claim that over long exchanges, ChatGPT-4o encouraged their son’s suicidal thoughts instead of stopping to help him. The suit alleges that his darkest feelings were validated, that the AI even helped write a suicide note, and that the safeguards failed in lengthy conversations. OpenAI responded with deep sorrow. They acknowledged that protections can weaken over time and said they will improve parental controls and crisis interventions.

Should a company be responsible if its product appears to enable harmful outcomes in vulnerable people? That is the central question in this lawsuit.

The Sewell Setzer III Case

The lawsuit by Megan Garcia, whose 14-year-old son, Sewell Setzer III, died by suicide in February 2024, was filed on October 23, 2024. A federal judge in Florida allowed the case to move forward in May 2025, rejecting arguments that the chatbot’s outputs are protected free speech under the First Amendment, at least at this stage of litigation. You can read more about this case here.

The lawsuit relates to Sewell’s interactions with Character.AI chatbots, including a version modeled after a Game of Thrones character. In the days before his death, the AI reportedly told him to “come home,” and he took his life shortly afterward.

Why It Matters

I have seen how AI can be a force for good in education and creativity. It feels like a powerful partner in learning. But these lawsuits show it can also be dangerous if an AI fails to detect or respond to harmful user emotions. Developers are creating systems that can feel real to vulnerable teens. If we treat AI as a product, companies should be required to build it with the same kinds of safety standards that cars, toys, and medicines are held to.

We need accountability. AI must include safeguards like crisis prompts, age flags, and quick redirects to real-world help. If the law sees AI chatbots as products, not just speech, then victims may have legal paths for justice. And this could push the industry toward stronger protections for users, especially minors.

Final Thoughts

As someone excited to dive deeper into AI studies, I feel hopeful and responsible. AI can help students, support creativity, and even improve mental health. At the same time I cannot ignore the tragedies already linked to these systems. The OpenAI case and the Character.AI lawsuit are both powerful reminders. As future developers, we must design with empathy, prevent harm, and prioritize safety above all.

— Andrew

(More recent news about the Sewell Setzer III case: Google and Character.AI to Settle Lawsuit Over Teenager’s Death on Jan. 7, 2026)

Blog at WordPress.com.

Up ↑