My First Solo Publication: A Case Study on Sentiment Analysis in Survey Data

I’m excited to share that my first solo-authored research paper has just been published in the National High School Journal of Science! 🎉

The paper is titled “A Case Study of Sentiment Analysis on Survey Data Using LLMs versus Dedicated Neural Networks”, and it explores a question I’ve been curious about for a while: how do large language models (like GPT-4o or LLaMA-3) compare to task-specific neural networks when it comes to analyzing open-ended survey responses?

If you’ve read some of my earlier posts—like my reflection on the DravidianLangTech shared task or my thoughts on Jonathan Dunn’s NLP book—you’ll know that sentiment analysis has become a recurring theme in my work. From experimenting with XLM-RoBERTa on Tamil and Tulu to digging into how NLP can support corpus linguistics, this paper feels like the natural next step in that exploration.

Why This Matters to Me

Survey responses are messy. They’re full of nuance, ambiguity, and context—and yet they’re also where we hear people’s honest voices. I’ve always thought it would be powerful if AI could help us make sense of that kind of data, especially in educational or public health settings where understanding sentiment could lead to real change.

In this paper, I compare how LLMs and dedicated models handle that challenge. I won’t go into the technical details here (the paper does that!), but one thing that stood out to me was how surprisingly effective LLMs are—even without task-specific fine-tuning.

That said, they come with trade-offs: higher computational cost, more complexity, and the constant need to assess bias and interpretability. There’s still a lot to unpack in this space.

Looking Ahead

This paper marks a milestone for me, not just academically but personally. It brings together things I’ve been learning in courses, competitions, side projects, and books—and puts them into conversation with each other. I’m incredibly grateful to the mentors and collaborators who supported me along the way.

If you’re interested in sentiment analysis, NLP for survey data, or just want to see what a high school research paper can look like in this space, I’d love for you to take a look:
🔗 Read the full paper here

Thanks again for following along this journey. Stay tuned!

I am back!

This will be a short post since I’m planning to post a more in-depth discussion on one thing that I’ve been up to over the summer. Between writing a research paper (currently under review by the Journal of High School Science) and founding a nonprofit called Student Echo, I’ve been keeping myself busy. Despite all this, I plan to post shorter updates more frequently here. Sorry for the wait—assuming anyone was actually waiting—but hey, here you go.

Here’s a bit more about what’s been keeping me occupied:
My Research Paper
Title: Comparing Performance of LLMs vs. Dedicated Neural Networks in Analyzing the Sentiment of Survey Responses
Abstract: Interpreting sentiment in open-ended survey data is a challenging but crucial task in the age of digital information. This paper studies the capabilities of three LLMs, Gemini-1.5-Flash, Llama-3-70B, and GPT-4o, comparing them to dedicated, sentiment analysis neural networks, namely RoBERTa-base-sentiment and DeBERTa-v3-base-absa. These models were evaluated on their accuracy along with other metrics (precision, recall, and F1-score) in determining the underlying sentiment of responses from two COVID-19 surveys. The results revealed that despite being designed for broader applications, all three LLMs generally outperformed specialized neural networks, with the caveat that RoBERTa was the most precise at detecting negative sentiment. While LLMs are more resource-intensive than dedicated neural networks, their enhanced accuracy demonstrates their evolving potential and justifies the increased resource costs in sentiment analysis.

My Nonprofit: Student Echo
Website: https://www.student-echo.org/
Student-Echo.org is a student-led non-profit organization with the mission of amplifying students’ voices through student-designed questionnaires, AI-based technology, and close collaboration among students, teachers, and school district educators.

Blog at WordPress.com.

Up ↑