Can AI Save Endangered Languages?

Recently, I’ve been thinking a lot about how computational linguistics and AI intersect with real-world issues, beyond just building better chatbots or translation apps. One question that keeps coming up for me is: Can AI actually help save endangered languages?

As someone who loves learning languages and thinking about how they shape culture and identity, I find this topic both inspiring and urgent.


The Crisis of Language Extinction

Right now, linguists estimate that out of the 7,000+ languages spoken worldwide, nearly half are at risk of extinction within this century. This isn’t just about losing words. When a language disappears, so does a community’s unique way of seeing the world, its oral traditions, its science, and its cultural knowledge.

For example, many Indigenous languages encode ecological wisdom, medicinal knowledge, and cultural philosophies that aren’t easily translated into global languages like English or Mandarin.


How Can Computational Linguistics Help?

Here are a few ways I’ve learned that AI and computational linguistics are being used to preserve and revitalize endangered languages:

1. Building Digital Archives

One of the first steps in saving a language is documenting it. AI models can:

  • Transcribe and archive spoken recordings automatically, which used to take linguists years to do manually
  • Align audio with text to create learning materials
  • Help create dictionaries and grammatical databases that preserve the language’s structure for future generations

Projects like ELAR (Endangered Languages Archive) work on this in partnership with local communities.


2. Developing Machine Translation Tools

Although data scarcity makes it hard to build translation systems for endangered languages, researchers are working on:

  • Transfer learning, where AI models trained on high-resource languages are adapted to low-resource ones
  • Multilingual language models, which can translate between many languages and improve with even small datasets
  • Community-centered translation apps, which let speakers record, share, and learn their language interactively

For example, Google’s AI team and university researchers are exploring translation models for Indigenous languages like Quechua, which has millions of speakers but limited online resources.


3. Revitalization Through Language Learning Apps

Some communities are partnering with tech developers to create mobile apps for language learning tailored to their heritage language. AI can help:

  • Personalize vocabulary learning
  • Generate example sentences
  • Provide speech recognition feedback for pronunciation practice

Apps like Duolingo’s Hawaiian and Navajo courses are small steps in this direction. Ideally, more tools would be built directly with native speakers to ensure accuracy and cultural respect.


Challenges That Remain

While all this sounds promising, there are real challenges:

  • Data scarcity. Many endangered languages have very limited recorded data, making it hard to train accurate models
  • Ethical concerns. Who owns the data? Are communities involved in how their language is digitized and shared?
  • Technical hurdles. Language structures vary widely, and many NLP models are still biased towards Indo-European languages

Why This Matters to Me

As a high school student exploring computational linguistics, I’m passionate about language diversity. Languages aren’t just tools for communication. They are stories, worldviews, and cultural treasures.

Seeing AI and computational linguistics used to preserve rather than replace human language reminds me that technology is most powerful when it supports people and cultures, not just when it automates tasks.

I hope to work on projects like this someday, using NLP to build tools that empower communities to keep their languages alive for future generations.


Final Thoughts

So, can AI save endangered languages? Maybe not alone. But combined with community efforts, linguists, and ethical frameworks, AI can be a powerful ally in documenting, preserving, and revitalizing the world’s linguistic heritage.

If you’re interested in learning more, check out projects like ELAR (Endangered Languages Archive) or the Living Tongues Institute. Let me know if you want me to write another post diving into how multilingual language models actually work.

— Andrew

What I Learned (and Loved) at SLIYS: Two Weeks of Linguistic Discovery at Ohio State

This summer, I had the chance to participate in both SLIYS 1 and SLIYS 2—the Summer Linguistic Institute for Youth Scholars—hosted by the Ohio State University Department of Linguistics. Across two weeks packed with lectures, workshops, and collaborative data collection, I explored the structure of language at every level: from the individual sounds we make to the complex systems that govern meaning and conversation. But if I had to pick just one highlight, it would be the elicitation sessions—hands-on explorations with real language data that made the abstract suddenly tangible.

SLIYS 1: Finding Language in Structure

SLIYS 1 started with the fundamentals—consonants, vowels, and the International Phonetic Alphabet (IPA)—but quickly expanded into diverse linguistic territory: morphology, syntax, semantics, and pragmatics. Each day featured structured lectures covering topics like sociolinguistic variation, morphological structures, and historical linguistics. Workshops offered additional insights, from analyzing sentence meanings to exploring language evolution.

The core experience, however, was our daily elicitation sessions. My group tackled Serbo-Croatian, collaboratively acting as elicitors and transcribers to construct a detailed grammar sketch. We identified consonant inventories, syllable structures (like CV, CVC, and CCV patterns), morphological markers for plural nouns and verb tenses, and syntactic word orders. Through interactions with our language consultant, we tested hypotheses directly, discovering intricacies like how questions were formed using particles like dahlee, and how adjective-noun order worked. This daily practice gave theory immediate clarity and meaning, shaping our skills as linguists-in-training.

SLIYS 2: Choosing My Path in Linguistics

SLIYS 2 built upon our initial foundations, diving deeper into phonological analysis, morphosyntactic properties, and the relationship between language and cognition. This week offered more autonomy, allowing us to select workshops tailored to our interests. My choices included sessions on speech perception, dialectology, semiotics, and linguistic anthropology—each challenging me to think more broadly about language as both cognitive and cultural phenomena.

Yet again, the elicitation project anchored our experience, this time exploring Georgian. Our group analyzed Georgian’s distinctive pluralization system, polypersonal verb agreement (verbs agreeing with both subjects and objects), and flexible sentence orders (SVO/SOV). One fascinating detail we uncovered was how nouns remained singular when preceded by numbers. Preparing our final presentation felt especially rewarding, bringing together the week’s linguistic discoveries in a cohesive narrative. Presenting to our peers crystallized not just what we learned, but how thoroughly we’d internalized it.

More Than Just a Summer Program

What I appreciated most about SLIYS was how seriously it treated us as student linguists. The instructors didn’t just lecture—they listened, challenged us, and encouraged our curiosity. Whether we were learning about deixis or discourse analysis, the focus was always on asking better questions, not just memorizing answers.

By the end of SLIYS 2, I found myself thinking not only about how language works, but why we study it in the first place. Language is a mirror to thought, a map of culture, and a bridge between people—and programs like SLIYS remind me that it’s also something we can investigate, question, and build understanding from.

Moments from SLIYS 2: A Snapshot of a Summer to Remember

As SLIYS 2 came to a close, our instructors captured these Zoom screenshots to help us remember the community, curiosity, and collaboration that made this experience so meaningful.

Special Thanks to the SLIYS 2025 Team

This incredible experience wouldn’t have been possible without the passion, insight, and dedication of the SLIYS 2025 instructors. Each one brought something unique to the table—whether it was helping us break down complex syntax, introducing us to sociolinguistics through speech perception, or guiding us through our elicitation sessions with patience and curiosity. I’m especially grateful for the way they encouraged us to ask deeper questions and think like real linguists.

Special thanks to:

  • Kyler Laycock – For leading with energy, making phonetics and dialectology come alive, and always reminding us how much identity lives in the details of speech.
  • Jory Ross – For guiding us through speech perception and conversational structure, and for sharing her excitement about how humans really process language.
  • Emily Sagasser – For her insights on semantics, pragmatics, and focus structure, and for pushing us to think about how language connects to social justice and cognition.
  • Elena Vaikšnoraitė – For their thoughtful instruction in syntax and psycholinguistics, and for showing us the power of connecting data across languages.
  • Dr. Clint Awai-Jennings – For directing the program with care and purpose—and for showing us that it’s never too late to turn a passion for language into a life’s work.

Thank you all for making SLIYS 1 and 2 an unforgettable part of my summer.

— Andrew

Blog at WordPress.com.

Up ↑