How AI Is Racing to Save the World's Dying Languages

Doris Lamar-McLemore is 83 years old and one of fewer than 20 fluent speakers of Potawatomi, an Algonquian language once spoken across the Great Lakes region. For decades, she has been recording vocabulary, stories, and ceremonial language with the Citizen Potawatomi Nation, knowing that the clock is running out.
Now she is working with a team of computer scientists at the University of Oklahoma who are using her recordings to train an AI language model -- one that can generate pronunciation guides, create interactive lessons, and even hold basic conversations in Potawatomi.
"I never thought a computer could speak my language," Lamar-McLemore said. "But if it helps my grandchildren learn the words, then I am glad to teach it."
Her story sits at the intersection of two powerful forces: the accelerating extinction of the world's linguistic diversity and the rapid maturation of AI tools that may be able to slow it down.
A Quiet Catastrophe
Of the roughly 7,000 languages spoken on Earth today, linguists estimate that nearly half will fall silent by the end of the century. UNESCO classifies more than 2,500 languages as endangered, and a language dies approximately every 14 days, usually when its last elderly speakers pass away without having transmitted it to the next generation.
The causes are familiar -- colonization, forced assimilation, urbanization, and the gravitational pull of dominant languages in education, media, and commerce. The consequences are less well understood but profound. Each language encodes unique knowledge about ecology, medicine, social organization, and human cognition. When a language dies, that knowledge goes with it.
"A language is not just a communication system. It is an entire worldview," said Dr. Lyle Campbell, a linguist at the University of Hawaii. "Losing a language is like losing a library that was never cataloged."
The AI Toolkit
The challenge of language preservation has historically been one of resources. Documenting a language thoroughly enough to support revitalization requires thousands of hours of recording, transcription, and analysis -- work that has traditionally been done by a handful of underfunded academic linguists.
AI is compressing that timeline dramatically. Automatic speech recognition systems, once useful only for major world languages, can now be trained on relatively small datasets to transcribe endangered languages with increasing accuracy. Natural language processing tools can analyze grammatical structures and generate teaching materials. And text-to-speech systems can produce audio in languages that have no commercial recording industry.
The most ambitious project in the field is the Endangered Languages Project, a collaboration between Google, the First Peoples' Cultural Council, and dozens of Indigenous organizations worldwide. The project has built AI models for more than 150 endangered languages and provides free tools for communities to create dictionaries, lesson plans, and interactive apps.
Community Control Is Non-Negotiable
The most sensitive aspect of this work is not technological but ethical. Indigenous communities have long been subjected to extractive research practices, in which outside academics studied their languages and cultures without meaningful consent or benefit-sharing. The history of linguistics is littered with examples of researchers who published grammars and dictionaries that communities never wanted made public.
AI amplifies these concerns. Language data fed into machine learning models can be difficult to control, and communities worry about sacred or ceremonial language being made accessible to outsiders or commercialized without permission.
The most successful projects have placed community control at the center. The Maori Language Commission in New Zealand developed its own AI transcription tool, Te Hiku Media, with explicit data sovereignty provisions ensuring that all language data remains under Maori ownership. The tool has transcribed thousands of hours of Maori-language media and is now being adapted for other Pacific languages.
"The data is ours. The language is ours. The technology must serve us, not the other way around," said Keoni Mahelona, co-founder of Te Hiku Media.
From Documentation to Revitalization
Documentation alone does not save a language. Revitalization -- getting people to actually speak it in daily life -- requires sustained community effort, educational infrastructure, and political will. AI tools are increasingly being designed with that goal in mind.
Duolingo now offers courses in Hawaiian, Navajo, and Yiddish, developed in partnership with community language programs. A startup called Anki Languages has built immersive conversation apps for Cherokee, Ojibwe, and several Australian Aboriginal languages, using AI-generated dialogues reviewed and approved by elder speakers.
In Canada, the federal government allocated $450 million over five years for Indigenous language revitalization in its 2024 budget, with a significant portion earmarked for technology development. Several First Nations communities have used the funding to build AI-powered language nests -- immersive early childhood programs where toddlers are surrounded by their ancestral language through both human speakers and digital tools.
A Race Against Time
The urgency is difficult to overstate. Many endangered languages have fewer than 100 speakers, nearly all of them elderly. Every month that passes without comprehensive documentation narrows the window for AI tools to learn from fluent speakers.
But there are reasons for cautious optimism. The combination of community determination, improved technology, and growing institutional support has created a moment of possibility that did not exist even five years ago. The question is whether that moment will be seized before the silence becomes permanent.


