Skip Navigation

Produced by the Office of Marketing and Communications

Subscribe Now

Can a UMD Algorithm Understand ‘Star Trek’ Aliens?

Researcher Puts Natural Language Processing to the Test With Sci-Fi Language

By Melissa Brachfeld

Winfield and Stewart in "Star Trek: The Next Generation"

In one "Star Trek" episode, Capt. Jean-Luc Picard deciphers Tamarian, the language spoken by Dathan, the Tamarian captain. A new study led in part by a UMD computer science professor developed a machine translation of Tamarian.

Image by Alamy

From the Elvish and other languages spoken in “Lord of the Rings” to Dothraki in “Game of Thrones,” successful fantasy and science fiction franchises frequently feature their own real, but constructed, languages. These creations often have many of the same syntactic or semantic features as commonly spoken languages, and some—such as Klingon from “Star Trek”—have been extensively developed, complete with online dictionaries and translators.

Now, a leading University of Maryland expert in natural language processing (NLP)—a subfield that combines linguistics, computer science and artificial intelligence to better understand the interactions between computers and languages—is giving another “Star Trek” language the NLP treatment in the first study of its type.

Computer science Associate Professor Jordan Boyd-Graber, a lifelong Trekkie known for incorporating Klingon into class NLP assignments, collaborated with University of Arizona Assistant Professor Peter A. Jansen to investigate machine translation of Tamarian with a collection of translated English-Tamarian phrases.

Like the fictional language itself, it’s anything but a straightforward task. Instead of direct references, “Star Trek’s” Tamarians speak in metaphorical references grounded in stories that—like symbols—have learned associations with their true meaning. For example, instead of saying, “I want to give this to you,” a Tamarian would say, “Temba, his arms wide.”

This unusual structure poses a challenge for both the characters and the automated translation systems onboard the Enterprise. Likewise, the Tamarians cannot understand starship Capt. Jean-Luc Picard’s straightforward use of language.

First, the researchers created a dictionary of 50 Tamarian phrases paired with 456 parallel English phrases that captured the inferred meaning of each Tamarian expression. Almost half of them were gleaned from a Reddit thread, while the rest came from context clues in tie-in novels from the “Star Trek” universe.

They discovered that their machine translation system had a 76% accuracy rate in translating English phrases to Tamarian metaphorical utterances.

“Our results suggest that automatically translating metaphor-grounded languages may be feasible, but it is extremely difficult,” said Boyd-Graber, who has appointments in the College of Information Studies (iSchool) and University of Maryland Institute for Advanced Computer Studies.

While Tamarian is a fictional language, the researchers said that their paper demonstrates large language models’ abilities and limitations. They also discuss what it would take to grow Tamarian—or a similar language—into a more complete artificial language like Klingon, and how their work can help computers—which work best with literal language—better understand metaphors like “between a rock and a hard place.”



Maryland Today is produced by the Office of Marketing and Communications for the University of Maryland community on weekdays during the academic year, except for university holidays.