Skip to Content, Navigation, or Footer.
The Tufts Daily
Where you read it first | Wednesday, April 24, 2024

Perseus project to analyze ancient tongues with supercomputers

The Perseus Digital Library Project recently received a $285,000 grant from the National Endowment for the Humanities (NEH) to use government supercomputers to research methods of automatically analyzing ancient languages.

Two Tufts researchers will travel this spring to Lawrence Berkeley National Laboratory, a U.S. Department of Energy entity located near the University of California, Berkeley campus, making use of modern-day technology to make classical texts more easily accessible to the masses.

The project, led by Professor of Classics Gregory Crane, was officially founded at Tufts in 1987 as a digital library that brings a broad array of primary and secondary literary sources and images together on the Internet in order to cater to large audiences.

The researchers will use the computers to analyze changes in Greek and Latin words, compare texts and develop a translation tool for ancient languages.

"[The grant] will allow us to analyze larger bodies of data and use techniques that are computationally really expensive that you can't conduct on a regular machine," said Crane, the project's editor-in-chief.

The U.S. Department of Energy and the NEH collaborated to award a number of grants that would allow research focusing on the humanities to be enhanced by government-owned supercomputers. The two agencies issued grants this December to Tufts and other organizations around the country.

"We have the need for linguistic analysis that requires a lot of computations," Crane said. "One example is trying to work on translation — automatic translation from Greek and Latin into English — technologies that are similar to Google translations."

The NEH's Office of Digital Humanities selected Perseus, as well as projects based out of the University of California, San Diego and the University of Virginia, after a highly competitive peer-review process.

"Often the competitions are very fierce," Crane said. "Sometimes only 5 percent of applicants are funded."

Crane and proposal co-author David Bamman, a senior researcher on the project, will travel to Berkeley this spring for computer training and to work with other experts from around the world.

Researchers have already started creating algorithms for sorting and analyzing data on Tufts computers but are looking forward to taking advantage of a large government machine.

"We will be researching ways where we can move our algorithms to that much larger space [the computer in Berkeley] where they can deal with greater quantities of data," Bamman said.

Currently, Bamman is working on analyzing Greek and Latin syntax and creating dictionaries, among myriad other enterprises.

Perseus works by taking mainly primary material from ancient Greece and Rome and connecting it to secondary sources. The project aims to allow people to access not only digitized classic texts, but also those of a supporting and analytical nature.

"References get linked to primary sources, so suddenly you're not just reading someone's opinion, but through your Internet connection, you can read the text of ‘Hamlet,' see various manuscripts and notations, and then see what other commentaries there are from the 1700s, 1800s, 1900s and current day," said Lisa Cerrato, the managing editor of Perseus.

The materials predominantly include work from the Classical period, but the collection has dipped into other eras and areas of interest, from the Renaissance to 19th century America to Shakespeare.

 "We found the tools being built to study the Classics were applicable to other large collections of data," Cerrato said.

Those running Perseus hope to capitalize on the large-scale digitization of books.

"We have all these books, far more than Tufts ever could print, that are available in electronic form," Crane said. "So how do we make good use of them?"

The project intends to make texts and resources available to as many people as possible, reaching a diverse audience ranging from university professors to individuals who want to simply enhance their own knowledge.

College students are one targeted demographic. "We want students to be able to conduct undergraduate research," Crane said. "Now, with this content and the analytical resources, they can take on significant research projects."

Perseus started as a CD-ROM project, but since entering the World Wide Web in 1995, its work has broadened to include multiple aspects of the humanities. As a result, the project's typical application has become harder to define.

"We are contacted by users across the globe from all levels of experience — some with no background in the Classics and some who are tenured professors," Cerrato said.

Currently, the researchers are working on applying the tools that analyze Greek and Latin to a study of words and texts in Arabic.

"We are hoping to be able to have people enter Arabic that they find from someplace like Al Jazeera and be informed as to what the word actually is," Cerrato said.