top of page


Documentation is an important aspect of preserving languages. In collaboration with native Romeyka speakers to document the language through elicitation sessions the project has put together a growing documentation of Romeyka, as well as an accompanying grammar sketch. The project covers lexical/grammar elicitation, oral history elicitation, digital cultural history documentation.      

By the Numbers


minutes elicitation


words recorded


bilingual pairs

Our Goals and Rationale

Traditional linguistic fieldwork often requires many resources, and falls short in various aspects. It makes it difficult to effectively collect sufficient documentation of endangered languages in a timely and costly manner. Furthermore, it leaves preservation in the control of field workers, giving researchers power of what is documented, narrowing perspective.

In this era where culture and technology overlaps, computational methods are valuable. Mobile apps allow for accessible language documentation tools for greater scale. The speakers decide what is documented, and they can take charge of protecting and promoting their identities, having their voices and their stories heard.

The amount of primary data to needed support wide-ranging investigations of a language is about 10 million words, or 1,000 hours of speech (Liberman, 2006). Computational and accessible tools, in collaboration with bilingual speakers, can help generate a substantial collection of text for the preservation of language and heritage.

Request Data

To obtain access to the data for purposes in academia, industry, education, or more, you may reach out to request the data. Please state your affiliation or institution, intended purpose and nature of use, and the category of documentation (lexical/grammar elicitation, oral history elicitation, digital cultural history).  

Thanks for submitting!

bottom of page