top of page

Research Outputs

Zeyneb N. Kaya

Proceedings of the Linguistic Society of America (PLSA), 2023

This paper investigates the morpho-syntactic features of language contact in the endangered Greek dialect Romeyka with Turkish. We analyze the use of the borrowed negative existential jok to (a) determine its role in Romeyka’s negation patterns (b) examine the effects of contact in Romeyka through cross-linguistic comparisons of jok with Turkish and forms of the dialect as spoken in Greece and (c) apply the identified grammatical patterns of jok to Myers-Scotton’s linguistic explanations for the code switching phenomena in the Matrix Language Turnover Hypothesis. The analysis demonstrates the pervasive influence of Turkish on the morpho-syntax of Romeyka through the incorporation of Turkish grammatical structures. We observe changes in the fundamental predicate grammar that are aligned with Turkish and that are inconsistent with Pontic’s existential constructions where the verb indicating existence is used. The patterns of contact confirm the Matrix Language hypothesis and provide evidence that indicate that Romeyka may be undergoing language turnover. Our findings are relevant to further understanding code switching among speakers of minority languages and assessing the vitality of Romeyka in Turkey.

Zeyneb N. Kaya

National Junior Science and Humanities Symposium (NJSHS), 2023

Broad-ranging machine translation systems are crucial across domains in defense, healthcare, and education, and help make diverse minority voices heard. Low-data conditions have been a persisting challenge in Deep Learning, and in Natural Language Processing and multilingual tasks, addressing this is especially complex. I propose MADLIBS (Multilingual Augmentation of Data with Alignment-Based Substitution), a new multilingual Data Augmentation (DA) method for Neural Machine Translation, specifically with the goal of inclusivity and supporting under-resourced languages. It consists of an attention-based encoder-decoder aligner, a semi-supervised POS-tagger, and a template generator to generate diversified and semantically consistent sentence pairs. Taking a fundamentally unique approach, it improves NMT systems and surpasses state-of-the-art established DA methods, without the use of external data. 

bottom of page