Hello! I’m a postdoctoral researcher in the Department of Computer Science at Boston College, and a 2021-2023 Computing Innovation Fellow supported by NSF/CRA. I received my PhD in Linguistics from the University of California, Davis in Summer 2020.
My research studies structural variation in both languages and machines. I address related questions from the angle of language typology, low-resource NLP, and cognitive development, along with a non-western mind. The general methodology that I take upon is a data-driven approach coupled with The #BenderRule and methods of number counting at varying degrees of carbon dioxide consumption.
Besides academic responsibilities, I proudly serve on the planning committee for Advocates for Indigenous California Language Survival, and their biennial institute Breath of Life.
My non-research interests are music, food, and simple methods, and in that order too.
Fall 2021 -
Starting as a Computing Innovation Fellow.
Fall 2020 -
Started working as a postdoc in late November, advised by Emily Tucker Prud’hommeaux. I usually spend time wandering around low-resource NLP evaluation as well as language technologies for enhancing documentation of indigenous languages. Some other times I like to think about language development and typology.
2014 - Summer 2020
During this time, I got a PhD, advised by Kenji Sagae. My dissertation project focuses on crosslinguistic modeling of word order preferences, asking what abstract constraints as well as idiosyncrasies govern language users’ structural choices from both synchronic and diachronic perspectives. Additionally, this project also tries to adapt theoretical framework of dependency syntax to develop a treebank for Hupa, an endangered Dene language of northwestern California traditionally spoken in Hoopa Valley on the lower Trinity River in present-day Humboldt County, as a way to formalize and model the syntax of indigenous languages.
During this time, I interned as a software developer at the Cognitive Computing Lab at Baidu. I worked on designing graph represetations for open-domain information extraction in English and Mandarin.