[Events] Dialect NLP: Thinking outside the box when processing non-standard and low-resource languages(11/26 16:30)
- 소프트웨어융합대학
- Hit405
- 2025-11-17
Title: Dialect NLP: Thinking outside the box when processing non-standard and low-resource languages
Speaker: Dr. Verena Blaschke @ LMU Munich
Time : 16:30 - 17:30, Nov 26th, 2025
Location: Online
https://hli.skku.edu/InvitedTalk251126
Language: English speech & English slides
Abstract:
Natural language processing (NLP) has improved by leaps and bounds when it comes to processing data from standardized languages with plenty of available data, like German. However, NLP lags behind when closely related non-standard varieties are concerned. In this talk, I will describe ways in which processing dialect data differs from processing standard-language data, and discuss some of the current challenges in dialect NLP research. For instance, I will talk about strategies to mitigate the effect of infelicitous subword tokenization caused by ad-hoc pronunciation spellings. Additionally, I argue that we should not only consider how to tackle dialectal variation in NLP, but also why. To this end, I will highlight perspectives of some dialect speaker communities on which language technologies should (or should not) be able to process or produce dialectal in- or output.
Bio:
Verena Blaschke is a final-year PhD student at LMU Munich. She currently researches NLP for non-standard dialects and other low-resource language varieties, investigating how robust language models are towards language variation (and how to make them more robust). Her research is supervised by Barbara Plank and co-supervised by Hinrich Schütze. She also completed a research internship at Apple where she worked on multilingual NLP, and she previously developed software for machine-assisted historical linguistics at the University of Tübingen.








