Students Quyen Ha and Sabina Hartnett text-mined German sources to reveal trends in discourse. Professor Birgit Tautz explains…
In summer 2017, Quyen Ha and I worked together as part of a Gibbons Fellowship that Quyen won. I’ve long been interested in popular discourse on China in 18th century German language newspapers and journals — unlike established then-and now-famous literary texts — the popular often remains anonymous, is repeated, stolen, translated or republished without attribution of source. (There was no copyright law in the 18th century!) So I had a hunch that the journals may help us explain how the image of China in German lands morphed from one that was largely positive and respectful, in part because of China being perceived as an ancient harbor of philosophy, to one that turned increasingly negative, irrational and into sinophobia. Quyen wrote a program that assisted with text recognition, making old German print legible, and “mined the data,” leading to sets and patterns waiting to be explored. She applied topic modeling and created data sets that I not only used in a seminar on 18th century German literature but that we are now interpreting: and the patterns we find reveal a much more complicated image of China than we expected.
Sabina Hartnett brings her far-ranging expertise in digital and computational studies (DCS) to her honors project in German. Her topic is very timely and revolves around the public discourse on refugees in Germany today, but also in historical perspectives. Here, the historical data serve to illustrate how concepts and words became engrained in the German language, were eclipsed or promoted by massive dictionary projects of the 19th century and resurrected, popularized, and emotionally invested in recent years. She shows in fascinating ways how data may help us understand how people think and talk and become politically enshrined and invested in closed-off ways of thinking.
In both projects, digital and computational studies methods help us to read in new ways and open up to debate long-standing truisms. And in the future, we look forward to applying DCS methods to the new database, German Literature Collections, for which the Library purchased the text-minable file.