Schedule
Unit 1: What is Ancient Language Processing?
Date | Topic | Assignment | Materials |
---|---|---|---|
09/04/25 | Brushing-up on your Python character encoding; regex; pandas; parsing; functions |
Choose one ancient language dataset: familiarize yourself with the data model, consider how this data model fits with the datasets presented in class. Remodel your data accordingly. Do not forget collecting metadata as defined in the class examples. | Tutorials: |
23/04/25 | Doing Computational Philology: the (ideal) pipeline research questions; microscope vs macroscope; quantification; large language models |
Explore your dataset: upload the data into Voyant (or similar programs) or into an LLM and describe it based on distant reading techniques (i.e. quantification). Following this analysis decide where close reading would be necessary. | Reading: Rockwell & Sinclair (2016); Sommerschield et al. (2023) Tools:
|
Unit 2: Digitization and Annotation
Date | Topic | Assignment | Materials |
---|---|---|---|
30/04/25 | Pipeline I: Optical / Handwritten Character Recognition OCR, HTR, layout analysis, line segmentation, transcribing |
Gordin, Alper et al. (2024); Lincke (2021); Chauhan’s blogpost (2022) | |
07/05/25 | Pipeline II: Annotation Lemmatization, PoS-Tagging, Treebanking |
Ong & Gordin (2024); Sahala & Lincke (2024); Farsi et al. (2025) |
Unit 3: Operationalization and the Vector Space
Date | Topic | Assignment | Materials |
---|---|---|---|
14/05/25 | No Class | ||
21/05/25 | Pipeline III: Operationalization of words I - the vector space | Tutorials: Reading: |
|
28/05/25 | Pipeline IV: Operationalization of words II - measuring distances | Process your data and measure distances: Data model using cuneiform data from ORACC with metadata can be found in the ALP 2024 course repository | Tutorials: Reading: Gavin (2020); Bennett & Sahala (2023); Schweitzer (2023); Jurafsky & Martin (2024) |
Unit 4: Network Analysis and Visualization
Date | Topic | Assignment | Materials |
---|---|---|---|
04/06/25 | Pipeline V: Visualization I - Graph theory | ||
11/06/25 | Pipeline VI: Visualization II - Network metrics | ||
18/06/25 | Presentation of class projects and feedback |