Unit 1: What is Ancient Language Processing?
| Date | Topic | Assignment | Materials |
|---|---|---|---|
| 15/04/2026 | Brushing-up on your Python | Choose one ancient language dataset: familiarize yourself with the data model, consider how this data model fits with the datasets presented in class. Remodel your data accordingly. Do not forget collecting metadata as defined in the class examples. |
Tutorials: - How to define a function | - List comprehensions | - Load, parse and extract data from ORACC json files | - RegEx101.com |
| 22/04/2026 | No Class (Independence Day) | ||
| 29/04/2026 | Doing Computational Philology: the (ideal) pipeline | Explore your dataset: upload the data into Voyant (or similar programs) or into an LLM and describe it based on distant reading techniques (i.e. quantification). Following this analysis decide where close reading would be necessary. |
Reading: Rockwell & Sinclair (2016) | Sommerschield et al. (2023) | Tools: - Voyant Tools; - AntConc; - LLM (ChatGPT, Claude, Olmoe etc.) |
Unit 2: Digitization and Annotation
| Date | Topic | Assignment | Materials |
|---|---|---|---|
| 06/05/2026 | Pipeline I: Optical / Handwritten Character Recognition | Gordin, Alper et al. (2024) | Lincke (2021) | Chauhan's blogpost (2022) |
|
| 13/05/2026 | Pipeline II: Annotation (Lemmatization, PoS-Tagging, Treebanking) | Ong & Gordin (2024) | Sahala & Lincke (2024) | Farsi et al. (2025) |
Unit 3: Operationalization and the Vector Space
| Date | Topic | Assignment | Materials |
|---|---|---|---|
| 20/05/2026 | Pipeline III: Operationalization of words I - the vector space | Tutorials: - Playing with Neural Networks; Reading: Riemenschneider (2025) | Gordin et al. (2025) |
|
| 27/05/2026 | Pipeline IV: Operationalization of words II - Measuring distances | Process your data and measure distances: Data model using cuneiform data from ORACC with metadata can be found in the ALP 2024 course repository |
Tutorials: - Coding the past: understand TF-IDF in python | - Programming Historian: analyzing documents with TF-IDF | - Karsdorp, Kestemont, & Riddell 2021 | Reading: Gavin (2020) | Bennett & Sahala (2023) | Schweitzer (2023) | Jurafsky & Martin (2024) |
| 03/06/2026 | Pipeline V: Operationalization of words III - Word vectors |
Unit 4: Network Analysis and Visualization
| Date | Topic | Assignment | Materials |
|---|---|---|---|
| 10/06/2026 | Pipeline VI: Visualization I - Graph theory | ||
| 17/06/2026 | Pipeline VII: Visualization II - Network metrics | ||
| 24/06/2026 | Presentation of class projects and feedback | ||
| 01/07/2026 | Presentation of class projects and feedback |