ProsopyEstimation · Neo-Babylonian Prosopography

Dating Undated Cuneiform Tablets Through Social Networks

An iterative algorithm estimates the date of undated Neo-Babylonian legal documents by leveraging the known life spans of co-attested individuals — expanding the datable corpus by over 29%.

4,589
Total Tablets
3,415
Pre-Dated (74.4%)
1,060
Newly Estimated (23.1%)
114
Unestimatable (2.5%)
70.1%
Validation Accuracy
01

Illuminating the Record: Temporal Gap-Filling

The chart shows the distribution of tablets across time in 5-year intervals, before and after estimation. Periods where the newly-dated tablets (orange) reach highest relative to pre-dated ones (blue) represent the moments where the algorithm had the greatest historical impact.

Loading insight…
02

Which Archives Benefit Most?

Each bar represents one archive's full tablet corpus, broken down by estimation status. Archives are sorted by the share of newly-dated tablets (orange). Archives with many undated documents gain the most from the algorithm.

Loading insight…
03

The Network Anchors: Most Productive Individuals

These are the people whose known date ranges anchored the most new estimations. A person is counted if they appear in dated tablets (entering the "Yellow Pages" reference) and also appear in tablets that were subsequently estimated. Kings (👑) dominate because they appear as date references in many tablets; non-royal figures reflect active business networks.

Loading insight…
04

How Accurate Are Our Estimates?

We test the algorithm using leave-one-out cross-validation: each of the 3,415 pre-dated tablets is temporarily removed from the corpus, and the algorithm tries to re-estimate its date from the remaining data alone. The estimate is compared to the known date. A hit means the true date falls inside the estimated range (±1 year tolerance). Of the 3,415 tablets, 45 could not be re-estimated at all (they become isolated — no co-attested person with a known date range when that tablet is removed); these are excluded from both charts below and are not counted as misses. The 114 tablets that are permanently unestimatable (no date, no network connections) never appear in validation.

Actual year vs. center of estimate — each point is one validated tablet. The X-axis is the true known date; the Y-axis is the midpoint of the estimated date range ("center of estimate"). Points on the dashed diagonal line would be perfect predictions. Green dots (hits) cluster near the diagonal; red dots (misses) deviate from it.

Distribution of estimation error — how many years does the center of estimate deviate from the true date? Hits (green) cluster close to 0 because by definition the true date is inside the estimated range. Misses (red) spread further out, showing how far off the algorithm was when it failed.

Loading insight…
05

Coverage vs. Precision: Choosing the Activity Threshold

The algorithm uses a max active years threshold to decide which individuals can serve as date anchors. Only individuals whose attested activity spans fewer than this many years are used — their narrow window provides a tighter date constraint. If too few anchors qualify, the algorithm automatically falls back to 2× and then 3× the threshold.

This chart shows how the threshold affects the tradeoff between coverage (how many undated tablets receive an estimate) and precision (average width of the estimated date range). The two vertical markers show the thresholds used in the paper: 17 years (optimal for precision) and 30 years (optimal for coverage).

Loading insight…