Illuminating the Record: Temporal Gap-Filling
The chart shows the distribution of tablets across time in 5-year intervals, before and after estimation. Periods where the newly-dated tablets (orange) reach highest relative to pre-dated ones (blue) represent the moments where the algorithm had the greatest historical impact.
Which Archives Benefit Most?
Each bar represents one archive's full tablet corpus, broken down by estimation status. Archives are sorted by the share of newly-dated tablets (orange). Archives with many undated documents gain the most from the algorithm.
The Network Anchors: Most Productive Individuals
These are the people whose known date ranges anchored the most new estimations. A person is counted if they appear in dated tablets (entering the "Yellow Pages" reference) and also appear in tablets that were subsequently estimated. Kings (👑) dominate because they appear as date references in many tablets; non-royal figures reflect active business networks.
How Accurate Are Our Estimates?
We test the algorithm using leave-one-out cross-validation: each of the 3,415 pre-dated tablets is temporarily removed from the corpus, and the algorithm tries to re-estimate its date from the remaining data alone. The estimate is compared to the known date. A hit means the true date falls inside the estimated range (±1 year tolerance). Of the 3,415 tablets, 45 could not be re-estimated at all (they become isolated — no co-attested person with a known date range when that tablet is removed); these are excluded from both charts below and are not counted as misses. The 114 tablets that are permanently unestimatable (no date, no network connections) never appear in validation.
Actual year vs. center of estimate — each point is one validated tablet. The X-axis is the true known date; the Y-axis is the midpoint of the estimated date range ("center of estimate"). Points on the dashed diagonal line would be perfect predictions. Green dots (hits) cluster near the diagonal; red dots (misses) deviate from it.
Distribution of estimation error — how many years does the center of estimate deviate from the true date? Hits (green) cluster close to 0 because by definition the true date is inside the estimated range. Misses (red) spread further out, showing how far off the algorithm was when it failed.
Coverage vs. Precision: Choosing the Activity Threshold
The algorithm uses a max active years threshold to decide which individuals can serve as
date anchors. Only individuals whose attested activity spans fewer than this many years are used — their
narrow window provides a tighter date constraint. If too few anchors qualify, the algorithm automatically
falls back to 2× and then 3× the threshold.
This chart shows how the threshold affects the tradeoff between coverage (how many undated
tablets receive an estimate) and precision (average width of the estimated date range).
The two vertical markers show the thresholds used in the paper: 17 years (optimal for precision)
and 30 years (optimal for coverage).