Period classification benchmarks the information content of tablet silhouettes. If a classifier trained only on shape achieves high accuracy, shape carries strong historical signal. If accuracy is low but above chance, shape carries some signal but is noisy. The analysis uses four classifiers of increasing representational power, from a hand-crafted decision tree to a large pre-trained vision transformer.
The four classifiers
Model
Input
Accuracy
Notes
Decision tree
3 hand-crafted ratios
30.1%
Baseline — explicit shape features only
CNN (shallow)
80×80 silhouette pixels
50.0%
Learns local texture + global form
ResNet50 — silhouette
80×80 silhouette (3ch)
61.0%
Deep residual network, shape-only
ResNet50 — full photo
224×224 RGB photo
71.0%
Same architecture + surface detail
All models classify into 21 historical periods (chance = 4.8%).
The key ratio: 86% information retention
The silhouette achieves 61% versus the full photo’s 71%, giving:
Shape alone carries ~86% of the period-classification signal present in the full photograph. The remaining 14% is encoded in surface features — script density, impression depth, tablet colour, surface texture — that the binary silhouette discards.
Figure 1: Classification accuracy by model type, ordered by representational power. The gap between ResNet50-silhouette and ResNet50-photo quantifies what shape discards; the gap above chance quantifies what shape retains.
What the confusion matrix reveals
Code
import matplotlib.pyplot as pltimport numpy as np# Representative per-class recall values from the paperperiods = ['Ur III', 'Neo-Assyrian', 'Old Babylonian', 'Achaemenid','Neo-Babylonian', 'Hellenistic', 'Middle Elamite', 'Old Assyrian']recall = [0.87, 0.95, 0.68, 0.72, 0.41, 0.38, 0.64, 0.55]fig, ax = plt.subplots(figsize=(8, 4))colors = ['#b5622e'if r >=0.70else'#4a6fa5'if r >=0.55else'#aaa'for r in recall]bars = ax.barh(periods[::-1], [r*100for r in recall[::-1]], color=colors[::-1], alpha=0.85, edgecolor='white', lw=0.5)ax.axvline(61.0, color='grey', ls='--', lw=1.0, alpha=0.7, label='Overall accuracy')for bar, r inzip(bars, recall[::-1]): ax.text(r*100+0.5, bar.get_y() + bar.get_height()/2,f'{r*100:.0f}%', va='center', fontsize=8.5)ax.set_xlabel('Per-class recall (%)', fontsize=10)ax.set_title('Selected per-class recall — ResNet50 (silhouette)', fontsize=10)ax.set_xlim(0, 105)ax.legend(fontsize=8)plt.tight_layout()plt.show()
Figure 2: Schematic of ResNet50-silhouette confusion patterns. Periods with the clearest shape signatures (Ur III, Neo-Assyrian) achieve the highest per-class recall; periods with high within-period variance (Neo-Babylonian, Hellenistic) are most often confused with adjacent periods.
Period-level performance
The two best-classified periods are: - Neo-Assyrian (~95% recall): extreme portrait orientation (X7 = +2.05) with very distinctive tall, narrow form makes it nearly unmistakable - Ur III (~87% recall): the tightest standardized portrait corpus; the classifier exploits low within-period variance
The hardest periods: - Neo-Babylonian (~41%): highest CV (80.7%), diverse genre mixture; tablets are often mistaken for Achaemenid or Hellenistic - Hellenistic (~38%): small corpus, format largely continuous with Achaemenid; confused with Achaemenid and Neo-Babylonian
What this establishes
The 86% retention figure serves as the methodological warrant for the rest of the analysis: if silhouette shape were uninformative, the VAE latent space, the h/w ratios, and the geographic comparisons would all be noise. The classification benchmark establishes that they are not.
The decision-tree baseline (30.1% on 3 hand-crafted ratios vs. 4.8% chance) further confirms that simple, interpretable shape features already carry substantial historical signal — the deep models primarily improve on the long tail of confusable periods.
Note
Next:VAE Features → — the 12-dimensional latent space and the discrimination vs. trend dissociation.
Source Code
---title: "3 · Period Classification"subtitle: "How much historical information survives in tablet silhouette alone?"sidebar: analyses---> **Paper section:** Results §3.2 · **Notebooks:** `3`, `4`, `5`, `6`, `7`, `8`, `9`## OverviewPeriod classification benchmarks the **information content** of tablet silhouettes.If a classifier trained only on shape achieves high accuracy, shape carries stronghistorical signal. If accuracy is low but above chance, shape carries some signalbut is noisy. The analysis uses four classifiers of increasing representational power,from a hand-crafted decision tree to a large pre-trained vision transformer.## The four classifiers| Model | Input | Accuracy | Notes ||---|---|---|---|| Decision tree | 3 hand-crafted ratios | 30.1% | Baseline — explicit shape features only || CNN (shallow) | 80×80 silhouette pixels | 50.0% | Learns local texture + global form || ResNet50 — silhouette | 80×80 silhouette (3ch) | 61.0% | Deep residual network, shape-only || ResNet50 — full photo | 224×224 RGB photo | 71.0% | Same architecture + surface detail |All models classify into **21 historical periods** (chance = 4.8%).## The key ratio: 86% information retentionThe silhouette achieves **61%** versus the full photo's **71%**, giving:$$\text{retention} = \frac{61\%}{71\%} \approx 86\%$$**Shape alone carries ~86% of the period-classification signal present in the full photograph.**The remaining 14% is encoded in surface features — script density, impression depth,tablet colour, surface texture — that the binary silhouette discards.```{python}#| label: fig-classifier-comparison#| fig-cap: "Classification accuracy by model type, ordered by representational power. The gap between ResNet50-silhouette and ResNet50-photo quantifies what shape discards; the gap above chance quantifies what shape retains."import matplotlib.pyplot as pltimport numpy as npmodels = ['Decision\nTree\n(3 ratios)', 'CNN\n(silhouette)', 'ResNet50\n(silhouette)', 'ResNet50\n(full photo)']accs = [30.1, 50.0, 61.0, 71.0]colors = ['#ccc', '#4a6fa5', '#b5622e', '#2c6e49']fig, ax = plt.subplots(figsize=(7, 4))bars = ax.bar(models, accs, color=colors, alpha=0.9, edgecolor='white', lw=0.5, width=0.55)ax.axhline(4.8, color='grey', ls='--', lw=1.2, alpha=0.7, label='Chance (21 classes = 4.8%)')ax.axhline(61.0, color='#b5622e', ls=':', lw=1.0, alpha=0.6)ax.axhline(71.0, color='#2c6e49', ls=':', lw=1.0, alpha=0.6)# Annotate the 86% retention ratioax.annotate('', xy=(3, 71), xytext=(3, 61), arrowprops=dict(arrowstyle='<->', color='black', lw=1.5))ax.text(3.08, 66, '14%\n(surface\ndetail)', fontsize=7.5, va='center')ax.annotate('86% retention\n(shape alone)', xy=(2.5, 66), fontsize=8, color='#b5622e', ha='center', style='italic')for bar, acc inzip(bars, accs): ax.text(bar.get_x() + bar.get_width()/2, acc +0.8, f'{acc}%', ha='center', va='bottom', fontsize=9, fontweight='bold')ax.set_ylabel('21-class accuracy (%)', fontsize=10)ax.set_title('Period classification accuracy by model', fontsize=11)ax.set_ylim(0, 80)ax.legend(fontsize=8)plt.tight_layout()plt.show()```## What the confusion matrix reveals```{python}#| label: fig-confusion-sketch#| fig-cap: "Schematic of ResNet50-silhouette confusion patterns. Periods with the clearest shape signatures (Ur III, Neo-Assyrian) achieve the highest per-class recall; periods with high within-period variance (Neo-Babylonian, Hellenistic) are most often confused with adjacent periods."import matplotlib.pyplot as pltimport numpy as np# Representative per-class recall values from the paperperiods = ['Ur III', 'Neo-Assyrian', 'Old Babylonian', 'Achaemenid','Neo-Babylonian', 'Hellenistic', 'Middle Elamite', 'Old Assyrian']recall = [0.87, 0.95, 0.68, 0.72, 0.41, 0.38, 0.64, 0.55]fig, ax = plt.subplots(figsize=(8, 4))colors = ['#b5622e'if r >=0.70else'#4a6fa5'if r >=0.55else'#aaa'for r in recall]bars = ax.barh(periods[::-1], [r*100for r in recall[::-1]], color=colors[::-1], alpha=0.85, edgecolor='white', lw=0.5)ax.axvline(61.0, color='grey', ls='--', lw=1.0, alpha=0.7, label='Overall accuracy')for bar, r inzip(bars, recall[::-1]): ax.text(r*100+0.5, bar.get_y() + bar.get_height()/2,f'{r*100:.0f}%', va='center', fontsize=8.5)ax.set_xlabel('Per-class recall (%)', fontsize=10)ax.set_title('Selected per-class recall — ResNet50 (silhouette)', fontsize=10)ax.set_xlim(0, 105)ax.legend(fontsize=8)plt.tight_layout()plt.show()```## Period-level performanceThe two best-classified periods are:- **Neo-Assyrian (~95% recall)**: extreme portrait orientation (X7 = +2.05) with very distinctive tall, narrow form makes it nearly unmistakable- **Ur III (~87% recall)**: the tightest standardized portrait corpus; the classifier exploits low within-period varianceThe hardest periods:- **Neo-Babylonian (~41%)**: highest CV (80.7%), diverse genre mixture; tablets are often mistaken for Achaemenid or Hellenistic- **Hellenistic (~38%)**: small corpus, format largely continuous with Achaemenid; confused with Achaemenid and Neo-Babylonian## What this establishesThe 86% retention figure serves as the methodological warrant for the rest of theanalysis: if silhouette shape were uninformative, the VAE latent space, the h/wratios, and the geographic comparisons would all be noise. The classificationbenchmark establishes that they are not.The decision-tree baseline (30.1% on 3 hand-crafted ratios vs. 4.8% chance) furtherconfirms that **simple, interpretable shape features already carry substantialhistorical signal** — the deep models primarily improve on the long tail ofconfusable periods.::: {.callout-note}**Next:** [VAE Features →](04-vae.qmd) — the 12-dimensional latent space and thediscrimination vs. trend dissociation.:::