3 · Period Classification

How much historical information survives in tablet silhouette alone?

Paper section: Results §3.2 · Notebooks: 3, 4, 5, 6, 7, 8, 9

Overview

Period classification benchmarks the information content of tablet silhouettes. If a classifier trained only on shape achieves high accuracy, shape carries strong historical signal. If accuracy is low but above chance, shape carries some signal but is noisy. The analysis uses four classifiers of increasing representational power, from a hand-crafted decision tree to a large pre-trained vision transformer.

The four classifiers

Model Input Accuracy Notes
Decision tree 3 hand-crafted ratios 30.1% Baseline — explicit shape features only
CNN (shallow) 80×80 silhouette pixels 50.0% Learns local texture + global form
ResNet50 — silhouette 80×80 silhouette (3ch) 61.0% Deep residual network, shape-only
ResNet50 — full photo 224×224 RGB photo 71.0% Same architecture + surface detail

All models classify into 21 historical periods (chance = 4.8%).

The key ratio: 86% information retention

The silhouette achieves 61% versus the full photo’s 71%, giving:

\[\text{retention} = \frac{61\%}{71\%} \approx 86\%\]

Shape alone carries ~86% of the period-classification signal present in the full photograph. The remaining 14% is encoded in surface features — script density, impression depth, tablet colour, surface texture — that the binary silhouette discards.

Code
import matplotlib.pyplot as plt
import numpy as np

models = ['Decision\nTree\n(3 ratios)', 'CNN\n(silhouette)', 'ResNet50\n(silhouette)', 'ResNet50\n(full photo)']
accs = [30.1, 50.0, 61.0, 71.0]
colors = ['#ccc', '#4a6fa5', '#b5622e', '#2c6e49']

fig, ax = plt.subplots(figsize=(7, 4))
bars = ax.bar(models, accs, color=colors, alpha=0.9, edgecolor='white', lw=0.5, width=0.55)
ax.axhline(4.8, color='grey', ls='--', lw=1.2, alpha=0.7, label='Chance (21 classes = 4.8%)')
ax.axhline(61.0, color='#b5622e', ls=':', lw=1.0, alpha=0.6)
ax.axhline(71.0, color='#2c6e49', ls=':', lw=1.0, alpha=0.6)

# Annotate the 86% retention ratio
ax.annotate('', xy=(3, 71), xytext=(3, 61),
            arrowprops=dict(arrowstyle='<->', color='black', lw=1.5))
ax.text(3.08, 66, '14%\n(surface\ndetail)', fontsize=7.5, va='center')
ax.annotate('86% retention\n(shape alone)', xy=(2.5, 66), fontsize=8,
            color='#b5622e', ha='center', style='italic')

for bar, acc in zip(bars, accs):
    ax.text(bar.get_x() + bar.get_width()/2, acc + 0.8, f'{acc}%',
            ha='center', va='bottom', fontsize=9, fontweight='bold')

ax.set_ylabel('21-class accuracy (%)', fontsize=10)
ax.set_title('Period classification accuracy by model', fontsize=11)
ax.set_ylim(0, 80)
ax.legend(fontsize=8)
plt.tight_layout()
plt.show()
Figure 1: Classification accuracy by model type, ordered by representational power. The gap between ResNet50-silhouette and ResNet50-photo quantifies what shape discards; the gap above chance quantifies what shape retains.

What the confusion matrix reveals

Code
import matplotlib.pyplot as plt
import numpy as np

# Representative per-class recall values from the paper
periods = ['Ur III', 'Neo-Assyrian', 'Old Babylonian', 'Achaemenid',
           'Neo-Babylonian', 'Hellenistic', 'Middle Elamite', 'Old Assyrian']
recall = [0.87, 0.95, 0.68, 0.72, 0.41, 0.38, 0.64, 0.55]

fig, ax = plt.subplots(figsize=(8, 4))
colors = ['#b5622e' if r >= 0.70 else '#4a6fa5' if r >= 0.55 else '#aaa'
          for r in recall]
bars = ax.barh(periods[::-1], [r*100 for r in recall[::-1]],
               color=colors[::-1], alpha=0.85, edgecolor='white', lw=0.5)
ax.axvline(61.0, color='grey', ls='--', lw=1.0, alpha=0.7, label='Overall accuracy')
for bar, r in zip(bars, recall[::-1]):
    ax.text(r*100 + 0.5, bar.get_y() + bar.get_height()/2,
            f'{r*100:.0f}%', va='center', fontsize=8.5)
ax.set_xlabel('Per-class recall (%)', fontsize=10)
ax.set_title('Selected per-class recall — ResNet50 (silhouette)', fontsize=10)
ax.set_xlim(0, 105)
ax.legend(fontsize=8)
plt.tight_layout()
plt.show()
Figure 2: Schematic of ResNet50-silhouette confusion patterns. Periods with the clearest shape signatures (Ur III, Neo-Assyrian) achieve the highest per-class recall; periods with high within-period variance (Neo-Babylonian, Hellenistic) are most often confused with adjacent periods.

Period-level performance

The two best-classified periods are: - Neo-Assyrian (~95% recall): extreme portrait orientation (X7 = +2.05) with very distinctive tall, narrow form makes it nearly unmistakable - Ur III (~87% recall): the tightest standardized portrait corpus; the classifier exploits low within-period variance

The hardest periods: - Neo-Babylonian (~41%): highest CV (80.7%), diverse genre mixture; tablets are often mistaken for Achaemenid or Hellenistic - Hellenistic (~38%): small corpus, format largely continuous with Achaemenid; confused with Achaemenid and Neo-Babylonian

What this establishes

The 86% retention figure serves as the methodological warrant for the rest of the analysis: if silhouette shape were uninformative, the VAE latent space, the h/w ratios, and the geographic comparisons would all be noise. The classification benchmark establishes that they are not.

The decision-tree baseline (30.1% on 3 hand-crafted ratios vs. 4.8% chance) further confirms that simple, interpretable shape features already carry substantial historical signal — the deep models primarily improve on the long tail of confusable periods.

Note

Next: VAE Features → — the 12-dimensional latent space and the discrimination vs. trend dissociation.