The problem was known: two thousand-odd photos in the archive, alt text written for maybe two hundred. Writing them by hand is the kind of job I've been putting off for three years, so I stopped pretending I'd ever do it.
The fix runs on a mini-pc the office was throwing out: sixteen gigs of RAM, a past life as a spreadsheet mule. On top of it I put llama.cpp and a Qwen quantized to 4 bits that fits in memory by a miracle.
The night round
At two in the morning a script kicks off: it grabs the photos with no alt, resizes them, hands them to the model with a strict prompt, writes the result to a review file. It never touches the database. In the morning, with coffee, I approve or fix.
First honest result: the model sees fine and writes so-so. "A person standing near a window" is correct and useless. I rewrote the prompt four times before it stopped opening every sentence with "The image shows".
What I keep
After a month: 70% of the alt text passes with minor edits, 20% I rewrite, 10% is comedy and I collect it in a separate file. For a job I'd never have done by hand, that's a trade I'll sign.
The mini-pc draws less than the darkroom safelight. I can hear it grinding from the next room, and it's a sound with something right about it.