Every tiny LM, same eval harness, transparent benchmarks
Generate flower images from selected classes