How I built a decoupled Minecraft pipeline with game environment setup, RCON, 60 FPS recording, and a swappable agent to collect paired video and action data for world model training.
dataagent
How I built a decoupled Minecraft pipeline with game environment setup, RCON, 60 FPS recording, and a swappable agent to collect paired video and action data for world model training.
Gemini 3 Pro seemingly recognized text out of a blurry 114x64 screenshot that I couldn't read myself. I ran a brief ablation study to understand how it arrived at its conclusions.