GroundControl AI is a discrete-event airport simulator coupled to a MaskablePPO policy. The simulator models a single day at minute granularity: aircraft arrive on a schedule, request services, occupy gates, and depart. A fleet of ground vehicles — fuelers, baggage tugs, pushback tractors — must be dispatched to complete each service before the departure window closes. Vehicle types, shift windows, travel times across the apron, and gate compatibility are all enforced as hard constraints.
The dispatcher is a MaskablePPO policy with a 337-dimensional observation space encoding aircraft states, vehicle positions, pending tasks, and an anticipation buffer of upcoming work. Action masking is what makes the whole thing tractable: at every decision step, an action mask is computed from world state — only compatible vehicle/aircraft pairs in valid shift windows are presented to the policy. Vanilla PPO has to learn legality through punishment; MaskablePPO never sees an illegal action in the first place. The policy converges; vanilla PPO does not.
Evaluation is the project's discipline. Every result comes from a replayable seed bank: a 50-seed in-distribution battery on KFIC (the synthetic 15-node graph the policy was trained on) plus a 50-seed out-of-distribution battery, plus a real-world 40-flight slice from BTS data on KAUS (Austin-Bergstrom). The KFIC win is real and reported (21% win rate vs FCFS, mean delay delta −0.4 min, zero conflicts, zero abandonments). The KAUS gap is also reported honestly — the policy was trained on a 15-node graph and sees fallback values for KAUS's 119-node graph; it does not fail (zero conflicts) but is overly passive. KAUS retraining is the active next step.