Atomic Action Slicing: Planner-Aligned Options for Generalist VLA Agents

Current vision–language–action (VLA) models generalize poorly,
particularly when tasks require new compositions of skills or objects. We introduce Atomic Action Slicing (AAS), a planner-aligned
approach that decomposes long-horizon demonstrations into short,
typed atomic actions that are easier for planners to use and policies
to learn. Using LIBERO demonstrations, AAS produces a validated
dataset of 2,124 atomic segments labeled with action type, temporal span, and confidence. A stronger segmenter (Gemini 2.5 Pro)
closely matches planner-defined plans and remains robust under
keyframe jitter, while smaller models perform worse on multi-object
tasks. Fine-tuning CLIP-RT+ on our atomic dataset improves task
success from 94.2%→95.3% on LIBERO-Goal and 83.8%→88.8% on
LIBERO-Long. We publicly release the GATE-VLAP dataset on HuggingFace.
.

DOI
Authors
Tabakov, S., Popov, A., Dimitrov, D., Kiyamousavi, S E., Hristov, V., Kraychev, B.
Search...