WIZARD.

Robotic Policy Adaptation
via Weight-Space Meta-Learning

TL;DR.

WIZARD enables a robot to learn a new task from a single video demonstration, without any fine-tuning. It directly generates the "brain surgery" (policy weights) needed for the robot to adapt, improving performance on unseen tasks by up to 15x compared to standard methods.

Zero-Shot Rollout Demonstrations.

LIBERO-Spatial: Pick up the black bowl on the stove and place it on the plate

LIBERO-Object: Pick up the orange juice and place it in the basket

LIBERO-Goal: Put the wine bottle on top of the cabinet

LIBERO-Spatial: Pick up the black bowl on the cookie box and place it on the plate

LIBERO-Object: Pick up the cream cheese and place it in the basket

LIBERO-Goal: Put the bowl on the plate

Abstract.

Modern imitation-based robotic policies are predominantly expert systems: large pretrained Vision-Language-Action (VLA) models typically require test-time fine-tuning on task data to specialize their behavior. We propose a new paradigm for robotic test-time adaptation: generating task-specific policy parameters directly in weight space, enabling scalable zero-shot adaptation to unseen tasks.

We introduce WIZARD, a meta-network that generates low-rank adapters for a frozen VLA policy from task evidence composed of language instructions and short demonstration videos. During meta-training, we construct a dataset that pairs task descriptions with their corresponding task-specific LoRA updates. At inference time, given only a prompt and a single video from an unseen task, WIZARD generates the task-specific LoRA adapter in a single forward pass. Experiments on the LIBERO benchmark show that the generated adapters improve performance by up to ~2x on unseen datasets and up to ~15x on unseen tasks.

Key Contributions.

The WIZARD Architecture.

WIZARD Architecture Diagram

WIZARD re-frames robotic adaptation as a direct parameter inference problem. Instead of slow, gradient-based fine-tuning, it learns to map task evidence directly to the parameter delta (a LoRA adapter) needed to specialize a general policy. This is done in three stages:

  1. Perception: A multimodal encoder processes a language prompt and a single demonstration video, creating a compact "task embedding" that captures the task's semantics and kinematics.
  2. Reasoning: The core meta-network, the Adapter Generator, takes this task embedding and synthesizes the weights for a LoRA adapter in a single forward pass.
  3. Action: The generated adapter is injected into the frozen, pre-trained VLA backbone. The resulting specialized policy can then be rolled out to execute the new task, zero-shot.

Zero-Shot Generalization: LIBERO-Spatial.

We evaluate against standard fine-tuning under strict held-out distribution shifts. In LIBERO-Spatial, the baseline MT-VLA struggles to adapt without gradient updates. Conversely, WIZARD's generated adapters correctly interpret relational spatial instructions and adapt kinematics, achieving a ~2x performance increase over the baseline entirely zero-shot.

MethodT1T2T3T4T5T6T7T8T9T10Avg.
MT-VLA (Baseline)0.220.000.560.860.000.020.000.180.020.000.19
WIZARD (Ours)0.900.120.820.840.080.280.100.760.080.000.40
π₀.₅ Experts (Upper Bound)1.000.981.000.940.920.980.960.980.960.960.97

Ablations.

Our analysis reveals that task-level embeddings preserve discriminative semantics necessary for effective parameter generation, outperforming broader dataset-level conditioning. Furthermore, a single demonstration episode is sufficient to construct an informative task embedding; expanding the support size yields diminishing returns. Finally, ablating the input modalities confirms that both text (providing high-level intent) and visual demonstrations (providing geometric and kinematic priors) are critical to prevent performance collapse.

Citation.

@misc{bianchi2026wizard,
    title={WIZARD: Robotic Policy Adaptation via Weight-Space Meta-Learning}, 
    author={Christian Bianchi and Siamak Yousefi and Alessio Sampieri and Luca Rigazio and Fabio Galasso and Luca Franco},
    year={2026},
    publisher={ItalAI}
}