FluxVLA Code Architecture Overview#

This page provides a quick overview of the FluxVLA code structure, helping you locate key modules faster when reading, debugging, or extending the framework.

FluxVLA Code Architecture Diagram

1. Code Organization Overview#

FluxVLA has a clear project structure. The core directories are:

  1. fluxvla/ (Core Implementation)

    • Covers model-related modules: VLA, Backbone, Head, and Projector.

    • Also includes core training/inference components such as datasets, transformers, tokenizer, engine, and optimizer.

  2. configs/ (Configuration Organization)

    • Organized by model families: openvla, llava, groot, pi0, and pi05.

    • Unifies model, data, training, evaluation, and inference parameters.

  3. scripts/ (Workflow Entrypoints)

    • Connects training, evaluation, and real-robot inference workflows.

    • Common entrypoints include scripts/train.sh, scripts/eval.sh, and scripts/inference_real_robot.py.

2. End-to-End Data and Training Flow#

A typical end-to-end execution flow is:

  • Load data from Parquet or RLDS

  • Apply transformers/data transforms and batch assembly

  • Run model forward

  • Compute action loss

  • Run backward

  • Complete the training loop with FSDP/DDP, standard optimizers, logs, and checkpoints

This flow covers the key path from data loading to distributed training convergence, and directly matches how parameters are organized in configs/ and scripts/.

3. Typical Call Chain#

A common execution flow is as follows:

  • Load config β†’ Build dataset and data loader β†’ Build model and modules β†’ Start training/evaluation engine β†’ Output logs and checkpoints β†’ Run inference.

You can explore each layer in depth through the following tutorials:

  • :doc:config/index

  • :doc:private_model

  • :doc:private_module

  • :doc:private_engine

  • :doc:private_dataset_config

  • :doc:inference/index