常见问题#

下面整理了 FluxVLA 的常见问题。该部分会持续更新，欢迎大家不断提问，帮助我们改进！
同时也非常欢迎使用右上角的 🦞 龙虾助手 进行提问，“龙虾”会进行答疑并收集记录问题。

如何在没有光线追踪功能的设备（如A100等）进行 Libero 评估？#

为了在没有光线追踪功能的设备（如A100等）上支持对Libero的评估，请参考 EGL设备上进行GPU渲染

如何使用 VSCode 进行 Debug？#

FluxVLA 的训练和评估脚本基于 torchrun 启动分布式训练，无法直接使用 VSCode 的默认 Python 调试方式。需要通过配置 .vscode/launch.json，将 torchrun 作为调试入口程序来实现断点调试。

1. 创建调试配置文件#

在项目根目录下创建 .vscode/launch.json 文件（如已存在则直接编辑）。

2. 配置模板#

以下是一个用于调试训练脚本的通用配置模板：

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Train Debug",
            "type": "debugpy",
            "request": "launch",
            "program": "/root/miniconda3/bin/torchrun",
            "args": [
                "--nnodes", "1",
                "--nproc_per_node", "2",
                "scripts/train.py",
                "--config", "configs/<model>/<your_config>.py",
                "--work-dir", "work_dirs/<your_work_dir>",
                "--cfg-options",
                "train_dataloader.batch_size=4",
                "train_dataloader.per_device_batch_size=2",
                "runner.max_epochs=None",
                "runner.max_steps=100",
                "runner.save_iter_interval=10"
            ],
            "console": "integratedTerminal",
            "justMyCode": false,
            "env": {
                "CUDA_VISIBLE_DEVICES": "0,1",
                "HF_ENDPOINT": "https://hf-mirror.com",
                "WANDB_MODE": "disabled"
            }
        }
    ]
}

评估脚本的配置类似，只需将 scripts/train.py 替换为 scripts/eval.py，并将 --work-dir 等参数替换为 --ckpt-path：

{
    "name": "Eval Debug",
    "type": "debugpy",
    "request": "launch",
    "program": "/root/miniconda3/bin/torchrun",
    "args": [
        "--nnodes", "1",
        "--nproc_per_node", "2",
        "scripts/eval.py",
        "--config", "configs/<model>/<your_config>.py",
        "--ckpt-path", "work_dirs/<your_work_dir>/checkpoints/latest-checkpoint.pt"
    ],
    "console": "integratedTerminal",
    "justMyCode": false,
    "env": {
        "CUDA_VISIBLE_DEVICES": "0,1",
        "HF_ENDPOINT": "https://hf-mirror.com",
        "WANDB_MODE": "disabled"
    }
}

3. 关键参数说明#

参数	说明
`type`	必须设为 `debugpy`，使用 Python 调试协议
`program`	指向 `torchrun` 的绝对路径，通常位于 conda 环境的 `bin` 目录下
`--nproc_per_node`	每个节点使用的 GPU 数量，调试时建议设为较小值（如 1 或 2）
`justMyCode`	设为 `false` 可以进入第三方库代码进行调试
`CUDA_VISIBLE_DEVICES`	控制可见的 GPU，调试时建议限制为少量 GPU
`WANDB_MODE`	设为 `disabled` 可在调试时关闭 wandb 日志上传
`--cfg-options`	调试时建议减小 `batch_size`、`max_steps` 等参数以加速迭代

4. 使用方法#

在 VSCode 中打开 FluxVLA 项目
按 F5 或点击左侧调试面板的运行按钮
从下拉菜单中选择对应的调试配置
在代码中设置断点，即可开始逐步调试

提示：调试分布式训练时，断点会在每个进程中触发。如果只想调试单个进程的逻辑，可将 --nproc_per_node 设为 1，并相应调整 CUDA_VISIBLE_DEVICES。

Transformers 安装常见问题#

FluxVLA 依赖 Hugging Face transformers 库，但该依赖未包含在 requirements.txt 中，需要手动安装。由于不同模型对 transformers 版本的要求不同，安装时容易遇到版本冲突问题。

1. 推荐安装方式#

按照 README 中的指引，在安装完 FluxVLA 后单独安装：

pip install transformers==4.53.0

2. 不同模型的版本要求#

不同模型在代码或配置中对 transformers 版本有不同的期望：

模型	推荐版本	说明
OpenVLA / dinosiglip-qwen2_5	`transformers==4.40.1`	代码中有显式版本检查，同时要求 `tokenizers==0.19.1`
Pi0 / Pi0.5 / Gr00t / LlavaVLA 等	`transformers==4.53.0`	按 README 推荐版本即可
Tron2 部署	`transformers==4.53.2`	见 Tron2 推理部署文档

3. 常见问题及解决方法#

问题一：使用 OpenVLA 时出现版本警告

Expected `transformers==4.40.1` and `tokenizers==0.19.1` but got ...
there might be inference-time regressions due to dependency changes.

这是因为 OpenVLA 的预训练模型基于 transformers==4.40.1 构建。如果你主要使用 OpenVLA，建议降级：

pip install transformers==4.40.1 tokenizers==0.19.1

注意：降级到 4.40.1 后，其他模型（如 Pi0、Gr00t 等）可能无法正常运行。如果需要同时使用多种模型，建议为不同模型创建独立的 Conda 环境。

问题二：pip install transformers 导致其他依赖被升级

安装 transformers 时，pip 可能会自动升级 numpy、tokenizers、huggingface-hub 等包，导致与 FluxVLA 的其他依赖产生冲突。推荐的安装顺序为：

# 1. 先安装 FluxVLA 及其依赖
pip install -r requirements.txt
python setup.py develop

# 2. 再安装 transformers
pip install transformers==4.53.0

# 3. 最后修复 numpy 版本
pip install numpy==1.26.4

问题三：ImportError 或 AttributeError

如果遇到类似以下错误：

ImportError: cannot import name 'XXX' from 'transformers'
AttributeError: module 'transformers' has no attribute 'XXX'

通常是由于 transformers 版本过低或过高导致 API 不兼容。请确认当前版本并重新安装目标版本：

python -c "import transformers; print(transformers.__version__)"
pip install transformers==<目标版本>

问题四：从源码安装 transformers

当 pip 安装的版本不包含最新修复时，可以从源码安装：

pip install git+https://github.com/huggingface/transformers.git@v4.53.0

多个数据集混合训练时遇到报错#

如果遇到类似以下错误：

ValueError: The features can't be aligned because the key observation.state of features {'observation.state': List(Value('float32')), 'observation.states.ee_state': List(Value('float32')), 'observation.states.joint_state': List(Value('float32')), 'observation.states.gripper_state': List(Value('float32')), 'action': List(Value('float32')), 'timestamp': Value('float32'), 'frame_index': Value('int64'), 'episode_index': Value('int64'), 'index': Value('int64'), 'task_index': Value('int64')} has unexpected type - List(Value('float32')) (expected either List(Value('float32'), length=8) or Value("null").

可能是因为你使用了不同版本的Lerobot数据，请确认所有数据均为LeRobotDataset v2.1，推荐使用特定的commit id使用以确保格式统一：

git clone https://github.com/huggingface/lerobot.git
cd lerobot
git checkout 55198de096f46a8e0447a8795129dd9ee84c088c
pip install -e .