> ## Documentation Index
> Fetch the complete documentation index at: https://freesolo.co/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Troubleshooting

> Common Flash errors, why they happen, and how to fix them.

Most `flash` errors print one clean line; add the global `--debug` flag before
the subcommand (e.g. `flash --debug train config.toml`) for the full traceback.

## Installation & CLI

<AccordionGroup>
  <Accordion title="flash: command not found">
    The CLI installs a single `flash` command, but its install location has to be
    on your `PATH`.

    * If you installed with `uv tool install freesolo-flash`, make sure uv's tool
      bin directory is on your `PATH` (run `uv tool update-shell`, then restart
      your shell).
    * Confirm the install with `flash version`.
  </Accordion>

  <Accordion title="I installed `flash` but it's the wrong tool">
    The CLI is published to PyPI as **`freesolo-flash`**, the bare `flash` name
    belongs to an unrelated project. Reinstall the right one:

    ```bash theme={null}
    uv tool install freesolo-flash
    ```
  </Accordion>
</AccordionGroup>

## Authentication

<AccordionGroup>
  <Accordion title="flash login fails or commands report 401 / invalid key">
    Every command authenticates with a **Freesolo API key**, verified against
    Freesolo at login.

    * Create a key in your dashboard at [freesolo.co](https://freesolo.co).
    * Log in once: `flash login --api-key <your-key>` (or set `FREESOLO_API_KEY`
      instead of passing `--api-key`).
    * Confirm who the stored key resolves to: `flash whoami`.
  </Accordion>

  <Accordion title="Pointing at a non-default Freesolo deployment">
    By default the CLI talks to `https://api.freesolo.co`. To target a different
    deployment, set `--freesolo-url` (or `FREESOLO_BASE_URL`) at login.
  </Accordion>
</AccordionGroup>

## Environments

<AccordionGroup>
  <Accordion title="flash env push fails">
    Make sure the folder you push contains an `environment.py` file with a
    `load_environment()` function that returns a Freesolo environment, then:

    ```bash theme={null}
    flash env push --name math math
    ```

    It prints the published id (`your-org/math`) to put in your config's
    `[environment] id`. If you pass `--name namespace/name`, the namespace must
    match your Freesolo org; otherwise pass a bare name such as `math`.
  </Accordion>

  <Accordion title="ModuleNotFoundError: No module named 'freesolo' (local authoring)">
    Managed training workers have the Freesolo SDK available when they import a
    published environment. Your local Python environment does not get that SDK
    automatically from the `flash` CLI. Install it locally when you run or test
    `environment.py` directly, or when you use a command such as
    `flash train --cost` that **loads** the environment to count examples.
    `--dry-run` validates only the config and does not import the environment:

    ```bash theme={null}
    uv pip install freesolo
    ```

    To pull a published env into your project for local work, use
    `flash env pull your-org/your-env`.
  </Accordion>

  <Accordion title="Worker import fails with ModuleNotFoundError">
    `flash train --dry-run` validates the config, but it does not prove every
    remote environment dependency is installed. If `flash log <run-id>` shows
    your `environment.py` failed while importing a task library, add that
    package to `[environment].pip`, publish the environment, and submit again:

    ```toml theme={null}
    [environment]
    id = "your-org/your-env"
    pip = ["math-verify>=0.8.0"]
    ```

    Only list packages your environment imports. Flash does not install worker
    dependencies from a `pyproject.toml`, `requirements.txt`, or lockfile bundled
    with the environment; keep managed-run dependencies in `[environment].pip`.
  </Accordion>

  <Accordion title="[environment].pip breaks training startup">
    `[environment].pip` is for task dependencies imported by your environment.
    Do not pin Flash's managed training stack there, such as `torch`, `trl`,
    `vllm`, `peft`, or `bitsandbytes`, unless your environment
    directly imports that package. Extra pins can conflict with the worker's
    tested training recipe. Remove the pin and resubmit.
  </Accordion>

  <Accordion title="flash env pull says the archive is too large">
    Pull the specific file you need instead of the whole environment:

    ```bash theme={null}
    flash env pull your-org/your-env environment.py -o environment.py
    flash env pull your-org/your-env dataset/eval.jsonl -o eval.jsonl
    ```

    Keep published environments focused on source, small sidecars, and datasets
    needed by the run. Do not publish virtualenvs, local caches, model weights,
    or generated artifacts.
  </Accordion>

  <Accordion title="I republished an environment and want to verify the data">
    Use `flash env pull` to inspect the exact packaged file:

    ```bash theme={null}
    flash env pull your-org/your-env dataset/train.jsonl -o train.jsonl
    ```

    For clean A/B experiments, publish changed datasets under a fresh env name
    so old runs, new runs, and local files are easy to tell apart.
  </Accordion>

  <Accordion title="Environment won't import / name collision">
    If your environment module shares a name with an installed Python package, it
    can shadow or be shadowed by that package. Keep helper module names distinct
    from installed packages.
  </Accordion>

  <Accordion title="What id do I reference in my config?">
    The `[environment] id` must be a published Freesolo environment id, produced
    by `flash env push`, for example `your-org/your-env`.
    A local file path is not a valid id, so publish it first or reference an
    existing published id. Use `flash env pull your-org/your-env` only when you
    want a local copy to edit or inspect.
  </Accordion>
</AccordionGroup>

## Configuration

<AccordionGroup>
  <Accordion title="unsupported model '...'">
    `model` must be one of the ids in the curated catalog. List the valid ids:

    ```bash theme={null}
    flash models
    ```

    Managed runs train catalog models only. See [Supported models](/reference/models).
  </Accordion>

  <Accordion title="Unknown config key or section rejected">
    Flash rejects unknown config sections and `[train]` keys **at parse time**.
    Check the key against the
    [configuration reference](/reference/configuration) and validate locally:

    ```bash theme={null}
    flash train config.toml --dry-run
    ```
  </Accordion>

  <Accordion title="unsupported algorithm">
    `algorithm` must be `sft` (the default) or `grpo`. Fix the value and
    re-validate with `--dry-run`.
  </Accordion>
</AccordionGroup>

## Run fit and resource use

<AccordionGroup>
  <Accordion title="GRPO costs or fits differently than SFT">
    Expected. GRPO samples multiple completions, scores them, and updates from
    that group of attempts. For the same model, it usually needs more room and
    costs more than SFT. If a GRPO run is too expensive or too large, use a
    smaller model, reduce `group_size`, reduce `max_tokens`, reduce `max_length`,
    or start with SFT.
  </Accordion>

  <Accordion title="Pre-flight says the run is too large">
    The usual causes are long context, long generated completions, a large GRPO
    `group_size`, or a larger base model than the task needs. Reduce
    `max_length`, `max_tokens`, or `group_size`, or switch to a smaller model.
    If you recently enabled `thinking = true`, remember that reasoning and the
    final answer share the same token budget.
  </Accordion>
</AccordionGroup>

## Training runs

<AccordionGroup>
  <Accordion title="When am I charged for a run?">
    Flash checks your prepaid org balance against the pre-flight estimate before
    submit, then bills successful runs at the quoted Flash cost. Cancelled runs
    are repriced to the training steps they reached, and setup time is reported
    separately without being billed. Preview the estimate before you submit:

    ```bash theme={null}
    flash train config.toml --cost
    ```
  </Accordion>

  <Accordion title="Run submit reports insufficient balance">
    Add funds or reduce the run estimate before submitting again. Smaller base
    models, fewer GRPO steps, lower `group_size`, shorter `max_tokens`, and SFT
    smoke tests are the fastest ways to lower the pre-flight estimate.
  </Accordion>

  <Accordion title="GRPO reward is stuck at 0">
    The task is too hard for the model at its current ability: if no rollout ever
    scores, there's nothing for GRPO to reinforce. Try a stronger/larger base
    model, make the task easier to get started, or double-check your reward
    actually returns a positive reward for good answers.
  </Accordion>

  <Accordion title="Reward is stuck at 1 (or never moves)">
    Usually the reward is not discriminative: it scores almost everything the
    same. Make the reward function separate better answers from worse ones so GRPO
    has a spread of scores to learn from. See [Environments](/guides/environments).
  </Accordion>

  <Accordion title="Reward collapses to ~0 with `thinking = true`">
    When `thinking = true`, the model emits a reasoning trace **before** its answer,
    and that trace counts against the **same `max_tokens` budget** as the answer. A
    `max_tokens` tuned for a non-thinking run is usually too small once reasoning is
    added: the reasoning eats the budget and the actual answer is truncated or never
    emitted. If your reward parses the answer (e.g. extracts a JSON object), it then
    sees nothing and scores \~0 across the board — even though the model is "working".

    Fixes:

    * **Raise `max_tokens`** so the reasoning *and* the answer both fit (e.g. a task
      that needs \~200 answer tokens may need `max_tokens = 2048` with thinking on),
      and make sure `max_length` is large enough to hold the prompt plus that budget.
    * Optionally set `thinking_length_penalty_coef` to nudge the model toward shorter
      reasoning so the answer reliably lands inside the budget.
    * Score the answer text by default. In thinking mode `response_text` remains
      string-compatible answer text and also exposes `response_text.completion`,
      `response_text.thinking`, and `response_text.raw` for rewards that intentionally
      inspect reasoning.

    The same trap applies to any **reasoning model you call as an LLM judge** from a
    reward: give the judge call enough `max_tokens` or it returns empty content and
    the judge silently scores 0.
  </Accordion>

  <Accordion title="Qwen3.5 thinking multi-turn SFT learns doubled thinking tags">
    In Qwen3.5 thinking mode, the chat template treats prior and next assistant
    turns differently: it strips literal `<think>` blocks from non-final assistant
    history, then pre-opens `<think>\n` in the next generation prompt. A naive
    multi-turn SFT transcript that puts `<think>...</think>` in every assistant
    turn can therefore train on a tag layout that inference will never render.
    The symptom is doubled, missing, or misplaced thinking tags, or an adapter that
    behaves differently in training-style evals than it does when served.

    Fixes:

    * For message-shaped multi-turn SFT targets, keep intermediate assistant turns
      as the actual code, tool, or action text only.
    * Put `<think>...</think>` plus the final answer only in the final assistant
      target.
    * Do not add a second opener for the template's pre-opened `<think>\n`. Flash's
      completion-only SFT masking uses the longest shared token prefix, so that
      pre-opened tag is treated as prompt text.
  </Accordion>

  <Accordion title="Did pressing Ctrl-C kill my run?">
    No. `Ctrl-C` during `flash train` just **detaches** you; the run keeps going
    on Freesolo. Re-follow it any time, and cancel explicitly if you mean to:

    ```bash theme={null}
    flash runs              # state and cost of your runs
    flash log <run-id> -f       # re-follow the logs
    flash status <run-id>       # status and cost JSON
    flash cancel <run-id>       # stop it
    ```
  </Accordion>

  <Accordion title="flash cancel takes a long time to return">
    Expected. Cancellation waits for the managed worker to stop and clean up
    before confirming, which can take several minutes. The CLI waits this
    teardown out; the run is marked `cancelled` when it completes.
  </Accordion>

  <Accordion title="A capacity or resource-fit issue interrupted training">
    Runs are supervised by Flash: a stall watchdog plus bounded auto-retry that
    **resumes from the last streamed checkpoint** when possible. If the run
    ultimately succeeds, the charge remains the quoted Flash cost. Watch logs
    with `flash log <run-id> -f` or poll status with `flash status <run-id> -f`.
    If the same shape repeatedly fails before useful metrics, reduce
    `max_length`, `max_tokens`, or `group_size`.
  </Accordion>
</AccordionGroup>

## Serving

<AccordionGroup>
  <Accordion title="Deploy fails: run has no run-level adapter">
    A run cancelled or preempted before finalizing has no final adapter, so a
    plain `flash deploy <run-id>` cannot serve it. The error lists the run's
    saved checkpoint steps and the exact command to use — deploy one of them
    instead:

    ```bash theme={null}
    flash checkpoints <run-id>
    flash deploy <run-id>/step-<N>
    ```
  </Accordion>

  <Accordion title="Deploy or the first request is slow">
    Deploy registers and warms the adapter on Freesolo's managed serving
    service. Large models can take a few minutes before the endpoint is ready.
    See [Deploy & chat](/guides/deploy-and-chat).
  </Accordion>

  <Accordion title="How is serving billed?">
    Serving is **billed per token** for requests. Prompt, completion, and cached
    prompt token rates are listed in [Supported models](/reference/models#serving-prices).
    `flash undeploy <run-id>` deregisters the adapter.
  </Accordion>

  <Accordion title="Calling the endpoint from my own code is rejected">
    Deployments are OpenAI-compatible. Use the endpoint from `flash deployments`
    as the `base_url` and the `<run-id>` as the `model`. The OpenAI SDK requires
    an `api_key`, and the serving endpoint requires it — pass your Freesolo API key
    (the same key `flash login` uses); serving authorizes every request against the
    org that owns the adapter, so a placeholder key is rejected (401/403). See
    [Use it from your own code](/guides/deploy-and-chat#use-it-from-your-own-code).
  </Accordion>
</AccordionGroup>

## Getting help

Still stuck? Add `--debug` to surface the full traceback, then reach out through
[freesolo.co/contact](https://freesolo.co/contact) with the run id (`flash runs`)
and the failing command.
