> ## Documentation Index > Fetch the complete documentation index at: https://freesolo.co/docs/llms.txt > Use this file to discover all available pages before exploring further. # Troubleshooting > Common Flash errors, why they happen, and how to fix them. Most `flash` errors print one clean line; add the global `--debug` flag before the subcommand (e.g. `flash --debug train config.toml`) for the full traceback. ## Installation & CLI The CLI installs a single `flash` command, but its install location has to be on your `PATH`. * If you installed with `uv tool install freesolo-flash`, make sure uv's tool bin directory is on your `PATH` (run `uv tool update-shell`, then restart your shell). * Confirm the install with `flash version`. The CLI is published to PyPI as **`freesolo-flash`**, the bare `flash` name belongs to an unrelated project. Reinstall the right one: ```bash theme={null} uv tool install freesolo-flash ``` ## Authentication Every command authenticates with a **Freesolo API key**, verified against Freesolo at login. * Create a key in your dashboard at [freesolo.co](https://freesolo.co). * Log in once: `flash login --api-key ` (or set `FREESOLO_API_KEY` instead of passing `--api-key`). * Confirm who the stored key resolves to: `flash whoami`. By default the CLI talks to `https://api.freesolo.co`. To target a different deployment, set `--freesolo-url` (or `FREESOLO_BASE_URL`) at login. ## Environments Make sure the folder you push contains an `environment.py` file with a `load_environment()` function that returns a Freesolo environment, then: ```bash theme={null} flash env push --name math math ``` It prints the published id (`your-org/math`) to put in your config's `[environment] id`. If you pass `--name namespace/name`, the namespace must match your Freesolo org; otherwise pass a bare name such as `math`. Managed training workers have the Freesolo SDK available when they import a published environment. Your local Python environment does not get that SDK automatically from the `flash` CLI. Install it locally when you run or test `environment.py` directly, or when you use a command such as `flash train --cost` that **loads** the environment to count examples. `--dry-run` validates only the config and does not import the environment: ```bash theme={null} uv pip install freesolo ``` To pull a published env into your project for local work, use `flash env pull your-org/your-env`. `flash train --dry-run` validates the config, but it does not prove every remote environment dependency is installed. If `flash log ` shows your `environment.py` failed while importing a task library, add that package to `[environment].pip`, publish the environment, and submit again: ```toml theme={null} [environment] id = "your-org/your-env" pip = ["math-verify>=0.8.0"] ``` Only list packages your environment imports. Flash does not install worker dependencies from a `pyproject.toml`, `requirements.txt`, or lockfile bundled with the environment; keep managed-run dependencies in `[environment].pip`. `[environment].pip` is for task dependencies imported by your environment. Do not pin Flash's managed training stack there, such as `torch`, `trl`, `vllm`, `peft`, or `bitsandbytes`, unless your environment directly imports that package. Extra pins can conflict with the worker's tested training recipe. Remove the pin and resubmit. Pull the specific file you need instead of the whole environment: ```bash theme={null} flash env pull your-org/your-env environment.py -o environment.py flash env pull your-org/your-env dataset/eval.jsonl -o eval.jsonl ``` Keep published environments focused on source, small sidecars, and datasets needed by the run. Do not publish virtualenvs, local caches, model weights, or generated artifacts. Use `flash env pull` to inspect the exact packaged file: ```bash theme={null} flash env pull your-org/your-env dataset/train.jsonl -o train.jsonl ``` For clean A/B experiments, publish changed datasets under a fresh env name so old runs, new runs, and local files are easy to tell apart. If your environment module shares a name with an installed Python package, it can shadow or be shadowed by that package. Keep helper module names distinct from installed packages. The `[environment] id` must be a published Freesolo environment id, produced by `flash env push`, for example `your-org/your-env`. A local file path is not a valid id, so publish it first or reference an existing published id. Use `flash env pull your-org/your-env` only when you want a local copy to edit or inspect. ## Configuration `model` must be one of the ids in the curated catalog. List the valid ids: ```bash theme={null} flash models ``` Managed runs train catalog models only. See [Supported models](/reference/models). Flash rejects unknown config sections and `[train]` keys **at parse time**. Check the key against the [configuration reference](/reference/configuration) and validate locally: ```bash theme={null} flash train config.toml --dry-run ``` `algorithm` must be `sft` (the default) or `grpo`. Fix the value and re-validate with `--dry-run`. ## Run fit and resource use Expected. GRPO samples multiple completions, scores them, and updates from that group of attempts. For the same model, it usually needs more room and costs more than SFT. If a GRPO run is too expensive or too large, use a smaller model, reduce `group_size`, reduce `max_tokens`, reduce `max_length`, or start with SFT. The usual causes are long context, long generated completions, a large GRPO `group_size`, or a larger base model than the task needs. Reduce `max_length`, `max_tokens`, or `group_size`, or switch to a smaller model. If you recently enabled `thinking = true`, remember that reasoning and the final answer share the same token budget. ## Training runs Flash checks your prepaid org balance against the pre-flight estimate before submit, then bills successful runs at the quoted Flash cost. Cancelled runs are repriced to the training steps they reached, and setup time is reported separately without being billed. Preview the estimate before you submit: ```bash theme={null} flash train config.toml --cost ``` Add funds or reduce the run estimate before submitting again. Smaller base models, fewer GRPO steps, lower `group_size`, shorter `max_tokens`, and SFT smoke tests are the fastest ways to lower the pre-flight estimate. The task is too hard for the model at its current ability: if no rollout ever scores, there's nothing for GRPO to reinforce. Try a stronger/larger base model, make the task easier to get started, or double-check your reward actually returns a positive reward for good answers. Usually the reward is not discriminative: it scores almost everything the same. Make the reward function separate better answers from worse ones so GRPO has a spread of scores to learn from. See [Environments](/guides/environments). When `thinking = true`, the model emits a reasoning trace **before** its answer, and that trace counts against the **same `max_tokens` budget** as the answer. A `max_tokens` tuned for a non-thinking run is usually too small once reasoning is added: the reasoning eats the budget and the actual answer is truncated or never emitted. If your reward parses the answer (e.g. extracts a JSON object), it then sees nothing and scores \~0 across the board — even though the model is "working". Fixes: * **Raise `max_tokens`** so the reasoning *and* the answer both fit (e.g. a task that needs \~200 answer tokens may need `max_tokens = 2048` with thinking on), and make sure `max_length` is large enough to hold the prompt plus that budget. * Optionally set `thinking_length_penalty_coef` to nudge the model toward shorter reasoning so the answer reliably lands inside the budget. * Score the answer text by default. In thinking mode `response_text` remains string-compatible answer text and also exposes `response_text.completion`, `response_text.thinking`, and `response_text.raw` for rewards that intentionally inspect reasoning. The same trap applies to any **reasoning model you call as an LLM judge** from a reward: give the judge call enough `max_tokens` or it returns empty content and the judge silently scores 0. In Qwen3.5 thinking mode, the chat template treats prior and next assistant turns differently: it strips literal `` blocks from non-final assistant history, then pre-opens `\n` in the next generation prompt. A naive multi-turn SFT transcript that puts `...` in every assistant turn can therefore train on a tag layout that inference will never render. The symptom is doubled, missing, or misplaced thinking tags, or an adapter that behaves differently in training-style evals than it does when served. Fixes: * For message-shaped multi-turn SFT targets, keep intermediate assistant turns as the actual code, tool, or action text only. * Put `...` plus the final answer only in the final assistant target. * Do not add a second opener for the template's pre-opened `\n`. Flash's completion-only SFT masking uses the longest shared token prefix, so that pre-opened tag is treated as prompt text. No. `Ctrl-C` during `flash train` just **detaches** you; the run keeps going on Freesolo. Re-follow it any time, and cancel explicitly if you mean to: ```bash theme={null} flash runs # state and cost of your runs flash log -f # re-follow the logs flash status # status and cost JSON flash cancel # stop it ``` Expected. Cancellation waits for the managed worker to stop and clean up before confirming, which can take several minutes. The CLI waits this teardown out; the run is marked `cancelled` when it completes. Runs are supervised by Flash: a stall watchdog plus bounded auto-retry that **resumes from the last streamed checkpoint** when possible. If the run ultimately succeeds, the charge remains the quoted Flash cost. Watch logs with `flash log -f` or poll status with `flash status -f`. If the same shape repeatedly fails before useful metrics, reduce `max_length`, `max_tokens`, or `group_size`. ## Serving A run cancelled or preempted before finalizing has no final adapter, so a plain `flash deploy ` cannot serve it. The error lists the run's saved checkpoint steps and the exact command to use — deploy one of them instead: ```bash theme={null} flash checkpoints flash deploy /step- ``` Deploy registers and warms the adapter on Freesolo's managed serving service. Large models can take a few minutes before the endpoint is ready. See [Deploy & chat](/guides/deploy-and-chat). Serving is **billed per token** for requests. Prompt, completion, and cached prompt token rates are listed in [Supported models](/reference/models#serving-prices). `flash undeploy ` deregisters the adapter. Deployments are OpenAI-compatible. Use the endpoint from `flash deployments` as the `base_url` and the `` as the `model`. The OpenAI SDK requires an `api_key`, and the serving endpoint requires it — pass your Freesolo API key (the same key `flash login` uses); serving authorizes every request against the org that owns the adapter, so a placeholder key is rejected (401/403). See [Use it from your own code](/guides/deploy-and-chat#use-it-from-your-own-code). ## Getting help Still stuck? Add `--debug` to surface the full traceback, then reach out through [freesolo.co/contact](https://freesolo.co/contact) with the run id (`flash runs`) and the failing command.