Skip to main content
This quickstart takes you from an empty directory to a deployed model: you’ll train a LoRA adapter on managed infrastructure, serve it, and chat with it from the CLI.

Prerequisites

  • Python 3.11 or 3.12, with uv, pipx, or pip to install the CLI.
  • A Freesolo API key, created in your dashboard at freesolo.co. It’s how every flash command authenticates.

Step 1: Install the CLI

uv tool install freesolo-flash

Step 2: Log in

flash login --api-key <your-freesolo-key>
Flash verifies the key against Freesolo and stores it locally, so you only do this once. You can also set FREESOLO_API_KEY instead of passing --api-key.
flash whoami    # confirm who your stored key resolves to

Step 3: Scaffold a project

Already have a published environment id, yours or one shared with you? Set it as [environment] id in your config and skip ahead to step 5.
flash env setup
This drops a ready-to-edit starter in the current directory:
environment.py            # a Freesolo environment (task + reward)
dataset/
  train.jsonl             # a tiny starter dataset (input/output rows)
configs/
  sft.toml                # an SFT training config to start from
  rl.toml                 # a GRPO (RL) training config to start from
TRAINING.md               # agent playbook with common run issues and mitigations
Rerunning is safe: flash env setup leaves any file that already exists untouched.
Skip the hand-editing. Point your coding agent (Claude Code, Cursor, etc.) at the environment guide and have it find and port your existing reward and dataset into environment.py. A prompt to start from:
I already have a training loop with a reward function and a dataset, and I
want to run it on Freesolo Flash. Find my reward function and dataset in this
project, then read https://freesolo.co/docs/guides/environments and
create an environment.py whose load_environment() returns an
EnvironmentSingleTurn (use EnvironmentMultiTurn for multi-step tasks). Port my
reward into score_response and wire my dataset in as the input/output records
load_environment() returns. Then fill in the matching training config in
configs/ (sft.toml for SFT, rl.toml for GRPO) with the base model and
[train] settings.

Step 4: Publish your environment

An environment is the task and reward your model trains on. Publish the scaffolded one to the managed Environments Hub to get an id:
flash env push --name starter .
This prints an environment id of the form your-org/starter.

Step 5: Configure and validate your run

Open configs/sft.toml and set the one thing that’s yours, the environment id from the previous step:
configs/sft.toml
model = "Qwen/Qwen3.5-4B"
algorithm = "sft"

[environment]
id = "your-org/starter"

[train]
epochs = 1
max_examples = 2
lora_rank = 32
Validate it locally first. --dry-run parses and checks the config locally:
flash train configs/sft.toml --dry-run
See what the run will cost before committing. --cost prints the pre-flight USD cost and exits without submitting:
flash train configs/sft.toml --cost

Step 6: Train

flash train configs/sft.toml
This is the first step that can spend money. At submit time Flash checks your org balance against the pre-flight estimate, then bills successful runs at the quoted Flash cost. Setup and cold start time are reported separately for observability and are not billed.
Flash then follows the logs live. Press Ctrl-C to detach. The run keeps going on the server, and you can follow it again any time:
flash runs                # list your runs and their state/cost
flash status <run-id>     # status JSON, including the cost record
flash log <run-id> -f     # stream logs until the run finishes
The run reaches done when training finishes. Start small: finish one short run end to end before you scale up. When you do, raise train.steps and change little else.

Step 7: Deploy

Serve the trained adapter on Freesolo’s managed serving service. Serving is billed per token for requests you send:
flash deploy <run-id>

Step 8: Chat

flash chat <run-id> -m "Hello! What can you do?"
That’s a full loop. When you’re done, tear the endpoint down:
flash undeploy <run-id>

Essential commands

The commands you used above, plus the ones you’ll reach for next. Run flash <command> --help for the full set of flags, or see the CLI reference.
CommandWhat it doesExample
flash loginStore your Freesolo API key locallyflash login --api-key <key>
flash env setupScaffold a starter environment and configsflash env setup
flash env pushPublish an environment and print its idflash env push --name starter .
flash trainSubmit a run and follow its logsflash train configs/sft.toml
flash train --costPreview the pre-flight cost, no submitflash train configs/sft.toml --cost
flash runsList your runs with state, cost, and modelflash runs
flash statusShow run status and costflash status <run-id>
flash logPrint or follow worker logsflash log <run-id> -f
flash deployServe a trained adapterflash deploy <run-id>
flash chatSend a message to a deploymentflash chat <run-id> -m "hi"
flash undeployTear a deployment downflash undeploy <run-id>

Next steps

How Flash works

The loop behind a run, and the concepts each command refers to.

Training in depth

SFT vs GRPO, config options, monitoring, and cost.

Build an environment

Replace the starter task with your own data and reward.

Deploy & chat

Serving billing and the OpenAI-compatible API.