Quickstart - Freesolo Docs

This quickstart takes you from an empty directory to a deployed model: you’ll train a LoRA adapter on managed infrastructure, serve it, and chat with it from the CLI.

Prerequisites

Python 3.11 or 3.12, with uv, pipx, or pip to install the CLI.
A Freesolo API key, created in your dashboard at freesolo.co. It’s how every flash command authenticates.

Step 1: Install the CLI

uv tool install freesolo-flash

Step 2: Log in

flash login --api-key <your-freesolo-key>

Flash verifies the key against Freesolo and stores it locally, so you only do this once. You can also set FREESOLO_API_KEY instead of passing --api-key.

flash whoami    # confirm who your stored key resolves to

Step 3: Scaffold a project

Already have a published environment id, yours or one shared with you? Set it as [environment] id in your config and skip ahead to step 5.

flash env setup

This drops a ready-to-edit starter in the current directory:

environment.py            # a Freesolo environment (task + reward)
dataset/
  train.jsonl             # a tiny starter dataset (input/output rows)
configs/
  sft.toml                # an SFT training config to start from
  rl.toml                 # a GRPO (RL) training config to start from
TRAINING.md               # agent playbook with common run issues and mitigations

Rerunning is safe: flash env setup leaves any file that already exists untouched.

Already have a training loop?

Skip the hand-editing. Point your coding agent (Claude Code, Cursor, etc.) at the environment guide and have it find and port your existing reward and dataset into environment.py. A prompt to start from:

I already have a training loop with a reward function and a dataset, and I
want to run it on Freesolo Flash. Find my reward function and dataset in this
project, then read https://freesolo.co/docs/guides/environments and
create an environment.py whose load_environment() returns an
EnvironmentSingleTurn (use EnvironmentMultiTurn for multi-step tasks). Port my
reward into score_response and wire my dataset in as the input/output records
load_environment() returns. Then fill in the matching training config in
configs/ (sft.toml for SFT, rl.toml for GRPO) with the base model and
[train] settings.

Step 4: Publish your environment

An environment is the task and reward your model trains on. Publish the scaffolded one to the managed Environments Hub to get an id:

flash env push --name starter .

This prints an environment id of the form your-org/starter.

Step 5: Configure and validate your run

Open configs/sft.toml and set the one thing that’s yours, the environment id from the previous step:

configs/sft.toml

model = "Qwen/Qwen3.5-4B"
algorithm = "sft"

[environment]
id = "your-org/starter"

[train]
epochs = 1
max_examples = 2
lora_rank = 32

Validate it locally first. --dry-run parses and checks the config locally:

flash train configs/sft.toml --dry-run

See what the run will cost before committing. --cost prints the pre-flight USD cost and exits without submitting:

flash train configs/sft.toml --cost

Step 6: Train

flash train configs/sft.toml

This is the first step that can spend money. At submit time Flash checks your org balance against the pre-flight estimate, then bills successful runs at the quoted Flash cost. Setup and cold start time are reported separately for observability and are not billed.

Flash then follows the logs live. Press Ctrl-C to detach. The run keeps going on the server, and you can follow it again any time:

flash runs                # list your runs and their state/cost
flash status <run-id>     # status JSON, including the cost record
flash log <run-id> -f     # stream logs until the run finishes

The run reaches done when training finishes. Start small: finish one short run end to end before you scale up. When you do, raise train.steps and change little else.

Step 7: Deploy

Serve the trained adapter on Freesolo’s managed serving service. Serving is billed per token for requests you send:

flash deploy <run-id>

Step 8: Chat

flash chat <run-id> -m "Hello! What can you do?"

That’s a full loop. When you’re done, tear the endpoint down:

flash undeploy <run-id>

Essential commands

The commands you used above, plus the ones you’ll reach for next. Run flash <command> --help for the full set of flags, or see the CLI reference.

Command	What it does	Example
`flash login`	Store your Freesolo API key locally	`flash login --api-key <key>`
`flash env setup`	Scaffold a starter environment and configs	`flash env setup`
`flash env push`	Publish an environment and print its id	`flash env push --name starter .`
`flash train`	Submit a run and follow its logs	`flash train configs/sft.toml`
`flash train --cost`	Preview the pre-flight cost, no submit	`flash train configs/sft.toml --cost`
`flash runs`	List your runs with state, cost, and model	`flash runs`
`flash status`	Show run status and cost	`flash status <run-id>`
`flash log`	Print or follow worker logs	`flash log <run-id> -f`
`flash deploy`	Serve a trained adapter	`flash deploy <run-id>`
`flash chat`	Send a message to a deployment	`flash chat <run-id> -m "hi"`
`flash undeploy`	Tear a deployment down	`flash undeploy <run-id>`

Next steps

How Flash works

The loop behind a run, and the concepts each command refers to.

Training in depth

SFT vs GRPO, config options, monitoring, and cost.

Build an environment

Replace the starter task with your own data and reward.

Deploy & chat

Serving billing and the OpenAI-compatible API.

​Prerequisites

​Step 1: Install the CLI

​Step 2: Log in

​Step 3: Scaffold a project

​Step 4: Publish your environment

​Step 5: Configure and validate your run

​Step 6: Train

​Step 7: Deploy

​Step 8: Chat

​Essential commands

​Next steps

How Flash works

Training in depth

Build an environment

Deploy & chat

Prerequisites

Step 1: Install the CLI

Step 2: Log in

Step 3: Scaffold a project

Step 4: Publish your environment

Step 5: Configure and validate your run

Step 6: Train

Step 7: Deploy

Step 8: Chat

Essential commands

Next steps