Skip to main content
Flash is Freesolo’s managed post-training service. You write a short config, run one command, and Flash fine-tunes a model on managed infrastructure, then serves the result behind an OpenAI-compatible endpoint. There’s nothing to host, and every command that talks to the platform is authenticated with your Freesolo API key.

Get started in minutes

Install the CLI and go from an empty directory to a deployed model in a few minutes.

What you can do

Write a TOML config, run one command, and Flash trains a LoRA adapter (a small set of add-on weights) on top of a supported base model. You pick the model and the task; Flash handles the training infrastructure. See Training.
flash train config.toml
Supervised fine-tuning learns from the prompt/answer pairs in your dataset. Use it when you already have examples of the output you want.
algorithm = "sft"
Reinforcement learning scores the model’s own completions with your environment’s reward and updates toward higher reward (see How Flash works). Use it when you can score an output but can’t hand-write the perfect one.
algorithm = "grpo"
flash deploy registers the adapter with managed serving, then flash chat or any OpenAI client can call it with your Freesolo key. See Deploy & chat.
flash deploy <run-id>
--cost prints the pre-flight estimate without submitting. You pay for the quoted training run cost and for the tokens you serve. Cancelled runs are repriced to the training steps they actually reached.
flash train config.toml --cost

Where Flash fits

I want to…Start here
Fine-tune from examples I already haveTraining: SFT
Improve outputs I can score but can’t hand-writeTraining: GRPO and Environments
Define the task and reward my model learns onEnvironments
Call my model from my existing OpenAI codeDeploy & chat
Understand how a run actually worksHow Flash works

Next steps

Quickstart

Install the CLI, log in, and ship your first run in a few minutes.

How Flash works

The loop behind a run: base models, environments, algorithms, serving.

Training

Write a config, submit a run, and follow it to completion.

Deploy & chat

Serve an adapter, then chat with it over an OpenAI-compatible API.