Get started in minutes
Install the CLI and go from an empty directory to a deployed model in a few
minutes.
What you can do
Fine-tune a model on your own task
Fine-tune a model on your own task
Write a TOML config, run one command, and Flash trains a LoRA adapter (a
small set of add-on weights) on top of a supported base model. You pick the
model and the task; Flash handles the training infrastructure. See
Training.
Teach a format from examples with SFT
Teach a format from examples with SFT
Supervised fine-tuning learns from the prompt/answer pairs in your dataset.
Use it when you already have examples of the output you want.
Reward the behavior you want with GRPO
Reward the behavior you want with GRPO
Reinforcement learning scores the model’s own completions with your
environment’s reward and updates toward higher
reward (see How Flash works). Use it when you can score
an output but can’t hand-write the perfect one.
Serve it behind an OpenAI-compatible API
Serve it behind an OpenAI-compatible API
flash deploy registers the adapter with managed serving, then
flash chat or any OpenAI client can call it with your Freesolo key. See
Deploy & chat.See the cost before you spend
See the cost before you spend
--cost prints the pre-flight estimate without submitting. You pay for the
quoted training run cost and for the tokens you serve. Cancelled runs are
repriced to the training steps they actually reached.Where Flash fits
| I want to… | Start here |
|---|---|
| Fine-tune from examples I already have | Training: SFT |
| Improve outputs I can score but can’t hand-write | Training: GRPO and Environments |
| Define the task and reward my model learns on | Environments |
| Call my model from my existing OpenAI code | Deploy & chat |
| Understand how a run actually works | How Flash works |
Next steps
Quickstart
Install the CLI, log in, and ship your first run in a few minutes.
How Flash works
The loop behind a run: base models, environments, algorithms, serving.
Training
Write a config, submit a run, and follow it to completion.
Deploy & chat
Serve an adapter, then chat with it over an OpenAI-compatible API.