> ## Documentation Index
> Fetch the complete documentation index at: https://freesolo.co/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Explore the directory

> Every file flash env setup creates, what it's for, and when Flash reads it.

`flash env setup` scaffolds a starter project into the **current directory**. A
run is fully described by what lands on disk: an **environment** (your task and
how it's scored) and a [**config**](/reference/configuration) (how to train on it). Every file is plain text
you can read, diff, and version-control. Rerunning is safe: any file that already
exists is left untouched.

```
./
├── environment.py        # the task + reward (a Freesolo environment)
├── dataset/
│   └── train.jsonl       # a tiny starter dataset (input/output rows)
├── configs/
│   ├── sft.toml          # an SFT (supervised) training config
│   └── rl.toml           # a GRPO (RL) training config
└── TRAINING.md           # a playbook for the AI agent driving your runs
```

<AccordionGroup>
  <Accordion title="environment.py" icon="python">
    **What it is:** your [environment](/environment-model), the single source of
    truth for what the model practices on and how it's graded. It defines
    `load_environment()`, which returns a Freesolo `EnvironmentSingleTurn` (or
    `EnvironmentMultiTurn`) carrying a dataset and a `score_response` reward. This
    is the file you edit first.

    **When Flash reads it:** `flash env push` packages and uploads it;
    `flash train --cost` may import it locally to count training examples; and on
    every run the worker imports it and calls `load_environment(**params)`. The scaffolded
    starter loads its rows from `dataset/train.jsonl`.

    See the scaffolded file in full — `StarterEnv` with `build_prompt_messages`
    and `score_response` — in [Environments](/guides/environments#scaffold-one).
  </Accordion>

  <Accordion title="dataset/train.jsonl" icon="table">
    **What it is:** a tiny starter dataset of `input`/`output` rows that the
    scaffolded `environment.py` loads. Replace it with your real training rows
    before a real run. See [Datasets](/guides/datasets).

    **When Flash reads it:** your `environment.py` reads it on the worker at run
    time (and locally when `flash train --cost` counts its rows; `--dry-run`
    validates only the config and does not read it); `flash env push`
    uploads the `dataset/` folder with the environment.

    ```jsonl dataset/train.jsonl theme={null}
    {"input":"What is 2 + 2?","output":"4"}
    {"input":"What is 3 + 5?","output":"8"}
    ```
  </Accordion>

  <Accordion title="configs/sft.toml" icon="gear">
    **What it is:** an [SFT training config](/guides/training#choose-sft-or-grpo)
    for supervised fine-tuning on the `input`/`output` pairs in your environment's
    dataset. You set `model`, the `[environment] id`, and the `[train]` knobs
    (epochs, lora\_rank); the
    training infrastructure and artifact storage are [managed for you](/reference/configuration). Copy
    it per experiment.

    **When Flash reads it:** every `flash train`, `--dry-run`, and `--cost` parses
    this file into the resolved job spec.

    ```toml configs/sft.toml theme={null}
    model = "Qwen/Qwen3.5-4B"
    algorithm = "sft"

    [environment]
    id = ""   # paste the id returned by `flash env push --name my-env .`

    [train]
    epochs = 3
    max_examples = 1000
    lora_rank = 32
    ```
  </Accordion>

  <Accordion title="configs/rl.toml" icon="copy">
    **What it is:** the same config shape with `algorithm = "grpo"`, for GRPO
    (RL) training that optimizes against your environment's `score_response`
    reward. Keep both and pick one at train time.

    **When Flash reads it:** same as `sft.toml`; pass it to `flash train` to run
    GRPO instead of SFT.

    ```bash theme={null}
    flash train configs/rl.toml
    ```
  </Accordion>

  <Accordion title="TRAINING.md" icon="book">
    **What it is:** a playbook for the AI coding agent you point at this project —
    how to design the reward, what to read in a run's output, and how to decide a
    run actually improved the model. It includes current CLI usage and common
    Flash issue mitigations.

    **When it travels:** if you publish the whole scaffolded folder,
    `flash env push` includes `.md` sidecars, so
    `TRAINING.md` can travel with the environment source in the Hub for humans
    and coding agents.
  </Accordion>
</AccordionGroup>

## Packaging an environment as a folder

The scaffolded `environment.py` is enough to publish on its own. Once your task
grows data files or helper modules, move it into a folder with
`environment.py` at the root and publish the whole folder.

```
math/
├── environment.py        # defines load_environment()
├── helpers.py            # optional sibling modules, imported by environment.py
└── dataset/
    ├── train.jsonl       # input/output records
    └── eval.jsonl
```

<AccordionGroup>
  <Accordion title="helpers.py" icon="python">
    **What it is:** any sibling Python modules your `environment.py` imports. Keep
    imports either from installed packages or from files in the folder you publish.

    **When Flash reads it:** uploaded with the folder by `flash env push` and
    importable on the worker.
  </Accordion>

  <Accordion title="dataset/" icon="table">
    **What it is:** sidecar data files (`.jsonl`, `.json`, `.csv`, and more)
    that your environment loads. See
    [Datasets](/guides/datasets#what-gets-uploaded) for the full list.

    **When Flash reads it:** your environment code reads it (for example via
    `load_task_examples(...)`); `flash env push` includes the `dataset/` folder
    in the uploaded artifact.

    ```jsonl dataset/train.jsonl theme={null}
    {"input":"What is 2 + 2?","output":"4"}
    {"input":"What is 3 + 5?","output":"8"}
    ```
  </Accordion>
</AccordionGroup>
