Quick Start

This guide walks you through submitting a Clustering pipeline job — from preparing your cohort files to downloading results.

What you’ll need

Access to the MyImmune web app: https://app.svc.myimmune.ai/dashboard
A cohort TSV listing your subjects and their AIRR file paths
Per-subject AIRR repertoire files, zipped into a single archive

Prepare your input files

1. Cohort TSV

Create a tab-separated file with one row per subject:

subject_id  airr_file_path
S001  S001_airr.zip
S002  S002_airr.zip
S003  S003_airr.zip

Save it as cohort.tsv.

2. Prepare the zip file

Zip all per-subject AIRR TSV files along with the cohort file into a single archive:

zip airr_files.zip S001_airr.tsv S002_airr.tsv S003_airr.tsv cohort.tsv

Each AIRR TSV must have at minimum: sequence_id, junction_aa, v_call, j_call. See Data Formats for the full spec.

Submit the job

Log in

Open the MyImmune web app and sign in.
Open the Clustering Pipeline

From the Dashboard, click Clustering Pipeline → Create New Job.

Fill in the form

Field	Value
Job name	e.g., `my-first-clustering-run`
Cohort TSV	name of the cohort file: `cohort.tsv`
Sequence identity (sid)	`0.85` (default — good starting point)
Coverage (cov)	`0.85` (default)
Generate HTML report	✓ enabled
Advanced Config	keep the default values
AIRR files zip	Upload `airr_files.zip`

Leave everything else at defaults for your first run.

Submit

Click Submit. The job appears in the job list with status queued.
Monitor progress

If the job is running you can see the status as running. If it has completed successfully you will see the status as success.
Download results

Once status shows success, the action menu lets you download the results.

What you get

File	What it tells you
`cluster_list.tsv`	Every cluster — which sequences and subjects belong to it
`subject_cluster_map.tsv`	Per-subject cluster membership for downstream analysis
`hierarchical_tree.nwk`	Newick tree — load in iTOL or FigTree for visualization
`cluster_report.html`	Interactive dendrogram + heatmap — open in any browser

What happened behind the scenes

Each subject’s sequences were compared against themselves at the configured sid/cov thresholds to build subgraphs. Those subgraphs were then clustered hierarchically using the a metric.

See the full Clustering Pipeline reference for all parameters and troubleshooting tips.

Next steps

Stratification Pipeline — add a response column to your cohort and run LOOCV classification on the clusters you just generated
Adjust thresholds — lower sid/cov for broader clusters, or enable auto-tune