Scope

Every row on this page is a candidate, not a target. The scoring framework is open educational research. It has been benchmarked retrospectively against four public methylation cohorts, but it has not been validated prospectively against any wet-lab editing readout. Do not select a row from any of the top-100 tables for therapeutic application without running the full freeze-and-validate pipeline described in the external-validation instruction.

v0.5 is the first coordinated release set of the ThermoCas9 Candidate Opportunity Atlas: BRCA on EPIC-v2 / hg38 (cell-line cohort) plus TCGA-COAD / TCGA-LUAD / TCGA-LIHC on HM450 / hg19 (patient cohorts). Each release is hash-pinned via SHA-256 in atlas_manifest.json and tagged immutably in the atlas repository. Hypothesis generation only — no prospective wet-lab validation. The HM450 / hg19 catalog is 19,787,820 sites; the EPIC-v2 / hg38 catalog is 35,406,213 sites. Per-cancer detail pages are under /atlas/brca/, /atlas/coad/, /atlas/luad/, /atlas/lihc/.

Per-release download tiles

Each tile links to the immutable v0.5 GitHub release tag on sigaihealth/atlas. Reviewer-access caveat: the atlas repository is currently restricted to organization members; unauthenticated requests return HTTP 404. Public-read access is a held GigaScience submission gate. The release URLs below resolve once the repo flips to public.

BRCA · EPIC-v2 / hg38

Source: GSE322563 MCF-7 vs MCF-10A cell-line cohort (n=2 / n=2)

Tier B candidates: 0 — the cell-line cohort sits below the framework p_trust_ramp_n=30 floor; this is a documented release shape, not a bug.

manifest sha256: 94cc08baa800…

atlas-brca-v0.5.0-wg-epic-v2-hg38 ↗

TCGA-COAD · HM450 / hg19

Source: TCGA-COAD patient tissue (n=312 tumor / n=38 normal)

Tier B candidates: 63,853

manifest sha256: 89d592be4104…

atlas-tcga-coad-v0.5.0-wg-sigmoid ↗

TCGA-LUAD · HM450 / hg19

Source: TCGA-LUAD patient tissue (n=473 tumor / n=32 normal)

Tier B candidates: 21,318

manifest sha256: a277af86a5b9…

atlas-tcga-luad-v0.5.0-wg-sigmoid ↗

TCGA-LIHC · HM450 / hg19

Source: TCGA-LIHC patient tissue (n=377 tumor / n=50 normal)

Tier B candidates: 149,099

manifest sha256: 2ae90a7a48f3…

atlas-tcga-lihc-v0.5.0-wg-sigmoid ↗

`panel_layer` normal-tissue coverage

v0.5 introduces a first-class panel_layer manifest block that records each release's normal-tissue panel coverage against the underlying probe catalog. The three TCGA HM450 / hg19 releases share a 10-tissue, 687-sample broad-normal envelope.

COAD

83.68%

16,558,370 / 19,787,820 HM450 catalog sites covered

LUAD

83.63%

16,549,327 / 19,787,820 HM450 catalog sites covered

LIHC

83.77%

16,576,426 / 19,787,820 HM450 catalog sites covered

BRCA

—

unavailable_controlled_access_pending: no public EPIC-v2 / hg38 normal breast cohort surfaces in any GPL33022 series. Recorded as a release limitation, not substituted by a non-breast panel.

Cross-cancer shared Tier B loci (3-way HM450 intersection)

Across the three HM450 / hg19 patient-cohort releases (COAD, LUAD, LIHC), the Tier B candidate sets share a 3-way intersection of 6,441 candidates / 1,378 probe loci / 1,635 nearest-gene symbols. Same loci, different signal strength across cancers. BRCA EPIC-v2 / hg38 is omitted from this intersection by design: cross-platform candidate-id overlap is incoherent (different probe space + different reference assembly).

Headline figure — per-positive whole-genome rank percentiles

Per-positive whole-genome rank percentiles for the three Roth Fig. 5d positives across four cohort paths, comparing V2.5-diff (open circles) and V2.5-sigmoid (filled circles).

Figure 4. For each of the three Roth Fig. 5d positives (ESR1, EGFLAM, GATA3), the dot-plot shows the rank percentile of that positive's PAM site within the cohort's whole-genome candidate universe (HM450 ≈ 19.8 M loci; EPIC v2 ≈ 35.4 M). Filled circles are V2.5-sigmoid; open circles are V2.5-diff; the line connects the two so the rank-lift direction is immediately visible. Underlying data: per_positive_wg_percentile.json · SVG.

What to read off the plot.

On the three matched cell-line cohorts (left three columns), V2.5-diff and V2.5-sigmoid sit on top of each other above the 95-percentile line — AUC parity within 0.002. The tie_band@100 collapse from 421–1,493 records under V2.5-diff to 1 under V2.5-sigmoid is the actual usability story on these cohorts; this dot-plot only shows that the ranks are equivalent in position.
On the GSE69914 tissue cohort (right column), the dumbbells stretch — and not all in the same direction. GATA3 and EGFLAM improve substantially under V2.5-sigmoid; ESR1 moves the wrong way. This is the non-uniform-superiority caveat the methods paper discloses in §6.1: in the GSE69914 EXACT + PROXIMAL_CLOSE restricted-universe subset (where ESR1 is the only evaluable positive), V2.5-sigmoid trails V2.5-diff, raw Δβ-only, and the limma-style baseline.

Per-cohort top-100 explorer

Pick a cohort, filter by gene symbol or PAM family. Rows highlighted in the accent color overlap a Roth Fig. 5d positive at the wide (±500 bp) tier. The full per-row schema (30+ columns including RepeatMasker overlap and CGI distance) lives in the per-cohort TSV linked below; this table shows the slim-schema columns that fit on a screen.

Loading atlas data…

Summary by cohort

GSE322563 HM450

Source: Roth MCF-7 / MCF-10A Path: HM450-intersect n_t / n_n: 2 / 2

GSE322563 native EPIC v2

Source: Roth MCF-7 / MCF-10A Path: Native EPIC v2 n_t / n_n: 2 / 2

GSE77348

Source: δ-development cohort Path: HM450 n_t / n_n: 3 / 3

GSE69914 tissue

Source: Primary tumor + healthy donor Path: HM450 n_t / n_n: 305 / 50

How rows on this page map to rows in the paper

The dot-plot (fig4) visualizes the per-positive WG-rank table reported numerically in PAPER §5.2.2. The data file (examples/genome_wide_panel.md) is committed at the paper tag.
The per-cohort top-100 tables are the same artifact PAPER §5.5 reports at top-20, expanded to top-100 for this website. They ship as examples/<cohort>_roth_labels/top100_atlas.{tsv,md} at memo-2026-04-22-bw.
The tie-band collapse story (V2.5-diff 421–1,493 → V2.5-sigmoid 1) is visible as the y-axis collapse in PAPER fig2; this atlas page does not duplicate that figure.
The ESR1 reversal is highlighted graphically in the dot-plot above and discussed quantitatively in PAPER §5.7.

What is not in this atlas (intentionally)

No TCGA pan-cancer rows. The framework includes a streaming k-way-merge pan-cancer aggregator and unpublished TCGA cohort scoring exists in the working tree, but those rows are not on this surface yet because they have not been audited against a tagged paper artifact and because publishing pan-cancer top-K shortlists on a corporate domain reads as a drug-discovery pipeline more than a benchmarked methods demonstration. A future v2 atlas may add a TCGA section behind explicit framing.
No genome-browser / Manhattan view. The UCSC Genome Browser already does this well; click any candidate_id chromosome coordinate from the per-cohort TSVs to jump there.
No score-distribution histograms. Those live under examples/*_roth_labels/ as supplementary materials; this page intentionally scopes to "where do the candidates land" rather than "what does the score distribution look like."

Cite or reproduce

Methods paper — PAPER.pdf at tag paper-5-10j; Bioinformatics-shaped short version MANUSCRIPT.pdf at tag memo-2026-04-22-bw.
Atlas reproducer scripts — build_atlas_dotplot.py and build_atlas_top100.py.
Roth et al., Nature 2026 — read the article on nature.com (DOI 10.1038/s41586-026-10384-z).

Atlas — per-cohort ThermoCas9 target shortlists

v0.5 Posture B coordinated release set — four cancers, mixed-platform

Per-release download tiles

`panel_layer` normal-tissue coverage

Cross-cancer shared Tier B loci (3-way HM450 intersection)

Headline figure — per-positive whole-genome rank percentiles

Per-cohort top-100 explorer

Summary by cohort

How rows on this page map to rows in the paper

What is not in this atlas (intentionally)

Cite or reproduce

v0.5 Posture B coordinated release set — four cancers, mixed-platform

Per-release download tiles

panel_layer normal-tissue coverage

Cross-cancer shared Tier B loci (3-way HM450 intersection)

Headline figure — per-positive whole-genome rank percentiles

Per-cohort top-100 explorer

Summary by cohort

How rows on this page map to rows in the paper

What is not in this atlas (intentionally)

Cite or reproduce

`panel_layer` normal-tissue coverage