Semantic codes
Surface content. "She describes the house as having the barn attached." Handled well by LLMs; verified by verbatim-span match.
An LLM-driven toolkit for qualitative researchers, built around Braun & Clarke's six phases.
This site is a design walkthrough of the v0 walking skeleton — twelve pipeline nodes, ten locked decisions, five deep dives — produced so a less-technical colleague can read the plan and stress-test it.
Funded by MASSHINE. All design choices in the v0 spec are auditable in natural language: every code, merge, and theme carries a written rationale and a verbatim span back to the source transcript.
The method
Braun & Clarke's reflexive thematic analysis treats researcher subjectivity as a generative resource. MASSHINE maps the six phases onto auditable model calls — keeping the human in the loop where interpretation is non-substitutable. Where a phase is deliberately not automated in v0, the right column says so: the v0/post-v0 boundary is itself a design decision (spec §8/§9).
Reflexive TA · 2006 / 2019 / 2021
LLM calls · file artifacts · human checkpoints
Surface content. "She describes the house as having the barn attached." Handled well by LLMs; verified by verbatim-span match.
Underlying meaning. "The barn being part of the house is read as continuity of agrarian life." LLMs are weaker here; flagged for human review.
Drift notes from the reconcile step. Memos are prompts for human reflexivity, not substitutes for it.
The pipeline
Walk a single sentence from Mary Grande's 1989 interview through the pipeline. At each stage, the data looks different — and the changes are auditable.
Below, the paragraph "I had to help with, when they killed the pigs I had to catch the blood. I didn't like it, but that was part of my job." travels from raw text to candidate theme. Read each card's summary, then the data, then the meta line. Stages marked editable are the ones a human reviews at the checkpoint. The full twelve-node pipeline — including reconcile, both checkpoints, the cross-family judge, and the audit — is mapped in Architecture & scale.
Locked decisions
Each v0 locked decision, with the spec clause, the research evidence, and a moment from a real Ellis Island transcript that shows why it matters.
Architecture & scale
The full v0 pipeline with inputs and outputs per node, the triggers that pull in each post-v0 extension, and the staged path from 12 synthetic transcripts to the 1,343-transcript corpus. Where the spec deviates from the literature's "proven recipe" (embeddings, SQLite), the trigger to revisit is written down here.
Deep dives
A pressure test, a systems comparison, and three worked transcript runs — each exportable as markdown for design conversations with colleagues and reviewers.
How these examples were made: the runs below are hand-authored illustrations of the v0 artifact formats, drawn from real Ellis Island Oral History Project transcripts (every quote is string-verified against the source — see tools/verify_quotes.py). v0 itself runs on synthetic data only (spec §3); the Ellis Island corpus enters at Stage 2.
Audit & limits
MASSHINE's exit gates are the only success criteria. The limits panel is the honest part — what the system deliberately does not automate.
≥ 4/6 planted themes recovered, including ≥ 1 of the 2 latent themes. Measured on the synthetic corpus only and always labeled synthetic_only.
Zero quotes in the final artifacts fail verbatim verification. The audit re-string-matches every quote against its source transcript.
Every theme resolves to codes → excerpts → char spans. No orphans. If the trace breaks, the theme is not grounded.
A full run is readable end-to-end: every artifact and decision is JSON or markdown, diffable with git, auditable in under an hour.
Reflexive TA treats subjectivity as a generative resource, not as bias. IRR presupposes a correct coding; RTA rejects that premise. Reserve IRR for codebook-TA arms.
The reconcile step writes drift memos as prompts for human reflexivity, not as a substitute. Every system we surveyed keeps a human as final interpretive authority.
Reading list
The research foundation for v0. The full literature review lives in research.md; these are the ones that actually moved design choices.