Proofd starts from the messiest possible input: unstructured natural language about professional work and thinking, arriving in fragments, out of context, from people who are thinking out loud. The task is not just to process it. It is to build, from that raw material, a system that knows how you think, what you believe, how your positions have evolved over time, and how to express all of that in your voice.

That is harder than it sounds. Here is an honest account of where it gets hard.

Stage 1: Splitting a Transcript into Semantic Units

Before anything useful can happen, a voice note transcript has to be broken into discrete units of meaning. We call these atoms.

A single 60-second voice note might contain three separate thoughts: a factual observation, an opinion forming in real time, and a prediction about how something will play out. These cannot be treated as one unit. They carry different signal types, different confidence levels, and different downstream behaviors. An observation that confirms a pattern you have been building for six months is categorically different from an offhand remark.

AtomizationService splits each transcript into TAtomDraft objects, each with a verbatimText (the exact substring from the transcript) and a canonicalStatement (a cleaned restatement). ClassificationService then assigns each draft an atomType: claim, observation, event, lesson, question, anecdote, fact, or prediction.

The boundary detection problem is real. Natural speech does not segment cleanly. A sentence can begin as an observation and end as a prediction. The atomizer has to make judgment calls about where one unit of meaning ends and another begins, and those calls affect every downstream stage.

Stage 2: What the System Extracts From Each Atom

For every atom, the system runs parallel extraction across several dimensions.

TNarrativePrimitives captures the structural elements of the statement: whether the atom has a coreMessage that can stand alone, whether it includes metrics, a reframeMoment, a constraint, a contrarianAgainst target, or a transferableLesson. These are not inferred thematically. They are structural properties of the statement itself.

TAtomOpinion extracts whether the atom represents a position and, if so, how strongly held it is. The strength field is an ordinal scale from musing through emerging, held, strong, to conviction. An offhand comment about a strategy gets marked musing. A position you have returned to repeatedly, expressed with clarity and evidence, gets marked conviction. The system never conflates the two.

TPostSignal[] flags determine whether an atom is worth publishing. These are qualitative flags, not scores: has_strong_opinion, is_contrarian, has_measurable_outcome, has_reframe, has_transferable_lesson. If any of the strongest signals are present, the system creates an automatic post job from that atom without waiting for a scheduled generation run.

intensity is a 0.0 to 1.0 float that captures the emotional weight of the atom. This feeds directly into which exemplar atoms are selected when the system later synthesizes your author profile.

Stage 3: Entity Resolution

Every person, company, product, and organization mentioned in your voice notes becomes a TEntity. The hard problem is that you rarely refer to the same person the same way twice.

"Marcus from the client team" and "Marcus" and "the client contact we met in Denver" might be the same person. Entity resolution has to adjudicate that at scale, without human review, while maintaining a low false-positive rate. Conflating two different people is a worse error than keeping them separate.

The resolution pipeline uses fuzzy string matching for high-confidence cases (threshold 0.85) and semantic embedding similarity for lower-confidence cases (threshold 0.82). In the gray band between those thresholds, an LLM adjudicates with access to surrounding context from both candidate atoms. Each entity tracks a sentimentArc: the average sentiment toward that entity, its trend over time, and the individual data points that make up the arc.

Getting entity resolution wrong is not a minor inconvenience. It corrupts the downstream belief threads, distorts the author identity profile, and generates posts that make confident claims about relationships that do not exist as described.

Stage 4: Theme Clustering and Momentum

As atoms accumulate, ThemeAssignmentService clusters them into TTheme objects using vector centroids stored in Qdrant. A new atom attaches to an existing theme if its embedding cosine similarity clears 0.78. Small themes merge at 0.92. New themes form when an atom does not fit anywhere.

Each theme carries a momentum object that is recomputed on every extraction run. The four possible states are rising, steady, cooling, and dormant, based on how many atoms have arrived in the last 7 and 30 days. A theme that has been quiet for more than 60 days is dormant, regardless of how many atoms it accumulated in the past.

Momentum is what the post generation system reads first when selecting what to write about. A theme with rising momentum and 4 atoms in the last week is a more productive source than a theme with 40 atoms that cooled off two months ago. The author's current intellectual activity is the signal. Stale expertise, however deep, produces posts that feel out of step.

The clustering problem: professional thinking does not divide into clean, non-overlapping topics. A thought about team dynamics might belong to leadership, to a specific project, and to a recurring concern about communication. The centroid model handles this with overlapping membership and threshold tuning, but the correct threshold is not universal across users with different communication styles and topic densities.

Stage 5: Belief Threads

This is the part of the system that is genuinely unlike anything in professional publishing tools.

BeliefThreadService tracks opinions across time. Each TBeliefThread represents a position the system has identified as evolving: the claim, its currentStrength and originStrength, and a milestones[] array of every atom that has moved the needle.

The status field tracks the lifecycle of a position: forming, firm, evolving, contradicted, resolved. If a new atom directly contradicts an existing strong position, the system logs a contradiction entry and marks the thread contradicted. The system does not silently discard old positions to make the current state look more coherent. Contradictions are preserved.

The LLM adjudication that assigns reinforces, evolves, contradicts, or introduces to each new atom has to reason about the semantic relationship between the new atom and a thread that may span dozens of entries over months. A nuanced position shift is not the same as a contradiction. A reframe that keeps the conclusion but changes the reasoning should extend the thread, not split it. Getting this wrong produces posts that either oversimplify your positions or present you as more contradictory than you are.

The belief thread system is also what makes the provenance feature honest. When a post surfaces a position, it can trace the conviction arc: when the view first appeared, how it strengthened or changed, which voice notes contributed. That trace comes from TBeliefThread.milestones[], not from anything the LLM invented.

Stage 6: Building the Author Profiles

Everything described above feeds into two documents per user: TAuthorVoiceProfile and TAuthorIdentityProfile.

TAuthorVoiceProfile captures how you write and speak: vocabularyLevel, sentenceRhythm, registerNotes, signaturePhrases[], idiomMarkers[], tone[], pointOfView, assertiveness, and concreteness. The confidence field is calculated as clamp(atomsAnalyzed / 25, 0.1, 0.95). Below 25 atoms analyzed, the system knows it does not have a reliable voice model.

TAuthorIdentityProfile captures what you know and where you stand: inferredRole, inferredDomain, expertiseAreas[], recurringStances[], audienceHints[], and biographyFragments[]. Its confidence is clamp(sourceAtomIds.length / 15, 0.1, 0.95).

Neither profile is built from onboarding questions. Both are accreted entirely from your atoms.

The LLM synthesis that produces readable summaries of voice and identity runs on a cadence, not on every extraction. It fires the first time 5 atoms have been analyzed, then every 15 atoms after that. Between synthesis runs, the system appends facts and strong opinions directly without re-running the full LLM pass. This keeps the profile current without running expensive synthesis on every voice note.

The hardest problem at this stage is synthesis quality on a thin corpus. At 5 atoms, the LLM is making stylistic inferences from very little evidence. The confidence formula dampens these early estimates, but the synthesis itself still has to produce something coherent without overclaiming. The instructions to the synthesis LLM include explicit guidance about epistemic humility proportional to the atom count.

The Cold-Start Problem

Profile confidence at 5 atoms analyzed is 0.20 for identity. At 25 atoms, voice confidence reaches 0.95. The gap between those two states is where the cold-start problem lives.

During cold start, the post generation system writes modestly: fewer strong claims, more provisional framing, no asserting expertise the author has not yet demonstrated in their atoms. The personaMaturity field on the user record tracks this explicitly as a four-level progression: forming (0 to 4 atoms), developing (5 to 14), established (15 to 24), refined (25 and above).

The cold-start failure mode is generating a confident, voice-faithful post from a thin corpus and having it feel hollow to the author, because the system is expressing a profile it has not yet earned.

Idempotency and Reconciliation

Voice notes can be re-extracted. When they are, the system cannot simply delete old atoms and re-run. Theme membership counts, entity mention counts, belief thread milestones, and profile counters all depend on atoms being present. A deletion without accounting rolls back the state of every downstream aggregate.

ReconciliationService runs before any atom deletion. It walks every theme, entity, and belief thread that the old atoms contributed to and reverses their contributions before anything is deleted. Only after reconciliation are the old atoms removed and new extraction begun. The downstream aggregates always reflect the current set of atoms, not a running total that has accumulated drift from partial runs and re-runs.

This matters operationally because signal extraction can fail partway through, and because the system allows manual re-classification of stale atoms when the extraction model is updated. Without reconciliation, these operations corrupt the longitudinal data.

Time Integrity

The most subtle correctness problem in the system is time.

When a user records a voice note at 8 PM about something that happened in a meeting at 2 PM, the spokenAt field is set to the meeting time, not the recording time. Theme momentum, belief thread timelines, and provenance spans all use spokenAt. If processing time were used instead, a cluster of late-evening voice notes would create false momentum signals and distorted conviction arcs.

spokenAt is derived from the audio capturedAt field, which reflects when the device began recording. For the longitudinal metrics to be honest, the system has to reason about when the thinking happened, not when the recording was submitted for processing.

This seems like a minor detail. It is not. Every visualization of how your thinking has evolved, every provenance trace on a published post, and every theme momentum calculation depends on getting the temporal anchor right.

Why the Published Post Is the Easy Part

The hard work in Proofd is not generating a post. It is understanding what you actually think, in enough depth to generate a post that sounds like you.

The pipeline described above runs before any generation happens. By the time the post engine opens a context window, it has your current voice profile, your live identity profile, your highest-momentum themes, your active belief threads with their full trajectory, and your strongest-signal atoms with verbatim excerpts from your own voice. The writer is not inventing content. It is expressing material it has been building a model of for weeks.

That is what makes the published post feel like you. Not the generation. The understanding that came before it.