Voice-Note-to-Brief: Sub-30s Latency from ≤90s Voice Note to Structured Campaign Brief

FILE δ-17·FIELD · capture·Apr 28, 2026

Voice-Note-to-Brief: Sub-30s Latency from ≤90s Voice Note to Structured Campaign Brief

Mickey Haslavsky — enso Lab, FIELD group

§1 · Abstract

Campaign ideation attrition is concentrated in the transition from founder cognition to written brief template. This design dossier specifies a system that ingests a ≤90-second voice note and returns a structured campaign brief plus three candidate opening lines in <30 seconds end-to-end. Implementation pending.

Statusdesign dossier
Clearanceδ-17
SurfaceFIELD · capture
Read2 min read

Stack5 components

Whisper large-v3verbatim transcription
GPT-4obrief + opening-line synthesis
ElevenLabs10-second audio summary back
Tavily APIaudience + competitor context
Twiliovoice-note capture from phone

§2 · Hypothesis

H1: Founders generate higher-quality campaign ideation via spoken vs. written modality. H2: Reducing the brief-production latency below the founder's locomotion window increases campaigns-shipped-per-founder and preserves authorial voice in shipped artefacts.

§3 · Materials

Input: a phone-recorded voice note, duration ≤90 seconds
Output: a single-page structured brief (problem · audience · offer · channels) plus three candidate opening lines
Target end-to-end latency: <30 seconds
Return-channel: ≤10-second audio summary, enabling in-motion approval

§4 · Procedure

The end-to-end recipe. Follow it top to bottom; each step assumes the previous one ran cleanly.

Step01
Preserve verbatim founder phrasing in transcription
The brief reproduces the founder verbatim at high-leverage points - offer wording and specific buyer pain. These phrases constitute the primary artefact; paraphrasing destroys signal and is suppressed by design.
Fig.The 30-second loop
1. 0190s voice notephone, on a walk
2. 02Transcriptverbatim
3. 03Structured briefproblem · audience · offer
4. 04Audio summary backapprove while walking
Emit a structured brief, not free-form prose
The model is not prompted for 'a brief'; it returns a fixed set of named fields (problem, audience, offer, three opening lines) which are then rendered into the brief artefact. The constant schema enables cross-campaign comparison at the team level.
Return a ≤10-second audio summary for in-motion approval
Once rendered, a short audio summary is delivered to the founder for asynchronous approval without requiring desk return. Requiring desktop confirmation collapses the entire latency budget and invalidates the design.

§5 · Results

Fig.Today's brief loop vs. the proposed one

01Idea on a walktoday: it dies here
02Open a template20 min of friction
03Write a briefor don't
04Hand to AI agentdays later

The status quo is the reason most campaigns never ship.

90s

max voice note

<30s

target turnaround

opening lines out

lines of code yet

No implementation as of this revision - this is a design dossier, not an executed experiment.
Nearest comparators (voice-to-Notion tooling) terminate at transcription and omit the synthesis step, which carries the decisive value.
Proposed pilot: n=10 founders × 2 weeks, with the primary outcome being incremental briefs shipped vs. their pre-pilot baseline.

§6 · Discussion

The dominant failure mode to monitor is over-synthesis - the agent produces a fluent brief decoupled from the founder's actual content. The audio return-channel is the lowest-cost mechanism for catching this prior to campaign launch. At the team level the upside is a continuous supply of founder-voiced briefs without scheduled extraction meetings.

§7 · Reproduce it yourself

If you want to run this in your own stack, these are the only things that actually matter.

Retain the verbatim transcript
Founder trust in the brief is higher when the underlying verbatim transcript is visible beneath it. The transcript functions as evidentiary substrate.
Fix the brief schema before prompting
Allowing the model to determine brief structure produces incompatible artefacts across runs and precludes cross-campaign comparison. The schema must be specified ex ante.
Ship the audio return-channel from v1
The product's value proposition is in-motion launch. Omitting the audio summary forces desktop return and collapses the system to the status quo.

§8 · References

[1]Internal: FIELD capture surface notes
[2]Field notes: why founder briefs die (enso interviews, 2026)

Back to all experiments

Filed byMickey Haslavsky

Field research on attention, outbound, and the surfaces buyers actually live in.

More experiments →

enso - Agentic Growth Lab

What enso does

Pricing

Compare

Developer resources

About the team

Voice-Note-to-Brief: Sub-30s Latency from ≤90s Voice Note to Structured Campaign Brief

Voice-Note-to-Brief: Sub-30s Latency from ≤90s Voice Note to Structured Campaign Brief

Preserve verbatim founder phrasing in transcription

Emit a structured brief, not free-form prose

Return a ≤10-second audio summary for in-motion approval

Retain the verbatim transcript

Fix the brief schema before prompting

Ship the audio return-channel from v1

A secret door.
Ask one of these keepers for the key to step inside.