How to Transcribe an Interview Recording: A Journalist's Workflow Guide
Learn how to transcribe an interview recording fast and accurately. Step-by-step journalist workflow: recording formats, AI upload, speaker labels, and exporting quotes.
How to Transcribe an Interview Recording: A Journalist's Workflow Guide
If you're a journalist, researcher, or podcaster, transcribing interview recordings is part of your core workflow. Manual transcription takes 4-6 hours per hour of audio. AI transcription takes 4-6 minutes.
This guide covers everything: recording formats, uploading, speaker identification, and exporting clean quotes for your story.
Step 1: Record in the Right Format
The quality of your transcript starts with the quality of your recording. Here's what works:
Best recording setups for transcription accuracy:
- Dedicated recorder (Zoom H1n, Tascam DR-05): Records as WAV or MP3 at 44.1kHz — excellent transcription accuracy
- iPhone voice memos: Records as M4A — fully compatible with AI tools
- Phone call recordings: Variable quality. Use apps like TapeACall (iOS) or ACR (Android) for clearer audio
- Zoom/Teams recordings: Download the MP4 or M4A file directly from the platform
Avoid: Speakerphone recordings, Bluetooth headset audio, or recordings in loud environments (cafes, street). Background noise is the #1 cause of transcription errors.
Step 2: Upload Your Interview to MP3toTXT
- Go to mp3totxt.com
- Drag your audio file into the upload area (MP3, WAV, M4A, MP4 all accepted)
- Select the interview language
- Enable Speaker Labels — this is critical for interview work. The AI will tag each line as "Speaker A" (typically the interviewer) or "Speaker B" (your source)
- Click Transcribe
A 60-minute interview processes in roughly 5-8 minutes.
Step 3: Review Speaker Attribution
Speaker labels work at 90-95% accuracy for clear, two-person interviews. After the transcript loads:
- Rename "Speaker A" to the actual name (e.g., "Maria Garcia")
- Review sections where speakers overlapped or interrupted each other
- Check the first 2 minutes carefully — the AI sets baseline speaker profiles from the opening
For group interviews (3+ speakers), accuracy on speaker attribution drops slightly. Plan for more review time.
Step 4: Extract Quotes for Your Story
This is where the workflow pays off. Instead of scrubbing through audio, you can:
- Use Ctrl+F / Cmd+F to search for keywords in the transcript
- Copy exact quotes with confidence — timestamps let you verify against the original if needed
- Highlight key passages in your text editor before writing
Best practice for quotes: Always verify a quote against the original audio before publication. AI transcription is highly accurate, but errors on names, numbers, and technical terms can occur.
Step 5: Export and Archive
Download the full transcript as .txt. Store it alongside your original audio file in your project folder. Many journalists name files by date and subject: 2026-03-11_garcia-interview.txt.
Why archive transcripts?
- Legal protection: A verbatim record is your evidence of what was said
- Future reference: Sources often become relevant again months later
- Fact-checking: Editors can verify quotes against the transcript without re-listening
Handling Sensitive Interviews
For sources who spoke on background or off-the-record:
- Do NOT upload sensitive audio to any cloud service without consent
- For confidential interviews, consider using local transcription tools (OpenAI Whisper runs offline)
- Always follow your publication's data security policy
Common Interview Transcription Mistakes
Mistake 1: Publishing the raw AI transcript Always edit. Remove filler words ("um", "you know"), fix proper nouns, add punctuation where the AI missed it.
Mistake 2: Not enabling speaker labels Without diarization, a two-person interview becomes a wall of unmarked text. Always enable it.
Mistake 3: Ignoring timestamps Even if you don't use them in the final transcript, timestamps let you jump back to exact moments in the audio for verification.
Transcription Accuracy Benchmarks for Journalism
- Studio-quality interview (quiet room, good mic): 95-98% accuracy
- Phone interview: 88-93% accuracy
- Noisy environment: 78-85% accuracy (consider re-recording if accuracy is critical)
Transcribe your interview recording
Speaker labels, timestamps, 30+ languages. Free to start.
Try MP3toTXT FreeFran Conejos
Fundador de MP3toTXT y experto en tecnologías de transcripción y procesamiento de audio.