How to Transcribe an Interview Recording: A Journalist's Workflow Guide

Learn how to transcribe an interview recording fast and accurately. Step-by-step journalist workflow: recording formats, AI upload, speaker labels, and exporting quotes.

Fran Conejos
6 minJournalism & Interviews
How to Transcribe an Interview Recording: A Journalist's Workflow Guide

How to Transcribe an Interview Recording: A Journalist's Workflow Guide

If you're a journalist, researcher, or podcaster, transcribing interview recordings is part of your core workflow. Manual transcription takes 4-6 hours per hour of audio. AI transcription takes 4-6 minutes.

This guide covers everything: recording formats, uploading, speaker identification, and exporting clean quotes for your story.

Step 1: Record in the Right Format

The quality of your transcript starts with the quality of your recording. Here's what works:

Best recording setups for transcription accuracy:

  • Dedicated recorder (Zoom H1n, Tascam DR-05): Records as WAV or MP3 at 44.1kHz — excellent transcription accuracy
  • iPhone voice memos: Records as M4A — fully compatible with AI tools
  • Phone call recordings: Variable quality. Use apps like TapeACall (iOS) or ACR (Android) for clearer audio
  • Zoom/Teams recordings: Download the MP4 or M4A file directly from the platform

Avoid: Speakerphone recordings, Bluetooth headset audio, or recordings in loud environments (cafes, street). Background noise is the #1 cause of transcription errors.

Step 2: Upload Your Interview to MP3toTXT

  1. Go to mp3totxt.com
  2. Drag your audio file into the upload area (MP3, WAV, M4A, MP4 all accepted)
  3. Select the interview language
  4. Enable Speaker Labels — this is critical for interview work. The AI will tag each line as "Speaker A" (typically the interviewer) or "Speaker B" (your source)
  5. Click Transcribe

A 60-minute interview processes in roughly 5-8 minutes.

Step 3: Review Speaker Attribution

Speaker labels work at 90-95% accuracy for clear, two-person interviews. After the transcript loads:

  • Rename "Speaker A" to the actual name (e.g., "Maria Garcia")
  • Review sections where speakers overlapped or interrupted each other
  • Check the first 2 minutes carefully — the AI sets baseline speaker profiles from the opening

For group interviews (3+ speakers), accuracy on speaker attribution drops slightly. Plan for more review time.

Step 4: Extract Quotes for Your Story

This is where the workflow pays off. Instead of scrubbing through audio, you can:

  1. Use Ctrl+F / Cmd+F to search for keywords in the transcript
  2. Copy exact quotes with confidence — timestamps let you verify against the original if needed
  3. Highlight key passages in your text editor before writing

Best practice for quotes: Always verify a quote against the original audio before publication. AI transcription is highly accurate, but errors on names, numbers, and technical terms can occur.

Step 5: Export and Archive

Download the full transcript as .txt. Store it alongside your original audio file in your project folder. Many journalists name files by date and subject: 2026-03-11_garcia-interview.txt.

Why archive transcripts?

  • Legal protection: A verbatim record is your evidence of what was said
  • Future reference: Sources often become relevant again months later
  • Fact-checking: Editors can verify quotes against the transcript without re-listening

Handling Sensitive Interviews

For sources who spoke on background or off-the-record:

  • Do NOT upload sensitive audio to any cloud service without consent
  • For confidential interviews, consider using local transcription tools (OpenAI Whisper runs offline)
  • Always follow your publication's data security policy

Common Interview Transcription Mistakes

Mistake 1: Publishing the raw AI transcript Always edit. Remove filler words ("um", "you know"), fix proper nouns, add punctuation where the AI missed it.

Mistake 2: Not enabling speaker labels Without diarization, a two-person interview becomes a wall of unmarked text. Always enable it.

Mistake 3: Ignoring timestamps Even if you don't use them in the final transcript, timestamps let you jump back to exact moments in the audio for verification.

Transcription Accuracy Benchmarks for Journalism

  • Studio-quality interview (quiet room, good mic): 95-98% accuracy
  • Phone interview: 88-93% accuracy
  • Noisy environment: 78-85% accuracy (consider re-recording if accuracy is critical)

Transcribe your interview recording

Speaker labels, timestamps, 30+ languages. Free to start.

Try MP3toTXT Free

Fran Conejos

Fundador de MP3toTXT y experto en tecnologías de transcripción y procesamiento de audio.