HomeGuides › Speaker labels in transcription

Speaker labels in transcription, explained

Guides · Updated June 2026

A transcript without speaker labels is a wall of text where you cannot tell who said what. Speaker labels fix that, turning the raw words into a readable conversation. Here is what they are, how the software figures out who is talking, and the easiest way to get a speaker-labeled transcript.


What is a speaker label?

A speaker label is the tag that marks who said each line, like Speaker 1, Speaker 2, or a person's name. Compare these two transcripts of the same exchange:

so are we agreed on Friday yes Friday works can you send the deck I will send it tonight

...versus the same audio with speaker labels:

Maria: So are we agreed on Friday?
James: Yes, Friday works. Can you send the deck?
Maria: I will send it tonight.

Same words, completely different usefulness. That is what speaker labels buy you.

How speaker identification works (diarization)

The technical name is speaker diarization. The software analyzes the audio, detects how many distinct voices are present, and groups each segment by who spoke, producing the generic Speaker 1 / Speaker 2 tags. You then rename those to real names. It is a separate step from transcription itself: transcription turns sound into words, diarization decides who owns each stretch of words.

How to get a transcript with speaker labels

Most enterprise speech-to-text services support diarization, but they are built for developers. For a normal conversation, the simplest path is an app that does it automatically. With Attesta on iPhone you just record:

  1. Tap record (an audible tone tells the room it has started).
  2. Have your conversation, then tap stop.
  3. Get a transcript that is already split by speaker, plus a summary, with no setup.
  4. Rename Speaker 1 / Speaker 2 to real names if you want.

Tips for accurate speaker labels

Get a speaker-labeled transcript automatically, plus a summary and action items, from a single tap.

Download on theApp Store

Frequently asked questions

What is a speaker label?

The tag that marks who said each line in a transcript, like Speaker 1, Speaker 2, or a name. It turns a wall of text into a readable back-and-forth.

How do you identify speakers in transcription?

Software uses speaker diarization to tell voices apart and group each segment by who spoke; you then rename the generic tags. Clear audio and distinct voices make it far more accurate.

What is the proper format for a speaker label?

Name or tag, then a colon, then the words, with a new line each time the speaker changes (for example, "Maria: Let's ship on Friday."). Timestamps at each turn are common for interviews and meetings.

How do I get a transcript with speaker labels?

Use a tool that supports diarization. Attesta does it automatically on iPhone: record the conversation and you get a speaker-labeled transcript plus a summary, then rename speakers if you want.

Related guides