Add support to send voice messages #10230

blord0 · 2025-07-16T19:10:30Z

Summary

Adds support for voice messags
closes #10165

Checklist

If code changes were made then they have been tested.
- I have updated the documentation to reflect the changes.
This PR fixes an issue.
This PR adds something new (e.g. new method or parameters).
This PR is a breaking change (e.g. methods or parameters removed/renamed)
This PR is not a code change (e.g. documentation, README, ...)

DA-344 · 2025-07-16T19:29:39Z

This is discussed here, and iirc, this was not planned at the moment, might be wrong though, worth checking: https://discord.com/channels/336642139381301249/1100398536920662117

AbstractUmbra

Was the code in this PR AI generated or something? Frankly I'd refuse to believe otherwise, due to seeing 3 different conventions/styles as well as a mish-mash of logic that doesn't really make sense.

I appreciate it's still in draft, but even so.

discord/http.py

blord0 · 2025-07-16T21:22:59Z

This is discussed here, and iirc, this was not planned at the moment, might be wrong though, worth checking: https://discord.com/channels/336642139381301249/1100398536920662117

Discord has added the message flag for "IS_VOICE_MESSAGE" in their offical docs so I think we good? https://discord.com/developers/docs/resources/message#message-object-message-flags

blord0 · 2025-07-16T21:29:46Z

Was the code in this PR AI generated or something? Frankly I'd refuse to believe otherwise, due to seeing 3 different conventions/styles as well as a mish-mash of logic that doesn't really make sense.

I appreciate it's still in draft, but even so.

Bit of AI was used as I was stuggling to get it all working. Like how I imported requests so that I could get actual values from the requests.
I will use the actual libraries/clean it all up tho (including the horrific amount of print statments i gotta remove)

AbstractUmbra · 2025-07-16T21:35:15Z

Pull requests that use AI generated code do not get accepted due to the possible licensing implications.
Not to mention the quality of AI generate code being abysmal.

blord0 · 2025-07-16T21:41:56Z

Pull requests that use AI generated code do not get accepted due to the possible licensing implications. Not to mention the quality of AI generate code being abysmal.

The only ai part is the part where requests is imported and used. That was never intended for the final merge.
All other abysmal code/syntax/logic was just me doing patchwork to get it working and returning some values

AbstractUmbra · 2025-07-16T21:43:33Z

Additionally I'm not sure where the library stands on sending "fake" data, we don't do it anywhere else so sending a fake waveform and fake size likely won't work.

blord0 · 2025-07-16T21:52:52Z

Additionally I'm not sure where the library stands on sending "fake" data, we don't do it anywhere else so sending a fake waveform and fake size likely won't work.

Fake size is a placeholder (as commented) so I can send stuff while testing, do plan adding actual code to the .size() method for final

The waveform data is only used as a visual preview for how the message looks. I do plan on looking into if I can get the real waveform data, but the only code I've seen that sends a voice message (https://gist.github.com/max-4-3/6704422045b1c21a4a3e9226fca38f68) does the same thing that I am currently.
It isn't checked by discord so if there is no method of getting it (without external tools like ffmpeg), then the message won't fail to send. Also the reason I allow users to pass in their own waveform right now

blord0

Found a much simpler way of sending voice messages while looking through https://discord.com/channels/336642139381301249/1100398536920662117

discord/abc.py

discord/file.py

blord0 · 2025-07-17T12:26:56Z

Example code for sending a voice message is

  channel = await bot.fetch_channel(xxx)
  file = discord.VoiceMessageFile("test.mp3", duration=21)
  await channel.send(file=file, voice=True)

AbstractUmbra

I wouldn't make a whole new File class, and instead have voice be an init parameter or a generated one (i.e. if duration is provided, or similar) of File and use this as needed.

discord/http.py

Rapptz

Thanks for the PR.

A few nits though:

Instead of passing voice=True at the send site, it should be at the File site instead because that's where it logically is. A voice message is nothing more than a fancy attachment that toggles a flag.
You should check whether multiple embedded audio attachments are supported per message.
You're missing .. versionadded:: 2.6 on the new File properties.
You're missing documentation for duration (which you typo'd as duation a few times).
The waveform should be generated properly.
More documentation is needed for this in general, such as what audio format is expected to be uploaded for it to be a voice message etc.

discord/abc.py

discord/file.py

blord0 · 2025-07-28T00:37:33Z

Thanks, didn't catch the typos there. Also added File.voice bool to for checking if the file is a voice message now
If you try to add multiple voice messages to a message, discord throws a 400 Bad Request (error code: 50160): Voice messages must have a single audio attachment

For points 5 and 6, discord accepts a wide range of audio formats. Found it to work with ogg, mp3, wav, aac and flac. Doesn't work with .mp4
Each different file format would need its own way to calculate the waveform data. I can find out ways to calculate waveform data for each of the types above, but

that list above is non-exhausitve
it would require adding more external dependencies to the project (not sure how you feel about that)

Rapptz · 2025-07-28T01:50:10Z

Looking at the documentation you can probably make a valid waveform for the .ogg ones using the facilities in the library, if possible, otherwise you can fallback to the random waveform.

blord0 · 2025-07-28T12:04:23Z

From what i can see, the oggparse and opus libraries can only be used for voice data, and fail when used on general opus files. Even a simple script like this fails due to a corrupted stream error. I downloaded my own voice-message.ogg from discord and checked with ffprobe to make sure it was a valid ogg file

from discord.oggparse import OggStream
from discord.opus import Decoder

filename = "voice-message.ogg"
f = open(filename, 'rb')
ogg = OggStream(f)
decoder = Decoder()
for packet in ogg.iter_packets():
  decoded = decoder.decode(packet) # discord.opus.OpusError: corrupted stream
  print(decoded)

Looked into another library called soundfile and this seems to be able to generate waveform for ogg, mp3 and wav.
Could impliment somthing similar to PyNacl, where you don't have to install it but it is required to generate the real waveform data without falling back to a random waveform?

AbstractUmbra · 2025-07-28T13:05:01Z

soundfile is a rather large dependency, since it includes numpy and a few others. Each wheel is around 1mib. I personally think that's a bit overkill.

I've discussed this waveform idea in depth and it should be possible using the opus/ogg items the library provides (I will do some testing when I have the time, soon) and generating the correct and real waveform should be a fast process.

However, this does mean we'll need to accept only OGG formatted files, it will be on the end user to do the necessary conversion somehow, unless we do decide to go in depth.

blord0 · 2025-07-28T15:54:47Z

Just had a deeper look into what each packet from my code example above has in it, and seems like the first 2 are metadata. If I just skip those 2 packets then decoder works fine all the way through.
Going to have a look around to see how to turn that data into waveform stuff

from discord.oggparse import OggStream
from discord.opus import Decoder

filename = "voice-message.ogg"
f = open(filename, 'rb')
ogg = OggStream(f)
decoder = Decoder()
count = 0
for packet in ogg.iter_packets():
    if count < 2:
        count += 1
        continue
    decoded: bytes = decoder.decode(packet) # Gives actual bytes now

scarletcafe

My personal thoughts, I think this is a bit of a tough one to approach.

scarletcafe · 2025-07-28T20:34:58Z

discord/file.py

+            try:
+                self._waveform = self.generate_waveform()
+            except Exception:
+                self._waveform = base64.b64encode(os.urandom(256)).decode('utf-8')


I'm not sure if I'm a big fan of this caveat, especially not as a catch-all except case. I think us handling audio extraction for anything outside of opus 'in house' is pretty far out of scope, but this feels like a design 'lock-in' that could prevent people who have the means to do the waveform generation themselves from doing so. Maybe the waveform property could have a setter, or maybe there could be a way to construct a File (or a specialized subclass) with a waveform provided e.g. File.voice_message_with_waveform(data, waveform).

You can actually pass in your own waveform when creating a file
file = discord.File('voice-message.ogg', duration=5.0, waveform='AAAAA...')

Hmm, that makes this less of an issue, but I still think we should avoid generating a fake waveform if we can, maybe we could just let the exception be raised instead or split generation and non-generation into classmethod-based signatures like suggested.

Problem is that if someone passes in an mp3 file, we still need to generate a waveform. I don't see the point in adding extra steps for the user to generate a waveform dependant on if they have the correct audio type as that could be confusing

discord/file.py

scarletcafe · 2025-07-28T20:40:31Z

discord/file.py

+        point_count: int = self.duration * 10  # type: ignore
+        point_count = min(point_count, 255)
+        points_per_sample: int = len(waveform) // point_count
+        sample_waveform: list[int] = []
+
+        total, count = 0, 0
+        # Average out the amplitudes for each point within a sample
+        for i in range(len(waveform)):
+            total += waveform[i]
+            count += 1
+            if i % points_per_sample == 0:
+                sample_waveform.append(total // count)
+                total, count = 0, 0
+
+        # Maximum value of a waveform is 0xff (255)
+        highest = max(sample_waveform)
+        mult = 255 / highest
+        for i in range(len(sample_waveform)):
+            sample_waveform[i] = int(sample_waveform[i] * mult)


We already rely on audioop or audioop-lts for newer versions of Python, which includes many audio operations implemented in C. I don't have a lot of reason to believe this operation is that slow, but I'm inclined to believe we can probably use audioop to make this simpler and faster to do.

audioop is depricated in python 3.11 and removed since 3.13
https://docs.python.org/3/library/audioop.html

We already include audioop for 3.13+ using audioop-lts.

The part of code you have commented this on doesn't decode or process the Opus data.
Lines 259 to 277 just processes the list of ints, that represent the waveform, to be in the form that discord expects them to be (values between 0-255 and a maximum of 255 values)
Since its just a list of ints, not sure if audioop would be applicable?

discord/file.py

blord0 added 4 commits July 16, 2025 19:50

Add VoiceMessageFile class

4480fea

Add abc.channel.send_voice_message to allow sending voice messages

31344ca

Start work on sending the voice messages (not working)

aa5c26f

First working version of sending voice messages

a7270b2

blord0 mentioned this pull request Jul 16, 2025

Voice message support #10165

Open

AbstractUmbra suggested changes Jul 16, 2025

View reviewed changes

blord0 added 3 commits July 16, 2025 22:55

Start cleaning up parts of the code

4a40683

More cleanup

9a4617e

Found a much simpler method to send voice messages

f9ca81a

blord0 commented Jul 16, 2025

View reviewed changes

discord/abc.py Outdated Show resolved Hide resolved

discord/file.py Outdated Show resolved Hide resolved

blord0 and others added 4 commits July 17, 2025 02:57

size method no longer needed

eb62338

Remove unncessary code

27bca43

Doc fixes and made duration a required field

d1747b9

Remove print statements

2a96d13

blord0 marked this pull request as ready for review July 17, 2025 12:23

AbstractUmbra suggested changes Jul 17, 2025

View reviewed changes

discord/http.py Outdated Show resolved Hide resolved

blord0 and others added 6 commits July 17, 2025 13:36

Move VoiceMessageFile into File

8332ca3

Remove final reference to VoiceMessageFile

5231d51

Merge branch 'Rapptz:master' into voice-messages

2e6bfd3

Add error checking

60030d8

Fix error checking

1d2ab9c

Merge branch 'Rapptz:master' into voice-messages

50cb4f6

Rapptz requested changes Jul 27, 2025

View reviewed changes

discord/abc.py Outdated Show resolved Hide resolved

discord/file.py Outdated Show resolved Hide resolved

discord/file.py Outdated Show resolved Hide resolved

discord/file.py Outdated Show resolved Hide resolved

blord0 added 3 commits July 28, 2025 01:24

Rename duation to duration

3dd7f8f

Add File.voice attribute

e5cca7d

Change checking for voice messages to use File.voice

8f1d548

Formatting change

9936b0d

blord0 added 3 commits July 28, 2025 18:31

Add real generation of waveforms for Opus files

8bc906e

Formatting

dd2fd33

Calculate correct number of points per sample

bb4de89

scarletcafe suggested changes Jul 28, 2025

View reviewed changes

blord0 and others added 4 commits July 29, 2025 00:43

Change TypeError to ValueError

0f3bc42

Change waveform data to be input as a list of ints

8bea5c3

Fix doc issues

394b16e

Merge branch 'Rapptz:master' into voice-messages

0f1ded6

Uh oh!

Add support to send voice messages #10230

Are you sure you want to change the base?

Add support to send voice messages #10230

Conversation

blord0 commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Checklist

Uh oh!

DA-344 commented Jul 16, 2025

Uh oh!

AbstractUmbra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

blord0 commented Jul 16, 2025

Uh oh!

blord0 commented Jul 16, 2025

Uh oh!

AbstractUmbra commented Jul 16, 2025

Uh oh!

blord0 commented Jul 16, 2025

Uh oh!

AbstractUmbra commented Jul 16, 2025

Uh oh!

blord0 commented Jul 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blord0 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

blord0 commented Jul 17, 2025

Uh oh!

AbstractUmbra left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Rapptz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

blord0 commented Jul 28, 2025

Uh oh!

Rapptz commented Jul 28, 2025

Uh oh!

blord0 commented Jul 28, 2025

Uh oh!

AbstractUmbra commented Jul 28, 2025

Uh oh!

blord0 commented Jul 28, 2025

Uh oh!

scarletcafe left a comment

Choose a reason for hiding this comment

Uh oh!

scarletcafe Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

blord0 Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

scarletcafe Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

blord0 Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

blord0 commented Jul 16, 2025 •

edited

Loading

blord0 commented Jul 16, 2025 •

edited

Loading

blord0 left a comment •

edited

Loading

blord0 Jul 28, 2025 •

edited

Loading