Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
300 views
in Technique[技术] by (71.8m points)

python - What are chunks, samples and frames when using pyaudio

After going through the documentation of pyaudio and reading some other articles on the web, I am confused if my understanding is correct.

This is the code for audio recording found on pyaudio's site:

import pyaudio
import wave

CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"

p = pyaudio.PyAudio()

stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input=True,
                frames_per_buffer=CHUNK)

print("* recording")

frames = []

for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    data = stream.read(CHUNK)
    frames.append(data)

print("* done recording")

stream.stop_stream()
stream.close()
p.terminate()

and if I add these lines then I am able to play whatever I recorded:

play=pyaudio.PyAudio()
stream_play=play.open(format=FORMAT,
                      channels=CHANNELS,
                      rate=RATE,
                      output=True)
for data in frames: 
    stream_play.write(data)
stream_play.stop_stream()
stream_play.close()
play.terminate()
  1. "RATE" is the number of samples collected per second.
  2. "CHUNK" is the number of frames in the buffer.
  3. Each frame will have 2 samples as "CHANNELS=2".
  4. Size of each sample is 2 bytes, calculated using the function: pyaudio.get_sample_size(pyaudio.paInt16).
  5. Therefore size of each frame is 4 bytes.
  6. In the "frames" list, size of each element must be 1024*4 bytes, for example, size of frames[0] must be 4096 bytes. However, sys.getsizeof(frames[0]) returns 4133, but len(frames[0]) returns 4096.
  7. for loop executes int(RATE / CHUNK * RECORD_SECONDS) times, I cant understand why. Here is the same question answered by "Ruben Sanchez" but I cant be sure if its correct as he says CHUNK=bytes. And according to his explanation, it must be int(RATE / (CHUNK*2) * RECORD_SECONDS) as (CHUNK*2) is the number of samples read in buffer with each iteration.
  8. Finally when I write print frames[0], it prints gibberish as it tries to treat the string to be ASCII encoded which it is not, it is just a stream of bytes. So how do I print this stream of bytes in hexadecimal using struct module? And if later, I change each of the hexadecimal value with values of my choice, will it still produce a playable sound?

Whatever I wrote above was my understanding of the things and many of them maybe wrong.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  1. "RATE" is the "sampling rate", i.e. the number of frames per second
  2. "CHUNK" is the (arbitrarily chosen) number of frames the (potentially very long) signals are split into in this example
  3. Yes, each frame will have 2 samples as "CHANNELS=2", but the term "samples" is seldom used in this context (because it is confusing)
  4. Yes, size of each sample is 2 bytes (= 16 bits) in this example
  5. Yes, size of each frame is 4 bytes
  6. Yes, each element of "frames" should be 4096 bytes. sys.getsizeof() reports the storage space needed by the Python interpreter, which is typically a bit more than the actual size of the raw data.
  7. RATE * RECORD_SECONDS is the number of frames that should be recorded. Since the for loop is not repeated for each frame but only for each chunk, the number of loops has to be divided by the chunk size CHUNK. This has nothing to do with samples, so there is no factor of 2 involved.
  8. If you really want to see the hexadecimal values, you can try something like [hex(x) for x in frames[0]]. If you want to get the actual 2-byte numbers use the format string '<H' with the struct module.

You might be interested in my tutorial about reading WAV files with the wave module, which covers some of your questions in more detail: http://nbviewer.jupyter.org/github/mgeier/python-audio/blob/master/audio-files/audio-files-with-wave.ipynb


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...