The problem is that I want to get phonemes of a audio speech in C# language.
say you have an audio file like "x.wav" that says "hello dear Shamim". i want to extract all the phonemes of the speech and their relative timings. something like the picture below:
I used System.Speech
library (both recognition
and synthesis
namespaces) but i didn't find what i wanted. Now don't be mistaken! I don't want the phonemes of the sentence "hello dear Shamim", i want to extract the phonemes from an unknown audio input that speaks and English sentence. I tried System.Speech.Recognition
but it tries to extract the words out of the audio file, not the phonems! and as you may guessed, the words are 30% wrong! ;)
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…