💡
Speech Recognition 패키지에서 기본이 되는 google api와
한국어와 오프라인 모드를 지원하는 vosk, whisper를 중점적으로 stt 기본 코드를 활용
1. 개발환경 구성
pip3 install SpeechRecognition
# for MAC
brew install portaudio
pip3 install pyaudio
# for Ubuntu
sudo apt-get install python-pyaudio python3-pyaudio
sudo apt-get install portaudio19-dev python-all-dev python3-all-dev
sudo pip install pyaudio
# for api
python3 -m pip install vosk
python3 -m pip install git+https://github.com/openai/whisper.git soundfile
2. 예제 코드
1. Google API
- 코드
import speech_recognition as sr r = sr.Recognizer() with sr.Microphone() as source: print('listening...') audio = r.listen(source, timeout=10, phrase_time_limit=10) print("......") try: text = r.recognize_google(audio, language='ko') print(text) except sr.UnknownValueError: print("Recognizer Failed..") except sr.RequestError as e: print("Request Failed...", e)
2. Vosk
- 환경 구성
- 모델 다운로드
- 코드
import speech_recognition as sr r = sr.Recognizer() with sr.Microphone() as source: print('listening...') audio = r.listen(source, timeout=10, phrase_time_limit=10) print("......") try: text = r.recognize_vosk(audio, language='ko') print(text) except sr.UnknownValueError: print("Recognizer Failed..") except sr.RequestError as e: print("Request Failed...", e)
3. whisper
- 코드
import speech_recognition as sr r = sr.Recognizer() with sr.Microphone() as source: print('listening...') audio = r.listen(source, timeout=10, phrase_time_limit=10) print("......") try: text = r.recognize_whisper(audio, language='ko') print(text) except sr.UnknownValueError: print("Recognizer Failed..") except sr.RequestError as e: print("Request Failed...", e)
3. 테스트
- 테스트 파일
- 안녕하세요. 이것은 테스트 문장입니다.
- 코드
import speech_recognition as sr import json r = sr.Recognizer() with sr.Microphone() as source: print('listening...') audio = r.listen(source, timeout=10, phrase_time_limit=10) print("......") try: text_google = r.recognize_google(audio, language='ko', show_all=True) text_google = dict(text_google)['alternative'][0]['transcript'] if 'alternative' in dict(text_google).keys() else "" text_vosk = r.recognize_vosk(audio, language='ko') text_vosk = json.loads(text_vosk)['text'] text_whisper = r.recognize_whisper(audio, language='ko') print("[Google]", text_google) print("[Vosk]", text_vosk) print("[whisper]", text_whisper) except sr.UnknownValueError: print("Recognizer Failed..") except sr.RequestError as e: print("Request Failed...", e)
참고
SpeechRecognition
Library for performing speech recognition, with support for several engines and APIs, online and offline.
