@
speech
關鍵字找到語音服務。並建立好服務範例key
值和location
值,作為語音識別庫的請求引數安裝視訊庫moviepy
pip install moviepy
編寫程式碼,將視訊檔test.mp4中的音訊提取到test2.wav
import moviepy.editor
videoClip = moviepy.editor.VideoFileClip(r"{}".format("test.mp4"))
videoClip.audio.write_audiofile(r"{}".format("test2.wav"))
安裝語音識別庫SpeechRecognition
pip install SpeechRecognition
編寫程式碼,將視訊檔test3.wav
中的音訊識別,並轉換成文字寫入test.txt
import speech_recognition
audio2 = speech_recognition.AudioFile("{}".format("test3.wav"))
recognizer = speech_recognition.Recognizer()
with audio2 as source:
audioData = recognizer.record(source)
result = recognizer.recognize_azure(audioData,key="<your api key>",language="zh-CN",location="eastus")
with open('test.txt', 'w') as file:
if result.__len__()>0:
file.write(result[0])
完整程式碼如下
import speech_recognition
import moviepy.editor
videoClip = moviepy.editor.VideoFileClip(r"{}".format("test.mp4"))
videoClip.audio.write_audiofile(r"{}".format("test2.wav"))
audio2 = speech_recognition.AudioFile("{}".format("test2.wav"))
recognizer = speech_recognition.Recognizer()
with audio2 as source:
audioData = recognizer.record(source)
result = recognizer.recognize_azure(audioData,key="<your api key>",language="zh-CN",location="eastus")
with open('test.txt', 'w') as file:
if result.__len__()>0:
file.write(result[0])
Azure提供了快捷轉換語音到文字的工具 https://speech.microsoft.com/portal
點選實時語音轉文字
這裡需要注意的是,需要上傳的音訊格式為16kHz 或 8kHz、16 位和單聲道 PCM
上傳完成後將自動轉換成文字
安裝音訊轉換庫pydub
pip install pydub
編寫程式碼,將test.aac
檔案以16kHz 取樣率和單聲道 PCM 編碼方式,儲存至test1.wav
注意,如果使用ffmpeg編碼的格式,需要下載ffmpeg相關庫到指令碼所在目錄
http://www.ffmpeg.org/download.html
from pydub import AudioSegment
audio1 = AudioSegment.from_file("test.aac", "aac")
# -ac 1 -ar 16000
audio1.export("test1.wav", format="wav",parameters=["-ac", "1", "-ar", "16000"])
本文來自部落格園,作者:林曉lx,轉載請註明原文連結:https://www.cnblogs.com/jevonsflash/p/17227943.html