use whisper下載 - use whisper源代碼下載

使用

用語音錄音機，實時轉錄和內置的言語錄音機，內置的React Hook for Openai Whisper API

演示
實時轉錄演示

使用 - 真實的時間轉錄.mp4

公告
正在開發對本機的使用。

存儲庫：https：//github.com/chengsokdara/use-whisper-native

進度：Chengsokdara/使用 - 旋轉的＃1

安裝

 npm i @chengsokdara/use-whisper

 yarn add @chengsokdara/use-whisper

用法

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const {
    recording ,
    speaking ,
    transcribing ,
    transcript ,
    pauseRecording ,
    startRecording ,
    stopRecording ,
  } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
  } )

  return (
    < div >
      < p > Recording: { recording } < / p >
      < p > Speaking: { speaking } < / p >
      < p > Transcribing: { transcribing } < / p >
      < p > Transcribed Text: { transcript . text } < / p >
      < button onClick = { ( ) => startRecording ( ) } > Start < / button >
      < button onClick = { ( ) => pauseRecording ( ) } > Pause < / button >
      < button onClick = { ( ) => stopRecording ( ) } > Stop < / button >
    < / div >
  )
}

自定義服務器（保持OpenAI API代幣安全）

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  /**
   * you have more control like this
   * do whatever you want with the recorded speech
   * send it to your own custom server
   * and return the response back to useWhisper
   */
  const onTranscribe = ( blob : Blob ) => {
    const base64 = await new Promise < string | ArrayBuffer | null > (
      ( resolve ) => {
        const reader = new FileReader ( )
        reader . onloadend = ( ) => resolve ( reader . result )
        reader . readAsDataURL ( blob )
      }
    )
    const body = JSON . stringify ( { file : base64 , model : 'whisper-1' } )
    const headers = { 'Content-Type' : 'application/json' }
    const { default : axios } = await import ( 'axios' )
    const response = await axios . post ( '/api/whisper' , body , {
      headers ,
    } )
    const { text } = await response . data
    // you must return result from your server in Transcript format
    return {
      blob ,
      text ,
    }
  }

  const { transcript } = useWhisper ( {
    // callback to handle transcription with custom server
    onTranscribe ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

例子
實時流trascription

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    streaming : true ,
    timeSlice : 1_000 , // 1 second
    whisperConfig : {
      language : 'en' ,
    } ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

在發送竊竊私語之前刪除沉默以節省成本

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    // use ffmpeg-wasp to remove silence from recorded speech
    removeSilence : true ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

自動開始在安裝的組件上錄製

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    // will auto start recording speech upon component mounted
    autoStart : true ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

只要用戶說話，請繼續錄製

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    nonStop : true , // keep recording as long as the user is speaking
    stopTimeout : 5000 , // auto stop after 5 seconds
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

自動轉錄為TRUE時自定義Whisper API配置

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    autoTranscribe : true ,
    whisperConfig : {
      prompt : 'previous conversation' , // you can pass previous conversation for context
      response_format : 'text' , // output text instead of json
      temperature : 0.8 , // random output
      language : 'es' , // Spanish
    } ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

依賴性
- @chengsokdara/react-hooks-async異步鉤鉤
- RecordRTC：跨瀏覽器錄音機
- lamejs將WAV編碼為MP3以進行跨瀏覽器支持
- @ffmpeg/ffmpeg：用於沉默清除功能
- Hark：進行演講檢測
- Axios：由於獲取不適用於耳語端點

這些依賴性大多數是懶惰的，因此僅在需要時導入它

API
配置對象

姓名	類型	預設值	描述
Apikey	細繩	''	您的Openai API令牌
Autostart	布爾	錯誤的	在組件安裝座上自動啟動演講錄製
自動轉錄	布爾	真的	停止錄製後應自動轉錄
模式	細繩	轉錄	控制耳語模式要么轉錄或翻譯，目前僅支持翻譯為英語
馬不停蹄	布爾	錯誤的	如果為true，記錄將在停止時間之後自動停止。但是，如果用戶繼續講話，錄音機將繼續錄製
消除	布爾	錯誤的	在將文件發送到OpenAI API之前刪除沉默
停止時間	數字	5,000毫秒	如果不間斷，這將成為必需。當錄音機自動停止時，此控件
流	布爾	錯誤的	基於時光實時轉錄演講
時光	數字	1000毫秒	每個ondataavailable事件之間的間隔
Whisperconfig	hisperapiconfig	不明確的	耳語API轉錄配置
Ondataavailable	（斑點：斑點）=> void	不明確的	在時膠之間間隔以錄製斑點的回調功能
ontranscribe	（Blob：Blob）=> Promise <tresmcript>	不明確的	回調功能以處理您自己的自定義服務器上的轉錄

hisperapiconfig

姓名	類型	預設值	描述
迅速的	細繩	不明確的	可選的文本，可指導模型的樣式或繼續以前的音頻段。提示應匹配音頻語言。
Response_format	細繩	JSON	成績單輸出的格式，其中一種選項之一：JSON，文本，SRT，Verbose_json或VTT。
溫度	數字	0	抽樣溫度在0到1之間。較高的值等值等於0.8將使輸出更隨機，而較低的值等於0.2將使其更加集中和確定性。如果設置為0，則該模型將使用對數概率自動增加溫度，直到點擊某些閾值為止。
語言	細繩	en	輸入音頻的語言。提供ISO-639-1格式的輸入語言將提高準確性和延遲。

返回對象

姓名	類型	描述
記錄	布爾	語音記錄狀態
請講	布爾	檢測用戶何時說話
轉錄	布爾	同時從演講中消除沉默並將請求發送到Openai Whisper API
成績單	成績單	對像在低語轉錄完成後返回
pauserecording	承諾	暫停語音記錄
Startrecording	承諾	啟動語音記錄
停止記錄	承諾	停止語音記錄