use whisper下载 - use whisper源代码下载

使用

用语音录音机，实时转录和内置的言语录音机，内置的React Hook for Openai Whisper API

演示
实时转录演示

使用 - 真实的时间转录.mp4

公告
正在开发对本机的使用。

存储库：https：//github.com/chengsokdara/use-whisper-native

进度：Chengsokdara/使用 - 旋转的＃1

安装

 npm i @chengsokdara/use-whisper

 yarn add @chengsokdara/use-whisper

用法

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const {
    recording ,
    speaking ,
    transcribing ,
    transcript ,
    pauseRecording ,
    startRecording ,
    stopRecording ,
  } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
  } )

  return (
    < div >
      < p > Recording: { recording } < / p >
      < p > Speaking: { speaking } < / p >
      < p > Transcribing: { transcribing } < / p >
      < p > Transcribed Text: { transcript . text } < / p >
      < button onClick = { ( ) => startRecording ( ) } > Start < / button >
      < button onClick = { ( ) => pauseRecording ( ) } > Pause < / button >
      < button onClick = { ( ) => stopRecording ( ) } > Stop < / button >
    < / div >
  )
}

自定义服务器（保持OpenAI API代币安全）

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  /**
   * you have more control like this
   * do whatever you want with the recorded speech
   * send it to your own custom server
   * and return the response back to useWhisper
   */
  const onTranscribe = ( blob : Blob ) => {
    const base64 = await new Promise < string | ArrayBuffer | null > (
      ( resolve ) => {
        const reader = new FileReader ( )
        reader . onloadend = ( ) => resolve ( reader . result )
        reader . readAsDataURL ( blob )
      }
    )
    const body = JSON . stringify ( { file : base64 , model : 'whisper-1' } )
    const headers = { 'Content-Type' : 'application/json' }
    const { default : axios } = await import ( 'axios' )
    const response = await axios . post ( '/api/whisper' , body , {
      headers ,
    } )
    const { text } = await response . data
    // you must return result from your server in Transcript format
    return {
      blob ,
      text ,
    }
  }

  const { transcript } = useWhisper ( {
    // callback to handle transcription with custom server
    onTranscribe ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

例子
实时流trascription

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    streaming : true ,
    timeSlice : 1_000 , // 1 second
    whisperConfig : {
      language : 'en' ,
    } ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

在发送窃窃私语之前删除沉默以节省成本

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    // use ffmpeg-wasp to remove silence from recorded speech
    removeSilence : true ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

自动开始在安装的组件上录制

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    // will auto start recording speech upon component mounted
    autoStart : true ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

只要用户说话，请继续录制

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    nonStop : true , // keep recording as long as the user is speaking
    stopTimeout : 5000 , // auto stop after 5 seconds
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

自动转录为TRUE时自定义Whisper API配置

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    autoTranscribe : true ,
    whisperConfig : {
      prompt : 'previous conversation' , // you can pass previous conversation for context
      response_format : 'text' , // output text instead of json
      temperature : 0.8 , // random output
      language : 'es' , // Spanish
    } ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

依赖性
- @chengsokdara/react-hooks-async异步钩钩
- RecordRTC：跨浏览器录音机
- lamejs将WAV编码为MP3以进行跨浏览器支持
- @ffmpeg/ffmpeg：用于沉默清除功能
- Hark：进行演讲检测
- Axios：由于获取不适用于耳语端点

这些依赖性大多数是懒惰的，因此仅在需要时导入它

API
配置对象

姓名	类型	默认值	描述
Apikey	细绳	''	您的Openai API令牌
Autostart	布尔	错误的	在组件安装座上自动启动演讲录制
自动转录	布尔	真的	停止录制后应自动转录
模式	细绳	转录	控制耳语模式要么转录或翻译，目前仅支持翻译为英语
马不停蹄	布尔	错误的	如果为true，记录将在停止时间之后自动停止。但是，如果用户继续讲话，录音机将继续录制
消除	布尔	错误的	在将文件发送到OpenAI API之前删除沉默
停止时间	数字	5,000毫秒	如果不间断，这将成为必需。当录音机自动停止时，此控件
流	布尔	错误的	基于时光实时转录演讲
时光	数字	1000毫秒	每个ondataavailable事件之间的间隔
Whisperconfig	hisperapiconfig	不明确的	耳语API转录配置
Ondataavailable	（斑点：斑点）=> void	不明确的	在时胶之间间隔以录制斑点的回调功能
ontranscribe	（Blob：Blob）=> Promise <tresmcript>	不明确的	回调功能以处理您自己的自定义服务器上的转录

hisperapiconfig

姓名	类型	默认值	描述
迅速的	细绳	不明确的	可选的文本，可指导模型的样式或继续以前的音频段。提示应匹配音频语言。
Response_format	细绳	JSON	成绩单输出的格式，其中一种选项之一：JSON，文本，SRT，Verbose_json或VTT。
温度	数字	0	抽样温度在0到1之间。较高的值等值等于0.8将使输出更随机，而较低的值等于0.2将使其更加集中和确定性。如果设置为0，则该模型将使用对数概率自动增加温度，直到点击某些阈值为止。
语言	细绳	en	输入音频的语言。提供ISO-639-1格式的输入语言将提高准确性和延迟。

返回对象

姓名	类型	描述
记录	布尔	语音记录状态
请讲	布尔	检测用户何时说话
转录	布尔	同时从演讲中消除沉默并将请求发送到Openai Whisper API
成绩单	成绩单	对象在低语转录完成后返回
pauserecording	承诺	暂停语音记录
Startrecording	承诺	启动语音记录
停止记录	承诺	停止语音记录