use whisper

Whisper를 사용하십시오

음성 녹음기, 실시간 전사 및 침묵 제거가있는 Openai Whisper API의 React Hook

데모
실시간 전사 데모

사용-위스퍼 리얼 타임 전사 .mp4

발표
React Native 용 Whisper가 개발 중입니다.

저장소 : https://github.com/chengsokdara/use-whisper-native

진행 : Chengsokdara/Use-Whisper-Native#1

설치하다

 npm i @chengsokdara/use-whisper

 yarn add @chengsokdara/use-whisper

용법

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const {
    recording ,
    speaking ,
    transcribing ,
    transcript ,
    pauseRecording ,
    startRecording ,
    stopRecording ,
  } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
  } )

  return (
    < div >
      < p > Recording: { recording } < / p >
      < p > Speaking: { speaking } < / p >
      < p > Transcribing: { transcribing } < / p >
      < p > Transcribed Text: { transcript . text } < / p >
      < button onClick = { ( ) => startRecording ( ) } > Start < / button >
      < button onClick = { ( ) => pauseRecording ( ) } > Pause < / button >
      < button onClick = { ( ) => stopRecording ( ) } > Stop < / button >
    < / div >
  )
}

사용자 정의 서버 (OpenAi API 토큰 유지)

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  /**
   * you have more control like this
   * do whatever you want with the recorded speech
   * send it to your own custom server
   * and return the response back to useWhisper
   */
  const onTranscribe = ( blob : Blob ) => {
    const base64 = await new Promise < string | ArrayBuffer | null > (
      ( resolve ) => {
        const reader = new FileReader ( )
        reader . onloadend = ( ) => resolve ( reader . result )
        reader . readAsDataURL ( blob )
      }
    )
    const body = JSON . stringify ( { file : base64 , model : 'whisper-1' } )
    const headers = { 'Content-Type' : 'application/json' }
    const { default : axios } = await import ( 'axios' )
    const response = await axios . post ( '/api/whisper' , body , {
      headers ,
    } )
    const { text } = await response . data
    // you must return result from your server in Transcript format
    return {
      blob ,
      text ,
    }
  }

  const { transcript } = useWhisper ( {
    // callback to handle transcription with custom server
    onTranscribe ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

예
실시간 스트리밍 트라스크

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    streaming : true ,
    timeSlice : 1_000 , // 1 second
    whisperConfig : {
      language : 'en' ,
    } ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

가격을 절약하기 위해 속삭임으로 보내기 전에 침묵을 제거하십시오

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    // use ffmpeg-wasp to remove silence from recorded speech
    removeSilence : true ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

구성 요소 장착에서 자동 시작 기록

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    // will auto start recording speech upon component mounted
    autoStart : true ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

사용자가 말하는 한 계속 녹음하십시오

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    nonStop : true , // keep recording as long as the user is speaking
    stopTimeout : 5000 , // auto stop after 5 seconds
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

AutoTranscribe가 사실 일 때 Whisper API 구성을 사용자 정의하십시오

 import { useWhisper } from '@chengsokdara/use-whisper'

const App = ( ) => {
  const { transcript } = useWhisper ( {
    apiKey : process . env . OPENAI_API_TOKEN , // YOUR_OPEN_AI_TOKEN
    autoTranscribe : true ,
    whisperConfig : {
      prompt : 'previous conversation' , // you can pass previous conversation for context
      response_format : 'text' , // output text instead of json
      temperature : 0.8 , // random output
      language : 'es' , // Spanish
    } ,
  } )

  return (
    < div >
      < p > { transcript . text } < / p >
    < / div >
  )
}

의존성
- @Chengsokdara/React-Hooks-Async 비동기 반응 후크
- RecordRTC : 크로스 브라우저 오디오 레코더
- LAMEJS는 wav를 크로스 브라우저 지원을 위해 MP3로 인코딩합니다
- @ffmpeg/ffmpeg : 침묵 제거 기능
- Hark : 말하기 탐지
- Axios : Fetch는 Whisper Endpoint에서 작동하지 않기 때문입니다

이 부피스의 대부분은 게으른로드되어 있으므로 필요할 때만 수입됩니다.

API
구성 객체

이름	유형	기본값	설명
아파키	끈	'' ''	당신의 OpenAi API 토큰
AutosTart	부울	거짓	구성 요소 마운트의 자동 시작 음성 녹음
자동 변환	부울	진실	녹음 중지 후 자동으로 전사해야합니다
방법	끈	전사	컨트롤 속삭임 모드 전사 또는 번역, 현재 영어 번역 만 지원합니다.
논스톱	부울	거짓	사실이라면, 멈춘 후 레코드가 자동 정지됩니다. 그러나 사용자가 계속 말하면 레코더는 계속 녹음합니다.
제거	부울	거짓	OpenAI API로 파일을 보내기 전에 침묵을 제거하십시오
중단 시간	숫자	5,000ms	논스톱이 사실이라면 이것이 필요합니다. 레코더가 자동 정차 할 때이 제어
스트리밍	부울	거짓	타임 슬라이스를 기준으로 실시간으로 연설을 전사하십시오
타임 슬라이스	숫자	1000ms	각 ondataavailable 이벤트 사이의 간격
WhisperConfig	Whisperapiconfig	한정되지 않은	속삭임 API 전사 구성
ondataavailable	(blob : blob) => void	한정되지 않은	타임 슬라이스 사이의 간격으로 기록 된 블로브를 얻기위한 콜백 함수
ontranscribe	(blob : blob) => 약속 <전사>	한정되지 않은	자신의 사용자 정의 서버에서 전사를 처리하는 콜백 기능

Whisperapiconfig

이름	유형	기본값	설명
즉각적인	끈	한정되지 않은	모델의 스타일을 안내하거나 이전 오디오 세그먼트를 계속할 수있는 선택적 텍스트. 프롬프트는 오디오 언어와 일치해야합니다.
응답 _format	끈	JSON	JSON, TEXT, SRT, VERBOSE_JSON 또는 VTT.
온도	숫자	0	0과 1 사이의 샘플링 온도는 0.8과 같은 값이 높을수록 출력을 더욱 무작위로 만들고 0.2와 같은 낮은 값은 더 집중적이고 결정적으로 만듭니다. 0으로 설정하면 모델은 로그 확률을 사용하여 특정 임계 값이 닿을 때까지 온도를 자동으로 증가시킵니다.
언어	끈	en	입력 오디오의 언어. 입력 언어를 ISO-639-1 형식으로 제공하면 정확도와 대기 시간이 향상됩니다.

반환 객체

이름	유형	설명
녹음	부울	음성 녹음 상태
말하기	부울	사용자가 말할 때 감지하십시오
전사	부울	연설에서 침묵을 제거하고 Openai Whisper API에 요청을 보내는 동안
성적 증명서	성적 증명서	Whisper Transcription이 완료된 후 물체가 반환됩니다
PAUSERECORDING	약속하다	음성 녹음을 일시 중지합니다
스타트 로코딩	약속하다	스피치 레코딩을 시작하십시오
정지 기록	약속하다	연설 기록을 중지하십시오