openai realtime console 다운로드 - openai realtime console 소스 코드 다운로드

OpenAI 실시간 콘솔

OpenAI Realtime Console은 OpenAI Realtime API에 대한 검사기 및 대화형 API 참조로 사용됩니다. 참조 클라이언트 (브라우저 및 Node.js용) 역할을 하는 openai/openai-realtime-api-beta와 브라우저에서 간단한 오디오 관리를 허용하는 /src/lib/wavtools 라는 두 가지 유틸리티 라이브러리가 패키지로 제공됩니다.

콘솔 시작

Webpack을 통해 번들로 제공되는 create-react-app 사용하여 생성된 React 프로젝트입니다. 이 패키지의 내용을 추출하고 다음을 사용하여 설치하십시오.

$ npm i

다음을 사용하여 서버를 시작하세요.

$ npm start

localhost:3000 통해 사용할 수 있어야 합니다.

콘솔 사용

콘솔에는 Realtime API에 액세스할 수 있는 OpenAI API 키( 사용자 키 또는 프로젝트 키 )가 필요합니다. 시작 시 입력하라는 메시지가 표시됩니다. localStorage 통해 저장되며 UI에서 언제든지 변경할 수 있습니다.

세션을 시작하려면 연결 해야 합니다. 마이크 액세스가 필요합니다. 그런 다음 수동 (Push-to-talk) 및 vad (음성 활동 감지) 대화 모드 중에서 선택하고 언제든지 전환할 수 있습니다.

두 가지 기능이 활성화되어 있습니다.

get_weather : 어디에서나 날씨를 요청하면 모델이 최선을 다해 위치를 찾아 지도에 표시하고 해당 위치의 날씨를 가져옵니다. 위치 액세스 권한이 없으며 좌표는 모델의 훈련 데이터에서 "추정"되므로 정확도가 완벽하지 않을 수 있습니다.
set_memory : 모델에 정보를 기억하도록 요청할 수 있으며, 모델은 해당 정보를 왼쪽의 JSON Blob에 저장합니다.

PTT(Push-to-talk) 또는 VAD 모드에서 언제든지 자유롭게 모델을 중단할 수 있습니다.

릴레이 서버 사용

보다 강력한 구현을 구축하고 자신의 서버를 사용하여 참조 클라이언트를 가지고 놀고 싶다면 Node.js 릴레이 서버가 포함되어 있습니다.

$ npm run relay

localhost:8081 에서 자동으로 시작됩니다.

다음 구성을 사용하여 .env 파일을 생성해야 합니다 .

 OPENAI_API_KEY=YOUR_API_KEY
REACT_APP_LOCAL_RELAY_SERVER_URL=http://localhost:8081

.env. 변경사항이 적용됩니다. 로컬 서버 URL은 ConsolePage.tsx 를 통해 로드됩니다. 언제든지 릴레이 서버 사용을 중지하려면 환경 변수를 삭제하거나 빈 문자열로 설정하면 됩니다.

 /**
 * Running a local relay server will allow you to hide your API key
 * and run custom logic on the server
 *
 * Set the local relay server address to:
 * REACT_APP_LOCAL_RELAY_SERVER_URL=http://localhost:8081
 *
 * This will also require you to set OPENAI_API_KEY= in a `.env` file
 * You can run it with `npm run relay`, in parallel with `npm start`
 */
const LOCAL_RELAY_SERVER_URL : string =
  process . env . REACT_APP_LOCAL_RELAY_SERVER_URL || '' ;

이 서버는 단순한 메시지 릴레이일 뿐이지만 다음으로 확장될 수 있습니다.

온라인으로 플레이할 앱을 출시하려면 API 자격 증명을 숨기세요.
비밀로 유지하고 싶은 특정 통화(예: instructions )를 서버에서 직접 처리하세요.
클라이언트가 수신하고 보낼 수 있는 이벤트 유형을 제한합니다.

이러한 기능을 직접 구현해야 합니다.

실시간 API 참조 클라이언트

최신 참조 클라이언트 및 문서는 GitHub(openai/openai-realtime-api-beta)에서 확인할 수 있습니다.

이 클라이언트는 React(프런트엔드) 또는 Node.js 프로젝트에서 직접 사용할 수 있습니다. 전체 문서를 보려면 GitHub 리포지토리를 참조하세요. 하지만 여기 가이드를 시작하기 위한 입문서로 사용할 수 있습니다.

 import { RealtimeClient } from '/src/lib/realtime-api-beta/index.js' ;

const client = new RealtimeClient ( { apiKey : process . env . OPENAI_API_KEY } ) ;

// Can set parameters ahead of connecting
client . updateSession ( { instructions : 'You are a great, upbeat friend.' } ) ;
client . updateSession ( { voice : 'alloy' } ) ;
client . updateSession ( { turn_detection : 'server_vad' } ) ;
client . updateSession ( { input_audio_transcription : { model : 'whisper-1' } } ) ;

// Set up event handling
client . on ( 'conversation.updated' , ( { item , delta } ) => {
  const items = client . conversation . getItems ( ) ; // can use this to render all items
  /* includes all changes to conversations, delta may be populated */
} ) ;

// Connect to Realtime API
await client . connect ( ) ;

// Send an item and triggers a generation
client . sendUserMessageContent ( [ { type : 'text' , text : `How are you?` } ] ) ;

스트리밍 오디오 보내기

스트리밍 오디오를 보내려면 .appendInputAudio() 메서드를 사용하세요. turn_detection: 'disabled' 모드에 있는 경우 .generate() 사용하여 모델에 응답하도록 지시해야 합니다.

 // Send user audio, must be Int16Array or ArrayBuffer
// Default audio format is pcm16 with sample rate of 24,000 Hz
// This populates 1s of noise in 0.1s chunks
for ( let i = 0 ; i < 10 ; i ++ ) {
  const data = new Int16Array ( 2400 ) ;
  for ( let n = 0 ; n < 2400 ; n ++ ) {
    const value = Math . floor ( ( Math . random ( ) * 2 - 1 ) * 0x8000 ) ;
    data [ n ] = value ;
  }
  client . appendInputAudio ( data ) ;
}
// Pending audio is committed and model is asked to generate
client . createResponse ( ) ;

도구 추가 및 사용

도구를 사용하여 작업하는 것은 쉽습니다. .addTool() 호출하고 콜백을 두 번째 매개변수로 설정하면 됩니다. 콜백은 도구의 매개변수와 함께 실행되며 결과는 자동으로 모델로 다시 전송됩니다.

 // We can add tools as well, with callbacks specified
client . addTool (
  {
    name : 'get_weather' ,
    description :
      'Retrieves the weather for a given lat, lng coordinate pair. Specify a label for the location.' ,
    parameters : {
      type : 'object' ,
      properties : {
        lat : {
          type : 'number' ,
          description : 'Latitude' ,
        } ,
        lng : {
          type : 'number' ,
          description : 'Longitude' ,
        } ,
        location : {
          type : 'string' ,
          description : 'Name of the location' ,
        } ,
      } ,
      required : [ 'lat' , 'lng' , 'location' ] ,
    } ,
  } ,
  async ( { lat , lng , location } ) => {
    const result = await fetch (
      `https://api.open-meteo.com/v1/forecast?latitude= ${ lat } &longitude= ${ lng } &current=temperature_2m,wind_speed_10m`
    ) ;
    const json = await result . json ( ) ;
    return json ;
  }
) ;

모델 중단

특히 turn_detection: 'disabled' 모드에서는 모델을 수동으로 중단할 수 있습니다. 이를 위해 다음을 사용할 수 있습니다.

 // id is the id of the item currently being generated
// sampleCount is the number of audio samples that have been heard by the listener
client . cancelResponse ( id , sampleCount ) ;

이 메서드를 사용하면 모델 생성이 즉시 중단되지만, sampleCount 이후의 모든 오디오를 제거하고 텍스트 응답을 지워 재생 중인 항목도 자릅니다. 이 방법을 사용하면 모델을 중단하고 사용자 상태보다 앞서 생성된 모든 항목을 "기억"하는 것을 방지할 수 있습니다.

참조 클라이언트 이벤트

RealtimeClient 에는 애플리케이션 제어 흐름을 위한 5가지 주요 클라이언트 이벤트가 있습니다. 이는 클라이언트 사용에 대한 개요일 뿐이며, 전체 Realtime API 이벤트 사양은 훨씬 더 큽니다. 더 많은 제어가 필요한 경우 GitHub 저장소(openai/openai-realtime-api-beta)를 확인하세요.

 // errors like connection failures
client . on ( 'error' , ( event ) => {
  // do thing
} ) ;

// in VAD mode, the user starts speaking
// we can use this to stop audio playback of a previous response if necessary
client . on ( 'conversation.interrupted' , ( ) => {
  /* do something */
} ) ;

// includes all changes to conversations
// delta may be populated
client . on ( 'conversation.updated' , ( { item , delta } ) => {
  // get all items, e.g. if you need to update a chat window
  const items = client . conversation . getItems ( ) ;
  switch ( item . type ) {
    case 'message' :
      // system, user, or assistant message (item.role)
      break ;
    case 'function_call' :
      // always a function call from the model
      break ;
    case 'function_call_output' :
      // always a response from the user / application
      break ;
  }
  if ( delta ) {
    // Only one of the following will be populated for any given event
    // delta.audio = Int16Array, audio added
    // delta.transcript = string, transcript added
    // delta.arguments = string, function arguments added
  }
} ) ;

// only triggered after item added to conversation
client . on ( 'conversation.item.appended' , ( { item } ) => {
  /* item status can be 'in_progress' or 'completed' */
} ) ;

// only triggered after item completed in conversation
// will always be triggered after conversation.item.appended
client . on ( 'conversation.item.completed' , ( { item } ) => {
  /* item status will always be 'completed' */
} ) ;

Wavtools

Wavtools에는 녹음 및 재생 모두에서 브라우저에서 PCM16 오디오 스트림을 쉽게 관리할 수 있는 기능이 포함되어 있습니다.

WavRecorder 빠른 시작

 import { WavRecorder } from '/src/lib/wavtools/index.js' ;

const wavRecorder = new WavRecorder ( { sampleRate : 24000 } ) ;
wavRecorder . getStatus ( ) ; // "ended"

// request permissions, connect microphone
await wavRecorder . begin ( ) ;
wavRecorder . getStatus ( ) ; // "paused"

// Start recording
// This callback will be triggered in chunks of 8192 samples by default
// { mono, raw } are Int16Array (PCM16) mono & full channel data
await wavRecorder . record ( ( data ) => {
  const { mono , raw } = data ;
} ) ;
wavRecorder . getStatus ( ) ; // "recording"

// Stop recording
await wavRecorder . pause ( ) ;
wavRecorder . getStatus ( ) ; // "paused"

// outputs "audio/wav" audio file
const audio = await wavRecorder . save ( ) ;

// clears current audio buffer and starts recording
await wavRecorder . clear ( ) ;
await wavRecorder . record ( ) ;

// get data for visualization
const frequencyData = wavRecorder . getFrequencies ( ) ;

// Stop recording, disconnects microphone, output file
await wavRecorder . pause ( ) ;
const finalAudio = await wavRecorder . end ( ) ;

// Listen for device change; e.g. if somebody disconnects a microphone
// deviceList is array of MediaDeviceInfo[] + `default` property
wavRecorder . listenForDeviceChange ( ( deviceList ) => { } ) ;

WavStreamPlayer 빠른 시작

 import { WavStreamPlayer } from '/src/lib/wavtools/index.js' ;

const wavStreamPlayer = new WavStreamPlayer ( { sampleRate : 24000 } ) ;

// Connect to audio output
await wavStreamPlayer . connect ( ) ;

// Create 1s of empty PCM16 audio
const audio = new Int16Array ( 24000 ) ;
// Queue 3s of audio, will start playing immediately
wavStreamPlayer . add16BitPCM ( audio , 'my-track' ) ;
wavStreamPlayer . add16BitPCM ( audio , 'my-track' ) ;
wavStreamPlayer . add16BitPCM ( audio , 'my-track' ) ;

// get data for visualization
const frequencyData = wavStreamPlayer . getFrequencies ( ) ;

// Interrupt the audio (halt playback) at any time
// To restart, need to call .add16BitPCM() again
const trackOffset = await wavStreamPlayer . interrupt ( ) ;
trackOffset . trackId ; // "my-track"
trackOffset . offset ; // sample number
trackOffset . currentTime ; // time in track

감사의 말과 연락

Realtime Console을 확인해 주셔서 감사합니다. Realtime API를 통해 즐거운 시간을 보내시기 바랍니다. 이를 가능하게 해준 Realtime API 팀 전체에 특별히 감사드립니다. 언제든지 연락하고, 질문하고, 저장소에 문제를 생성하여 피드백을 제공하세요. 또한 직접 연락하여 어떻게 생각하는지 알려주실 수도 있습니다!