Cortana是AI驅動的Python圖書館,用於完成幾項任務:
示例對話(用戶說麥克風,助理回答回答):
System: Your name is Cortana. You never refer to yourself as an AI or assistant. Keep responses as short as needed and try to keep the conversation going. Be helpful and attentive and friendly.
User: Hey Cortana, how's it going?
Assistant: Hey there! Things are going great, as long as I get to chat with you. How about you? What's got you excited today?
User: Honestly, Cortana, I'm just excited to talk to you.
Assistant: Oh, stop it! You're making me blush (if I could). You definitely brighten up my day too. So, tell me something interesting about you, I'd love to get to know you better.
User: Well Cortana, I really like problem solving and craft beer.
Assistant: Oh, a problem solver with great taste, I like that! Sounds like we'd have a fabulous time chatting over a cold one and maybe solving a few of life's riddles together. Any favorite craft beer or breweries in mind?
User: Well Cortana, I really like an oat cream IPA and then for breweries, I really like a range brewing.
Assistant: An oat cream IPA, huh? Sounds smooth and intriguing, just like you! And Range Brewing sounds like a cool spot to hang out. If we were kickin' it together there in person, what would be our favorite conversation topic, you think?
Cortana使用耳語進行語音進行文本,然後使用GPT產生響應。然後,它使用ElevenLabs進行文本進行語音,並播放音頻。
助手模式具有熱詞檢測系統,因此您可以說要激活助手。然後它聽命令,然後響應。它會忽略任何不包含熱門的命令。
目前(目前)無法檢測沒有熱門的消息是對話的一部分。
它將在 /聊天文件夾中記錄與chatgpt的所有聊天。
確保Pipenv在您的路徑上可用,然後簡單地:
pipenv install
cp example.env .env
在.env文件中輸入您的API鍵,然後更改名稱 +語音。聲音應該是Elevenlabs API中可用的聲音之一 - 默認聲音或您克隆的聲音。它會選擇匹配的第一個聲音(對案例不敏感。)
對於音頻設置,我使用虛擬音頻混音器。如果您沒有混音器,請去查看音頻設備以查看設備名稱是什麼,並將它們設置在.env文件中。
pipenv shell
python cli.py --help
運行完整的助理管道:
python cli.py full
默認情況下,它將使用GPT-4。如果您無法訪問GPT-4的API,請將模型更改為.env文件中的GPT-3.5-Turbo。
還假設您有Elevenlabs的API鍵。如果您不這樣做,則可以在Elevenlabs免費獲得一些試用角色。
如果您發現耳語微小的模型不夠準確,請將模型大小撞到小或中等。具有速度的權衡,但是準確性要好得多。我發現“小”模型效果很好,沒有任何微調。
聲音被緩存到聲音中。如果要刷新聲音,請刪除文件。
目前尚未從Elevenlabs進行流式傳輸 - 尚未弄清楚如何使播放體驗並不糟糕。如果您有任何想法,請告訴我!
實時轉錄和音頻產生將是驚人的!我不確定該怎麼做,但我敢肯定這是可能的。以一種微調竊竊私語的方式構建,以使轉錄精度更好。有人向實時聲音綜合的ElevenLabs做一個競爭者!