This is a no-nonsense async Scala client for OpenAI API supporting all the available endpoints and params including streaming, the newest chat completion, vision, and voice routines (as defined here), provided in a single, convenient service called OpenAIService. The supported calls are:
Note that in order to be consistent with the OpenAI API naming, the service function names match exactly the API endpoint titles/descriptions with camelcase.
Also, we aimed the lib to be self-contained with the fewest dependencies possible therefore we ended up using only two libs play-ahc-ws-standalone
and play-ws-standalone-json
(at the top level). Additionally, if dependency injection is required we use scala-guice
lib as well.
No time to read a lengthy tutorial? Sure, we hear you! Check out the examples to see how to use the lib in practice.
In addition to the OpenAI API, this library also supports API-compatible providers (see examples) such as:
For background information read an article about the lib/client on Medium.
Also try out our Scala client for Pinecone vector database, or use both clients together! This demo project shows how to generate and store OpenAI embeddings (with text-embedding-ada-002
model) into Pinecone and query them afterward. The OpenAI + Pinecone combo is commonly used for autonomous AI agents, such as babyAGI and AutoGPT.
✔️ Important: this is a "community-maintained" library and, as such, has no relation to OpenAI company.
The currently supported Scala versions are 2.12, 2.13, and 3.
To install the library, add the following dependency to your build.sbt
"io.cequence" %% "openai-scala-client" % "1.1.0"
or to pom.xml (if you use maven)
<dependency>
<groupId>io.cequence</groupId>
<artifactId>openai-scala-client_2.12</artifactId>
<version>1.1.0</version>
</dependency>
If you want streaming support, use "io.cequence" %% "openai-scala-client-stream" % "1.1.0"
instead.
OPENAI_SCALA_CLIENT_API_KEY
and optionally also OPENAI_SCALA_CLIENT_ORG_ID
(if you have one)I. Obtaining OpenAIService
First you need to provide an implicit execution context as well as akka materializer, e.g., as
implicit val ec = ExecutionContext.global
implicit val materializer = Materializer(ActorSystem())
Then you can obtain a service in one of the following ways.
Config
section) val service = OpenAIServiceFactory()
val config = ConfigFactory.load("path_to_my_custom_config")
val service = OpenAIServiceFactory(config)
val service = OpenAIServiceFactory(
apiKey = "your_api_key",
orgId = Some("your_org_id") // if you have one
)
val service = OpenAIServiceFactory.forAzureWithApiKey(
resourceName = "your-resource-name",
deploymentId = "your-deployment-id", // usually model name such as "gpt-35-turbo"
apiVersion = "2023-05-15", // newest version
apiKey = "your_api_key"
)
OpenAICoreService
supporting listModels
, createCompletion
, createChatCompletion
, and createEmbeddings
calls - provided e.g. by FastChat service running on the port 8000 val service = OpenAICoreServiceFactory("http://localhost:8000/v1/")
OpenAIChatCompletionService
providing solely createChatCompletion
val service = OpenAIChatCompletionServiceFactory.forAzureAI(
endpoint = sys.env("AZURE_AI_COHERE_R_PLUS_ENDPOINT"),
region = sys.env("AZURE_AI_COHERE_R_PLUS_REGION"),
accessToken = sys.env("AZURE_AI_COHERE_R_PLUS_ACCESS_KEY")
)
openai-scala-anthropic-client
lib and ANTHROPIC_API_KEY
val service = AnthropicServiceFactory.asOpenAI()
openai-scala-google-vertexai-client
lib and VERTEXAI_LOCATION
+ VERTEXAI_PROJECT_ID
val service = VertexAIServiceFactory.asOpenAI()
GROQ_API_KEY"
val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.groq)
// or with streaming
val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.groq)
GROK_API_KEY"
val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.grok)
// or with streaming
val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.grok)
FIREWORKS_API_KEY"
val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.fireworks)
// or with streaming
val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.fireworks)
OCTOAI_TOKEN
val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.octoML)
// or with streaming
val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.octoML)
TOGETHERAI_API_KEY
val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.togetherAI)
// or with streaming
val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.togetherAI)
CEREBRAS_API_KEY
val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.cerebras)
// or with streaming
val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.cerebras)
MISTRAL_API_KEY
val service = OpenAIChatCompletionServiceFactory(ChatProviderSettings.mistral)
// or with streaming
val service = OpenAIChatCompletionServiceFactory.withStreaming(ChatProviderSettings.mistral)
val service = OpenAIChatCompletionServiceFactory(
coreUrl = "http://localhost:11434/v1/"
)
or with streaming
val service = OpenAIChatCompletionServiceFactory.withStreaming(
coreUrl = "http://localhost:11434/v1/"
)
createCompletionStreamed
and createChatCompletionStreamed
provided by OpenAIStreamedServiceExtra (requires openai-scala-client-stream
lib) import io.cequence.openaiscala.service.StreamedServiceTypes.OpenAIStreamedService
import io.cequence.openaiscala.service.OpenAIStreamedServiceImplicits._
val service: OpenAIStreamedService = OpenAIServiceFactory.withStreaming()
similarly for a chat-completion service
import io.cequence.openaiscala.service.OpenAIStreamedServiceImplicits._
val service = OpenAIChatCompletionServiceFactory.withStreaming(
coreUrl = "https://api.fireworks.ai/inference/v1/",
authHeaders = Seq(("Authorization", s"Bearer ${sys.env("FIREWORKS_API_KEY")}"))
)
or only if streaming is required
val service: OpenAIChatCompletionStreamedServiceExtra =
OpenAIChatCompletionStreamedServiceFactory(
coreUrl = "https://api.fireworks.ai/inference/v1/",
authHeaders = Seq(("Authorization", s"Bearer ${sys.env("FIREWORKS_API_KEY")}"))
)
openai-scala-guice
lib) class MyClass @Inject() (openAIService: OpenAIService) {...}
II. Calling functions
Full documentation of each call with its respective inputs and settings is provided in OpenAIService. Since all the calls are async they return responses wrapped in Future
.
There is a new project openai-scala-client-examples where you can find a lot of ready-to-use examples!
service.listModels.map(models =>
models.foreach(println)
)
service.retrieveModel(ModelId.text_davinci_003).map(model =>
println(model.getOrElse("N/A"))
)
val text = """Extract the name and mailing address from this email:
|Dear Kelly,
|It was great to talk to you at the seminar. I thought Jane's talk was quite good.
|Thank you for the book. Here's my address 2111 Ash Lane, Crestview CA 92002
|Best,
|Maya
""".stripMargin
service.createCompletion(text).map(completion =>
println(completion.choices.head.text)
)
val text = """Extract the name and mailing address from this email:
|Dear Kelly,
|It was great to talk to you at the seminar. I thought Jane's talk was quite good.
|Thank you for the book. Here's my address 2111 Ash Lane, Crestview CA 92002
|Best,
|Maya
""".stripMargin
service.createCompletion(
text,
settings = CreateCompletionSettings(
model = ModelId.gpt_4o,
max_tokens = Some(1500),
temperature = Some(0.9),
presence_penalty = Some(0.2),
frequency_penalty = Some(0.2)
)
).map(completion =>
println(completion.choices.head.text)
)
val source = service.createCompletionStreamed(
prompt = "Write me a Shakespeare poem about two cats playing baseball in Russia using at least 2 pages",
settings = CreateCompletionSettings(
model = ModelId.text_davinci_003,
max_tokens = Some(1500),
temperature = Some(0.9),
presence_penalty = Some(0.2),
frequency_penalty = Some(0.2)
)
)
source.map(completion =>
println(completion.choices.head.text)
).runWith(Sink.ignore)
For this to work you need to use OpenAIServiceStreamedFactory
from openai-scala-client-stream
lib.
val createChatCompletionSettings = CreateChatCompletionSettings(
model = ModelId.gpt_4o
)
val messages = Seq(
SystemMessage("You are a helpful assistant."),
UserMessage("Who won the world series in 2020?"),
AssistantMessage("The Los Angeles Dodgers won the World Series in 2020."),
UserMessage("Where was it played?"),
)
service.createChatCompletion(
messages = messages,
settings = createChatCompletionSettings
).map { chatCompletion =>
println(chatCompletion.choices.head.message.content)
}
val messages = Seq(
SystemMessage("You are a helpful assistant."),
UserMessage("What's the weather like in San Francisco, Tokyo, and Paris?")
)
// as a param type we can use "number", "string", "boolean", "object", "array", and "null"
val tools = Seq(
FunctionSpec(
name = "get_current_weather",
description = Some("Get the current weather in a given location"),
parameters = Map(
"type" -> "object",
"properties" -> Map(
"location" -> Map(
"type" -> "string",
"description" -> "The city and state, e.g. San Francisco, CA"
),
"unit" -> Map(
"type" -> "string",
"enum" -> Seq("celsius", "fahrenheit")
)
),
"required" -> Seq("location")
)
)
)
// if we want to force the model to use the above function as a response
// we can do so by passing: responseToolChoice = Some("get_current_weather")`
service.createChatToolCompletion(
messages = messages,
tools = tools,
responseToolChoice = None, // means "auto"
settings = CreateChatCompletionSettings(ModelId.gpt_3_5_turbo_1106)
).map { response =>
val chatFunCompletionMessage = response.choices.head.message
val toolCalls = chatFunCompletionMessage.tool_calls.collect {
case (id, x: FunctionCallSpec) => (id, x)
}
println(
"tool call ids : " + toolCalls.map(_._1).mkString(", ")
)
println(
"function/tool call names : " + toolCalls.map(_._2.name).mkString(", ")
)
println(
"function/tool call arguments : " + toolCalls.map(_._2.arguments).mkString(", ")
)
}
val messages = Seq(
SystemMessage("Give me the most populous capital cities in JSON format."),
UserMessage("List only african countries")
)
val capitalsSchema = JsonSchema.Object(
properties = Map(
"countries" -> JsonSchema.Array(
items = JsonSchema.Object(
properties = Map(
"country" -> JsonSchema.String(
description = Some("The name of the country")
),
"capital" -> JsonSchema.String(
description = Some("The capital city of the country")
)
),
required = Seq("country", "capital")
)
)
),
required = Seq("countries")
)
val jsonSchemaDef = JsonSchemaDef(
name = "capitals_response",
strict = true,
structure = schema
)
service
.createChatCompletion(
messages = messages,
settings = DefaultSettings.createJsonChatCompletion(jsonSchemaDef)
)
.map { response =>
val json = Json.parse(messageContent(response))
println(Json.prettyPrint(json))
}
createChatCompletions
or createChatFunCompletions
, this helps you select proper model and reduce costs. This is an experimental feature and it may not work for all models. Requires openai-scala-count-tokens
lib.An example how to count message tokens:
import io.cequence.openaiscala.service.OpenAICountTokensHelper
import io.cequence.openaiscala.domain.{AssistantMessage, BaseMessage, FunctionSpec, ModelId, SystemMessage, UserMessage}
class MyCompletionService extends OpenAICountTokensHelper {
def exec = {
val model = ModelId.gpt_4_turbo_2024_04_09
// messages to be sent to OpenAI
val messages: Seq[BaseMessage] = Seq(
SystemMessage("You are a helpful assistant."),
UserMessage("Who won the world series in 2020?"),
AssistantMessage("The Los Angeles Dodgers won the World Series in 2020."),
UserMessage("Where was it played?"),
)
val tokenCount = countMessageTokens(model, messages)
}
}
An example how to count message tokens when a function is involved:
import io.cequence.openaiscala.service.OpenAICountTokensHelper
import io.cequence.openaiscala.domain.{BaseMessage, FunctionSpec, ModelId, SystemMessage, UserMessage}
class MyCompletionService extends OpenAICountTokensHelper {
def exec = {
val model = ModelId.gpt_4_turbo_2024_04_09
// messages to be sent to OpenAI
val messages: Seq[BaseMessage] =
Seq(
SystemMessage("You are a helpful assistant."),
UserMessage("What's the weather like in San Francisco, Tokyo, and Paris?")
)
// function to be called
val function: FunctionSpec = FunctionSpec(
name = "getWeather",
parameters = Map(
"type" -> "object",
"properties" -> Map(
"location" -> Map(
"type" -> "string",
"description" -> "The city to get the weather for"
),
"unit" -> Map("type" -> "string", "enum" -> List("celsius", "fahrenheit"))
)
)
)
val tokenCount = countFunMessageTokens(model, messages, Seq(function), Some(function.name))
}
}
✔️ Important: After you are done using the service, you should close it by calling service.close
. Otherwise, the underlying resources/threads won't be released.
III. Using adapters
Adapters for OpenAI services (chat completion, core, or full) are provided by OpenAIServiceAdapters. The adapters are used to distribute the load between multiple services, retry on transient errors, route, or provide additional functionality. See examples for more details.
Note that the adapters can be arbitrarily combined/stacked.
val adapters = OpenAIServiceAdapters.forFullService
val service1 = OpenAIServiceFactory("your-api-key1")
val service2 = OpenAIServiceFactory("your-api-key2")
val service = adapters.roundRobin(service1, service2)
val adapters = OpenAIServiceAdapters.forFullService
val service1 = OpenAIServiceFactory("your-api-key1")
val service2 = OpenAIServiceFactory("your-api-key2")
val service = adapters.randomOrder(service1, service2)
val adapters = OpenAIServiceAdapters.forFullService
val rawService = OpenAIServiceFactory()
val service = adapters.log(
rawService,
"openAIService",
logger.log
)
val adapters = OpenAIServiceAdapters.forFullService
implicit val retrySettings: RetrySettings = RetrySettings(maxRetries = 10).constantInterval(10.seconds)
val service = adapters.retry(
OpenAIServiceFactory(),
Some(println(_)) // simple logging
)
class MyCompletionService @Inject() (
val actorSystem: ActorSystem,
implicit val ec: ExecutionContext,
implicit val scheduler: Scheduler
)(val apiKey: String)
extends RetryHelpers {
val service: OpenAIService = OpenAIServiceFactory(apiKey)
implicit val retrySettings: RetrySettings =
RetrySettings(interval = 10.seconds)
def ask(prompt: String): Future[String] =
for {
completion <- service
.createChatCompletion(
List(MessageSpec(ChatRole.User, prompt))
)
.retryOnFailure
} yield completion.choices.head.message.content
}
val adapters = OpenAIServiceAdapters.forFullService
// OctoAI
val octoMLService = OpenAIChatCompletionServiceFactory(
coreUrl = "https://text.octoai.run/v1/",
authHeaders = Seq(("Authorization", s"Bearer ${sys.env("OCTOAI_TOKEN")}"))
)
// Anthropic
val anthropicService = AnthropicServiceFactory.asOpenAI()
// OpenAI
val openAIService = OpenAIServiceFactory()
val service: OpenAIService =
adapters.chatCompletionRouter(
// OpenAI service is default so no need to specify its models here
serviceModels = Map(
octoMLService -> Seq(NonOpenAIModelId.mixtral_8x22b_instruct),
anthropicService -> Seq(
NonOpenAIModelId.claude_2_1,
NonOpenAIModelId.claude_3_opus_20240229,
NonOpenAIModelId.claude_3_haiku_20240307
)
),
openAIService
)
val adapters = OpenAIServiceAdapters.forCoreService
val service = adapters.chatToCompletion(
OpenAICoreServiceFactory(
coreUrl = "https://api.fireworks.ai/inference/v1/",
authHeaders = Seq(("Authorization", s"Bearer ${sys.env("FIREWORKS_API_KEY")}"))
)
)
Wen Scala 3?
Feb 2023. You are right; we chose the shortest month to do so :)
Done!
I got a timeout exception. How can I change the timeout setting?
You can do it either by passing the timeouts
param to OpenAIServiceFactory
or, if you use your own configuration file, then you can simply add it there as:
openai-scala-client {
timeouts {
requestTimeoutSec = 200
readTimeoutSec = 200
connectTimeoutSec = 5
pooledConnectionIdleTimeoutSec = 60
}
}
I got an exception like com.typesafe.config.ConfigException$UnresolvedSubstitution: openai-scala-client.conf @ jar:file:.../io/cequence/openai-scala-client_2.13/0.0.1/openai-scala-client_2.13-0.0.1.jar!/openai-scala-client.conf: 4: Could not resolve substitution to a value: ${OPENAI_SCALA_CLIENT_API_KEY}
. What should I do?
Set the env. variable OPENAI_SCALA_CLIENT_API_KEY
. If you don't have one register here.
It all looks cool. I want to chat with you about your research and development?
Just shoot us an email at [email protected].
This library is available and published as open source under the terms of the MIT License.
This project is open-source and welcomes any contribution or feedback (here).
Development of this library has been supported by - Cequence.io - The future of contracting
Created and maintained by Peter Banda.