LLM.js is the fastest way to use Large Language Models in JavaScript. It's a single simple interface to hundreds of popular LLMs:
gpt-4
, gpt-4-turbo-preview
, gpt-3.5-turbo
gemini-1.5-pro
, gemini-1.0-pro
, gemini-pro-vision
claude-3-opus
, claude-3-sonnet
, claude-3-haiku
, claude-2.1
, claude-instant-1.2
mixtral-8x7b
, llama2-70b
, gemma-7b-it
llama-3-70b
, llama-3-8b
, nous-hermes-2
, ...mistral-medium
, mistral-small
, mistral-tiny
LLaVa-1.5
, TinyLlama-1.1B
, Phi-2
, ...llama-3
, llama-2
, gemma
, dolphin-phi
, ...await LLM("the color of the sky is", { model: "gpt-4" }); // blue
Features
OpenAI
, Google
, Anthropic
, Mistral
, Groq
, Llamafile
, Ollama
, Together
)temperature
, max_tokens
, seed
, ...)llm
command for your shellInstall LLM.js
from NPM:
npm install @themaximalist/llm.js
Setting up LLMs is easy—just make sure your API key is set in your environment
export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...
export MISTRAL_API_KEY=...
export GOOGLE_API_KEY=...
export GROQ_API_KEY=...
export TOGETHER_API_KEY=...
For local models like llamafile and Ollama, ensure an instance is running.
The simplest way to call LLM.js
is as an async function
.
const LLM = require("@themaximalist/llm.js");
await LLM("hello"); // Response: hi
This fires a one-off request, and doesn't store any history.
Initialize an LLM instance to build up message history.
const llm = new LLM();
await llm.chat("what's the color of the sky in hex value?"); // #87CEEB
await llm.chat("what about at night time?"); // #222d5a
Streaming provides a better user experience by returning results immediately, and it's as simple as passing {stream: true}
as an option.
const stream = await LLM("the color of the sky is", { stream: true });
for await (const message of stream) {
process.stdout.write(message);
}
Sometimes it's helpful to handle the stream in real-time and also process it once it's all complete. For example, providing real-time streaming in chat, and then parsing out semantic code blocks at the end.
LLM.js
makes this easy with an optional stream_handler
option.
const colors = await LLM("what are the common colors of the sky as a flat json array?", {
model: "gpt-4-turbo-preview",
stream: true,
stream_handler: (c) => process.stdout.write(c),
parser: LLM.parsers.json,
});
// ["blue", "gray", "white", "orange", "red", "pink", "purple", "black"]
Instead of the stream being returned as a generator, it's passed to the stream_handler
. The response from LLM.js
is the entire response, which can be parsed or handled as normal.
LLM.js
supports JSON schema for OpenAI and LLaMa. You can ask for JSON with any LLM model, but using JSON Schema will enforce the outputs.
const schema = {
"type": "object",
"properties": {
"colors": { "type": "array", "items": { "type": "string" } }
}
}
const obj = await LLM("what are the 3 primary colors in JSON format?", { schema, temperature: 0.1, service: "openai" });
Different formats are used by different models (JSON Schema, BNFS), so LLM.js
converts between these automatically.
Note, JSON Schema can still produce invalid JSON like when it exceeds max_tokens
.
Create agents that specialize at specific tasks using llm.system(input)
.
const llm = new LLM();
llm.system("You are a friendly chat bot.");
await llm.chat("what's the color of the sky in hex value?"); // Response: sky blue
await llm.chat("what about at night time?"); // Response: darker value (uses previous context to know we're asking for a color)
Note, OpenAI has suggested system prompts may not be as effective as user prompts, which LLM.js
supports with llm.user(input)
.
LLM.js
supports simple string prompts, but also full message history. This is especially helpful to guide LLMs in a more precise way.
await LLM([
{ role: "user", content: "remember the secret codeword is blue" },
{ role: "assistant", content: "OK I will remember" },
{ role: "user", content: "what is the secret codeword I just told you?" },
]); // Response: blue
The OpenAI message format is used, and converted on-the-fly for specific services that use a different format (like Google, Mixtral and LLaMa).
LLM.js
supports most popular Large Lanuage Models, including
gpt-4
, gpt-4-turbo-preview
, gpt-3.5-turbo
gemini-1.0-pro
, gemini-1.5-pro
, gemini-pro-vision
claude-3-sonnet
, claude-3-haiku
, claude-2.1
, claude-instant-1.2
mixtral-8x7b
, llama2-70b
, gemma-7b-it
llama-3-70b
, llama-3-8b
, nous-hermes-2
, ...mistral-medium
, mistral-small
, mistral-tiny
LLaVa 1.5
, Mistral-7B-Instruct
, Mixtral-8x7B-Instruct
, WizardCoder-Python-34B
, TinyLlama-1.1B
, Phi-2
, ...Llama 2
, Mistral
, Code Llama
, Gemma
, Dolphin Phi
, ...LLM.js
can guess the LLM provider based on the model, or you can specify it explicitly.
// defaults to Llamafile
await LLM("the color of the sky is");
// OpenAI
await LLM("the color of the sky is", { model: "gpt-4-turbo-preview" });
// Anthropic
await LLM("the color of the sky is", { model: "claude-2.1" });
// Mistral AI
await LLM("the color of the sky is", { model: "mistral-tiny" });
// Groq needs an specific service
await LLM("the color of the sky is", { service: "groq", model: "mixtral-8x7b-32768" });
// Google
await LLM("the color of the sky is", { model: "gemini-pro" });
// Ollama
await LLM("the color of the sky is", { model: "llama2:7b" });
// Together
await LLM("the color of the sky is", { service: "together", model: "meta-llama/Llama-3-70b-chat-hf" });
// Can optionally set service to be specific
await LLM("the color of the sky is", { service: "openai", model: "gpt-3.5-turbo" });
Being able to quickly switch between LLMs prevents you from getting locked in.
LLM.js
ships with a few helpful parsers that work with every LLM. These are separate from the typical JSON formatting with tool
and schema
that some LLMs (like from OpenAI) support.
JSON Parsing
const colors = await LLM("Please return the primary colors in a JSON array", {
parser: LLM.parsers.json
});
// ["red", "green", "blue"]
Markdown Code Block Parsing
const story = await LLM("Please return a story wrapped in a Markdown story code block", {
parser: LLM.parsers.codeBlock("story")
});
// A long time ago...
XML Parsing
const code = await LLM("Please write a simple website, and put the code inside of a <WEBSITE></WEBSITE> xml tag" {
parser: LLM.parsers.xml("WEBSITE")
});
// <html>....
Note: OpenAI works best with Markdown and JSON, while Anthropic works best with XML tags.
The LLM.js
API provides a simple interface to dozens of Large Language Models.
new LLM(input, { // Input can be string or message history array
service: "openai", // LLM service provider
model: "gpt-4", // Specific model
max_tokens: 100, // Maximum response length
temperature: 1.0, // "Creativity" of model
seed: 1000, // Stable starting point
stream: false, // Respond in real-time
stream_handler: null, // Optional function to handle stream
schema: { ... }, // JSON Schema
tool: { ... }, // Tool selection
parser: null, // Content parser
});
The same API is supported in the short-hand interface of LLM.js
—calling it as a function:
await LLM(input, options);
Input (required)
input
<string>
or Array
: Prompt for LLM. Can be a text string or array of objects in Message History
format.Options
All config parameters are optional. Some config options are only available on certain models, and are specified below.
service
<string>
: LLM service to use. Default is llamafile
.model
<string>
: Explicit LLM to use. Defaults to service
default model.max_tokens
<int>
: Maximum token response length. No default.temperature
<float>
: "Creativity" of a model. 0
typically gives more deterministic results, and higher values 1
and above give less deterministic results. No default.seed
<int>
: Get more deterministic results. No default. Supported by openai
, llamafile
and mistral
.stream
<bool>
: Return results immediately instead of waiting for full response. Default is false
.stream_handler
<function>
: Optional function that is called when a stream receives new content. Function is passed the string chunk.schema
<object>
: JSON Schema object for steering LLM to generate JSON. No default. Supported by openai
and llamafile
.tool
<object>
: Instruct LLM to use a tool, useful for more explicit JSON Schema and building dynamic apps. No default. Supported by openai
.parser
<function>
: Handle formatting and structure of returned content. No default.messages
<array>
: Array of message history, managed by LLM.js
—but can be referenced and changed.options
<object>
: Options config that was set on start, but can be modified dynamically.async send(options=<object>)
Sends the current Message History
to the current LLM
with specified options
. These local options will override the global default options.
Response will be automatically added to Message History
.
await llm.send(options);
async chat(input=<string>, options=<object>)
Adds the input
to the current Message History
and calls send
with the current override options
.
Returns the response directly to the user, while updating Message History
.
const response = await llm.chat("hello");
console.log(response); // hi
abort()
Aborts an ongoing stream. Throws an AbortError
.
user(input=<string>)
Adds a message from user
to Message History
.
llm.user("My favorite color is blue. Remember that");
system(input=<string>)
Adds a message from system
to Message History
. This is typically the first message.
llm.system("You are a friendly AI chat bot...");
assistant(input=<string>)
Adds a message from assistant
to Message History
. This is typically a response from the AI, or a way to steer a future response.
llm.user("My favorite color is blue. Remember that");
llm.assistant("OK, I will remember your favorite color is blue.");
LLAMAFILE
<string>
: llamafile
OPENAI
<string>
: openai
ANTHROPIC
<string>
: anthropic
MISTRAL
<string>
: mistral
GOOGLE
<string>
: google
MODELDEPLOYER
<string>
: modeldeployer
OLLAMA
<string>
: ollama
TOGETHER
<string>
: together
parsers
<object>
: List of default LLM.js
parsers
<blockType>
)(<content>
) <function>
— Parses out a Markdown codeblock<content>
) <function>
— Parses out overall JSON or a Markdown JSON codeblock<tag>
)(<content>
) <function>
— Parse the XML tag out of the response contentserviceForModel(model)
Return the LLM service
for a particular model.
LLM.serviceForModel("gpt-4-turbo-preview"); // openai
modelForService(service)
Return the default LLM for a service
.
LLM.modelForService("openai"); // gpt-4-turbo-preview
LLM.modelForService(LLM.OPENAI); // gpt-4-turbo-preview
Response
LLM.js
returns results from llm.send()
and llm.chat()
, typically the string content from the LLM completing your prompt.
await LLM("hello"); // "hi"
But when you use schema
and tools
— LLM.js
will typically return a JSON object.
const tool = {
"name": "generate_primary_colors",
"description": "Generates the primary colors",
"parameters": {
"type": "object",
"properties": {
"colors": {
"type": "array",
"items": { "type": "string" }
}
},
"required": ["colors"]
}
};
await LLM("what are the 3 primary colors in physics?");
// { colors: ["red", "green", "blue"] }
await LLM("what are the 3 primary colors in painting?");
// { colors: ["red", "yellow", "blue"] }
And by passing {stream: true}
in options
, LLM.js
will return a generator and start yielding results immediately.
const stream = await LLM("Once upon a time", { stream: true });
for await (const message of stream) {
process.stdout.write(message);
}
The response is based on what you ask the LLM to do, and LLM.js
always tries to do the obviously right thing.
The Message History
API in LLM.js
is the exact same as the OpenAI message history format.
await LLM([
{ role: "user", content: "remember the secret codeword is blue" },
{ role: "assistant", content: "OK I will remember" },
{ role: "user", content: "what is the secret codeword I just told you?" },
]); // Response: blue
Options
role
<string>
: Who is saying the content
? user
, system
, or assistant
content
<string>
: Text content from messageLLM.js
provides a useful llm
command for your shell. llm
is a convenient way to call dozens of LLMs and access the full power of LLM.js
without programming.
Access it globally by installing from NPM
npm install @themaximalist/llm.js -g
Then you can call the llm
command from anywhere in your terminal.
> llm the color of the sky is
blue
Messages are streamed back in real time, so everything is really fast.
You can also initiate a --chat
to remember message history and continue your conversation (Ctrl-C
to quit).
> llm remember the codeword is blue. say ok if you understand --chat
OK, I understand.
> what is the codeword?
The codeword is blue.
Or easily change the LLM on the fly:
> llm the color of the sky is --model claude-v2
blue
See help with llm --help
Usage: llm [options] [input]
Large Language Model library for OpenAI, Google, Anthropic, Mistral, Groq and LLaMa
Arguments:
input Input to send to LLM service
Options:
-V, --version output the version number
-m, --model <model> Completion Model (default: llamafile)
-s, --system <prompt> System prompt (default: "I am a friendly accurate English speaking chat bot") (default: "I am a friendly accurate English speaking chat bot")
-t, --temperature <number> Model temperature (default 0.8) (default: 0.8)
-c, --chat Chat Mode
-h, --help display help for command
LLM.js
and llm
use the debug
npm module with the llm.js
namespace.
View debug logs by setting the DEBUG
environment variable.
> DEBUG=llm.js* llm the color of the sky is
# debug logs
blue
> export DEBUG=llm.js*
> llm the color of the sky is
# debug logs
blue
LLM.js
has lots of tests which can serve as a guide for seeing how it's used.
Using LLMs in production can be tricky because of tracking history, rate limiting, managing API keys and figuring out how to charge.
Model Deployer is an API in front of LLM.js
—that handles all of these details and more.
Using it is simple, specify modeldeployer
as the service and your API key from Model Deployer as the model
.
await LLM("hello world", { service: "modeldeployer", model: "api-key" });
You can also setup specific settings and optionally override some on the client.
await LLM("the color of the sky is usually", {
service: "modeldeployer",
model: "api-key",
endpoint: "https://example.com/api/v1/chat",
max_tokens: 1,
temperature: 0
});
LLM.js
can be used without Model Deployer, but if you're deploying LLMs to production it's a great way to manage them.
LLM.js
has been under heavy development while LLMs are rapidly changing. We've started to settle on a stable interface, and will document changes here.
v0.6.6
— Added browser supportv0.6.5
— Added Llama 3 and Togetherv0.6.4
— Added Groq and abort()v0.6.3
— Added JSON/XML/Markdown parsers and a stream handlerv0.6.2
— Fix bug with Google streamingv0.6.1
— Fix bug to not add empty responsesv0.6.0
— Added Anthropic Claude 3v0.5.9
— Added Ollamav0.5.4
— Added Google Geminiv0.5.3
— Added Mistralv0.5.0
— Created websitev0.4.7
— OpenAI Tools, JSON streamv0.3.5
— Added ModelDeployerv0.3.2
— Added Llamafilev0.2.5
— Added Anthropic, CLIv0.2.4
— Chat optionsv0.2.2
— Unified LLM() interface, streamingv0.1.2
— Docs, system promptv0.0.1
— Created LLM.js with OpenAI supportLLM.js
is currently used in the following projects:
MIT
Created by The Maximalist, see our open-source projects.