OpenGPTAndBeyond Download - OpenGPTAndBeyond Source code download

ChatGPT: Open Source and Beyond

Simplified Chinese | English

The road to implementation and transcendence of the open source ChatGPT model

Since the accidental leakage of LLaMA weights and the impressive performance of Stanford Alpaca's instruction fine-tuning of LLaMA using data built from the gpt-3 api in a self-instruct manner, the open source community has become more and more interested in realizing a large language model at the level of ChatGPT. Getting more and more hopeful.

This repo is to record this process of reproduction and transcendence, and provide an overview for the community.

Including: related technological progress, basic models, domain models, training, reasoning, technology, data, multi-language, multi-modality, etc.

# Table of contents

Base Models
Domain Models
General Domain Instruction Models
Model Merging
Alternatives To Transformer
Multi-Modal
MoE
Data
- Pretrain Data
- Instruction Data
- Synthetic Data Generation
Evaluation
- Benchmark
- LeaderBoard
Framework/ToolKit/Platform
Alignment
Multi-Language
- vocabulary expansion
Efficient Training/Fine-Tuning
Low-Cost Inference
- quantization
- projects
- Prompt Compression
Prompting
Safety
Truthfulness
Exceeding Context Window
Knowledge Editing
- Implementations
External Knowledge
- AI search engine
- Chat with Docs
- Content analysis
- Vector DataBase
External Tools
- Using Existing Tools
- Make New Tools
Agent
LLMs as XXX
Similar Collections

Base Models

contributor	model/project	license	language	main feature
Meta	LLaMA/LLaMA2		multi	LLaMA-13B outperforms GPT-3(175B) and LLaMA-65B is competitive to PaLM-540M. Base model for most follow-up works.
HuggingFace-BigScience	BLOOM		multi	an autoregressive Large Language Model (LLM) trained by HuggingFace BigScience.
HuggingFace-BigScience	BLOOMZ		multi	instruction-finetuned version of BLOOM & mT5 pretrained multilingual language models on crosslingual task mixture.
EleutherAI	GPT-J		en	transformer model trained using Ben Wang'sMesh Transformer JAX.
Meta	OPT		en	Open Pre-trained Transformer Language Models, aim in developing this suite of OPT models is to enable reproducible and responsible research at scale, and to bring more voices to the table in studying the impact of these LLMs.
Cerebras Systems	Cerebras-GPT		en	Pretrained LLM, GPT-3 like, Commercially available, efficiently trained on the Andromeda AI supercomputer, trained in accordance with Chinchilla scaling laws (20 tokens per model parameter) which is compute-optimal.
EleutherAI	python		en	combine interpretability analysis and scaling laws to understand how knowledge develops and evolves during training in autoregressive transformers.
Stability-AI	StableLM		en	Stability AI Language Models
FDU	MOSS		en/zh	An open-source tool-augmented conversational language model from Fudan University.
ssymmetry&FDU	BBT-2		zh	12B open-source LM.
@mlfoundations	OpenFlamingo		en	An open-source framework for training large multimodal models.
EleutherAI	GPT-NeoX-20B		en	Its architecture intentionally resembles that of GPT-3, and is almost identical to that of GPT-J-6B.
UCB	OpenLLaMA	Apache-2.0	en	An Open Reproduction of LLaMA.
MosaicML	MPT	Apache-2.0	en	MPT-7B is a GPT-style model, and the first in the MosaicML Foundation Series of models. Trained on 1T tokens of a MosaicML-curated dataset, MPT-7B is open-source, commercially usable, and equivalent to LLaMa 7B on evaluation metrics.
TogetherComputer	RedPajama-INCITE-Base-3B-v1	Apache-2.0	en	A 2.8B parameter pretrained language model, pretrained onRedPajama-Data-1T, together with an Instruction-tuned Version and a Chat Version.
Lightning-AI	Lit-LLaMA	Apache-2.0	-	Independent implementation ofLLaMA that is fully open source under the Apache 2.0 license.
@conceptofmind	PLM	MIT License	en	An open-source implementation of Google PaLM models.
TII	Falcon-7B	TII Falcon LLM License	en	a 7B parameters causal decoder-only model built by TII and trained on 1,500B tokens of RefinedWeb enhanced with curated corpora.
TII	Falcon-40B	TII Falcon LLM License	multi	a 40B parameters causal decoder-only model built by TII and trained on 1,000B tokens of RefinedWeb enhanced with curated corpora.
TigerResearch	TigerBot	Apache-2.0	en/zh	a multi-language and multitask LLM.
BAAI	Aquila/Aquila2	BAAI_Aquila_Model_License	en/zh	The Aquila language model inherits the architectural design advantages of GPT-3 and LLaMA, replacing a batch of more efficient underlying operator implementations and redesigning the tokenizer for Chinese-English bilingual support.
OpenBMB	CPM-Bee	Universal Model License Agreement-Source Statement-Publicity Restrictions-Commercial Authorization	en/zh	CPM-Bee is a fully open-source, commercially-usable Chinese-English bilingual base model with a capacity of ten billion parameters. And has been pre-trained on an extensive corpus of trillion-scale tokens.
Baichuan	baichuan-7B	Apache-2.0	en/zh	It has achieved the best performance among models of the same size on standard Chinese and English authoritative benchmarks (C-EVAL, MMLU, etc).
Tencent	lyraChatGLM	MIT License	en/zh	To the best of our knowledge, it is the first accelerated version of ChatGLM-6B . The inference speed of lyraChatGLM has achieved 300x acceleration upon the early original version. We are still working hard to further improve the performance.
SalesForce	XGen	Apache-2.0	multi	Salesforce open-source LLMs with 8k sequence length
Shanghai AI Lab	InternLM	Apache-2.0	en/zh	InternLM has open-sourced a 7 billion parameter base model and a chat model tailored for practical scenarios. The model has the following characteristics: It leverages trillions of high-quality tokens for training to establish a powerful knowledge base. It supports an 8k context window length, enabling longer input sequences and stronger reasoning capabilities. It provides a versatile toolset for users to flexibly build their own workflows.
xverse-ai	XVERSE	Apache-2.0	multi	Multilingual LLMs developed by XVERSE Technology Inc.
Writer	palmyra	Apache-2.0	en	extremely powerful while being extremely fast. This model excels at many nuanced tasks such as sentiment classification and summarization.
Mistral AI	Mistral	Apache-2.0	en	Mistral 7B is a 7.3B parameter model that: 1. Outperforms Llama 2 13B on all benchmarks 2. Outperforms Llama 1 34B on many benchmarks 3. Approaches CodeLlama 7B performance on code, while remaining good at English tasks 4. Uses Grouped-query attention (GQA) for faster inference 5. Uses Sliding Window Attention (SWA) to handle longer sequences at smaller cost
SkyworkAI	Skywork	-	en/zh	In major evaluation benchmarks, Skywork-13B is at the forefront of Chinese open source models and is the optimal level under the same parameter scale; it can be used commercially without application; it has also open sourced a 600G (150 billion tokens) Chinese data set.
01.AI	Yi	-	en/zh	The Yi series models are large language models trained from scratch by developers at 01.AI.
IEIT Systems	Yuan-2.0	-	en/zh	In this work, the Localized Filtering-based Attention (LFA) is introduced to incorporate prior knowledge of local dependencies of natural language into Attention. Based on LFA, we develop and release Yuan 2.0, a large language model with parameters ranging from 2.1 billion to 102.6 billion. A data filtering and generation method is presented to build pretraining and fine-tuning dataset in high quality. A distributed training method with non-uniform pipeline parallel, data parallel, and optimizer parallel is proposed, which greatly reduces the bandwidth requirements of intra-node communication, and achieves good performance in large-scale distributed training. Yuan 2.0 models display impressive ability in code generation, math problem-solving, and chat compared with existing models.
Nanbeige	Nanbeige	Apache-2.0	en/zh	Nanbeige-16B is a 16 billion parameter language model developed by Nanbeige LLM Lab. It uses 2.5T Tokens for pre-training. The training data includes a large amount of high-quality internet corpus, various books, code, etc. It has achieved good results on various authoritative evaluation data sets. This release includes the Base, Chat, Base-32k and Chat-32k.
deepseek-ai	deepseek-LLM	MIT License	en/zh	an advanced language model comprising 67 billion parameters. It has been trained from scratch on a vast dataset of 2 trillion tokens in both English and Chinese.
LLM360	LLM360	-	-	Most open-source LLM releases include model weights and evaluation results. However, additional information is often needed to genuinely understand a model's behavior—and this information is not typically available to most researchers. Hence, we commit to releasing all of the intermediate checkpoints ( up to 360!) collected during training, all of the training data (and its mapping to checkpoints), all collected metrics (eg, loss, gradient norm, evaluation results), and all source code for preprocessing data and model training. These additional artifacts can help researchers and practitioners to have a deeper look into LLM's construction process and conduct research such as analyzing model dynamics. We hope that LLM360 can help make advanced LLMs more transparent, foster research in smaller-scale labs, and improve reproducibility in AI research.
FDU, etc.	CT-LLM	-	zh/en	focusing on the Chinese language. Starting from scratch, CT-LLM primarily uses Chinese data from a 1,200 billion token corpus, including 800 billion Chinese, 300 billion English, and 100 billion code tokens. By open-sourcing CT-LLM's training process, including data processing and the Massive Appropriate Pretraining Chinese Corpus (MAP-CC), and introducing the Chinese Hard Case Benchmark (CHC-Bench), we encourage further research and innovation, aiming for more inclusive and adaptable language models.
TigerLab	MAP-NEO	-	zh/en	The first large model that is open source for the entire process from data processing to model training and model weights.
DataCamp	DCLM	-	-	Provides tools and guidance for processing raw data, tokenization, data shuffling, model training, and performance evaluation. The basic baseline 7B model has excellent performance.

Domain Models

contributor	model	domain	language	base model	main feature
UT Southwestern/ UIUC/OSU/HDU	ChatDoctor	medical	en	LLAMA	Maybe the first domain-specific chat model tuned on LLaMA.
Cambridge	Visual Med-Alpaca	biomedical	en	LLaMA-7B	a multi-modal foundation model designed specifically for the biomedical domain.
HIT	BenTsao/ChatGLM-Med	medical	zh	LLaMA/ChatGLM	fine-tuned with Chinese medical knowledge dataset, which is generated by using gpt3.5 api.
ShanghaiTech, etc.	DoctorGLM	medical	en/zh	ChatGLM-6B	Chinese medical consultation model fine-tuned on ChatGLM-6B.
THU AIR	BioMedGPT-1.6B	biomedical	en/zh	-	a pre-trained multi-modal molecular foundation model with 1.6B parameters that associates 2D molecular graphs with texts.
@LiuHC0428	LawGPT_en	legal	zh	ChatGLM-6B	a general model in Chinese legal domain, trained on data generated via Reliable-Self-Instruction.
SJTU	MedicalGPT-zh	medical	zh	ChatGLM-6B	a general model in Chinese medical domain, a diverse data generated via self-instruct.
SJTU	PMC-LLaMA	medical	zh	LLAMA	Continue Training LLaMA on Medical Papers.
HuggingFace	StarCoder	code generation	en	-	a language model (LM) trained on source code and natural language text. Its training data incorporates more than 80 different programming languages as well as text extracted from GitHub issues and commits and from notebooks.
@CogStack	NHS-LLM	medical	en	not clear	A conversational model for healthcare trained usingOpenGPT.
@pengxiao-song	LaWGPT	legal	zh	LLaMA/ChatGLM	expand the vocab with Chinese legal terminologies, fine-tuned instruction on data generated using self-instruct.
Duxiaoman	Xuanyuan	finance	zh	BLOOM-176B	A Large Chinese Financial Chat Model with Hundreds of Billions Parameters.
CUHK	HuatuoGPT	medical	zh	not clear	HuatuoGPT, a large language model (LLM) trained on a vast Chinese medical corpus. Our objective with HuatuoGPT is to construct a more professional 'ChatGPT' for medical consultation scenarios.
PKU	Lawyer LLaMA	legal	zh	LLAMA	continue pretraining on Chinese legal data, insturction tuned on legal exams and legal consulting qa pairs.
THU	LexiLaw	legal	zh	ChatGLM-6B	trained on a mixture of general data (BELLE 1.5M) and legal data
THU, etc.	taoli	education	zh	LLAMA	A large model for international Chinese education. It extends specific vocabulary on the base model, and uses the domain's proprietary data set for instruction fine-tuning.
NUS	Goat	arithmetic	en	LLAMA	a fine-tuned LLaMA model that significantly outperforms GPT-4 on a range of arithmetic tasks. Fine-tuned on a synthetically generated dataset, Goat achieves state-ofthe-art performance on BIG-bench arithmetic sub-task.
CU/NYU	FinGPT	finance	en	-	an end-to-end open-source framework for financial large language models (FinLLMs).
microsoft	WizardCoder	code generation	en	StarCoder	trained with 78k evolved code instructions. surpasses Claude-Plus (+6.8) , Bard (+15.3) and InstructCodeT5+ (+22.3) on the HumanEval Benchmarks.
UCAS	Cornucopia	finance	zh	LLAMA	finetune LLaMA on Chinese financial knowledge,
PKU	ChatLaw	legal	zh	Ziya/Anima	Chinese legal domain model.
@michael-wzhu	ChatMed	medical	zh	LLAMA	Chinese medical LLM based on LLaMA-7B.
SCUT	SoulChat	mental health	zh	ChatGLM-6B	Chinese dialogue LLM in mental health domain, based on ChatGLM-6B.
@shibing624	MedicalGPT	medical	zh	ChatGLM-6B	Training Your Own Medical GPT Model with ChatGPT Training Pipeline.
BJTU	TransGPT	transportation	zh	LLaMA-7B	Chinese transportation model.
BAAI	AquilaCode	code generation	multi	Aquila	AquilaCode-multi is a multi-language model that supports high-accuracy code generation for various programming languages, including Python/C++/Java/Javascript/Go, etc. It has achieved impressive results in HumanEval (Python) evaluation, with Pass@1, Pass@10, and Pass@100 scores of 26/45.7/71.6, respectively. In the HumanEval-X multi-language code generation evaluation, it significantly outperforms other open-source models with similar parameters (as of July 19, 2023). AquilaCode-py, on the other hand, is a single-language Python version of the model that focuses on Python code generation. It has also demonstrated excellent performance in HumanEval evaluation, with Pass@1, Pass@10, and Pass@100 scores of 28.8/50.6/76.9 (as of July 19, 2023).
Meta	CodeLLaMA	code generation	multi	LLaMA-2	a family of large language models for code based onLlama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks.
UNSW, etc.	Darwin	natural science	en	LLaMA-7B	the first open-source LLM for natural science, mainly in physics, chemistry and material science.
alibaba	EcomGPT	e-commerce	en/zh	BLOOMZ	An Instruction-tuned Large Language Model for E-commerce.
TIGER-AI-Lab	MAmmoTH	math	en	LLaMA2/CodeLLaMA	a series of open-source large language models (LLMs) specifically tailored for general math problem-solving. The MAmmoTH models are trained on MathInstruct, a meticulously curated instruction tuning dataset that is lightweight yet generalizable. MathInstruct is compiled from 13 math rationale datasets, six of which are newly curated by this work. It uniquely focuses on the hybrid use of chain-of-thought (CoT) and program-of-thought (PoT) rationales, and ensures extensive coverage of diverse mathematical fields.
SJTU	abel	math	en	LLaMA2	We propose Parental Oversight * , A Babysitting Strategy for Supervised Fine-tuning, `Parental Oversight` is not limited to any specific data processing method. Instead, it defines the data processing philosophy that should guide supervised fine-tuning in the era of Generative AI GAI) .
FDU	DISC-LawLLM	legal	zh	Baichuan-13B	FudanDISC has released DISC-LawLLM, a Chinese intelligent legal system driven by a large language model. The system can provide various legal services for different user groups. In addition, DISC-Law-Eval is constructed to evaluate the large legal language model from both objective and subjective aspects. The model has obvious advantages compared with the existing large legal models. The team also made available a high-quality Supervised fine-tuning (SFT) dataset of 300,000, DISC-Law-SFT.
HKU, etc.	ChatPsychiatrist	mental health	en	LLaMA-7B	This repo open-sources the Instruct-tuned LLaMA-7B model that has been fine-tuned with counseling domian instruction data. To construct our 8K size instruct-tuning dataset, we collected real-world counseling dialogue examples and employed GPT-4 as an extractor and filter. In addition, we have introduced a comprehensive set of metrics, specifically tailored to the LLM+Counseling domain, by incorporating domain counseling evaluation criteria. These metrics enable the assessment of performance in generating language content that involves multi-dimensional counseling skills.
CAS	StarWhisper	astronomical	zh	-	StarWhisper, a large astronomical model, significantly improves the reasoning logic and integrity of the model through the fine-tuning of astrophysical corpus labeled by experts, logical long text training, and direct preference optimization. In the CG-Eval jointly published by the Keguei AI Research Institute and LanguageX AI Lab, it reached the second place overall, just below GPT-4, and its mathematical reasoning and astronomical capabilities are close to or exceed the GPT 3.5 Turbo.
ZhiPuAI	FinGLM	finance	zh	ChatGLM	solutions of SMP2023-ELMFT(The Evaluation of Large Model of Finance Technology).
PKU, etc.	CodeShell	code generation	en/zh	-	CodeShell is a code large language model (LLM) developed jointly by the Knowledge Computing Lab at Peking University and the AI team of Sichuan Tianfu Bank. CodeShell has 7 billion parameters, was trained on 500 billion tokens, and has a context window length of 8192. On authoritative code evaluation benchmarks (HumanEval and MBPP), CodeShell achieves the best performance for models of its scale.
FDU	DISC-FinLLM	finance	zh	Baichuan-13B-Chat	DISC-FinLLM is a large language model in the financial field. It is a multi-expert intelligent financial system composed of four modules for different financial scenarios: financial consulting, financial text analysis, financial calculation, and financial knowledge retrieval and question answering.
Deepseek	Deepseek Coder	code generation	en/zh	-	Deepseek Coder comprises a series of code language models trained on both 87% code and 13% natural language in English and Chinese, with each model pre-trained on 2T tokens. For coding capabilities, Deepseek Coder achieves state-of-the-art performance among open-source code models on multiple programming languages and various benchmarks.
microsoft	MathOctopus	math	multi	LLaMA2	This work pioneers exploring and building powerful Multilingual Math Reasoning (xMR) LLMs. To accomplish this, we make the following works: 1. MGSM8KInstruct , the first multilingual math reasoning instruction dataset, encompassing ten distinct languages, thus addressing the issue of training data scarcity in xMR tasks. 2. MSVAMP , an out-of-domain xMR test dataset, to conduct a more exhaustive and comprehensive evaluation of the model's multilingual mathematical capabilities. 3. MathOctopus , our effective Multilingual Math Reasoning LLMs, training with different strategies, which notably outperform conventional open-source LLMs and exhibit superiority over ChatGPT in few-shot scenarios.
ITREC	Zh-MT-LLM	maritime	en/zh	ChatGLM3-6b	The training data use the maritime domain data Zh-mt-sft organized for three main segments, and 30w general conversation datamoss-003-sft-data. Zh-mt-sft specifically Contains CrimeKgAssitant-1.8w, Zh-law-qa, and Zh-law-court related to maritime laws and regulations Q&A, Zh-edu-qa and Zh-edu-qb related to maritime education and training, and Zh-mt-qa related to maritime specialized knowledge Q&A.
@SmartFlowAI	EmoLLM	mental health	zh	-	EmoLLM is a series of large mental health models that can support the link of understanding users - supporting users - helping users in mental health counseling, which is fine-tuned by `LLM` instructions.

some medical models: here

some domain llms: Awesome-Domain-LLM

healing models: Awesome-Healthcare-Foundation-Models

General Domain Instruction Models

contributor	model/project	language	base model	main feature
Stanford	Alpaca	en	LLaMA/OPT	use 52K instruction-following data generated by Self-Instructt techniques to fine-tune 7B LLaMA, the resulting model, Alpaca, behaves similarly to the `text-davinci-003` model on the Self-Instruct instruction-following evaluation suite. Alpaca has inspired many follow-up models.
LianJiaTech	BELLE	en/zh	BLOOMZ-7B1-mt	maybe the first Chinese model to follow Alpaca.
THU	ChatGLM-6B	en/zh	-	well-known Chinese model.
Databricks	Dolly	en	GPT-J 6B	use Alpaca data to fine-tune a 2-year-old model: GPT-J, which exhibits surprisingly high quality instruction following behavior not characteristic of the foundation model on which it is based.
@tloen	Alpaca-LoRA	en	LLaMA-7B	trained within hours on a single RTX 4090, reproducing the Stanford Alpaca results using low-rank adaptation (LoRA), and can run on a Raspberry pi.
ColossalAI	Coati7B	en/zh	LLaMA-7B	a large language model developed by the ColossalChat project
Shanghai AI Lab	LLaMA-Adapter	en	LLaMA-7B	Fine-tuning LLaMA to follow instructions within 1 Hour and 1.2M Parameters
AetherCortex	Llama-X	en	LLAMA	Open Academic Research on Improving LLaMA to SOTA LLM.
TogetherComputer	OpenChatKit	en	GPT-NeoX-20B	OpenChatKit provides a powerful, open-source base to create both specialized and general purpose chatbots for various applications. The kit includes an instruction-tuned language models, a moderation model, and an extensible retrieval system for including up-to-date responses from custom repositories.
nomic-ai	GPT4All	en	LLAMA	trained on a massive collection of clean assistant data including code, stories and dialogue
@ymcui	Chinese-LLaMA-Alpaca	en/zh	LLaMA-7B/13B	expand the Chinese vocabulary based on the original LLaMA and use Chinese data for secondary pre-training, further enhance Chinese basic semantic understanding. Additionally, the project uses Chinese instruction data for fine-tuning on the basis of the Chinese LLaMA, significantly improving the model's understanding and execution of instructions.
UC Berkley Stanford CMU	Vicuna	en	LLaMA-13B	Impressing GPT-4 with 90% ChatGPT Quality.
UCSD/SYSU	baize	en/zh	LLAMA	fine-tuned withLoRA. It uses 100k dialogs generated by letting ChatGPT chat with itself. Alpaca's data is also used to improve its performance.
UC Berkley	Koala	en	LLAMA	Rather than maximizing quantity by scraping as much web data as possible, the team focus on collecting a small high-quality dataset.
@imClumsyPanda	langchain-ChatGLM	en/zh	ChatGLM-6B	local knowledge based ChatGLM with langchain.
@yangjianxin1	Firefly	zh	bloom-1b4-zh bloom-2b6-zh	Instruction Tuning on Chinese dataset. Vocabulary pruning, ZeRO, and tensor parallelism are used to effectively reduce memory consumption and improve training efficiency.
microsoft	GPT-4-LLM	en/zh	LLAMA	aims to share data generated by GPT-4 for building an instruction-following LLMs with supervised learning and reinforcement learning.
Hugging Face	StackLLaMA	en	LLAMA	trained on StackExchange data and the main goal is to serve as a tutorial and walkthrough on how to train model with RLHF and not primarily model performance.
Nebuly	ChatLLaMA	en	-	a library that allows you to create hyper-personalized ChatGPT-like assistants using your own data and the least amount of compute possible.
@juncongmoo	ChatLLaMA	en	LLAMA	LLaMA-based RLHF model, runnable in a single GPU.
@juncongmoo	minichatgpt	en	GPT/OPT...	To Train ChatGPT In 5 Minutes with ColossalAI.
@LC1332	Luotuo-Chinese-LLM	zh	LLaMA/ChatGLM	Instruction fine-tuned Chinese Language Models, with colab provided!
@Facico	Chinese-Vicuna	zh	LLAMA	A Chinese Instruction-following LLaMA-based Model, fine-tuned with Lora, cpp inference supported, colab provided.
@yanqiangmiffy	InstructGLM	en/zh	ChatGLM-6B	ChatGLM based instruction-following model, fine-tuned on a variety of data sources, supports deepspeed accelerating and LoRA.
alibaba	Wombat	en	LLAMA	a novel learning paradigm called RRHF, as an alternative of RLHF, is proposed, which scores responses generated by different sampling policies and learns to align them with human preferences through ranking loss. And the performance is comparable to RLHF, with less models used in the process.
@WuJunde	alpaca-glassoff	en	LLAMA	a mini image-acceptable Chat AI can run on your own laptop, based onstanford-alpaca and alpaca-lora.
@JosephusCheung	Guanaco	multi	LLaMA-7B	A Multilingual Instruction-Following Language Model.
@FreedomIntelligence	LLM Zoo	multi	BLOOMZ/LLaMA	a project that provides data, models, and evaluation benchmark for large language models. model released: Phoenix, Chimera
SZU	Linly	en/zh	LLAMA	expand the Chinese vocabulary , full fine-tuned models, largest LLaMA-based Chinese models, aggregation of Chinese instruction data, reproducible details..
@lamini-ai	lamini	multi	-	data generator for generating instructions to train instruction-following LLMs.
Stability-AI	StableVicuna	en	LLAMA	a further instruction fine tuned and RLHF trained version of Vicuna v0 13b, with better performance than Vicuna.
Hugging Face	HuggingChat	en	LLAMA	seems to be the first one available to access as a platform that appears similar to ChatGPT.
microsoft	WizardLM	en	LLAMA	trained with 70k evolved instructions,Evol-Instruct is a novel method using LLMs instead of humans to automatically mass-produce open-domain instructions of various difficulty levels and skills range, to improve the performance of LLMs.
FDU	OpenChineseLLaMA	en/zh	LLaMA-7B	further pretrain LLaMA on Chinese data, improving LLaMA preformance on Chinese tasks.
@chenfeng357	open-Chinese-ChatLLaMA	en/zh	LLAMA	The complete training code of the open-source Chinese-Llama model, including the full process from pre-training instructing and RLHF.
@FSoft-AI4Code	CodeCapybara	en	LLAMA	Open Source LLaMA Model that Follow Instruction-Tuning for Code Generation.
@mbzuai-nlp	LaMini-LM	en	LLaMA/Flan-T5...	A Diverse Herd of Distilled Models from Large-Scale Instructions.
NTU	Panda	en/zh	LLAMA	further pretraining on Chinese data, full-size of LLaMA models.
IBM/CMU/MIT	Dromedary	en	LLaMA-65B	Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision.
@melodysdreamj	WizardVicunaLM	multi	Vicuna	Wizard's dataset + ChatGPT's conversation extension + Vicuna's tuning method, achieving approximately 7% performance improvement over Vicuna.
sambanovasystems	BLOOMChat	multi	BLOOM	BLOOMChat is a 176 billion parameter multilingual chat model. It is instruction tuned fromBLOOM (176B) on assistant-style conversation datasets and supports conversation, question answering and generative answers in multiple languages.
TII	Falcon-7B-Instruct	en	Falcon-7B	a 7B parameters causal decoder-only model built by TII based on Falcon-7B and finetuned on a mixture of chat/instruct datasets.
TII	Falcon-40B-Instruct	multi	Falcon-40B	a 40B parameters causal decoder-only model built by TII based on Falcon-40B and finetuned on a mixture of Baize.
USTC, etc.	ExpertLLaMA	en	LLAMA	use In-Context Learning to automatically write customized expert identity and find the quality quite satisfying. We then prepend corresponding expert identity to each instruction to produce augmented instruction-following data. We refer to the overall framework as ExpertPrompting , find more details in our paper.
ZJU	CaMA	en/zh	LLAMA	further pretrained on Chinese courpus without expansion of vocabulary; optimized on the Information Extraction (IE) tasks. pre-training script is available, which includes transformations, construction, and loading of large-scale corpora, as well as the LoRA instruction fine-tuning script.
THU	UltraChat	en	LLAMA	First, the UltraChat dataset provides a rich resource for the training of chatbots. Second, by fine-tuning the LLaMA model, the researchers successfully created a dialogue model UltraLLaMA with superior performance.
RUC	YuLan-Chat	en/zh	LLAMA	developed based on fine-tuning LLaMA with high-quality English and Chinese instructions.
AI2	Tulu	en	LLaMA/Pythia/OPT	a suite of LLaMa models fully-finetuned on a strong mix of datasets.
KAIST	SelFee	en	LLAMA	Iterative Self-Revising LLM Empowered by Self-Feedback Generation.
@lyogavin	Anima	en/zh	LLAMA	trained based on QLoRA's33B guanaco, finetuned for 10000 steps.
THU	ChatGLM2-6B	en/zh	-	ChatGLM 2 -6B is the second-generation version of the open-source bilingual (Chinese-English) chat model ChatGLM-6B. It retains the smooth conversation flow and low deployment threshold of the first-generation model, while introducing the following new features: -Stronger Performance - Longer Context - More Efficient Inference- More Open License
OpenChat	OpenChat	en	LLaMA, etc.	a series of open-source language models fine-tuned on a small, yet diverse and high-quality dataset of multi-round conversations. Specifically, we utilize only ~6K GPT-4 conversations directly filtered from the ~90K ShareGPT conversations. Despite the small size of the dataset, OpenLLMs has demonstrated remarkable performance.
CAS	BayLing	multi	LLAMA	BayLing is an English/Chinese LLM equipped with advanced language alignment, showing superior capability in English/Chinese generation, instruction following and multi-turn interaction.
stabilityai	FreeWilly/FreeWilly2	en	LLaMA/LLaMA2	`FreeWilly` is a Llama65B model fine-tuned on an Orca style Dataset. `FreeWilly2` is a Llama2 70B model finetuned on an Orca style Dataset. `FreeWilly2` outperforms Llama2 70B on the huggingface Open LLM leaderboard.
alibaba	Qwen-7B	en/zh	-	7B-parameter version of the large language model series, Qwen (abbr. Tongyi Qianwen), proposed by Alibaba Cloud.
ZJU	KnowLM	en/zh	LLAMA	With the rapid development of deep learning technology, large language models such as ChatGPT have made substantial strides in the realm of natural language processing. However, these expansive models still encounter several challenges in acquiring and comprehending knowledge, including the difficulty of updating knowledge and potential knowledge discrepancies and biases, collectively known as knowledge fallacies . The KnowLM project endeavors to tackle these issues by launching an open-source large-scale knowledgable language model framework and releasing corresponding models.
NEU	TechGPT	en/zh	LLAMA	TechGPT mainly strengthens the following three types of tasks: - Various information extraction tasks such as relation triplet extraction with "knowledge graph construction" as the core - Various intelligent question-and-answer tasks centered on "reading comprehension". - Various sequence generation tasks such as keyword generation with "text understanding" as the core.
@MiuLab	Taiwan-LLaMa	en/zh	LLaMA2	Traditional Chinese LLMs for Taiwan.
Xwin-LM	Xwin-LM	en	LLaMA2	Xwin-LM aims to develop and open-source alignment technologies for large language models, including supervised fine-tuning (SFT), reward models (RM), reject sampling, reinforcement learning from human feedback (RLHF), etc. Our first release, built-upon on the Llama2 base models, ranked TOP-1 on AlpacaEval. Notably, it's the first to surpass GPT-4 on this benchmark.
wenge-research	Yayi	en/zh	LLaMA/LLaMA2	YaYi was fine-tuned on millions of artificially constructed high-quality domain data. This training data covers five key domains: media publicity, public opinion analysis, public safety, financial risk control, and urban governance, encompassing over a hundred natural language instruction tasks.
HuggingFace	zephyr	en	Mistral	Zephyr is a series of language models that are trained to act as helpful assistants. Zephyr-7B-α is the first model in the series, and is a fine-tuned version of mistralai/Mistral-7B-v0.1 that was trained on on a mix of publicly available, synthetic datasets using Direct Preference Optimization (DPO).
Cohere	Command-R / Command R+	multi	-	Command-R has the capability for multilingual generation evaluated in 10 languages and highly performant RAG capabilities.
XAI	grok	en	-	314B MoE; context length: 8192
databricks	dbrx-instruct	-	-	a fine-grained mixture-of-experts (MoE) architecture with 132B total parameters of which 36B parameters are active on any input. It was pre-trained on 12T tokens of text and code data. Compared to other open MoE models like Mixtral- 8x7B and Grok-1, DBRX is fine-grained, meaning it uses a larger number of smaller experts. DBRX has 16 experts and chooses 4, while Mixtral-8x7B and Grok-1 have 8 experts and choose 2.

Model Merging

contributor	model/method	main feature	main feature
FuseAI	FuseChat	Firstly, it undertakes pairwise knowledge fusion for source LLMs to derive multiple target LLMs of identical structure and size via lightweight fine-tuning. Then, these target LLMs are merged within the parameter space, wherein we propose a novel method VaRM for determining the merging weights based on the variation ratio of parameter matrices before and after fine-tuning.	a fusion of three prominent chat LLMs with diverse architectures and scales, namelyNH2-Mixtral-8x7B, NH2-Solar-10.7B, and OpenChat-3.5-7B. FuseChat-7B-VaRM achieves an average performance of 8.22 on MT-Bench, outperforming various powerful chat LLMs at 7B and 34B scales like Starling-7B and Yi-34B-Chat, even surpassing GPT-3.5 (March), Claude-2.1, and approaching Mixtral-8x7B-Instruct.
arcee-ai	mergekit	Tools for merging pretrained large language models.
SakanaAI	EvoLLM	Evolutionary Optimization of Model Merging Recipes.

Alternatives To Transformer

(maybe successors?)

contributor	method	main feature
BlinkDL	RWKV-LM	RWKV is an rnn with transformer-level llm performance. It can be directly trained like a gPT (parallLizable). So it's compining the best of rnn and transformer -Great Performance, Fast Inference, Saves VRAM, Fast Training, "Infinite" CTX_LEN, and Free Sentence Embedding.
MSRA	Retnet	simultaneously achieving training parallelism, low-cost inference, and good performance. We theoretically derive the connection between recurrence and attention. Then we propose the retention mechanism for sequence modeling, which supports three computation paradigms, IE, paralll, recurrence, and chunkwise recurrence. SPECIFICALLY, The Parallel REPRESENTATION ALLOWS for Training Paralism. The Recurrent Repretion Enables Low-Cost O ( 1 ) Infrence, Which Improves Decoding roughput, Latender, and GPU MEMIRY Without Sacrificing Performance. The Chunkwise Recurrent Reprerentation Facilities Efficient Long-Sequest With Linear Complexity ,, WHERE EACH Chunk is encoded Parally WHILERENTLY SUMMARIZING The Chunks. Experimental Results on Language Modeling Show That RetNet Achieves Favorable S Caling Results, Parallel Training, Low-Cost Deployment, and Efficient Inference. The intriguing Properties Make Retnet A Strong Successer to Large Language Models.
stanford	Bapcpack	Abackpack is a drop-in replacement for a transformer that provides new tools for interpretability-zrough-Control . Backpacks Decomictive Meaning of Words In-CONTExtually, and Aggregate them by Weighted Sum, Allowing for Precise, Predictable Intern Ventions.
Stanford, ETC.	Monarch Mixer (M2)	The Basic Idea Is to Replace the Major Elements of a Transformer with Monarch Matrice Datic, Hardware-Efficient, and Expressive. In Monarch Mixer, We use Layers Built Up FROM MONARCH MATRES to Do Both Mixing Across the Sequency PERATION) and Mixing Across the Model Dimension (Replacing The Dense MLP).
CMU, ETC.	Mamba	Mamba is a New State Space Model Architecture Showing Performance on Information-DENSE DATA SUCH As Language Modeling l short of transformers. It is base on the line of propress onStructured State Space Models, with an effect hardware-aware design and Implementation in the Spirit of Flashattent.
TogetherComputer	Stripedhyna	Stripedhyna is the first alternative model competitive with the best open-source transformers of similarrrrr. Stripedhyna is a hybrid Architecture Composed of Multi-Head, Groupd-Query Attention and Gated Convolution inhyna Blocks, Diffrent FRADITI ONAL Decoder-only transformers. 1. Costant Memory Decoding in Hyena Blocks Via Representation of Convolutions As State-Space Models (Modal Orgonical Form), or as truncated Filters. 2. LOW LATENCY, FASTER Decoding and Higher Throughput than transformers. 3. Improvement to trailing and increity-optimal scaling last, compared to optimized transformer architectuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctuctures such as llama-2. 4. Trained on sequences of up to 32k, allowing it to process longer prompts.
microsoft	BGPT	BGPT SUPPORTS Generatively Modeling Via Next Byte Prediction on Any Type of Data and Can Perform Task Executable on A Computer, showCasing The Capability To Simult E -All Activities within the Digital World, With iTS Potential Only Limited by Computational Resources and Our Imagination.
DeepMind	Griffin-jax	JAX + Flax Implementation of thegriffin: MIXING GATED Linear Recurrencess with Local Attention for Efficient Language Models, Not Office Code (Office Code IS Otalased yet); The rg-lro layer, a NOVEL GATED LINEAR Recurrent Layer, Around Which We Design a New Recurren Block to Replace MQA. We built Two New Models using this Block: Hawk, a model which interleaves mlps with recurren blocks, and griffin, a Hybrid Model Which Interleaves Mlps with A Mixture of Recurrent Blocks and Local Attention Griffin-3B outperforms mamba-3b, and griffin-7b and gripin-14B Achieve Performance with Llama-2, Despite Being Train Kens; Griffin Can Extrapolate on Sequences Significantly Longer than Those Seen During Training.
AI21	Jamba	Jamba is the first product-scale mamba imagentation. It's a pretrained, mixture-of-exits (MOE) Generative Text Model, With 12B Active Parameters and a Tota l of 52B parameters across all experts. It supports a 256k context length up to 140k tokens on a single 80GB GPU.
Meta	Megalodon	Megalodon Inherits the Architecture of Mega (Exponential Moving Average with Gated Attention), and Further INTRODUCES MULTIPLE Technology ITS Capability and Stability, Including Complex Exponential Moving Average (CEMA), Timestep Normalization Layer, Normalized Attention Mechanism and Pre-Norm with H TWO -Hop Residual Configuration. In a Controlled Head-Head Comparison with LLAMA2, Megalodon Achieves Better EFFICIENCY THE Scale OF 7 BILLION Arameters and 2 TRILLION Training tokens.

MoE

contributor	Model/Project	main feature
Mistralai	Mixtral-8x7b	The MIXTRAL-8X7B LARGE LANGUAGE MODEL (LLM) is a pretrained generative sparse mixture of exce d.
SHANGHAI AI LAB, ETC.	Llama-Moe	A Small and Affordable Moe Model Based Onlined Slimpajama. The number of activated Model Parameters is only 3.0 ~ 3.5B, Which Is Friendly MEPLOYMENT and Research usage.
NUS, ETC.	Openmoe	A family of Open-Sourced Mixture-OF-EXPERTS (MOE) Large Language Models.
Snowflake	Arctic	ARCTIC UNIQUE DENSE-MOE HYBRID Transformer Architecture. It Combines A 10B DENSETENSFREMER MODEL WITH A Residual 128X3.66B MOE MLP Resulting In 480B Tot. Al and 17B Active Parameters Chosen using a top-2 gating.

Multi-Modal

contributor	project	language	base model	main feature
Baihaiaoen	IDPChat	EN/ZH	Llama-13b Stable Diffusion	Open Chinese Multi-Modal Model, Single GPU Runnable, Easy to Deploy, UI PROVIDED.
Kaust	MiniGPT-4	EN/ZH	LLAMA	Minigpt-4 Aligns A Frozen Visual Encoder from Blip-2 with A Frozen LLM, Vicuna, USING Just One Project Layer,, USING Just One And Yields Many Emerging Vision-Language Capabilities Similar to Those Demonstrated in GPT-4.
MSR, ETC.	Llava	en	LLAMA	Visual Instruction Tuning is proposed, Towards Building Large Language and Vision Models with GPT-4 Level Capabilities.
NUS/THU	Vpgtrans	en	Llama/opt/ Flan-T5/BLIP-2 ...	Transferring VPG Across LLMS to Build Vl-LLMS at Significantly Lower Cost. Can be Reduced Over 10 Times and the Training Data Can be Reduced to Art 10%. Two Novel Vl-LLMS Are Released VIA VPGTRARANS, Including Vl-Llama and Vl-Vicuna . Vl-llama is a multimodal version llama by transferring the blip-2 OPT-6.7b to llama via vpgtrans. Vl-vicuna is a GPT-4-Like Multimodal Chatbot, base on the vicuna llm.
CAS, ETC	X-LLM	EN/ZH	ChatGLM-6B	X-LLM Converts Multi-Modalities (Images, Speech, Video) Into Foreign Languages UNTERFACES and FEED THEM Into a large language model (Chatglm) to accomplish a multimodal llm, the achieving impressive multimodal Capabilities.
NTU	Otter	en	Openflamingo	A Multi-Modal Model Based on OpenFlamingo (Open-Source Version of Deepmind's Flamingo), Trained on MIMIC-IT and Showcasing IMPROVED Instruction-Following Ability and In-Context Learning. FUTHERMORE, OPTIMIZE OpenFlamingo's Implementation, Democratizing the required Training Resources from 1x A100 GPU to 4X RTX-3090 GPUS.
Xmu	Lavin	en	LLAMA	PROPOSE A Novel and Affordable Solution for Vision-Language Instructor Tuning, namely mixTure-OF-MODALITY (MMA). Particularly, MMA is an end-to-line-end optimization regime, which connector the image enCoder and llm via lightweight adapters. Meanwhile, we also propose a novel routing algorithm in mma, which can help the model automatical shifts the reasoning paths For Single-and Multi-Modal Instructions.
USTC	Woodpecker	-	-	The first work to correct hallucination in multimodal laconuage models.
hpcaitech	Open-sora	-	-	Open source alternative to openai sora.

See Also: AWESOME-Multimodal-Large-Language-Models

Data

Pretrain data

contributor	data/project	language	main feature
TogetherComputer	Redpajama-Data	en	An open source recipe to reposition after.
@goldsmith	Wikipedia	multi	A pythonic wrapper for the wikipedia api.

Instrument Data

See Alpaca-Cot Data Collection

contributor	data	language	main feature
salesforce	Dialogstudio	en	DialogStudio: Towards Richest and Most Diverse Unified DataSet Collection-AWare Models for Conversational AI.

Synthetic data generation

contributor	method	main feature
UW, ETC.	Self-instructed	USING The Model's Own Generations to Create a Large Collection of Instructional Data.
@Liuhc0428	Reliable-Self-Instruction	Use Chatgpt to Generate Some Questions and Answers Based on a Given Text.
PKU	EVOL-InStruct	A Novel Method, Proposed inwizardlm, by USING LLMS Instead of Humans to Automatical Mass-PRODUCE Open-Domain Instructor of Various Diffical Levels and Skills Range, to Improve the Performance of LLMS.
Kaust, ETC.	CAMEL	a Novel Communicative Agent Framework Named Role-Playing Is Proposed, Which Involves USING Inception to Guide Chat Agents TOWARD TASK COMPLETION While Maintaining Consistency With Human Intents. Role-Playing Can be used to generate conversional data in a spiecific task/domain.
@Chatarena	Chatarena	a library that provides Multi-Agent Language Game ENVIRONMENTS and Facilities Research About Autonomous Llm Agents and Their Social Interactions. It provides a flexible framework to define multiple players, environments and the Internets between them.

Evaluation

contributor	method	main feature
-	Human Evaluation	-
OpenAI	GPT-4/Chatgpt	-
Pku/cmu/msra ...	Pandalm	Reproducible and Automated Language Model Assessment.
UCB	Chatbot arena	Chat with two anonymous models size-side and vote for white one is better, THEN USE the ELO RATING SYSTEM to Calculating the Relative Performance of the Models.
Stanford	Alpacaeval	GPT-4/Claude Evaluation ONALPACAFARM Dataset.
CLUEAI	SuperCluelyb	Chinese version ofchatbot arena deverted by clueai.
Sjtu, ETC.	Auto-J	a New Open-Source Generation Judge That Can Effectively Evaluate Different LLMS on How they Align to Human Preference.
CMU	Codebertscore	an automatic metric for code generation, based onbertscore. As BertScore, CodeBertScore Leverages the Pre-Trained Contextual Embeddings from a Model Such As Codebert and Matcher in Candidate and Reference Sentences by COS INE SimiLority. Differently From BertScore, CodeBertScore Also Encodes Natural Language Input or Other Context all, generated code, but do do doe that context to comput. E cosine similarities.

Benchmark

The status quo of domestic big model evaluation

contributor	benchmark	main feature
Princeton	Swe-Bench	a Benchmark for Evaluating Large Language Models on Real World Software is Collected from Github. Given a codebase and an issue a language model is tasked with generating a patch that resolution problem.
microsoft	AGIEval	a human-centric benchmark specifically designed to evaluate the general abilities of foundation models in tasks pertinent to human cognition and problem-solving.
CLUEAI	SuperClue-agent	Agent Evaluation Benchmark Based on Chinese Native Tasks.
bytedance	GPT-FATHOM	GPT-FATHOM is an Open-Source and Reproducible Llm Evaluation Suite, BENCHMARKING 10+ Leading Open-Source and Closed-Source Llms AS Openai's Earlier Model s on 20+ Curant Benchmarks Under Aligned Settings.

Leaderboard

opencompass, huggingFace

Framework/Toolkit/Platform

contributor	project	main feature
CAS	Alpaca-cot	Extend COT Data to Alpaca to Boost ITS Reasoning Ability. AIMS At Building An Instruction Finetuning (IFT) Platform with Extensive Instruction Collection And a unified interface for various laconuage models.
@Hiyouga	Chatglm-Efficient-Tuning	Efficient Fine-Tuning Chatglm-6B with PEFT.
@Hiyouga	Llama-effict-tring	Fine-Tuning Llama with PEFT (PT+SFT+RLHF With QLORA).
@jianzhnie	Efficient-Tuning-LLMS	Efficient finding of qlora llms.
Colossalai	Colossalchat	An open-Source Low Cost Solution for CloningChatgpt with a Complete Rlhf Pipeline.
microsoft	DeepSpeed-Chat	Easy, Fast and Affordable Rlhf Training of Chatgpt-Like Models at all scales.
Laion-ai	Open Assistant	a Project Meant to Give Everyone Access to a Great Chat Based LANGUAGE MODEL.
HKUST	Lmflow	an extensible, connected, and eFFICIENT TOOLBOX for Finetuning Large Machine Learning Models, designed to be used-friendly, spey and reliable, and acc Expand Additional Information Version 1.0.0 Type AI Source Code Update Time 2024-12-14 size 113.3KB From Github Related Applications node telegram bot api 2024-12-14 typebot.io 2024-12-14 python wechaty getting started 2024-12-14 TranscriberBot 2024-12-14 genal chat 2024-12-14 Facemoji 2024-12-14 Recommended for You chat.petals.dev Other source code 1.0.0 GPT Prompt Templates Other source code 1.0.0 GPTyped Other source code GPTyped 1.0.5 node telegram bot api AI Source Code v0.50.0 typebot.io AI Source Code v3.1.2 python wechaty getting started AI Source Code 1.0.0 waymo open dataset Other source code December 2023 Update termwind Other categories v2.3.0 wp functions Other categories 1.0.0 Related Information All Java implementation method of changing file query 2025-02-02 HTML5 application file upload 2025-02-02 Use DateFormat class in Java multi -threaded programming 2025-02-02 Guide Excel in ASPX 2025-02-01 Transfer: ASP built -in object Request and Respones 2025-02-02 The method of installing Java's OpenJDK on the CentOS system 2025-02-01 Products & Services Install APK APK signature verification APK Download Service Company Developer Console Submit APK Monetization of traffic through downcodes Legal About us Contact Us Cooperation [email protected]