step_into_llm Download - step_into_llm Source code download

step_into_llm

AI Source Code

1.0.0

Download

MindSpore technology open class

Exploring the cutting edge : interpreting technology hot spots and deconstructing hot spot models
Application practice : combine theory with practice, guide development step by step
Expert interpretation : experts in multiple fields, diverse interpretations
Open source sharing : the course is free and the courseware code is open source
Competition Empowerment : ICT Competition Empowerment Course (Large Model Topic Issues 1 and 2)
Series of courses : Special courses on large models are in progress, please stay tuned for other special courses

Registration method

Registration link: https://xihe.mindspore.cn/course/foundation-model-v2/introduction

(Note: You must register to participate in the free course! Add the QQ group simultaneously, and subsequent course matters will be notified in the group!)

The first issue of large model special topic (completed) & the second issue (in progress)

The second phase of the course will be live broadcast at Station B from 14:00-15:00 every Saturday from October 14th.

The ppt and code of each course will be gradually uploaded to github along with the teaching, and the series of video playbacks will be archived on station b. You can get a review of the knowledge points of each class and a course preview for the next class on the MindSpore official account. Welcome to Everyone receives a series of large model tasks from the MindSpore community to challenge.

Because the course cycle is long, the class schedule may be slightly adjusted midway. The final notice shall prevail. Thank you for your understanding!

Friends are warmly welcome to participate in the construction of the course. Interesting developments based on the course can be submitted to the MindSpore large model platform.

If you find any problems with courseware and code during the learning process, and you want us to explain what content you want, or have any suggestions for the course, you can create an issue directly in this repository.

Teaching and research team

Study before class

python
Basics of artificial intelligence and deep learning (focus on natural language processing): MindSpore-d2l
Basic use of OpenI Qizhi community (can obtain computing power for free): OpenI_Learning
Basic use of MindSpore: MindSpore tutorial
Basic use of MindFormers: MindFormers explanation video

Course introduction

Shengsi MindSpore technology open class is now in full swing. It is open to all developers who are interested in large models. We will lead you to combine theory with time and gradually deepen the large model technology from the shallower to the deeper.

In the completed first course (Lecture 1-10), we started with Transformer, analyzed the evolution route of ChatGPT, and guided you step by step to build a simple version of "ChatGPT"

The ongoing second phase of the course (Lecture 11-) has been comprehensively upgraded on the basis of the first phase. It focuses on the whole process practice of large models from development to application, explaining more cutting-edge large model knowledge and enriching more knowledge. A diverse lineup of lecturers, looking forward to your joining!

Chapter number	Chapter name	Course Introduction	video	Courseware and code	Summary of knowledge points
Lecture 1	Transformer	Multi-head self-attention principle. Masking processing method of Masked self-attention. Transformer-based machine translation task training.	link	link	link
Lecture 2	BERT	BERT model design based on Transformer Encoder: MLM and NSP tasks. BERT's paradigm for fine-tuning downstream tasks.	link	link	link
Lecture 3	GPT	GPT model design based on Transformer Decoder: Next token prediction. GPT downstream task fine-tuning paradigm.	link	link	link
Lecture 4	GPT2	The core innovations of GPT2 include Task Conditioning and Zero shot learning; the model implementation details are based on the changes of GPT1.	link	link	link
Lecture 5	MindSpore automatically parallelizes	Data parallelism, model parallelism, Pipeline parallelism, memory optimization and other technologies based on MindSpore's distributed parallelism characteristics.	link	link	link
Lecture 6	Code pre-training	The development history of code pre-training. Code data preprocessing. CodeGeex code pre-trains large models.	link	link	link
Lecture 7	Prompt Tuning	Change from Pretrain-finetune paradigm to Prompt tuning paradigm. Hard prompt and Soft prompt related technologies. Just change the prompting of the description text.	link	link	link
Lecture 8	Multi-modal pre-trained large model	The design, data processing and advantages of Zidong Taichu multi-modal large model; the theoretical overview, system framework, current situation and challenges of speech recognition.	link	/	/
Lecture 9	Instruct Tuning	The core idea of instruction tuning: enable the model to understand the task description (instruction). Limitations of instruction tuning: unable to support open domain innovative tasks, unable to align LM training goals and human needs. Chain-of-thoughts: By providing examples in prompts, the model can draw inferences.	link	link	link
Lecture 10	RLHF	The core idea of RLHF: Align LLM with human behavior. Breakdown of RLHF technology: LLM fine-tuning, reward model training based on human feedback, and model fine-tuning through reinforcement learning PPO algorithm.	link	link	Updating
Lecture 11	ChatGLM	GLM model structure, evolution from GLM to ChatGLM, ChatGLM inference deployment code demonstration	link	link	link
Lecture 12	Multimodal remote sensing intelligent interpretation basic model	In this course, Mr. Sun Xian, deputy director of the researcher laboratory of the Institute of Aerospace Information Innovation, Chinese Academy of Sciences, explained the basic model of multi-modal remote sensing interpretation, revealing the development and challenges of intelligent remote sensing technology in the era of large models, and the technical routes and solutions of the basic remote sensing model. Typical scenario applications	link	/	link
Lecture 13	ChatGLM2	ChatGLM2 technical analysis, ChatGLM2 inference deployment code demonstration, ChatGLM3 feature introduction	link	link	link
Lecture 14	Text generation and decoding principles	Taking MindNLP as an example to explain the principles and implementation of search and sampling technology	link	link	link
Lecture 15	LLAMA	LLaMA background and introduction to the alpaca family, LLaMA model structure analysis, LLaMA inference deployment code demonstration	link	link	link
Lecture 16	LLAMA2	Introducing the LLAMA2 model structure, reading the code to demonstrate LLAMA2 chat deployment	link	link	link
Lecture 17	Pengcheng mind	The Pengcheng Brain 200B model is an autoregressive language model with 200 billion parameters. It is based on MindSpore's multi-dimensional distributed parallel technology for long-term large-scale development on the China Computing Network hub node 'Pengcheng Cloud Brain II' kilocard cluster. Scale training. The model focuses on the core capabilities of Chinese, taking into account English and some multi-language capabilities. It has completed training on 1.8T tokens.	link	/	link
Lecture 18	CPM-Bee	Introducing CPM-Bee pre-training, inference, fine-tuning and live code demonstration	link	link	link
Lecture 19	RWKV1-4	The decline of RNN and the rise of Transformers. Universal Transformers? Disadvantages of Self-attention "Punch" Transformer's new RNN-RWKV Practice of RWKV model based on MindNLP	link	/	link
Lecture 20	MOE	The past and present life of MoE The implementation foundation of MoE: AlltoAll communication; Mixtral 8x7b: The best open source MoE large model at present, MoE and lifelong learning, based on the Mixtral 8x7b inference demonstration of Shengsi MindSpore.	link	link	link
Lecture 21	Efficient parameter fine-tuning	Introducing Lora, (P-Tuning) principles and code implementation	link	link	link
Lecture 22	Prompt Engineering	Prompt engineering: 1. What is Prompt? 2. How to define the quality of a Prompt? 3. How to write a high-quality Prompt? 4. How to produce a high-quality prompt? 5. Let’s briefly talk about some of the problems we encountered when executing Prompt.	link	/	link
Lecture 23	Multi-dimensional hybrid parallel automatic search optimization strategy	Topic 1·Time loss model and improved multi-dimensional dichotomy/Topic 2·APSS algorithm application	up and down	link
Lecture 24	Scholar. Puyu large model open source full chain tool chain introduction and intelligent agent development experience	In this course, we are fortunate to invite Mr. Wen Xing, the technical operator and technical evangelist of Shusheng.Puyu community, and Mr. Geng Li, the technical evangelist of MindSpore, to explain in detail the open source full-link tool of Shusheng.Puyu large model. chain, demonstrating how to fine-tune, reason and develop intelligent agents on Shusheng.Puyu.	link	/	link
Lecture 25	RAG
Lecture 26	LangChain module analysis	Analyze Models, Prompts, Memory, Chains, Agents, Indexes, Callbacks modules, and case analysis
Lecture 27	RWKV5-6	/
Lecture 28	Quantify	Introducing low-bit quantization and other related model quantization technologies