SkyCode AI CodeX GPT3 Download - SkyCode AI CodeX GPT3 Source code download

SkyCode AI CodeX GPT3

Other source code

1.0.0

Download

SkyCode

SkyCode is a multilingual open source programming model released by Singularity Intelligence. It adopts the GPT3 model structure and uses massive code for training. Supports Java, JavaScript, C, C++, Python, Go, shell and other mainstream programming languages, and can understand Chinese annotations. The model can complete the code, solve problems and other operations, allowing you to free yourself from programming and focus on solving bigger problems.

Hugging Face Model Home Page: https://huggingface.co/SkyWork/SkyCode

Project Highlights

Technical Advantage 1: Covering multiple programming languages
Different programming languages focus on solving problems in different platforms and environments, and different programming languages have their own reasons for their existence. The code that Singularity Intelligence SkyCode can generate not only includes a wide range of JavaScript, python, Java, C, etc., but also covers more than ten programming languages such as php, go, swift, etc., allowing users of different languages to experience SkyCode's powerful code generation capabilities.
Technical Advantage 2: Optimize for Chinese annotations
In the field of pre-training big models, it has always been dominated by the English community, and the code generation model based on GPT3 has the same problem. With the experience of deeply cultivating Chinese models, Singularity Zhiyuan optimized and innovated the use of unique Chinese encoding methods based on the characteristics of Chinese, which is more in line with Chinese language habits, making the model's understanding of Chinese annotations better.
Technical Advantage Three: Extremely Excellent Problem-solving Ability
On the HumanEval dataset that reflects the problem-solving ability of code generation models, Singularity Intelligent Source SkyCode's problem-solving ability is also far higher than that of other open source models.
model pass@1 pass@10 pass@100
GPT-Neo 1.3B 4.79% 7.47% 16.30%
GPT-Neo 2.7B 6.41% 11.27% 21.37%
GPT-J 6B 11.62% 15.74% 27.74%
SKY_code(2.6B) 12.84% 21.07% 35.97%
It can be seen that SkyCode with a parameter amount of 2.6B is not only much higher than GPT-Neo 1.3B with fewer parameters, but also much higher than GPT-Neo 2.7B model with a comparable parameter amount. Even compared with the GPT-J 6B model with higher parameters, SkyCode has stronger problem-solving ability. In the pass@100 indicator that better reflects the upper limit of problem-solving ability, SkyCode's net value exceeding GPT-J is 8.23%.

model	pass@1	pass@10	pass@100
GPT-Neo 1.3B	4.79%	7.47%	16.30%
GPT-Neo 2.7B	6.41%	11.27%	21.37%
GPT-J 6B	11.62%	15.74%	27.74%
SKY_code(2.6B)	12.84%	21.07%	35.97%

Singularity News

[2022.12.15] Kunlun Tiangong AIGC press conference

——————————————————————————————————————————————————————————

rely

推荐
transformers>=4.18.0

Model usage

 # -*- coding: utf-8 -*-
from transformers import GPT2LMHeadModel
from transformers import AutoTokenizer
from transformers import TextGenerationPipeline

model = GPT2LMHeadModel . from_pretrained ( "SkyWork/SkyCode" )
tokenizer = AutoTokenizer . from_pretrained ( "SkyWork/SkyCode" , trust_remote_code = True )
text_generator = TextGenerationPipeline ( model , tokenizer , device = 0 )
input_str = "if __name__"
max_new_tokens = 40
print ( text_generator ( input_str , max_new_tokens = max_new_tokens , do_sample = True ))