跳转到主要内容

标签(标签)

资源精选(342) Go开发(108) Go语言(103) Go(99) angular(82) LLM(75) 大语言模型(63) 人工智能(53) 前端开发(50) LangChain(43) golang(43) 机器学习(39) Go工程师(38) Go程序员(38) Go开发者(36) React(33) Go基础(29) Python(24) Vue(22) Web开发(20) Web技术(19) 精选资源(19) 深度学习(19) Java(18) ChatGTP(17) Cookie(16) android(16) 前端框架(13) JavaScript(13) Next.js(12) 安卓(11) 聊天机器人(10) typescript(10) 资料精选(10) NLP(10) 第三方Cookie(9) Redwoodjs(9) LLMOps(9) Go语言中级开发(9) 自然语言处理(9) PostgreSQL(9) 区块链(9) mlops(9) 安全(9) 全栈开发(8) ChatGPT(8) OpenAI(8) Linux(8) AI(8) GraphQL(8) iOS(8) 软件架构(7) Go语言高级开发(7) AWS(7) C++(7) 数据科学(7) whisper(6) Prisma(6) 隐私保护(6) RAG(6) JSON(6) DevOps(6) 数据可视化(6) wasm(6) 计算机视觉(6) 算法(6) Rust(6) 微服务(6) 隐私沙盒(5) FedCM(5) 语音识别(5) Angular开发(5) 快速应用开发(5) 提示工程(5) Agent(5) LLaMA(5) 低代码开发(5) Go测试(5) gorm(5) REST API(5) 推荐系统(5) WebAssembly(5) GameDev(5) CMS(5) CSS(5) machine-learning(5) 机器人(5) 游戏开发(5) Blockchain(5) Web安全(5) Kotlin(5) 低代码平台(5) 机器学习资源(5) Go资源(5) Nodejs(5) PHP(5) Swift(5) 智能体(4) devin(4) Blitz(4) javascript框架(4) Redwood(4) GDPR(4) 生成式人工智能(4) Angular16(4) Alpaca(4) 编程语言(4) SAML(4) JWT(4) JSON处理(4) Go并发(4) kafka(4) 移动开发(4) 移动应用(4) security(4) 隐私(4) spring-boot(4) 物联网(4) nextjs(4) 网络安全(4) API(4) Ruby(4) 信息安全(4) flutter(4) 专家智能体(3) Chrome(3) CHIPS(3) 3PC(3) SSE(3) 人工智能软件工程师(3) LLM Agent(3) Remix(3) Ubuntu(3) GPT4All(3) 软件开发(3) 问答系统(3) 开发工具(3) 最佳实践(3) RxJS(3) SSR(3) Node.js(3) Dolly(3) 移动应用开发(3) 低代码(3) IAM(3) Web框架(3) CORS(3) 基准测试(3) Go语言数据库开发(3) Oauth2(3) 并发(3) 主题(3) Theme(3) earth(3) nginx(3) 软件工程(3) azure(3) keycloak(3) 生产力工具(3) gpt3(3) 工作流(3) C(3) jupyter(3) 认证(3) prometheus(3) GAN(3) Spring(3) 逆向工程(3) 应用安全(3) Docker(3) Django(3) R(3) .NET(3) 大数据(3) Hacking(3) 渗透测试(3) C++资源(3) Mac(3) 微信小程序(3) Python资源(3) JHipster(3) 大型语言模型(2) 语言模型(2) 可穿戴设备(2) JDK(2) SQL(2) Apache(2) Hashicorp Vault(2) Spring Cloud Vault(2) Go语言Web开发(2) Go测试工程师(2) WebSocket(2) 容器化(2) AES(2) 加密(2) 输入验证(2) ORM(2) Fiber(2) Postgres(2) Gorilla Mux(2) Go数据库开发(2) 模块(2) 泛型(2) 指针(2) HTTP(2) PostgreSQL开发(2) Vault(2) K8s(2) Spring boot(2) R语言(2) 深度学习资源(2) 半监督学习(2) semi-supervised-learning(2) architecture(2) 普罗米修斯(2) 嵌入模型(2) productivity(2) 编码(2) Qt(2) 前端(2) Rust语言(2) NeRF(2) 神经辐射场(2) 元宇宙(2) CPP(2) 数据分析(2) spark(2) 流处理(2) Ionic(2) 人体姿势估计(2) human-pose-estimation(2) 视频处理(2) deep-learning(2) kotlin语言(2) kotlin开发(2) burp(2) Chatbot(2) npm(2) quantum(2) OCR(2) 游戏(2) game(2) 内容管理系统(2) MySQL(2) python-books(2) pentest(2) opengl(2) IDE(2) 漏洞赏金(2) Web(2) 知识图谱(2) PyTorch(2) 数据库(2) reverse-engineering(2) 数据工程(2) swift开发(2) rest(2) robotics(2) ios-animation(2) 知识蒸馏(2) 安卓开发(2) nestjs(2) solidity(2) 爬虫(2) 面试(2) 容器(2) C++精选(2) 人工智能资源(2) Machine Learning(2) 备忘单(2) 编程书籍(2) angular资源(2) 速查表(2) cheatsheets(2) SecOps(2) mlops资源(2) R资源(2) DDD(2) 架构设计模式(2) 量化(2) Hacking资源(2) 强化学习(2) flask(2) 设计(2) 性能(2) Sysadmin(2) 系统管理员(2) Java资源(2) 机器学习精选(2) android资源(2) android-UI(2) Mac资源(2) iOS资源(2) Vue资源(2) flutter资源(2) JavaScript精选(2) JavaScript资源(2) Rust开发(2) deeplearning(2) RAD(2)
SEO Title

category

🧵 Table of Contents

 

🚀 Leaderboard

 

Central Leaderboard (Sort by HumanEval Pass@1)

Model Params HumanEval MBPP HF Source
GPT-4 + Reflexion ? 91.0 77.1   paper
GPT-4 (latest) ? 84.1 80.0   github
DeepSeek-Coder-Instruct 33B 79.3 70.0 ckpt github
DeepSeek-Coder-Instruct 7B 78.6 65.4 ckpt github
GPT-3.5-Turbo (latest) ? 76.2 70.8   github
Code-Llama 34B 62.2 61.2   paper
Pangu-Coder2 15B 61.6     paper
WizardCoder-15B 15B 57.3 51.8 ckpt paper
Code-Davinci-002 ? 47.0     paper
StarCoder-15B (Prompted) 15B 40.8 49.5 ckpt paper
PaLM 2-S ? 37.6 50.0   paper
PaLM-Coder-540B 540B 36.0 47.0   paper
InstructCodeT5+ 16B 35.0     paper
StarCoder-15B 15B 33.6 52.7 ckpt paper
Code-Cushman-001 ? 33.5 45.9   paper
CodeT5+ 16B 30.9     paper
LLaMA2-70B 70B 29.9   ckpt paper
CodeGen-16B-Mono 16B 29.3 35.3   paper
PaLM-540B 540B 26.2 36.8   paper
LLaMA-65B 65B 23.7 37.7   paper
CodeGeeX 13B 22.9 24.4   paper
LLaMA-33B 33B 21.7 30.2   paper
CodeGen-16B-Multi 16B 18.3 20.9   paper
AlphaCode 1.1B 17.1     paper
Leaderboard Access
Big Code Models Leaderboard [Source]
BIRD [Source]
CanAiCode Leaderboard [Source]
Coding LLMs Leaderboard [Source]
CRUXEval Leaderboard [Source]
EvalPlus [Source]
HumanEval.jl [Source]
InfiCoder-Eval [Source]
InterCode [Source]
Program Synthesis Models Leaderboard [Source]
Spider [Source]

💡 Evaluation Toolkit:

 

  • bigcode-evaluation-harness: A framework for the evaluation of autoregressive code generation language models.
  • code-eval: A framework for the evaluation of autoregressive code generation language models on HumanEval.

📚 Paper

 

▶️ Pre-Training

 

  1. Evaluating Large Language Models Trained on Code Preprint

    [PaperMark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto. et al. 2021.07

  2. CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis ICLR23

    [PaperErik Nijkamp, Bo Pang, Hiroaki Hayashi, Lifu Tu, Huan Wang, Yingbo Zhou, Silvio Savarese, Caiming Xiong. 2022.03

  3. ERNIE-Code: Beyond English-Centric Cross-lingual Pretraining for Programming Languages ACL23 (Findings)

    [Paper][RepoYekun Chai, Shuohuan Wang, Chao Pang, Yu Sun, Hao Tian, and Hua Wu. 2022.12

  4. SantaCoder: don't reach for the stars! Preprint

    [PaperLoubna Ben Allal, Raymond Li, Denis Kocetkov, Chenghao Mou, Christopher Akiki, Carlos Munoz Ferrandis, Niklas Muennighoff. et al. 2023.01

  5. CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X Preprint

    [PaperQinkai Zheng, Xiao Xia, Xu Zou, Yuxiao Dong, Shan Wang, Yufei Xue, Zihan Wang, Lei Shen, Andi Wang, Yang Li, Teng Su, Zhilin Yang, Jie Tang. 2023.03

  6. CodeGen2: Lessons for Training LLMs on Programming and Natural Languages ICLR23

    [PaperErik Nijkamp, Hiroaki Hayashi, Caiming Xiong, Silvio Savarese, Yingbo Zhou. 2023.05

  7. StarCoder: may the source be with you! Preprint

    [PaperRaymond Li, Loubna Ben Allal, Yangtian Zi, Niklas Muennighoff, Denis Kocetkov, Chenghao Mou. et al. 2023.05

  8. CodeT5+: Open Code Large Language Models for Code Understanding and Generation Preprint

    [PaperYue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi D.Q. Bui, Junnan Li, Steven C.H. Hoi. 2023.05

  9. Textbooks Are All You Need Preprint

    [PaperSuriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi. et al. 2023.06

  10. Code Llama: Open Foundation Models for Code Preprint

    [PaperBaptiste Rozière, Jonas Gehring, Fabian Gloeckle, Sten Sootla, Itai Gat. et al. 2023.08

  11. DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence Preprint

    [PaperDaya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen et al. 2024.01

  12. StarCoder 2 and The Stack v2: The Next Generation Preprint

    [PaperAnton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang et al. 2024.02

▶️ Instruction Tuning

 

  1. Code Alpaca: An Instruction-following LLaMA Model trained on code generation instructions

    [RepoSahil Chaudhary. 2023

  2. WizardCoder: Empowering Code Large Language Models with Evol-Instruct Preprint

    [PaperZiyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, Qingwei Lin, Daxin Jiang. 2023.07

  3. OctoPack: Instruction Tuning Code Large Language Models Preprint

    [Paper][RepoNiklas Muennighoff, Qian Liu, Armel Zebaze, Qinkai Zheng, Binyuan Hui, Terry Yue Zhuo, Swayam Singh, Xiangru Tang, Leandro von Werra, Shayne Longpre. 2023.08

  4. Magicoder: Source Code Is All You Need Preprint

    [Paper][RepoYuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, Lingming Zhang 2023.12

▶️ Alignment with Feedback

 

  1. CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning NeurIPS22

    [PaperHung Le, Yue Wang, Akhilesh Deepak Gotmare, Silvio Savarese, Steven C.H. Hoi. 2022.07

  2. Execution-based Code Generation using Deep Reinforcement Learning TMLR23

    [PaperParshin Shojaee, Aneesh Jain, Sindhu Tipirneni, Chandan K. Reddy. 2023.01

  3. RLTF: Reinforcement Learning from Unit Test Feedback Preprint

    [PaperJiate Liu, Yiqin Zhu, Kaiwen Xiao, Qiang Fu, Xiao Han, Wei Yang, Deheng Ye. 2023.07

  4. PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback Preprint

    [PaperBo Shen, Jiaxin Zhang, Taihong Chen, Daoguang Zan, Bing Geng, An Fu, Muhan Zeng, Ailun Yu, Jichuan Ji, Jingyang Zhao, Yuenan Guo, Qianxiang Wang. 2023.07

▶️ Prompting

 

  1. CodeT: Code Generation with Generated Tests ICLR23

    [PaperBei Chen, Fengji Zhang, Anh Nguyen, Daoguang Zan, Zeqi Lin, Jian-Guang Lou, Weizhu Chen. 2022.07

  2. Coder Reviewer Reranking for Code Generation ICML23

    [PaperTianyi Zhang, Tao Yu, Tatsunori B Hashimoto, Mike Lewis, Wen-tau Yih, Daniel Fried, Sida I Wang. 2022.11

  3. LEVER: Learning to Verify Language-to-Code Generation with Execution ICML23

    [PaperAnsong Ni, Srini Iyer, Dragomir Radev, Ves Stoyanov, Wen-tau Yih, Sida I. Wang, Xi Victoria Lin. 2023.02

  4. Teaching Large Language Models to Self-Debug Preprint

    [PaperXinyun Chen, Maxwell Lin, Nathanael Schärli, Denny Zhou. 2023.06

  5. Demystifying GPT Self-Repair for Code Generation Preprint

    [PaperTheo X. Olausson, Jeevana Priya Inala, Chenglong Wang, Jianfeng Gao, Armando Solar-Lezama. 2023.06

  6. SelfEvolve: A Code Evolution Framework via Large Language Models Preprint

    [PaperShuyang Jiang, Yuhao Wang, Yu Wang. 2023.06

▶️ Evaluation & Benchmark

 

  1. Measuring Coding Challenge Competence With APPS NeurIPS21

    Named APPS

    [Paper][RepoDan Hendrycks, Steven Basart, Saurav Kadavath, Mantas Mazeika, Akul Arora, Ethan Guo, Collin Burns, Samir Puranik, Horace He, Dawn Song, Jacob Steinhardt. 2021.05

  2. Program Synthesis with Large Language Models Preprint

    Named MBPP

    [PaperJacob Austin, Augustus Odena, Maxwell Nye, Maarten Bosma, Henryk Michalewski, David Dohan, Ellen Jiang, Carrie Cai, Michael Terry, Quoc Le, Charles Sutton. 2021.08

  3. DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation ICML23

    [PaperYuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu. 2022.11

  4. RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems Preprint

    [PaperTianyang Liu, Canwen Xu, Julian McAuley. 2023.06

  5. Can ChatGPT replace StackOverflow? A Study on Robustness and Reliability of Large Language Model Code Generation Preprint

    [PaperLi Zhong, Zilong Wang. 2023.08

  6. RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation EMNLP23

    [PaperFengji Zhang, Bei Chen, Yue Zhang, Jacky Keung, Jin Liu, Daoguang Zan, Yi Mao, Jian-Guang Lou, Weizhu Chen. 2023.10

  7. CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion Neurips23

    [PaperYangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan. et al. 2023.11

  8. SWE-bench: Can Language Models Resolve Real-World GitHub Issues? ICLR24

    [PaperYCarlos E. Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, Karthik Narasimhan. 2023.10

  9. DevBench: A Comprehensive Benchmark for Software Development Preprint

    [Paper][RepoYuxiang Wei, Zhe Wang, Jiawei Liu, Yifeng Ding, Lingming Zhang 2024.3

  10. LongCoder: A Long-Range Pre-trained Language Model for Code Completion ICML23

    [PaperDaya Guo, Canwen Xu, Nan Duan, Jian Yin, Julian McAuley. 2023.10

  11. Coeditor: Leveraging Contextual Changes for Multi-round Code Auto-editing Preprint

    [PaperJiayi Wei, Greg Durrett, Isil Dillig. 2023.5

  12. Automating Code Review Activities by Large-Scale Pre-training Preprint

    [PaperJZhiyu Li, Shuai Lu, Daya Guo, Nan Duan, Shailesh Jannu, Grant Jenks, Deep Majumder, Jared Green, Alexey Svyatkovskiy, Shengyu Fu, Neel Sundaresan. 2022.10

▶️ Using LLMs while coding

 

  1. Awesome-DevAI: A list of resources about using LLMs while building software Awesome

    [RepoTy Dunn, Nate Sesti. 2023.10

文章链接

标签