pgmr.cloud
19 May 2023
SEO Title
目录
LLM时间表
List of LLMs
Category | model | Release Time | Size(B) | Link |
---|---|---|---|---|
Publicly Accessbile |
T5 | 2019/10 | 11 | Paper |
mT5 | 2021/03 | 13 | Paper | |
PanGu-α | 2021/05 | 13 | Paper | |
CPM-2 | 2021/05 | 198 | Paper | |
T0 | 2021/10 | 11 | Paper | |
GPT-NeoX-20B | 2022/02 | 20 | Paper | |
CodeGen | 2022/03 | 16 | Paper | |
Tk-Instruct | 2022/04 | 11 | Paper | |
UL2 | 2022/02 | 20 | Paper | |
OPT | 2022/05 | 175 | Paper | |
YaLM | 2022/06 | 100 | Github | |
NLLB | 2022/07 | 55 | Paper | |
BLOOM | 2022/07 | 176 | Paper | |
GLM | 2022/08 | 130 | Paper | |
Flan-T5 | 2022/10 | 11 | Paper | |
mT0 | 2022/11 | 13 | Paper | |
Galatica | 2022/11 | 120 | Paper | |
BLOOMZ | 2022/11 | 176 | Paper | |
OPT-IML | 2022/12 | 175 | Paper | |
Pythia | 2023/01 | 12 | Paper | |
LLaMA | 2023/02 | 65 | Paper | |
Vicuna | 2023/03 | 13 | Blog | |
ChatGLM | 2023/03 | 6 | Github | |
CodeGeeX | 2023/03 | 13 | Paper | |
Koala | 2023/04 | 13 | Blog | |
Closed Source |
GShard | 2020/01 | 600 | Paper |
GPT-3 | 2020/05 | 175 | Paper | |
LaMDA | 2021/05 | 137 | Paper | |
HyperCLOVA | 2021/06 | 82 | Paper | |
Codex | 2021/07 | 12 | Paper | |
ERNIE 3.0 | 2021/07 | 10 | Paper | |
Jurassic-1 | 2021/08 | 178 | Paper | |
FLAN | 2021/10 | 137 | Paper | |
MT-NLG | 2021/10 | 530 | Paper | |
Yuan 1.0 | 2021/10 | 245 | Paper | |
Anthropic | 2021/12 | 52 | Paper | |
WebGPT | 2021/12 | 175 | Paper | |
Gopher | 2021/12 | 280 | Paper | |
ERNIE 3.0 Titan | 2021/12 | 260 | Paper | |
GLaM | 2021/12 | 1200 | Paper | |
InstructGPT | 2022/01 | 175 | Paper | |
AlphaCode | 2022/02 | 41 | Paper | |
Chinchilla | 2022/03 | 70 | Paper | |
PaLM | 2022/04 | 540 | Paper | |
Cohere | 2022/06 | 54 | Homepage | |
AlexaTM | 2022/08 | 20 | Paper | |
Luminous | 2022/09 | 70 | Docs | |
Sparrow | 2022/09 | 70 | Paper | |
WeLM | 2022/09 | 10 | Paper | |
U-PaLM | 2022/10 | 540 | Paper | |
Flan-PaLM | 2022/10 | 540 | Paper | |
Flan-U-PaLM | 2022/10 | 540 | Paper | |
Alpaca | 2023/03 | 7 | Blog | |
GPT-4 | 2023/3 | - | Paper | |
PanGU-Σ | 2023/3 | 1085 | Paper |
Resources of LLMs
Publicly Available Models
- T5: "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer". Colin Raffel et al. JMLR 2019. [Paper] [Checkpoint]
- mT5: "mT5: A massively multilingual pre-trained text-to-text transformer". Linting Xue et al. NAACL 2021. [Paper] [Checkpoint]
- PanGu-α: "PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation". Wei Zeng et al. arXiv 2021. [Paper] [Checkpoint]
- CPM-2: "CPM-2: Large-scale Cost-effective Pre-trained Language Models". Zhengyan Zhang et al. arXiv 2021. [Paper] [Checkpoint]
- T0: "Multitask Prompted Training Enables Zero-Shot Task Generalization". Victor Sanh et al. ICLR 2022. [Paper] [Checkpoint]
- GPT-NeoX-20B: "GPT-NeoX-20B: An Open-Source Autoregressive Language Model". Sid Black et al. arXiv 2022. [Paper] [Checkpoint]
- CodeGen: "CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis". Erik Nijkamp et al. arXiv 2022. [Paper] [Checkpoint]
- Tk-Instruct: "Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks". Yizhong Wang et al. EMNLP 2022. [Paper] [Checkpoint]
- UL2: "UL2: Unifying Language Learning Paradigms". Yi Tay et al. arXiv 2022. [Paper] [Checkpoint]
- OPT: "OPT: Open Pre-trained Transformer Language Models". Susan Zhang et al. arXiv 2022. [Paper] [Checkpoint]
- NLLB: "No Language Left Behind: Scaling Human-Centered Machine Translation". NLLB Team. arXiv 2022. [Paper] [Checkpoint]
- BLOOM: "BLOOM: A 176B-Parameter Open-Access Multilingual Language Model". BigScience Workshop. arXiv 2022. [Paper] [Checkpoint]
- GLM: "GLM-130B: An Open Bilingual Pre-trained Model". Aohan Zeng et al. arXiv 2022. [Paper] [Checkpoint]
- Flan-T5: "Scaling Instruction-Finetuned Language Models". Hyung Won Chung et al. arXiv 2022. [Paper] [Checkpoint]
- mT0 && BLOOMZ: "Crosslingual Generalization through Multitask Finetuning". Niklas Muennighoff et al. arXiv 2022. [Paper] [Checkpoint]
- Galactica: "Galactica: A Large Language Model for Science". Ross Taylor et al. arXiv 2022. [Paper] [Checkpoint]
- OPT-IML: "OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization". Srinivasan et al. . arXiv 2022. [Paper] [Checkpoint]
- CodeGeeX: "CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X". Qinkai Zheng et al. . arXiv 2023. [Paper] [Checkpoint]
- Pythia: "Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling". Stella Biderman et al. . arXiv 2023. [Paper] [Checkpoint]
- LLaMA: "LLaMA: Open and Efficient Foundation Language Models". Hugo Touvron et al. arXiv 2023. [Paper] [Checkpoint]
Closed-source Models
- GShard: "GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding". Dmitry Lepikhin et al. ICLR 2021. [Paper]
- GPT-3: "Language Models are Few-Shot Learners". Tom B. Brown et al. NeurIPS 2020. [Paper]
- LaMDA: "LaMDA: Language Models for Dialog Applications". Romal Thoppilan et al. arXiv 2021. [Paper]
- HyperCLOVA: "What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers". Boseop Kim et al. EMNLP 2021. [Paper]
- CodeX: "Evaluating Large Language Models Trained on Code". Mark Chen et al. arXiv 2021. [Paper]
- ERNIE 3.0: "ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation". Yu Sun et al. arXiv 2021. [Paper]
- Jurassic-1: "Jurassic-1: Technical details and evaluation". Opher Lieber et al. 2021. [Paper]
- FLAN: "Finetuned Language Models Are Zero-Shot Learners". Jason Wei et al. ICLR 2021. [Paper]
- MT-NLG: "Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model". Shaden Smith et al. arXiv 2021. [Paper]
- Yuan 1.0: "Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning". Shaohua Wu et al. arXiv 2021. [Paper]
- Anthropic: "A General Language Assistant as a Laboratory for Alignment" . Amanda Askell et al. arXiv 2021. [Paper]
- WebGPT: "WebGPT: Browser-assisted question-answering with human feedback" . Reiichiro Nakano et al. arXiv 2021. [Paper]
- Gopher: "Scaling Language Models: Methods, Analysis & Insights from Training Gopher". Jack W. Rae et al. arXiv 2021. [Paper]
- ERNIE 3.0 Titan: "ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation". *Shuohuan Wang et al. *arXiv 2021. [Paper]
- GLaM: "GLaM: Efficient Scaling of Language Models with Mixture-of-Experts". Nan Du et al. ICML 2022. [Paper]
- InstructGPT: "Training language models to follow instructions with human feedback". Long Ouyang et al. arXiv 2022. [Paper]
- AlphaCode: "Competition-Level Code Generation with AlphaCode". Yujia Li et al. arXiv 2022. [Paper]
- Chinchilla: "Training Compute-Optimal Large Language Models". Jordan Hoffmann et al. arXiv. [Paper]
- PaLM: "PaLM: Scaling Language Modeling with Pathways". Aakanksha Chowdhery et al. arXiv 2022. [Paper]
- AlexaTM: "AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model". Saleh Soltan et al. arXiv 2022. [Paper]
- Sparrow: "Improving alignment of dialogue agents via targeted human judgements". Amelia Glaese et al. . arXiv 2022. [Paper]
- WeLM: "WeLM: A Well-Read Pre-trained Language Model for Chinese". Hui Su et al. . arXiv 2022. [Paper]
- U-PaLM: "Transcending Scaling Laws with 0.1% Extra Compute". Yi Tay et al. arXiv 2022. [Paper]
- Flan-PaLM && Flan-U-PaLM: "Scaling Instruction-Finetuned Language Models". Hyung Won Chung et al. arXiv. [Paper]
- GPT-4: "GPT-4 Technical Report". OpenAI. arXiv 2023. [Paper]
- PanGu-Σ: "PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing". Xiaozhe Ren et al. arXiv 2023. [Paper]
Commonly Used Corpora
- BookCorpus: "Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books". Yukun Zhu et al. ICCV 2015. [Paper] [Source]
- Guntenburg: [Source]
- CommonCrawl: [Source]
- C4: "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer". Colin Raffel et al. JMLR 2019. [Paper] [Source]
- CC-stories-R: "A Simple Method for Commonsense Reasoning". Trieu H. Trinh el al. arXiv 2018. [Paper] [Source]
- CC-NEWS: "RoBERTa: A Robustly Optimized BERT Pretraining Approach". Yinhan Liu et al. arXiv 2019. [Paper] [Source]
- REALNEWs: "Defending Against Neural Fake News". Rowan Zellers et al. NeurIPS 2019. [Paper] [Source]
- OpenWebText: [Source]
- Pushshift.io: "The Pushshift Reddit Dataset". Jason Baumgartner et al. AAAI 2020. [Paper] [Source]
- Wikipedia: [Source]
- BigQuery: [Source]
- The Pile: "The Pile: An 800GB Dataset of Diverse Text for Language Modeling". Leo Gao et al. arxiv 2021. [Paper] [Source]
- ROOTS: "The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset". Laurençon et al. NeurIPS 2022 Datasets and Benchmarks Track. [paper]
Library Resource
- Transformers: "Transformers: State-of-the-Art Natural Language Processing". Thomas Wolf et al. EMNLP 2020. [Paper] [Source]
- DeepSpeed: "Deepspeed: System optimizations enable training deep learning models with over 100 billion parameters". Rasley et al. KDD 2020. [Paper] [Source]
- Megatron-LM: "Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism". Mohammad Shoeybi et al. arXiv 2019. [Paper] [Source]
- JAX: [Source]
- Colossal-AI: "Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training". Zhengda Bian et al. arXiv 2021. [Paper] [Source]
- BMTrain: [Source]
- FastMoE: "FastMoE: A Fast Mixture-of-Expert Training System". Jiaao He et al. arXiv 2021. [Paper] [Source]
Deep Learning Frameworks
- Pytorch: "PyTorch: An Imperative Style, High-Performance Deep Learning Library". Adam Paszke el al. NeurIPS 2019. [Paper] [Source]
- TensorFlow: "TensorFlow: A system for large-scale machine learning". Martín Abadi et al. OSDI 2016. [Paper] [Source]
- MXNet: "MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems". Tianqi Chen et al. arXiv 2015. [Paper] [Source]
- PaddlePaddle: "PaddlePaddle: An Open-Source Deep Learning Platform from Industrial Practice" . Yanjun Ma et al. Frontiers of Data and Domputing 2019. [Paper] [Source]
- MindSpore: "Huawei MindSpore AI Development Framework" . Huawei Technologies Co., Ltd. Artificial Intelligence Technology 2022. [Paper] [Source]
- OneFlow: "OneFlow: Redesign the Distributed Deep Learning Framework from Scratch" . Jinhui Yuan et al. arXiv 2021. [Paper] [Source]
Pre-training
Data Collection
- "The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset". Laurençon et al. NeurIPS 2022 Datasets and Benchmarks Track. [paper]
- "Deduplicating Training Data Makes Language Models Better". Katherine Lee et al. ACL 2022. [paper]
- "Deduplicating Training Data Mitigates Privacy Risks in Language Models". Nikhil Kandpal et al. ICML 2022. [paper]
- "Scaling Laws and Interpretability of Learning from Repeated Data". Danny Hernandez et al. arXiv 2022. [paper]
Architecture
Mainstream Architectures
Causal Decoder
- "Language Models are Few-Shot Learners". Tom B. Brown et al. NeurIPS 2020. [paper]
- "OPT: Open Pre-trained Transformer Language Models". Susan Zhang et al. arXiv 2022. [paper]
- "BLOOM: A 176B-Parameter Open-Access Multilingual Language Model". Teven Le Scao et al. arXiv 2022. [paper]
- "Training Compute-Optimal Large Language Models". Jordan Hoffmann et al. arXiv 2022. [paper]
- "Scaling Language Models: Methods, Analysis & Insights from Training Gopher". Jack W. Rae et al. arXiv 2021. [paper]
- "Galactica: A Large Language Model for Science". Ross Taylor et al. arXiv 2022. [paper]
- "PaLM: Scaling Language Modeling with Pathways". Aakanksha Chowdhery et al. arXiv 2022. [paper]
- "Jurassic-1: Technical Details and Evaluation". Opher Lieber et al. AI21 Labs. [paper]
- "LaMDA: Language Models for Dialog Applications". Romal Thoppilan et al. arXiv 2022. [paper]
Prefix Decoder
- "GLM-130B: An Open Bilingual Pre-trained Model". Aohan Zeng et al. arXiv 2022. [paper]
- "GLM: General Language Model Pretraining with Autoregressive Blank Infilling". Zhengxiao Du et al. ACL 2022. [paper]
- "Transcending Scaling Laws with 0.1% Extra Compute". Yi Tay et al. arXiv 2022. [paper]
MoE
- "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity". William Fedus et al. JMLR. [paper]
- "Unified Scaling Laws for Routed Language Models". Aidan Clark et al. ICML 2022. [paper]
SSM
- "Pretraining Without Attention". Junxiong Wang et al. arXiv 2022. [paper]
- "Efficiently Modeling Long Sequences with Structured State Spaces". Albert Gu et al. ICLR 2022. [paper]
- "Long Range Language Modeling via Gated State Spaces". Harsh Mehta et al. arXiv 2022. [paper]
Detailed Configuration
Layer Normalization
- "DeepNet: Scaling Transformers to 1,000 Layers". Hongyu Wang et al. arXiv 2022. [paper]
- "Root Mean Square Layer Normalization". Biao Zhang et al. NeurIPS 2019. [paper]
Position Encoding
- "Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation". Ofir Press et al. ICLR 2022. [paper]
- "RoFormer: Enhanced Transformer with Rotary Position Embedding". Jianlin Su et al. arXiv 2021. [paper]
Analysis
- "What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?". Thomas Wang et al. ICML 2022. [paper]
- "What Language Model to Train if You Have One Million GPU Hours?". Teven Le Scao et al. Findings of EMNLP 2022. [paper]
- "Examining Scaling and Transfer of Language Model Architectures for Machine Translation". Biao Zhang et al. ICML 2022. [paper]
- "Scaling Laws vs Model Architectures: How does Inductive Bias Influence Scaling?". Yi Tay et al. arXiv 2022. [paper]
- "Do Transformer Modifications Transfer Across Implementations and Applications?". Sharan Narang et al. EMNLP 2021. [paper]
Training Algorithms
- "Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism". Mohammad Shoeybi et al. arXiv 2019. [paper]
- "An Efficient 2D Method for Training Super-Large Deep Learning Models". Qifan Xu et al. arXiv 2021. [paper]
- "Tesseract: Parallelize the Tensor Parallelism Efficiently". Boxiang Wang et al. ICPP 2022. [paper]
- "Maximizing Parallelism in Distributed Training for Huge Neural Networks". Zhengda Bian et al. arXiv 2021. [paper]
- "GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism". Yanping Huang et al. NeurIPS 2019. [paper]
- "PipeDream: Fast and Efficient Pipeline Parallel DNN Training". Aaron Harlap et al. arXiv 2018. [paper]
- "ZeRO: Memory Optimizations Toward Training Trillion Parameter Models". Samyam Rajbhandari et al. SC 2020. [paper]
- "ZeRO-Offload: Democratizing Billion-Scale Model Training". Jie Ren et al. USENIX 2021. [paper]
Pre-training on Code
LLMs for Program Synthesis
- "Evaluating Large Language Models Trained on Code". Mark Chen et al. arXiv 2021. [paper]
- "Program Synthesis with Large Language Models". Jacob Austin et al. arXiv 2021. [paper]
- "Show Your Work: Scratchpads for Intermediate Computation with Language Models". Maxwell Nye et al. arXiv 2021. [paper]
- "A Systematic Evaluation of Large Language Models of Code". Frank F. Xu et al. arXiv 2022. [paper]
- "Competition-Level Code Generation with AlphaCode". Yujia Li et al. Science. [paper]
- "CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis". Erik Nijkamp et al. ICLR 2023. [paper]
- "InCoder: A Generative Model for Code Infilling and Synthesis". Daniel Fried et al. ICLR 2023. [paper]
- "CodeT: Code Generation with Generated Tests". Bei Chen et al. ICLR 2023. [paper]
NLP Tasks Formatted as Code
- "Language Models of Code are Few-Shot Commonsense Learners". Aman Madaan et al. EMNLP 2022. [paper]
- "Autoformalization with Large Language Models". Yuhuai Wu et al. NeurIPS 2022. [paper]
Adaptation Tuning
Instruction Tuning
- "Multi-Task Deep Neural Networks for Natural Language Understanding". Xiaodong Liu et al. ACL 2019. [Paper] [Homepage]
- "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer". Colin Raffel et al. JMLR 2020. [Paper] [Checkpoint]
- "Muppet: Massive Multi-task Representations with Pre-Finetuning". Armen Aghajanyan et al. EMNLP 2021. [Paper] [Checkpoint]
- "Cross-Task Generalization via Natural Language Crowdsourcing Instructions". Swaroop Mishra et al. ACL 2022. [Paper] [Collection]
- "CrossFit: A Few-shot Learning Challenge for Cross-task Generalization in NLP". Qinyuan Ye et al. EMNLP 2021. [Paper] [Collection]
- "Finetuned Language Models Are Zero-Shot Learners". Jason Wei et al. ICLR 2022. [Paper] [Homepage]
- "Multitask Prompted Training Enables Zero-Shot Task Generalization". Victor Sanh et al. ICLR 2022. [Paper] [Checkpoint]
- "ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning". Vamsi Aribandi et al. ICLR 2022. [Paper]
- "UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models". Tianbao Xie et al. EMNLP 2022. [Paper] [Collection] [Checkpoint]
- "PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts". Stephen H. Bach et al. ACL 2022. [Paper] [Collection]
- "Training language models to follow instructions with human feedback". Long Ouyang et al. arXiv 2022. [Paper]
- "Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks". Yizhong Wang et al. EMNLP 2022. [Paper] [Collection] [Checkpoint]
- "MVP: Multi-task Supervised Pre-training for Natural Language Generation". Tianyi Tang et al. arXiv 2022. [Paper] [Collection] [Checkpoint]
- "Crosslingual Generalization through Multitask Finetuning". Niklas Muennighoff et al. arXiv 2022. [Paper] [Collection] [Checkpoint]
- "Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization". Yuxian Gu et al. EMNLP 2022. [Paper] [Homepage]
- "Scaling Instruction-Finetuned Language Models". Hyung Won Chung et al. arXiv 2022. [Paper] [Homepage]
- "Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor". Or Honovich et al. arXiv 2022. [Paper] [Homepage]
- "Self-Instruct: Aligning Language Model with Self Generated Instructions". Yizhong Wang et al. arXiv 2022. [Paper] [Homepage]
- "OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization". Srinivasan Iyer et al. arXiv 2022. [Paper] [Checkpoint]
- "The Flan Collection: Designing Data and Methods for Effective Instruction Tuning". Shayne Longpre et al. arXiv 2023. [Paper] [Homepage]
- "Is Prompt All You Need No. A Comprehensive and Broader View of Instruction Learning". Renze Lou et al. arXiv 2023. [Paper]
Alignment Tuning
- "TAMER: Training an Agent Manually via Evaluative Reinforcement". W. Bradley Knox et al. ICDL 2008. [Paper]
- "Interactive Learning from Policy-Dependent Human Feedback". James MacGlashan et al. ICML 2017. [Paper]
- "Deep Reinforcement Learning from Human Preferences". Paul Christiano et al. NIPS 2017. [Paper]
- "Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces". Garrett Warnell et al. AAAI 2018. [Paper]
- "Fine-Tuning Language Models from Human Preferences". Daniel M. Ziegler et al. arXiv 2019. [Paper]
- "Learning to summarize from human feedback". Nisan Stiennon et al. NeurIPS 2020. [Paper]
- "Alignment of Language Agents". Zachary Kenton et al. arXiv 2021. [Paper]
- "Recursively Summarizing Books with Human Feedback". Jeff Wu et al. arXiv 2021. [Paper]
- "A General Language Assistant as a Laboratory for Alignment". Amanda Askell et al. arXiv 2021. [Paper]
- "WebGPT: Browser-assisted question-answering with human feedback". Reiichiro Nakano et al. arXiv 2021. [Paper]
- "Training language models to follow instructions with human feedback". Long Ouyang et al. arXiv 2022. [Paper]
- "Teaching language models to support answers with verified quotes". Jacob Menick et al. arXiv 2022. [Paper]
- "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback". Yuntao Bai et al. arXiv 2022. [Paper]
- "Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning". Deborah Cohen et al. arXiv 2022. [Paper]
- "Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned". Deep Ganguli et al. arXiv 2022. [Paper]
- "Improving alignment of dialogue agents via targeted human judgements". Amelia Glaese et al. arXiv 2022. [Paper]
- "Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization". Rajkumar Ramamurthy et al. arXiv 2022. [Paper]
- "Scaling Laws for Reward Model Overoptimization". Leo Gao et al. arXiv 2022. [Paper]
- "The Wisdom of Hindsight Makes Language Models Better Instruction Followers". Tianjun Zhang et al. arXiv 2023. [Paper]
Utilization
- "An Information-theoretic Approach to Prompt Engineering Without Ground Truth Labels". Taylor Sorensen et al. ACL 2022. [Paper]
- "What Makes Good In-Context Examples for GPT-3?". Jiachang Liu et al. ACL 2022. [Paper]
- "Learning to retrieve prompts for in-context learning". Ohad Rubin et al. NAACL 2022. [Paper]
- "Diverse demonstrations improve in-context compositional generalization". Itay Levy et al. arxiv 2022. [Paper]
- "Automatic Chain of Thought Prompting in Large Language Models". Zhuosheng Zhang et al. arxiv 2022. [Paper]
- "Demystifying Prompts in Language Models via Perplexity Estimation". Hila Gonen et al. arxiv 2022. [Paper]
- "Active Example Selection for In-Context Learning". Yiming Zhang et al. EMNLP 2022. [Paper]
- "Self-adaptive In-context Learning". Zhiyong Wu et al. arxiv 2022. [Paper]
- "Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity". Yao Lu et al. ACL 2022. [Paper]
- "Structured Prompting: Scaling In-Context Learning to 1,000 Examples". Hao, Yaru et al. arxiv 2022. [Paper]
- "The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning". Ye, Xi et al. arxiv 2022. [Paper]
- "Cross-Task Generalization via Natural Language Crowdsourcing Instructions". Swaroop Mishra et al. ACL 2022. [Paper]
- "Prompt-Augmented Linear Probing: Scaling Beyond the Limit of Few-shot In-Context Learner". Hyunsoo Cho et al. arxiv 2022. [Paper]
- "Self-instruct: Aligning language model with self generated instructions". Yizhong Wang et al. arxiv 2022. [Paper]
- "An Explanation of In-context Learning as Implicit Bayesian Inference". Sang Michael Xie et al. ICLR 2022. [Paper]
- "Calibrate Before Use: Improving Few-Shot Performance of Language Models". Zihao Zhao et al. ICML 2021. [Paper]
- "Data distributional properties drive emergent in-context learning in transformers". Stephanie C. Y. Chan et al. arxiv 2022. [Paper]
- "Emergent Abilities of Large Language Models". Jason Wei et al. arxiv 2022. [Paper]
- "In-context Learning and Induction Heads". Catherine Olsson et al. arxiv 2022. [Paper]
- "Language Models are Few-Shot Learners". Tom B. Brown et al. NeurIPS 2020. [Paper]
- "On the Effect of Pretraining Corpora on In-context Learning by a Large-scale Language Model". Seongjin Shin et al. NAACL 2022. [Paper]
- "Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?". Sewon Min et al. EMNLP 2022. [Paper]
- "Rethinking the Role of Scale for In-Context Learning: An Interpretability-based Case Study at 66 Billion Scale". Hritik Bansal et al. arxiv 2022. [Paper]
- "Transformers as algorithms: Generalization and implicit model selection in in-context learning". Yingcong Li et al. arxiv 2023. [Paper]
- "Transformers learn in-context by gradient descent". Johannes von Oswald et al. arxiv 2022. [Paper]
- "What learning algorithm is in-context learning? investigations with linear models". Ekin Aky{"{u}}rek et al. arxiv 2022. [Paper]
- "Chain of Thought Prompting Elicits Reasoning in Large Language Models". Jason Wei et al. arxiv 2022. [Paper]
- "STaR: Self-Taught Reasoner Bootstrapping Reasoning With Reasoning". Zelikman et al. arxiv 2022. [Paper]
- "Large language models are zero-shot reasoners". Takeshi Kojima et al. arxiv 2022. [Paper]
- "Automatic Chain of Thought Prompting in Large Language Models". Zhuosheng Zhang et al. arxiv. [Paper]
- "Complexity-Based Prompting for Multi-Step Reasoning". Yao Fu et al. arxiv 2022. [Paper]
- "Language Models are Multilingual Chain-of-Thought Reasoners". Freda Shi et al. arxiv 2022. [Paper]
- "Rationale-Augmented Ensembles in Language Models". Xuezhi Wang et al. arxiv 2022. [Paper]
- "Least-to-Most Prompting Enables Complex Reasoning in Large Language Models". Denny Zhou et al. arxiv 2022. [Paper]
- "Multimodal Chain-of-Thought Reasoning in Language Models". Zhuosheng Zhang et al. arxiv 2023. [Paper]
- "Self-Consistency Improves Chain of Thought Reasoning in Language Models". Xuezhi Wang et al. arxiv 2022. [Paper]
- "Large Language Models Can Self-Improve". Jiaxin Huang et al. arxiv 2022. [Paper]
- "Training Verifiers to Solve Math Word Problems". Karl Cobbe et al. arxiv 2021. [Paper]
- "On the Advance of Making Language Models Better Reasoners". Yifei Li et al. arxiv 2022. [Paper]
- "Large Language Models are reasoners with Self-Verification". Yixuan Weng et al. arxiv 2022. [Paper]
- "Teaching small language models to reason". Lucie Charlotte Magister et al. arxiv 2022. [Paper]
- "Large language models are reasoning teachers". Namgyu Ho et al. arxiv 2022. [Paper]
- "The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning". Ye, Xi et al. arxiv 2022. [Paper]
- "Scaling Instruction-Finetuned Language Models". Hyung Won Chung et al. arxiv 2022. [Paper]
- "Solving Quantitative Reasoning Problems with Language Models". Aitor Lewkowycz et al. arxiv 2022. [Paper]
- "Text and patterns: For effective chain of thought, it takes two to tango". Aman Madaan et al. arxiv 2022. [Paper]
- "Challenging BIG-Bench tasks and whether chain-of-thought can solve them". Mirac Suzgun et al. arxiv 2022. [Paper]
- "A Survey for In-context Learning". Qingxiu Dong et al. arxiv 2023. [Paper]
- "Reasoning with Language Model Prompting: A Survey". Shuofei Qiao et al. arxiv 2022. [Paper]
- "Towards Reasoning in Large Language Models: A Survey". Jie Huang et al. arxiv 2022. [Paper]
- "Reward Design with Language Models". Minae Kwon et al. arxiv 2023. [Paper]
- "Promptagator: Few-shot Dense Retrieval From 8 Examples". Zhuyun Dai et al. arxiv 2022. [Paper]
- "On the Feasibility of Specialized Ability Stealing for Large Language Code Models". Zongjie Li et al. arxiv 2023. [Paper]
- "MathPrompter: Mathematical Reasoning using Large Language Models". Imani, Shima et al. arxiv 2023. [Paper]
- "ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction". Jiabang He et al. arxiv 2023. [Paper]
- "Selective Annotation Makes Language Models Better Few-Shot Learners". Hongjin Su et al. arxiv 2022. [Paper]
Capacity Evaluation
- "Measuring Massive Multitask Language Understanding". Dan Hendrycks et al. ICLR 2021. [Paper]
- "Persistent Anti-Muslim Bias in Large Language Models". Abubakar Abid et al. AIES 2021. [Paper]
- "Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models". Alex Tamkin et al. arXiv 2021. [Paper]
- "BEHAVIOR: Benchmark for Everyday Household Activities in Virtual, Interactive, and Ecological Environments". Sanjana Srivastava et al. CoRL 2021. [Paper]
- "Program Synthesis with Large Language Models". Jacob Austin et al. arXiv 2021. [Paper]
- "Training Verifiers to Solve Math Word Problems". Karl Cobbe et al. arXiv 2021. [Paper]
- "Show Your Work: Scratchpads for Intermediate Computation with Language Models". Maxwell I. Nye et al. arXiv 2021. [Paper]
- "Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents". Wenlong Huang et al. ICML 2022. [Paper]
- "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models". Jason Wei et al. NeurIPS 2022. [Paper]
- "Training language models to follow instructions with human feedback". Long Ouyang et al. arXiv 2022. [Paper]
- "Competition-Level Code Generation with AlphaCode". Yujia Li et al. Science 2022. [Paper]
- "Do As I Can, Not As I Say: Grounding Language in Robotic Affordances". Michael Ahn et al. arXiv 2022. [Paper]
- "Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback". Yuntao Bai et al. arXiv 2022. [Paper]
- "Autoformalization with Large Language Models". Yuhuai Wu et al. NeurIPS 2022. [Paper]
- "Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models". Aarohi Srivastava et al. arXiv 2022. [Paper]
- "Exploring Length Generalization in Large Language Models". Cem Anil et al. NeurIPS 2022. [Paper]
- "Few-shot Learning with Retrieval Augmented Language Models". Gautier Izacard et al. arXiv 2022. [Paper]
- "Limitations of Language Models in Arithmetic and Symbolic Induction". Jing Qian et al. arXiv 2022. [Paper]
- "Code as Policies: Language Model Programs for Embodied Control". Jacky Liang et al. arXiv 2022. [Paper]
- "ProgPrompt: Generating Situated Robot Task Plans using Large Language Models". Ishika Singh et al. arXiv 2022. [Paper]
- "Law Informs Code: A Legal Informatics Approach to Aligning Artificial Intelligence with Humans". John J. Nay et al. arXiv 2022. [Paper]
- "Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought". Abulhair Saparov et al. ICLR 2023. [Paper]
- "Language Models are Multilingual Chain-of-Thought Reasoners". Freda Shi et al. ICLR 2023. [Paper]
- "Re3: Generating Longer Stories With Recursive Reprompting and Revision". Kevin Yang et al. EMNLP 2022. [Paper]
- "Language Models of Code are Few-Shot Commonsense Learners". Aman Madaan et al. EMNLP 2022. [Paper]
- "Challenging BIG-Bench Tasks and Whether Chain-of-Thought Can Solve Them". Mirac Suzgun et al. arXiv 2022. [Paper]
- "Large Language Models Can Self-Improve". Jiaxin Huang et al. arXiv 2022. [Paper]
- "Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs". Albert Q. Jiang et al. ICLR 2023. [Paper]
- "Holistic Evaluation of Language Models". Percy Liang et al. arXiv 2022. [Paper]
- "PAL: Program-aided Language Models". Luyu Gao et al. arXiv 2022. [Paper]
- "Legal Prompt Engineering for Multilingual Legal Judgement Prediction". Dietrich Trautmann et al. arXiv 2022. [Paper]
- "How Does ChatGPT Perform on the Medical Licensing Exams? The Implications of Large Language Models for Medical Education and Knowledge Assessment". Aidan Gilson et al. medRxiv 2022. [Paper]
- "ChatGPT: The End of Online Exam Integrity?". Teo Susnjak et al. arXiv 2022. [Paper]
- "Large Language Models are reasoners with Self-Verification". Yixuan Weng et al. arXiv 2022. [Paper]
- "Self-Instruct: Aligning Language Model with Self Generated Instructions". Yizhong Wang et al. arXiv 2022. [Paper]
- "ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports". Katharina Jeblick et al. arXiv 2022. [Paper]
- "The End of Programming". Matt Welsh et al. ACM 2023. [Paper]
- "Chatgpt goes to law school". Choi Jonathan H et al. SSRN 2023. [Paper]
- "How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection". Biyang Guo et al. arXiv 2023. [Paper]
- "Is ChatGPT A Good Translator? A Preliminary Study". Wenxiang Jiao et al. arXiv 2023. [Paper]
- "Could an Artificial-Intelligence agent pass an introductory physics course?". Gerd Kortemeyer et al. arXiv 2023. [Paper]
- "Mathematical Capabilities of ChatGPT". Simon Frieder et al. arXiv 2023. [Paper]
- "Synthetic Prompting: Generating Chain-of-Thought Demonstrations for Large Language Models". Zhihong Shao et al. arXiv 2023. [Paper]
- "Grounding Large Language Models in Interactive Environments with Online Reinforcement Learning". Thomas Carta et al. arXiv 2023. [Paper]
- "Evaluating ChatGPT as an Adjunct for Radiologic Decision-Making". Arya Yao et al. medRxiv 2023. [Paper]
- "Theory of Mind May Have Spontaneously Emerged in Large Language Models". Michal Kosinski et al. arXiv 2023. [Paper]
- "A Categorical Archive of ChatGPT Failures". Ali Borji et al. arXiv 2023. [Paper]
- "A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity". Yejin Bang et al. arXiv 2023. [Paper]
- "Toolformer: Language Models Can Teach Themselves to Use Tools". Timo Schick et al. arXiv 2023. [Paper]
- "Is ChatGPT a General-Purpose Natural Language Processing Task Solver?". Chengwei Qin et al. arXiv 2023. [Paper]
- "How Good Are GPT Models at Machine Translation? A Comprehensive Evaluation". Hendy Amr et al. arXiv 2023. [Paper]
- "Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT". Qihuang Zhong et al. arXiv 2023. [Paper]
- "Zero-Shot Information Extraction via Chatting with ChatGPT". Xiang Wei et al. arXiv 2023. [Paper]
- "ChatGPT: Jack of all trades, master of none". Jan Kocon et al. arXiv 2023. [Paper]
- "On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective". Jindong Wang et al. arXiv 2023. [Paper]
- "Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback". Baolin Peng et al. arXiv 2023. [Paper]
- "An Independent Evaluation of ChatGPT on Mathematical Word Problems (MWP)". Paulo Shakarian et al. arXiv 2023. [Paper]
- "How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks". Chen Xuanting et al. arXiv 2023. [Paper]
- "The utility of ChatGPT for cancer treatment information". Shen Chen et al. medRxiv 2023. [Paper]
- "Can ChatGPT Assess Human Personalities? A General Evaluation Framework". Haocong Rao et al. arXiv 2023. [Paper]
- "Will Affective Computing Emerge from Foundation Models and General AI? A First Evaluation on ChatGPT.". Mostafa M. Amin et al. arXiv 2023. [Paper]
- "Exploring the Feasibility of ChatGPT for Event Extraction.". Jun Gao et al. arXiv 2023. [Paper]
- "Does Synthetic Data Generation of LLMs Help Clinical Text Mining?". Tang Ruixiang et al. arXiv 2023. [Paper]
- "Consistency Analysis of ChatGPT". Myeongjun Jang et al. arXiv 2023. [Paper]
- "Self-planning Code Generation with Large Language Model". Shun Zhang et al. ICLR 2023. [Paper]
- "Evaluation of ChatGPT as a Question Answering System for Answering Complex Questions". Yiming Tan et al. arXiv 2023. [Paper]
- "GPT-4 Technical Report". OpenAI et al. OpenAI 2023. [Paper]
- "A Short Survey of Viewing Large Language Models in Legal Aspect". Zhongxiang Sun et al. arXiv 2023. [Paper]
- "ChatGPT Participates in a Computer Science Exam". Sebastian Bordt et al. arXiv 2023. [Paper]
- "A Comprehensive Capability Analysis of GPT-3 and GPT-3.5 Series Models". Junjie Ye et al. arXiv 2023. [Paper]
- "On the Educational Impact of ChatGPT: Is Artificial Intelligence Ready to Obtain a University Degree?". Kamil Malinka et al. arXiv 2023. [Paper]
- "Sparks of Artificial General Intelligence: Early experiments with GPT-4". S'ebastien Bubeck et al. arXiv 2023. [Paper]
- "Is ChatGPT A Good Keyphrase Generator? A Preliminary Study". Mingyang Song et al. arXiv 2023. [Paper]
- "Capabilities of GPT-4 on Medical Challenge Problems". Harsha Nori et al. arXiv 2023. [Paper]
- "Can we trust the evaluation on ChatGPT?". Rachith Aiyappa et al. arXiv 2023. [Paper]
- "ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks". Fabrizio Gilardi et al. arXiv 2023. [Paper]
- "Evaluation of ChatGPT for NLP-based Mental Health Applications". Bishal Lamichhane et al. arXiv 2023. [Paper]
- "ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models". Bian Ning et al. arXiv 2023. [Paper]
- "Evaluating GPT-3.5 and GPT-4 Models on Brazilian University Admission Exams". Desnes Nunes et al. arXiv 2023. [Paper]
- "Humans in Humans Out: On GPT Converging Toward Common Sense in both Success and Failure". Philipp Koralus et al. arXiv 2023. [Paper]
- "Yes but.. Can ChatGPT Identify Entities in Historical Documents?". Carlos-Emiliano González-Gallardo et al. arXiv 2023. [Paper]
- "Uncovering ChatGPT's Capabilities in Recommender Systems". Sunhao Dai et al. arXiv 2023. [Paper]
- 登录 发表评论