跳转到主要内容

category

该存储库包含用于编码、预训练和微调类似GPT的LLM的代码,是《构建大型语言模型(从头开始)》一书的官方代码存储库。

(如果您从Manning网站下载了代码包,请考虑访问GitHub上的官方代码库,网址为https://github.com/rasbt/LLMs-from-scratch.)


 

 

 

在构建大型语言模型(从头开始)中,您将通过从头开始一步一步地对大型语言模型进行编码来学习和理解它们是如何从内而外工作的。在这本书中,我将指导您创建自己的LLM,用清晰的文本、图表和示例解释每个阶段。


本书中描述的为教育目的培训和开发自己的小型但功能性模型的方法反映了创建大型基础模型(如ChatGPT背后的模型)时使用的方法。


 

Table of Contents

 

请注意,此README.md文件是Markdown(.md)文件。如果您已经从Manning网站下载了此代码包,并且正在本地计算机上查看,我建议您使用Markdown编辑器或预览器进行正确查看。如果您还没有安装Markdown编辑器,MarkText是一个很好的免费选项。

Alternatively, you can view this and other files on GitHub at https://github.com/rasbt/LLMs-from-scratch.


 

Tip

If you're seeking guidance on installing Python and Python packages and setting up your code environment, I suggest reading the README.md file located in the setup directory.

 

Chapter Title Main Code (for quick access) All Code + Supplementary
Setup recommendations - -
Ch 1: Understanding Large Language Models No code -
Ch 2: Working with Text Data ch02.ipynb
dataloader.ipynb (summary)
exercise-solutions.ipynb
./ch02
Ch 3: Coding Attention Mechanisms ch03.ipynb
multihead-attention.ipynb (summary)
exercise-solutions.ipynb
./ch03
Ch 4: Implementing a GPT Model from Scratch ch04.ipynb
gpt.py (summary)
exercise-solutions.ipynb
./ch04
Ch 5: Pretraining on Unlabeled Data ch05.ipynb
gpt_train.py (summary)
gpt_generate.py (summary)
exercise-solutions.ipynb
./ch05
Ch 6: Finetuning for Text Classification ch06.ipynb
gpt-class-finetune.py
exercise-solutions.ipynb
./ch06
Ch 7: Finetuning to Follow Instructions ch07.ipynb ./ch07
Appendix A: Introduction to PyTorch code-part1.ipynb
code-part2.ipynb
DDP-script.py
exercise-solutions.ipynb
./appendix-A
Appendix B: References and Further Reading No code -
Appendix C: Exercise Solutions No code -
Appendix D: Adding Bells and Whistles to the Training Loop appendix-D.ipynb ./appendix-D
Appendix E: Parameter-efficient Finetuning with LoRA appendix-E.ipynb ./appendix-E


 

Shown below is a mental model summarizing the contents covered in this book.

 


 

Hardware Requirements

 

The code in the main chapters of this book is designed to run on conventional laptops within a reasonable timeframe and does not require specialized hardware. This approach ensures that a wide audience can engage with the material. Additionally, the code automatically utilizes GPUs if they are available.

 

Bonus Material

 

Several folders contain optional materials as a bonus for interested readers: