Building-llama3-from-scratch

Building-llama3-from-scratch

TLDR: This repository contains code to build the LLaMA 3 language model from scratch using Python. It explains the components of LLaMA 3 such as pre-normalization using RMSNorm, SwiGLU activation function, Rotary Embeddings (RoPE), and Byte Pair Encoding (BPE) Algorithm. The code shows how to tokenize input data, create embeddings for each token, implement attention heads, self-attention, multi-head attention, SwiGLU activation function, and generate the output.

2024-05-27 Github

LLMs-from-scratch

38,600
@rasbt

LLMs-from-scratch

TLDR: This repository contains code for developing, pretraining, and finetuning a GPT-like LLM. It is the official code repository for the book 'Build a Large Language Model (From Scratch)'. The code is designed to run on conventional laptops and automatically utilizes GPUs if available. It also includes bonus materials and has specific hardware requirements.

2023-07-23 Github