← Roadmap 💬 Month 2: LLM Development

What Are Large Language Models?

Understanding the engine that powers everything you'll build

🔑 Key Concepts

📐 How LLMs Work (Simplified)

  1. Text is tokenised — split into tokens (words, subwords, characters)
  2. Tokens become embeddings — numerical vectors capturing meaning
  3. Transformer architecture processes tokens using self-attention (each token looks at all other tokens)
  4. Model predicts the next token given all previous tokens
  5. Repeat step 4 until a stop token is generated
💡 Key Insight: LLMs are probability engines. They don't "know" facts — they generate statistically likely text. This is why they can be confident and wrong (hallucination).

🛠️ The Major LLM Providers

ProviderModelContextBest For
OpenAIGPT-4o128KGeneral purpose, best quality
OpenAIGPT-4o-mini128KCheaper, fast, good enough for most tasks
AnthropicClaude Sonnet 4.61MLong context, coding, analysis
GoogleGemini 2.5 Pro2MLargest context, reasoning
MetaLlama 4 Scout10MOpen-source, massive context
MistralMistral Large128KEU-hosted, strong multilingual

✅ Check Your Understanding