โ† Roadmap ๐Ÿ Month 1: Python
0/10

Packages, venv & Project Structure

Real projects need more than one file. Learn how to install third-party packages, manage dependencies, structure your project, and import your own code.

Learning Objectives

pip โ€” The Package Installer

Python has over 500,000 packages on PyPI. pip is how you install them. Think of it like an app store for Python code other people wrote.

bash
# Install a single package
pip install requests

# Install a specific version (pinning โ€” important for reproducibility)
pip install requests==2.31.0

# Install minimum version (any 2.31.x or higher)
pip install requests>=2.31.0

# Install multiple at once
pip install fastapi uvicorn python-dotenv

# Upgrade an already-installed package
pip install --upgrade requests

# Uninstall a package
pip uninstall requests

# List everything installed
pip list

# Show details about a package
pip show requests
โš ๏ธ

Never use sudo with pip

sudo pip install installs packages to your system Python โ€” this can break your OS (especially on Linux/macOS where system tools depend on Python). Always use a virtual environment instead.

requirements.txt โ€” The Dependency List

requirements.txt is a plain text file listing every package your project needs. Anyone (or any server) can recreate your exact setup with one command.

bash
# Save all installed packages with exact versions
pip freeze > requirements.txt

# Install everything from requirements.txt
pip install -r requirements.txt
text requirements.txt
# AI Engineering starter requirements
requests==2.31.0
fastapi==0.109.0
uvicorn==0.27.0
python-dotenv==1.0.0
pydantic==2.5.3
httpx==0.26.0
๐Ÿ’ก

Pin exact versions!

Use == to pin exact versions in requirements.txt. If you write requests>=2.31.0, a future breaking change could break your app silently. Pinned versions = reproducible builds = no surprises.

Virtual Environments โ€” Why and How

Every project should have its own virtual environment. Without it, ALL your projects share the same packages โ€” and Project A needing requests==2.28 will conflict with Project B needing requests==2.31.

bash Full venv workflow
# Step 1: Create a new project
mkdir my-ai-app && cd my-ai-app

# Step 2: Create a virtual environment
python -m venv venv

# Step 3: Activate it (EVERY new terminal session!)
source venv/bin/activate        # macOS/Linux
venv\Scripts\activate           # Windows

# Step 4: Verify you're in the venv
which python                    # should point to venv/bin/python
pip list                        # should show only pip + setuptools

# Step 5: Install packages
pip install requests fastapi

# Step 6: Save dependencies
pip freeze > requirements.txt

# Step 7: When done working
deactivate                      # exits the venv
๐Ÿ”

What does a venv actually do?

A virtual environment is just a folder (venv/) that contains a copy of the Python interpreter and its own site-packages/ directory. When activated, your shell's python and pip point to the ones inside that folder. That's it โ€” no magic, just PATH manipulation.

Importing โ€” How Python Finds Code

The import statement tells Python to load code from another file or package. Understanding how Python finds modules is essential.

python imports.py
# โ”€โ”€โ”€ Standard library (built into Python) โ”€โ”€โ”€
import json                        # whole module
import os
import sys
from pathlib import Path            # specific class
from datetime import datetime      # specific class
from collections import Counter   # specific class

# โ”€โ”€โ”€ Third-party packages (installed with pip) โ”€โ”€โ”€
import requests                     # whole module
from fastapi import FastAPI        # specific class
from pydantic import BaseModel     # specific class
from dotenv import load_dotenv     # specific function

# โ”€โ”€โ”€ Your own code (same project) โ”€โ”€โ”€
from utils import helpers             # from utils/helpers.py
from utils.helpers import tokenize   # specific function from that file
from models import Document          # from models.py or models/__init__.py

How Python finds your imports

When you write import foo, Python checks these locations in order:

python
import sys

# The search path โ€” Python checks each directory in order
for path in sys.path:
    print(path)

# Typical output:
# 1. "" (current directory โ€” checked FIRST)
# 2. Path to your virtual environment's site-packages
# 3. Standard library directories
# 4. PYTHONPATH environment variable (if set)
โš ๏ธ

Never name your file the same as a standard library module

If you create json.py in your project, then import json will import YOUR file instead of the standard library. This causes incredibly confusing errors. Common names to avoid: json.py, os.py, sys.py, test.py, utils.py (at root level).

Project Structure โ€” How to Organize Your Code

As projects grow, one file isn't enough. Here's the standard structure for an AI engineering project:

text Project structure
my-ai-app/
โ”œโ”€โ”€ venv/                   โ† virtual environment (never commit this!)
โ”œโ”€โ”€ .env                    โ† API keys and secrets (never commit this!)
โ”œโ”€โ”€ .gitignore              โ† tells git what to skip
โ”œโ”€โ”€ requirements.txt        โ† package list
โ”œโ”€โ”€ README.md               โ† project documentation
โ”œโ”€โ”€ main.py                 โ† entry point โ€” run this to start
โ”œโ”€โ”€ config.py               โ† configuration and settings
โ”œโ”€โ”€ models/
โ”‚   โ”œโ”€โ”€ __init__.py         โ† makes this folder a Python package
โ”‚   โ”œโ”€โ”€ document.py         โ† Document class
โ”‚   โ””โ”€โ”€ chat.py             โ† ChatMessage class
โ”œโ”€โ”€ services/
โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”œโ”€โ”€ llm_client.py       โ† LLM API wrapper
โ”‚   โ”œโ”€โ”€ vector_store.py     โ† vector database operations
โ”‚   โ””โ”€โ”€ embeddings.py        โ† embedding generation
โ””โ”€โ”€ utils/
    โ”œโ”€โ”€ __init__.py
    โ”œโ”€โ”€ text_processing.py  โ† chunking, cleaning text
    โ””โ”€โ”€ logging.py           โ† logging setup

What is __init__.py?

An __init__.py file tells Python "this folder is a package." It can be empty, or it can contain code that runs when the package is imported.

python models/__init__.py
# This makes 'models' a package that can be imported
# You can also re-export classes for convenience:

from models.document import Document
from models.chat import ChatMessage

# Now users can write:
#   from models import Document, ChatMessage
# Instead of:
#   from models.document import Document

Importing between your own modules

python services/llm_client.py
# Import from your own project
from models.document import Document
from config import API_KEY, DEFAULT_MODEL
from utils.text_processing import chunk_text

class LLMClient:
    def __init__(self):
        self.api_key = API_KEY        # from config.py
        self.model = DEFAULT_MODEL    # from config.py

    def process_document(self, doc: Document):
        chunks = chunk_text(doc.content)  # from utils
        # ...

.env Files โ€” Storing Secrets Safely

API keys, passwords, and tokens should NEVER be in your code or committed to Git. Use .env files instead.

text .env
OPENAI_API_KEY=sk-proj-abc123...
ANTHROPIC_API_KEY=sk-ant-xyz789...
DEFAULT_MODEL=gpt-4o-mini
LOG_LEVEL=INFO
python config.py
import os
from dotenv import load_dotenv

# Load .env file into os.environ
load_dotenv()

# Access with os.environ.get() โ€” always provide a default
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY")
DEFAULT_MODEL = os.environ.get("DEFAULT_MODEL", "gpt-4o-mini")
LOG_LEVEL = os.environ.get("LOG_LEVEL", "INFO")

# Validate โ€” fail fast if required keys are missing
if not OPENAI_API_KEY:
    raise ValueError("OPENAI_API_KEY not found. Set it in .env")
โš ๏ธ

NEVER commit .env to Git!

Add .env to your .gitignore. If you accidentally push an API key, revoke it immediately โ€” it's in the git history forever. GitHub scans for leaked keys and will auto-revoke them.

The .gitignore File

text .gitignore
# Python virtual environment
venv/
.venv/

# Python cache
__pycache__/
*.pyc
*.pyo

# Environment and secrets
.env
.env.local

# IDE files
.vscode/
.idea/

# OS files
.DS_Store
Thumbs.db

# Data and logs (project-specific)
data/
*.log

Essential Packages for AI Engineering

Here are the packages you'll install most often in AI engineering work:

text Common requirements.txt for AI apps
# โ”€โ”€โ”€ API & Web โ”€โ”€โ”€
requests          # Call any HTTP API
httpx             # Like requests but supports async
fastapi           # Build your own API endpoints
uvicorn           # Run FastAPI apps

# โ”€โ”€โ”€ AI/LLM Specific โ”€โ”€โ”€
openai            # Official OpenAI Python SDK
anthropic         # Official Anthropic Python SDK
langchain         # Orchestration framework
langchain-community # Community integrations

# โ”€โ”€โ”€ Data & Embeddings โ”€โ”€โ”€
chromadb          # Local vector database
sentence-transformers # Generate embeddings locally

# โ”€โ”€โ”€ Utilities โ”€โ”€โ”€
python-dotenv     # Load .env files
pydantic          # Data validation with types
rich              # Beautiful terminal output

๐Ÿงช Exercises

๐Ÿ“

Exercise 1 โ€” Environment Setup

Create a new project folder. Set up a venv, install requests and python-dotenv, freeze to requirements.txt. Then delete the venv, recreate it, and reinstall from requirements.txt. Verify pip list shows the same packages.

๐Ÿ“

Exercise 2 โ€” Multi-File Project

Create a project with this structure: main.py, config.py, utils/__init__.py, utils/text_processing.py. In text_processing.py, write a count_words(text) function. Import and call it from main.py.

๐Ÿ“

Exercise 3 โ€” .env Secrets

Create a .env file with MY_NAME=YourName. Create main.py that loads it with python-dotenv and prints a greeting using the variable. Add .env to .gitignore.

โš ๏ธ Common Mistakes

โœ—

Forgetting to activate venv

You install a package, but your script can't find it โ€” because you installed to the system Python, not the venv. Always check: does your prompt show (venv)?

โœ—

Circular imports

a.py imports from b.py, and b.py imports from a.py. Python gets confused. Fix: move the shared code to a third file, or import inside functions instead of at the top level.

โœ—

Committing venv/ or .env to Git

venv/ can be thousands of files โ€” it bloats your repo and makes cloning slow. .env exposes your secrets. Both should ALWAYS be in .gitignore.

โœ—

Not pinning versions

pip freeze > requirements.txt without the == pins means a teammate installing tomorrow gets different versions. Always pin, always use requirements.txt.

โœ… You've completed this step when you can confirm: