Packages, venv & Project Structure
Real projects need more than one file. Learn how to install third-party packages, manage dependencies, structure your project, and import your own code.
Learning Objectives
- Install, upgrade, and uninstall packages with pip
- Pin exact versions in requirements.txt
- Reproduce an environment from requirements.txt
- Understand Python's import system and module search path
- Structure a project with multiple files and folders
- Understand __init__.py and how packages work
- Use .env files for secrets with python-dotenv
- Avoid the most common dependency pitfalls
pip โ The Package Installer
Python has over 500,000 packages on PyPI. pip is how you install them. Think of it like an app store for Python code other people wrote.
# Install a single package
pip install requests
# Install a specific version (pinning โ important for reproducibility)
pip install requests==2.31.0
# Install minimum version (any 2.31.x or higher)
pip install requests>=2.31.0
# Install multiple at once
pip install fastapi uvicorn python-dotenv
# Upgrade an already-installed package
pip install --upgrade requests
# Uninstall a package
pip uninstall requests
# List everything installed
pip list
# Show details about a package
pip show requests
Never use sudo with pip
sudo pip install installs packages to your system Python โ this can break your OS (especially on Linux/macOS where system tools depend on Python). Always use a virtual environment instead.
requirements.txt โ The Dependency List
requirements.txt is a plain text file listing every package your project needs. Anyone (or any server) can recreate your exact setup with one command.
# Save all installed packages with exact versions
pip freeze > requirements.txt
# Install everything from requirements.txt
pip install -r requirements.txt
# AI Engineering starter requirements
requests==2.31.0
fastapi==0.109.0
uvicorn==0.27.0
python-dotenv==1.0.0
pydantic==2.5.3
httpx==0.26.0
Pin exact versions!
Use == to pin exact versions in requirements.txt. If you write requests>=2.31.0, a future breaking change could break your app silently. Pinned versions = reproducible builds = no surprises.
Virtual Environments โ Why and How
Every project should have its own virtual environment. Without it, ALL your projects share the same packages โ and Project A needing requests==2.28 will conflict with Project B needing requests==2.31.
# Step 1: Create a new project
mkdir my-ai-app && cd my-ai-app
# Step 2: Create a virtual environment
python -m venv venv
# Step 3: Activate it (EVERY new terminal session!)
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windows
# Step 4: Verify you're in the venv
which python # should point to venv/bin/python
pip list # should show only pip + setuptools
# Step 5: Install packages
pip install requests fastapi
# Step 6: Save dependencies
pip freeze > requirements.txt
# Step 7: When done working
deactivate # exits the venv
What does a venv actually do?
A virtual environment is just a folder (venv/) that contains a copy of the Python interpreter and its own site-packages/ directory. When activated, your shell's python and pip point to the ones inside that folder. That's it โ no magic, just PATH manipulation.
Importing โ How Python Finds Code
The import statement tells Python to load code from another file or package. Understanding how Python finds modules is essential.
# โโโ Standard library (built into Python) โโโ
import json # whole module
import os
import sys
from pathlib import Path # specific class
from datetime import datetime # specific class
from collections import Counter # specific class
# โโโ Third-party packages (installed with pip) โโโ
import requests # whole module
from fastapi import FastAPI # specific class
from pydantic import BaseModel # specific class
from dotenv import load_dotenv # specific function
# โโโ Your own code (same project) โโโ
from utils import helpers # from utils/helpers.py
from utils.helpers import tokenize # specific function from that file
from models import Document # from models.py or models/__init__.py
How Python finds your imports
When you write import foo, Python checks these locations in order:
import sys
# The search path โ Python checks each directory in order
for path in sys.path:
print(path)
# Typical output:
# 1. "" (current directory โ checked FIRST)
# 2. Path to your virtual environment's site-packages
# 3. Standard library directories
# 4. PYTHONPATH environment variable (if set)
Never name your file the same as a standard library module
If you create json.py in your project, then import json will import YOUR file instead of the standard library. This causes incredibly confusing errors. Common names to avoid: json.py, os.py, sys.py, test.py, utils.py (at root level).
Project Structure โ How to Organize Your Code
As projects grow, one file isn't enough. Here's the standard structure for an AI engineering project:
my-ai-app/
โโโ venv/ โ virtual environment (never commit this!)
โโโ .env โ API keys and secrets (never commit this!)
โโโ .gitignore โ tells git what to skip
โโโ requirements.txt โ package list
โโโ README.md โ project documentation
โโโ main.py โ entry point โ run this to start
โโโ config.py โ configuration and settings
โโโ models/
โ โโโ __init__.py โ makes this folder a Python package
โ โโโ document.py โ Document class
โ โโโ chat.py โ ChatMessage class
โโโ services/
โ โโโ __init__.py
โ โโโ llm_client.py โ LLM API wrapper
โ โโโ vector_store.py โ vector database operations
โ โโโ embeddings.py โ embedding generation
โโโ utils/
โโโ __init__.py
โโโ text_processing.py โ chunking, cleaning text
โโโ logging.py โ logging setup
What is __init__.py?
An __init__.py file tells Python "this folder is a package." It can be empty, or it can contain code that runs when the package is imported.
# This makes 'models' a package that can be imported
# You can also re-export classes for convenience:
from models.document import Document
from models.chat import ChatMessage
# Now users can write:
# from models import Document, ChatMessage
# Instead of:
# from models.document import Document
Importing between your own modules
# Import from your own project
from models.document import Document
from config import API_KEY, DEFAULT_MODEL
from utils.text_processing import chunk_text
class LLMClient:
def __init__(self):
self.api_key = API_KEY # from config.py
self.model = DEFAULT_MODEL # from config.py
def process_document(self, doc: Document):
chunks = chunk_text(doc.content) # from utils
# ...
.env Files โ Storing Secrets Safely
API keys, passwords, and tokens should NEVER be in your code or committed to Git. Use .env files instead.
OPENAI_API_KEY=sk-proj-abc123...
ANTHROPIC_API_KEY=sk-ant-xyz789...
DEFAULT_MODEL=gpt-4o-mini
LOG_LEVEL=INFO
import os
from dotenv import load_dotenv
# Load .env file into os.environ
load_dotenv()
# Access with os.environ.get() โ always provide a default
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
ANTHROPIC_API_KEY = os.environ.get("ANTHROPIC_API_KEY")
DEFAULT_MODEL = os.environ.get("DEFAULT_MODEL", "gpt-4o-mini")
LOG_LEVEL = os.environ.get("LOG_LEVEL", "INFO")
# Validate โ fail fast if required keys are missing
if not OPENAI_API_KEY:
raise ValueError("OPENAI_API_KEY not found. Set it in .env")
NEVER commit .env to Git!
Add .env to your .gitignore. If you accidentally push an API key, revoke it immediately โ it's in the git history forever. GitHub scans for leaked keys and will auto-revoke them.
The .gitignore File
# Python virtual environment
venv/
.venv/
# Python cache
__pycache__/
*.pyc
*.pyo
# Environment and secrets
.env
.env.local
# IDE files
.vscode/
.idea/
# OS files
.DS_Store
Thumbs.db
# Data and logs (project-specific)
data/
*.log
Essential Packages for AI Engineering
Here are the packages you'll install most often in AI engineering work:
# โโโ API & Web โโโ
requests # Call any HTTP API
httpx # Like requests but supports async
fastapi # Build your own API endpoints
uvicorn # Run FastAPI apps
# โโโ AI/LLM Specific โโโ
openai # Official OpenAI Python SDK
anthropic # Official Anthropic Python SDK
langchain # Orchestration framework
langchain-community # Community integrations
# โโโ Data & Embeddings โโโ
chromadb # Local vector database
sentence-transformers # Generate embeddings locally
# โโโ Utilities โโโ
python-dotenv # Load .env files
pydantic # Data validation with types
rich # Beautiful terminal output
๐งช Exercises
Exercise 1 โ Environment Setup
Create a new project folder. Set up a venv, install requests and python-dotenv, freeze to requirements.txt. Then delete the venv, recreate it, and reinstall from requirements.txt. Verify pip list shows the same packages.
Exercise 2 โ Multi-File Project
Create a project with this structure: main.py, config.py, utils/__init__.py, utils/text_processing.py. In text_processing.py, write a count_words(text) function. Import and call it from main.py.
Exercise 3 โ .env Secrets
Create a .env file with MY_NAME=YourName. Create main.py that loads it with python-dotenv and prints a greeting using the variable. Add .env to .gitignore.
โ ๏ธ Common Mistakes
Forgetting to activate venv
You install a package, but your script can't find it โ because you installed to the system Python, not the venv. Always check: does your prompt show (venv)?
Circular imports
a.py imports from b.py, and b.py imports from a.py. Python gets confused. Fix: move the shared code to a third file, or import inside functions instead of at the top level.
Committing venv/ or .env to Git
venv/ can be thousands of files โ it bloats your repo and makes cloning slow. .env exposes your secrets. Both should ALWAYS be in .gitignore.
Not pinning versions
pip freeze > requirements.txt without the == pins means a teammate installing tomorrow gets different versions. Always pin, always use requirements.txt.