Why Documentation Matters
Code tells you how. Documentation tells you why. Without documentation, every new team member spends weeks reverse-engineering decisions that took the original author 5 minutes to make.
Good documentation:
- Reduces onboarding time -- new developers become productive faster
- Prevents knowledge loss -- when people leave, knowledge stays
- Reduces interruptions -- "how does X work?" is answered by a doc, not a Slack message
- Improves decision-making -- knowing why past decisions were made prevents repeating mistakes
- Enables async work -- teams across time zones can understand context without meetings
The cost of not documenting is invisible but massive. Every time someone asks "how does this work?" because there's no doc, that's wasted time -- multiplied by every person who asks the same question.
README Files
The README is the front door of your project. It's the first thing anyone sees. A great README makes the difference between "I can start contributing in 30 minutes" and "I gave up and closed the tab."
README Structure
# Project Name
One-paragraph description of what this project does and why it exists.
## Quick Start
Fastest way to get running locally.
## Prerequisites
What you need installed before starting.
## Installation
Step-by-step setup instructions.
## Usage
How to use the project (commands, API, examples).
## Architecture
High-level overview of how the system is structured.
## Contributing
How to contribute (branching strategy, PR process, code style).
## License
License information.
Complete README Example
# OrderFlow API
REST API for e-commerce order management. Handles order creation,
payment processing, inventory updates, and shipping notifications.
## Quick Start
git clone https://github.com/company/orderflow-api.git
cd orderflow-api
cp .env.example .env.local
docker compose up -d
npm install
npm run dev
Server starts at http://localhost:4000. API docs at http://localhost:4000/docs.
## Prerequisites
- Node.js 20+
- Docker & Docker Compose
- PostgreSQL 16 (runs via Docker)
- Redis 7 (runs via Docker)
## Installation
1. Clone the repository
2. Copy environment template: `cp .env.example .env.local`
3. Fill in required values in `.env.local` (see .env.example for descriptions)
4. Start infrastructure: `docker compose up -d`
5. Install dependencies: `npm install`
6. Run migrations: `npm run db:migrate`
7. Seed development data: `npm run db:seed`
8. Start dev server: `npm run dev`
## Available Scripts
| Command | Description |
|---|---|
| `npm run dev` | Start development server with hot reload |
| `npm run build` | Build for production |
| `npm start` | Start production server |
| `npm test` | Run unit tests |
| `npm run test:e2e` | Run end-to-end tests |
| `npm run lint` | Run ESLint |
| `npm run db:migrate` | Run database migrations |
| `npm run db:seed` | Seed development data |
| `npm run db:reset` | Drop, recreate, migrate, and seed |
## Architecture
orderflow-api/
src/
routes/ # Express route handlers
services/ # Business logic
repositories/ # Database access
middleware/ # Auth, validation, error handling
lib/ # Shared utilities
types/ # TypeScript type definitions
prisma/
schema.prisma # Database schema
migrations/ # Migration files
tests/
unit/ # Unit tests
integration/ # Integration tests
e2e/ # End-to-end tests
## API Overview
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/orders | List orders (paginated) |
| POST | /api/orders | Create new order |
| GET | /api/orders/:id | Get order details |
| PATCH | /api/orders/:id | Update order status |
| POST | /api/orders/:id/refund | Process refund |
Full API documentation: http://localhost:4000/docs (Swagger UI)
## Contributing
1. Create a branch from `main`: `git checkout -b feat/your-feature`
2. Make changes and write tests
3. Ensure all checks pass: `npm run lint && npm test`
4. Open a PR with a clear description
5. Get at least one approval
6. Squash and merge
See CONTRIBUTING.md for detailed guidelines.
## License
MIT
README Anti-Patterns
Bad: "See the wiki" (wikis are where docs go to die)
Bad: Empty README or only project name
Bad: Setup instructions that don't work
Bad: No mention of prerequisites
Bad: "TODO: add documentation"
API Documentation
API documentation tells consumers how to interact with your service. The standard is OpenAPI (formerly Swagger).
OpenAPI Specification
# openapi.yaml
openapi: 3.0.3
info:
title: OrderFlow API
description: E-commerce order management API
version: 1.0.0
contact:
name: Engineering Team
email: [email protected]
servers:
- url: https://api.orderflow.com/v1
description: Production
- url: https://staging-api.orderflow.com/v1
description: Staging
- url: http://localhost:4000/v1
description: Local development
paths:
/orders:
get:
summary: List orders
description: Returns a paginated list of orders for the authenticated user
operationId: listOrders
tags: [Orders]
parameters:
- name: page
in: query
schema:
type: integer
default: 1
description: Page number
- name: limit
in: query
schema:
type: integer
default: 20
maximum: 100
description: Items per page
- name: status
in: query
schema:
type: string
enum: [pending, processing, shipped, delivered, cancelled]
description: Filter by order status
responses:
'200':
description: Successful response
content:
application/json:
schema:
type: object
properties:
data:
type: array
items:
$ref: '#/components/schemas/Order'
pagination:
$ref: '#/components/schemas/Pagination'
'401':
description: Unauthorized
content:
application/json:
schema:
$ref: '#/components/schemas/Error'
post:
summary: Create order
description: Creates a new order with the provided items
operationId: createOrder
tags: [Orders]
requestBody:
required: true
content:
application/json:
schema:
type: object
required: [items, shippingAddress]
properties:
items:
type: array
items:
type: object
required: [productId, quantity]
properties:
productId:
type: string
quantity:
type: integer
minimum: 1
shippingAddress:
$ref: '#/components/schemas/Address'
example:
items:
- productId: "prod_abc123"
quantity: 2
- productId: "prod_def456"
quantity: 1
shippingAddress:
street: "123 Main St"
city: "Portland"
state: "OR"
zip: "97201"
country: "US"
responses:
'201':
description: Order created
content:
application/json:
schema:
$ref: '#/components/schemas/Order'
'400':
description: Invalid request body
'401':
description: Unauthorized
components:
schemas:
Order:
type: object
properties:
id:
type: string
example: "ord_abc123"
status:
type: string
enum: [pending, processing, shipped, delivered, cancelled]
items:
type: array
items:
$ref: '#/components/schemas/OrderItem'
total:
type: number
format: float
example: 149.99
createdAt:
type: string
format: date-time
updatedAt:
type: string
format: date-time
OrderItem:
type: object
properties:
productId:
type: string
name:
type: string
quantity:
type: integer
unitPrice:
type: number
format: float
Address:
type: object
required: [street, city, state, zip, country]
properties:
street:
type: string
city:
type: string
state:
type: string
zip:
type: string
country:
type: string
Pagination:
type: object
properties:
page:
type: integer
limit:
type: integer
total:
type: integer
totalPages:
type: integer
Error:
type: object
properties:
error:
type: string
message:
type: string
statusCode:
type: integer
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
security:
- bearerAuth: []
Serving API Docs
// Express with Swagger UI
import swaggerUi from 'swagger-ui-express';
import YAML from 'yamljs';
const swaggerDocument = YAML.load('./openapi.yaml');
app.use('/docs', swaggerUi.serve, swaggerUi.setup(swaggerDocument, {
customCss: '.swagger-ui .topbar { display: none }',
customSiteTitle: 'OrderFlow API Docs',
}));
// Now visit http://localhost:4000/docs for interactive API documentation
API Doc from Code (JSDoc + TypeScript)
/**
* Create a new order
*
* @route POST /api/orders
* @param {CreateOrderRequest} req.body - Order details
* @returns {Order} 201 - Created order
* @returns {Error} 400 - Invalid request body
* @returns {Error} 401 - Unauthorized
*
* @example request - Create order with two items
* {
* "items": [
* { "productId": "prod_abc", "quantity": 2 }
* ],
* "shippingAddress": {
* "street": "123 Main St",
* "city": "Portland",
* "state": "OR",
* "zip": "97201",
* "country": "US"
* }
* }
*/
app.post('/api/orders', authenticate, validateBody(createOrderSchema), async (req, res) => {
const order = await orderService.create(req.body, req.user.id);
res.status(201).json(order);
});
Code Comments
Comments explain why, not what. The code itself tells you what it does. Comments tell you why it does it that way.
Good Comments
// Good: explains WHY
// Using a 5-second timeout because the payment provider
// recommends at most 3s, but we've seen spikes up to 4.5s
// during peak hours. See incident report INC-2024-089.
const PAYMENT_TIMEOUT_MS = 5000;
// Good: explains business rule
// Tax-exempt if the buyer is a registered nonprofit
// with a valid EIN. See IRS Publication 557 for details.
if (buyer.taxExemptStatus === 'verified') {
return calculateSubtotal(items);
}
// Good: explains non-obvious behavior
// We sort descending by createdAt first, then apply pagination.
// This ensures consistent ordering even when new items are added
// between page requests.
const items = await db.query(
'SELECT * FROM items ORDER BY created_at DESC LIMIT $1 OFFSET $2',
[limit, offset]
);
// Good: documents a workaround
// HACK: Safari doesn't support `scrollIntoView({ behavior: 'smooth' })`
// on dynamically created elements. Using setTimeout(0) to defer
// until after the DOM update completes. Remove when Safari 18+ has
// sufficient market share. Tracked in BUG-1234.
setTimeout(() => element.scrollIntoView({ behavior: 'smooth' }), 0);
// Good: warns about consequences
// WARNING: This function deletes data permanently. It bypasses
// the soft-delete mechanism and removes records from the database.
// Used only for GDPR right-to-erasure compliance.
async function permanentlyDeleteUserData(userId: string) { ... }
Bad Comments
// Bad: restates the code
// Increment counter by 1
counter++;
// Bad: obvious from the variable name
// Set the user's name
user.name = name;
// Bad: journal entries (use git history instead)
// 2024-01-15: Added validation
// 2024-02-20: Fixed edge case with empty arrays
// 2024-03-01: Refactored to use async/await
// Bad: commented-out code (delete it, git remembers)
// function oldCalculation(x) {
// return x * 2 + 1;
// }
// Bad: TODO that will never be done
// TODO: optimize this later
JSDoc for Public APIs
/**
* Calculates the shipping cost for an order based on weight,
* destination, and shipping method.
*
* Uses the carrier's rate table for the selected method.
* Free shipping applied for orders over $100 (domestic only).
*
* @param weight - Total weight in kilograms
* @param destination - ISO 3166-1 alpha-2 country code
* @param method - Shipping method selection
* @returns Shipping cost in USD
* @throws {InvalidDestinationError} If country code is not supported
*
* @example
* calculateShipping(2.5, 'US', 'standard') // 5.99
* calculateShipping(2.5, 'US', 'express') // 14.99
* calculateShipping(2.5, 'DE', 'standard') // 24.99
*/
export function calculateShipping(
weight: number,
destination: string,
method: 'standard' | 'express' | 'overnight'
): number {
// ...
}
Architecture Decision Records (ADRs)
ADRs document the why behind significant technical decisions. They prevent the "why did we do it this way?" question that haunts every codebase.
ADR Format
# ADR-001: Use PostgreSQL as primary database
## Status
Accepted
## Date
2026-01-15
## Context
We need to choose a primary database for the OrderFlow platform.
Requirements:
- ACID transactions for order processing
- JSON support for flexible product metadata
- Full-text search for product catalog
- Strong ecosystem and community support
- Team familiarity
Options considered:
1. PostgreSQL
2. MySQL
3. MongoDB
## Decision
We will use PostgreSQL 16.
## Rationale
- **ACID compliance**: Critical for financial transactions (orders, payments)
- **JSONB support**: Handles flexible product metadata without schema changes
- **Full-text search**: Built-in tsvector/tsquery eliminates need for Elasticsearch
for our current scale
- **Team experience**: 4 of 5 engineers have production PostgreSQL experience
- **Ecosystem**: Excellent ORM support (Prisma, Drizzle), monitoring tools
MongoDB was rejected because:
- Eventual consistency is unacceptable for order processing
- Multi-document transactions add complexity we don't need
- Team has limited MongoDB experience
MySQL was a close second but lacks native JSONB querying and has
weaker full-text search.
## Consequences
- Must manage PostgreSQL infrastructure (or use managed service like RDS)
- Team must follow PostgreSQL best practices for indexing and query optimization
- Schema migrations required for structural changes (managed via Prisma)
- Connection pooling needed for production (PgBouncer or built-in)
## Related
- ADR-002: Use Prisma as ORM
- ADR-005: Use RDS for managed PostgreSQL in production
ADR Directory Structure
docs/
adr/
ADR-001-use-postgresql.md
ADR-002-use-prisma-orm.md
ADR-003-adopt-github-flow.md
ADR-004-use-stripe-for-payments.md
ADR-005-deploy-to-aws.md
ADR-006-adopt-typescript-strict-mode.md
README.md (index of all ADRs)
ADR Index
# Architecture Decision Records
| ADR | Title | Status | Date |
|---|---|---|---|
| [001](ADR-001-use-postgresql.md) | Use PostgreSQL as primary database | Accepted | 2026-01-15 |
| [002](ADR-002-use-prisma-orm.md) | Use Prisma as ORM | Accepted | 2026-01-15 |
| [003](ADR-003-adopt-github-flow.md) | Adopt GitHub Flow branching | Accepted | 2026-01-20 |
| [004](ADR-004-use-stripe-for-payments.md) | Use Stripe for payment processing | Accepted | 2026-02-01 |
| [005](ADR-005-deploy-to-aws.md) | Deploy to AWS (ECS + RDS) | Accepted | 2026-02-10 |
| [006](ADR-006-adopt-typescript-strict.md) | Adopt TypeScript strict mode | Superseded by 009 | 2026-02-15 |
When to Write an ADR
Write an ADR when you're making a decision that:
- Is hard to reverse (database choice, language, framework)
- Affects multiple teams or services
- Will be questioned by future developers
- Involves significant tradeoffs
- Was debated by the team
Don't write an ADR for:
- Trivial decisions (variable naming, file structure within a module)
- Decisions that are easily reversed
- Industry-standard practices that don't need justification
Runbooks and Troubleshooting Guides
Runbooks document how to handle operational scenarios -- deployments, incidents, and common problems. They turn "call the person who knows" into "follow these steps."
Runbook Template
# Runbook: Database Connection Pool Exhaustion
## Symptoms
- API returns 503 errors
- Logs show "too many connections" or "connection pool exhausted"
- Database monitoring shows connections at max limit
## Severity
High -- affects all API requests
## Diagnosis
1. Check current connection count:
SELECT count(*) FROM pg_stat_activity WHERE datname = 'orderflow';
2. Check which queries are holding connections:
SELECT pid, state, query, query_start
FROM pg_stat_activity
WHERE datname = 'orderflow'
ORDER BY query_start;
3. Check for idle connections:
SELECT count(*), state
FROM pg_stat_activity
WHERE datname = 'orderflow'
GROUP BY state;
## Resolution
### Step 1: Kill idle connections (immediate relief)
SELECT pg_terminate_backend(pid)
FROM pg_stat_activity
WHERE datname = 'orderflow'
AND state = 'idle'
AND query_start < now() - interval '5 minutes';
### Step 2: Restart application pods (if connections don't drop)
kubectl rollout restart deployment/api-server -n production
### Step 3: Increase pool size (if legitimate traffic increase)
Update DATABASE_POOL_SIZE in production config from 20 to 40.
Requires redeployment.
## Prevention
- Monitor connection count with alert at 80% of max
- Set idle connection timeout to 30 seconds
- Use connection pooling (PgBouncer) in production
- Review queries for missing connection releases
## Escalation
If the above steps don't resolve within 15 minutes:
1. Page the database team (PagerDuty: #db-oncall)
2. Consider enabling read-only mode
3. Notify affected teams in #incident-response
Incident Response Template
# Incident Response: [TITLE]
## Timeline
- **Detected**: 2026-03-15 14:32 UTC (PagerDuty alert)
- **Acknowledged**: 2026-03-15 14:35 UTC
- **Mitigated**: 2026-03-15 15:10 UTC
- **Resolved**: 2026-03-15 16:00 UTC
## Impact
- 38 minutes of degraded API performance
- ~2,400 failed requests (0.3% of daily traffic)
- 12 customer-facing error reports
## Root Cause
Memory leak in the search service caused by unclosed database
cursors in the autocomplete endpoint. Each request leaked ~2KB.
After 6 hours of traffic, the container exceeded its 512MB limit
and was OOM-killed by Kubernetes.
## Resolution
- Immediate: Restarted search service pods
- Permanent: Fixed cursor leak in PR #1234
- Added memory monitoring alert at 400MB
## Action Items
- [ ] Add cursor cleanup to all database queries (JIRA-456)
- [ ] Add integration test for cursor lifecycle (JIRA-457)
- [ ] Increase memory limit to 1GB as safety buffer (JIRA-458)
- [ ] Add memory usage to dashboard (JIRA-459)
## Lessons Learned
- Our load testing didn't catch this because tests run for <1 hour
- Need long-running soak tests for memory leak detection
Keeping Documentation Updated
The biggest challenge with documentation isn't writing it -- it's keeping it current. Stale docs are worse than no docs because they mislead.
Strategies for Keeping Docs Fresh
1. Docs Live Next to Code
src/
services/
payment/
payment.service.ts
payment.test.ts
README.md # Docs right next to the code
When you change the code, the docs are right there to update.
2. Docs as Part of PR Checklist
## PR Checklist
- [ ] Code changes complete
- [ ] Tests added/updated
- [ ] **Documentation updated** (if behavior changed)
- [ ] **README updated** (if setup/usage changed)
- [ ] **API docs updated** (if endpoints changed)
3. Automated Doc Checks
# GitHub Actions: check for docs
name: Docs Check
on: pull_request
jobs:
check-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Check for doc changes
run: |
# If API routes changed, API docs should too
API_CHANGED=$(git diff --name-only origin/main | grep -c "src/routes/" || true)
DOCS_CHANGED=$(git diff --name-only origin/main | grep -c "openapi.yaml" || true)
if [ "$API_CHANGED" -gt 0 ] && [ "$DOCS_CHANGED" -eq 0 ]; then
echo "::warning::API routes changed but openapi.yaml was not updated"
fi
4. Scheduled Doc Audits
Monthly: Review README for accuracy
Quarterly: Audit all ADRs -- are any superseded?
Per release: Update API changelog
Per incident: Update or create relevant runbook
5. Generated Documentation
Automate what you can. Generated docs are always current.
# Generate API docs from OpenAPI spec
npx @openapitools/openapi-generator-cli generate \
-i openapi.yaml \
-g html2 \
-o docs/api
# Generate TypeScript docs from JSDoc
npx typedoc --out docs/api src/
# Generate database docs from schema
npx prisma generate # Prisma auto-generates client docs
Documentation Comparison
| Doc Type | Audience | Updates When | Location |
|---|---|---|---|
| README | New developers | Setup/usage changes | Root of repo |
| API Docs | API consumers | Endpoints change | /docs or hosted |
| Code Comments | Future maintainers | Logic changes | Inline with code |
| ADRs | Future architects | New decisions made | /docs/adr |
| Runbooks | Ops/on-call | Incidents happen | Wiki or /docs/runbooks |
| CHANGELOG | Users/consumers | Each release | Root of repo |
| CONTRIBUTING | Contributors | Process changes | Root of repo |
Key Takeaways
- README is the front door -- include quick start, prerequisites, installation, and architecture overview
- API documentation uses OpenAPI/Swagger -- define schemas, examples, and serve interactive docs
- Code comments explain why, not what -- the code tells you what, comments tell you why you did it that way
- ADRs capture the reasoning behind significant technical decisions -- they save hours of "why did we choose X?"
- Runbooks turn tribal knowledge into repeatable procedures -- critical for incident response
- Docs next to code stay fresher than docs in a wiki -- proximity drives updates
- Automate where possible -- generated docs from code, schemas, and specs are always current
- Add documentation to the PR checklist -- if behavior changed, docs should too
- Stale docs are worse than no docs -- they actively mislead people
- Start with the README and API docs -- those provide the highest value per effort