Nick Langa't — Product Engineer & Solutions Architect

Recap

This is the final article in the series. Here is what we have built together.

Article 1 established the mental model: the test pyramid, the three types of tests, and the properties that separate a useful test suite from noise. Article 2 built a unit test suite with Django's TestCase: model methods, validators, custom managers, and service logic. Article 3 migrated that suite to pytest: plain functions, composable fixtures, parametrize, and the mocker fixture. Article 4 moved up to integration tests: views, APIs, URL routing, serializer validation, filtering, and pagination. Article 5 covered every auth and permission boundary: login flows, token and JWT auth, custom permission classes, and role-based access. Article 6 built a complete feature test-first: seven red-green-refactor cycles, from model to permissions. Article 7 solved the test data problem with factory_boy, covered fixture scoping, shared state bugs, and hermetic isolation.

The test suite works. Now we make it work automatically, on every commit, with visibility into what it covers and the performance to make nobody avoid running it.

What coverage actually measures

Coverage measures which lines of source code were executed during the test run. A line is marked as covered if any test caused it to run. That is all.

It does not measure whether the line was tested correctly. It does not measure whether all the conditions on that line were exercised. It does not measure whether the assertions in your tests are meaningful.

Consider this function:

python

def calculate_discount(price, customer_type):
    if customer_type == "loyalty":
        return price * 0.9
    return price

python

# This test achieves 100% line coverage
def test_calculate_discount():
    result = calculate_discount(100, "loyalty")
    # No assertion — the test just runs the function

One hundred percent coverage, zero confidence. The test runs every line but asserts nothing. A bug could change 0.9 to 0.8 and this test would still pass.

Coverage is a floor, not a ceiling. Its value is in finding untested code, not in validating tested code. A path with 0% coverage is a problem. A path with 100% coverage is not automatically correct.

Coverage tells you where you have not looked. It does not tell you what you found when you did.

With that caveat established, coverage is still genuinely useful. A well-designed suite with 85% coverage and meaningful assertions is far more valuable than a poorly designed suite with 100% coverage and assertions that verify nothing. Use coverage to identify blind spots, not to chase a number.

Running coverage with pytest

The cleanest way to run coverage alongside pytest is with the pytest-cov plugin. It integrates directly with the pytest runner so you get a coverage report at the end of every test run without a separate command.

bash

pip install pytest-cov

bash

# Run tests and show a terminal coverage report
pytest --cov=. --cov-report=term-missing

# Run tests, show terminal report, and generate an HTML report
pytest --cov=. --cov-report=term-missing --cov-report=html

# Run tests and fail the suite if coverage drops below 80%
pytest --cov=. --cov-fail-under=80

The --cov=. flag tells pytest-cov to measure coverage for the entire current directory. You can narrow it to a specific app: --cov=orders.

The --cov-report=term-missing flag shows which specific line numbers were not executed. This is the most useful format for identifying gaps.

bash

---------- coverage: platform darwin, python 3.12 ----------
Name                          Stmts   Miss  Cover   Missing
-----------------------------------------------------------
orders/models.py                 45      3    93%   78, 91-92
orders/services.py               28      0   100%
orders/utils.py                  15      0   100%
orders/views.py                  52      8    85%   112-119
reviews/models.py                18      0   100%
reviews/services.py               8      0   100%
reviews/views.py                 34      2    94%   67-68
-----------------------------------------------------------
TOTAL                           200     13    94%

The Missing column tells you exactly which lines to investigate. Lines 112 to 119 in orders/views.py have not been executed by any test. Open the file, read those lines, and decide whether they need a test or whether they are genuinely unreachable dead code.

Reading the report

The HTML report gives you a line-by-line view of every file. Open htmlcov/index.html in a browser after running with --cov-report=html.

Green lines were executed. Red lines were not. Yellow lines have partial branch coverage: the line ran but not all branches on it were taken.

Branch coverage is more useful than line coverage. A line like if condition: can be green (the line ran) while the else branch was never tested. Enable branch coverage to catch this:

bash

pytest --cov=. --cov-branch --cov-report=term-missing

bash

Name                     Stmts   Miss Branch BrPart  Cover   Missing
--------------------------------------------------------------------
orders/models.py            45      3     12      2    88%   78, 91->93, 92
orders/services.py          28      0      8      0   100%

The BrPart column shows partially covered branches. 91->93 means line 91 has a branch that can jump to line 93, and that jump was never taken in any test. This is the gap that line coverage would have missed.

Configuring coverage

Repeating coverage flags on the command line every time is tedious and error-prone. Move them into configuration files so the same settings apply everywhere: local development, CI, and any teammate's machine.

Add coverage settings to pyproject.toml:

pyproject.toml

[tool.pytest.ini_options]
DJANGO_SETTINGS_MODULE = "myproject.settings"
addopts = "--cov=. --cov-report=term-missing --cov-branch"

[tool.coverage.run]
source = ["."]
omit = [
    "*/migrations/*",
    "*/tests/*",
    "manage.py",
    "myproject/settings*.py",
    "myproject/wsgi.py",
    "myproject/asgi.py",
]
branch = true

[tool.coverage.report]
show_missing = true
skip_covered = false
fail_under = 80
exclude_lines = [
    "pragma: no cover",
    "if TYPE_CHECKING:",
    "raise NotImplementedError",
    "if __name__ == .__main__.:",
]

The omit list excludes directories and files that should not count toward coverage. Migrations are generated code. Test files themselves should not be measured (a test that runs covers test lines, but those are not your application). Settings files and entry points are configuration, not logic.

The exclude_lines list marks patterns that should be treated as covered even if they never execute. pragma: no cover is the standard inline annotation for deliberately untested lines. if TYPE_CHECKING: blocks only run during static analysis, never at runtime.

✦ Tip

Use pragma: no cover sparingly

Adding # pragma: no cover to a line tells coverage to ignore it. This is appropriate for genuinely untestable lines like abstract method bodies (raise NotImplementedError) and platform-specific code that only runs on certain operating systems. It is not appropriate for business logic you have not gotten around to testing yet. Use it to silence coverage on lines that cannot be reached, not on lines you have chosen not to reach.

What good coverage looks like

There is no universal correct number. Coverage targets depend on the nature of the codebase, the team's confidence, and the consequences of bugs.

Some practical guidance:

Critical paths (payment, auth, data writes) should be close to 100%. A bug in these areas has immediate user impact. Every branch of every function in these paths deserves a test.
80% to 90% is a reasonable overall target for most Django applications. Below 80% usually indicates entire features or layers with no tests. Above 90% the marginal value of additional coverage drops while the cost of writing tests for diminishing-returns edge cases rises.
New code should not lower the overall coverage. Use --cov-fail-under in CI to enforce this as a gate. If a pull request drops coverage below the threshold, it does not merge until tests are added.
Track the trend, not the number. Coverage going from 75% to 80% over a month tells you the suite is improving. Coverage sitting at 60% for six months tells you the team has stopped caring. The trend matters more than where it sits at any given moment.

Running tests in GitHub Actions

GitHub Actions is a CI/CD platform that runs workflows defined in YAML files. A workflow that runs your test suite on every push and pull request means bugs are caught before they are reviewed or merged, not after.

Create the workflow file at .github/workflows/tests.yml:

.github/workflows/tests.yml

name: Tests

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_DB: test_db
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: postgres
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    env:
      DATABASE_URL: postgres://postgres:postgres@localhost:5432/test_db
      DJANGO_SETTINGS_MODULE: myproject.settings.test
      SECRET_KEY: ci-secret-key-not-used-in-production

    steps:
      - name: Check out code
        uses: actions/checkout@v4

      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: "3.12"
          cache: "pip"

      - name: Install dependencies
        run: pip install -r requirements.txt

      - name: Run tests
        run: pytest --cov=. --cov-report=xml --cov-fail-under=80

      - name: Upload coverage report
        uses: codecov/codecov-action@v4
        with:
          file: ./coverage.xml
          fail_ci_if_error: true

This workflow does the following:

Triggers on every push to main and on every pull request targeting main.
Starts a PostgreSQL container with health checks so the database is ready before the tests run.
Sets environment variables for the database connection and Django settings.
Checks out the code, sets up Python 3.12 with pip caching, and installs dependencies.
Runs the full test suite with coverage, outputting an XML report.
Uploads the coverage report to Codecov for tracking over time.

✦ Tip

Use a separate test settings file

Create a myproject/settings/test.py that inherits from your base settings and overrides anything that should be different in CI: a faster password hasher (MD5 instead of bcrypt), disabled throttling, disabled external integrations. Faster password hashing alone can shave seconds off a suite with many user creation operations.

myproject/settings/test.py

from .base import *  # noqa

# Use a fast hasher in tests — bcrypt is intentionally slow
PASSWORD_HASHERS = ["django.contrib.auth.hashers.MD5PasswordHasher"]

# Disable Celery task execution — use CELERY_TASK_ALWAYS_EAGER instead
CELERY_TASK_ALWAYS_EAGER = True
CELERY_TASK_EAGER_PROPAGATES = True

# Disable external integrations
EMAIL_BACKEND = "django.core.mail.backends.locmem.EmailBackend"
STRIPE_ENABLED = False

# Use an in-memory cache
CACHES = {
    "default": {
        "BACKEND": "django.core.cache.backends.locmem.LocMemCache",
    }
}

Caching dependencies

Installing dependencies on every CI run is slow. The actions/setup-python@v5 action supports pip caching out of the box with cache: "pip". This caches the pip download cache and reuses it on subsequent runs when requirements.txt has not changed.

For further speed, cache the installed packages themselves using a hash of the requirements file as the cache key:

yaml

      - name: Cache pip packages
        uses: actions/cache@v4
        with:
          path: ~/.cache/pip
          key: ${{ runner.os }}-pip-${{ hashFiles('requirements.txt') }}
          restore-keys: |
            ${{ runner.os }}-pip-

      - name: Install dependencies
        run: pip install -r requirements.txt

When requirements.txt changes, the hash changes, the cache misses, and pip installs fresh. When requirements have not changed, the cache hits and the install step is nearly instant.

Matrix testing

A matrix strategy runs the same job against multiple versions of Python and Django in parallel. This catches compatibility issues before they reach production and verifies that your application works on the versions your users actually run.

yaml

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: ["3.11", "3.12", "3.13"]
        django-version: ["4.2", "5.0", "5.1"]

    steps:
      - uses: actions/checkout@v4

      - name: Set up Python ${{ matrix.python-version }}
        uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}

      - name: Install dependencies
        run: |
          pip install django==${{ matrix.django-version }}
          pip install -r requirements.txt

      - name: Run tests
        run: pytest

GitHub runs all 9 combinations (3 Python versions times 3 Django versions) in parallel. You see which combinations fail and which pass, without any additional configuration.

Coverage in CI

Running coverage in CI serves two purposes: enforcing a minimum threshold so pull requests cannot merge if they drop coverage, and tracking coverage trends over time so you can see whether the suite is improving or degrading.

Enforcing a minimum threshold

      - name: Run tests with coverage gate
        run: pytest --cov=. --cov-fail-under=80

If the total coverage drops below 80%, the step fails, the job fails, and the pull request shows a failing check. The developer must add tests before the PR can merge.

For tracking trends, upload to Codecov (free for open source, paid for private repos) or use GitHub's own summary output:

yaml

      - name: Run tests
        run: pytest --cov=. --cov-report=xml --cov-report=term-missing

      - name: Upload to Codecov
        uses: codecov/codecov-action@v4
        with:
          file: ./coverage.xml

      # Alternative: write coverage summary directly to the GitHub Actions job summary
      - name: Coverage summary
        run: |
          echo "## Coverage Report" >> $GITHUB_STEP_SUMMARY
          coverage report --format=markdown >> $GITHUB_STEP_SUMMARY

Making the suite fast

A suite that takes 5 minutes to run locally will not be run locally. Developers will push and wait for CI, creating a slow feedback loop and a culture of "fix it later." Speed is not a luxury. It directly determines how useful your test suite is in practice.

Here are the highest-leverage optimisations, roughly in order of impact.

Use a fast password hasher in tests

Django's default password hasher (PBKDF2) is intentionally slow. That is correct for production security. In tests, where hundreds of users are created, it adds seconds. Switch to MD5 in the test settings file:

python

# myproject/settings/test.py
PASSWORD_HASHERS = ["django.contrib.auth.hashers.MD5PasswordHasher"]

This single change commonly cuts 20 to 40 percent from suite runtime in applications that create many test users.

Reuse the test database

pytest-django destroys and recreates the test database on every run by default. This applies and rolls back all migrations from scratch. On projects with many migrations, this adds several seconds.

Use --reuse-db to skip this step and reuse the existing database:

bash

# Reuse the database if it exists, rebuild if migrations changed
pytest --reuse-db

# Force a rebuild when you need to apply new migrations
pytest --reuse-db --create-db

Add --reuse-db to addopts in pytest.ini for local development. Remove it in CI where you always want a clean database.

pyproject.toml (local only)

[tool.pytest.ini_options]
# Local development: reuse the db
addopts = "--reuse-db"

pyproject.toml (with env override for CI)

[tool.pytest.ini_options]
# Override with PYTEST_ADDOPTS="" in CI to prevent --reuse-db from being used
addopts = "--reuse-db"

Keep unit tests off the database

As covered in Article 7, using Model() or factory.build() instead of objects.create() for unit tests avoids all database I/O. A suite with 200 unit tests that never touch the database can run in under a second. The same 200 tests with unnecessary database access can take 30 seconds.

Run your unit tests in isolation to see the gap:

bash

# Run only tests with no database access (no @pytest.mark.django_db)
pytest -m "not django_db"

# Compare the time to running only database tests
pytest -m "django_db"

Parallel test execution

pytest-xdist runs tests in parallel across multiple CPU cores. Each worker gets its own database, so tests remain isolated. On a machine with 8 cores, a suite that takes 60 seconds can drop to 10 to 15 seconds.

bash

pip install pytest-xdist

bash

# Run with 4 workers
pytest -n 4

# Run with one worker per CPU core (auto)
pytest -n auto

⚠ Gotcha

Parallel tests expose hidden shared state bugs

Tests that pass when run sequentially sometimes fail under parallel execution because they depend on shared state that gets mutated concurrently. If your suite has flaky failures with -n auto but passes with -n 1, you have a shared state bug. Use the isolation techniques from Article 7 to fix it. Parallel execution is a useful tool for finding these bugs before they surface in more subtle ways.

Using xdist in GitHub Actions

      - name: Run tests in parallel
        run: pytest -n auto --cov=. --cov-report=xml

Running only what changed

pytest-picked runs only the tests related to files that have changed since the last git commit. On a large codebase, this can reduce a 3-minute suite to a 10-second feedback loop during development.

bash

pip install pytest-picked

bash

# Run only tests related to changed files
pytest --picked

# Show which tests would run without running them
pytest --picked --collect-only

Use this locally during active development. Always run the full suite before pushing to catch regressions in areas you did not expect your change to affect.

The --lf (last failed) and --ff (failed first) flags from Article 3 complement this well: after a full run reveals failures, use --lf to iterate quickly on fixing them, then run the full suite to confirm nothing else broke.

Where to go from here

This series covered unit tests, integration tests, authentication, TDD, test data, and CI. There is more ground to cover in the broader Django testing landscape. Here are the most valuable directions from here.

Async testing

Django 4.1 added async views, and Django Channels enables WebSocket support. Testing async code requires async-aware test utilities.

Django provides AsyncClient for async views and @pytest.mark.asyncio (from pytest-asyncio) for async test functions. Channels provides a WebsocketCommunicator for testing WebSocket consumers.

bash

pip install pytest-asyncio

python

import pytest
from django.test import AsyncClient


@pytest.mark.asyncio
@pytest.mark.django_db
async def test_async_view_returns_200():
    client = AsyncClient()
    response = await client.get("/api/async-endpoint/")
    assert response.status_code == 200

Browser testing with Playwright

For end-to-end tests that exercise the full browser stack, Playwright is the modern choice over Selenium. It is faster, more reliable, and has better async support.

The pytest-playwright plugin integrates Playwright with pytest and provides a page fixture that represents a browser context.

bash

pip install pytest-playwright
playwright install

python

from playwright.sync_api import Page, expect


def test_user_can_log_in_and_see_dashboard(page: Page, live_server):
    page.goto(f"{live_server.url}/login/")
    page.fill("[name=username]", "alice")
    page.fill("[name=password]", "testpass123")
    page.click("[type=submit]")

    expect(page).to_have_url(f"{live_server.url}/dashboard/")
    expect(page.locator("h1")).to_contain_text("Welcome back")

Keep browser tests narrow. Test critical user journeys (login, checkout, account creation) and nothing else. They are slow and expensive to maintain. A handful of well-chosen browser tests on top of a strong unit and integration suite is the right balance.

Property-based testing with Hypothesis

Property-based testing generates hundreds of random inputs and runs your function against all of them, looking for inputs that violate an invariant. Hypothesis is the standard library for this in Python.

It is especially powerful for finding edge cases in validation logic, parsing, and mathematical functions that a human would never think to test manually.

bash

pip install hypothesis

python

from hypothesis import given, strategies as st
from decimal import Decimal


@given(
    price=st.decimals(min_value=0, max_value=10000, places=2),
    is_loyalty=st.booleans(),
)
def test_loyalty_discount_never_returns_more_than_input(price, is_loyalty):
    result = calculate_loyalty_discount(price, is_loyalty)
    assert result <= price


@given(st.integers(min_value=1, max_value=5))
def test_valid_rating_never_raises(rating):
    from reviews.models import Review
    review = Review(rating=rating)
    review.full_clean()  # should never raise for 1-5

Mutation testing

Mutation testing answers the question that coverage cannot: are your tests actually catching bugs? It works by introducing deliberate bugs (mutations) into your code, one at a time, and checking whether any test fails. A mutation that survives (no test catches it) is a gap in your suite.

mutmut is the most popular mutation testing tool for Python. It is slow by nature (it runs your suite once per mutation), but running it periodically on critical modules gives you a rigorous measure of test quality that coverage alone cannot provide.

bash

pip install mutmut

# Run mutation testing on a specific module
mutmut run --paths-to-mutate orders/services.py

# Show surviving mutations (tests that did not catch them)
mutmut results

Series summary

Eight articles. One complete picture of testing in Django.

The mental model: tests are organised in a pyramid. Unit tests at the base are fast, isolated, and precise. Integration tests in the middle verify that components are wired together. End-to-end tests at the top exercise the full system. A healthy suite has many unit tests, a meaningful number of integration tests, and a small number of end-to-end tests for critical journeys.

The tools: Django's TestCase and pytest-django for running tests and accessing the database. factory_boy for creating test data without fragility or duplication. unittest.mock and pytest-mock for isolating external dependencies. pytest-cov for measuring coverage. pytest-xdist for parallel execution. GitHub Actions for running everything automatically on every push.

The practices: write tests at the right level. Use function-scoped fixtures by default. Test every branch, every error path, every permission boundary. Call refresh_from_db() before asserting on saved state. Patch at the point of use. Use factory.build() for tests that do not need the database. Use a fast password hasher in tests. Enforce a coverage threshold in CI. Track the trend over time.

The discipline: a test suite is not a checkbox. It is the machine that makes it safe to change code. Feed it, maintain it, and it will pay back every hour invested many times over.

Coverage, CI, and Shipping with Confidence

Recap

What coverage actually measures

Running coverage with pytest

Reading the report

Configuring coverage

What good coverage looks like

Running tests in GitHub Actions

Caching dependencies

Matrix testing

Coverage in CI

Making the suite fast

Use a fast password hasher in tests

Reuse the test database

Keep unit tests off the database

Parallel test execution

Running only what changed

Where to go from here

Async testing

Browser testing with Playwright

Property-based testing with Hypothesis

Mutation testing

Series summary