data

Django ORM Query Optimization

How to write efficient Django ORM queries for production. Covers select_related, prefetch_related, query analysis, N+1 detection, aggregation, raw SQL, and database indexing strategies.

⏱ 15 min read production
Database query analysis showing execution plans and optimization paths

Django’s ORM makes database access convenient, but convenience has a cost when you stop paying attention to what SQL actually runs. The difference between an optimized Django view and a naive one can be 3 queries versus 300, the difference between a 50ms response and a 5-second timeout. This guide covers the query optimization patterns I apply during code review on every Django project: eager loading with select_related and prefetch_related, N+1 detection, queryset evaluation, annotation and aggregation, only() and defer(), raw SQL escape hatches, and the profiling tools that reveal where time actually goes. For a broader data layer view, see the ORM hub.

Query optimization is not premature optimization. If you serve web pages backed by a database, query efficiency is part of the user experience. A page that makes 200 database round trips will feel slow regardless of how fast each individual query runs.

The N+1 query problem

The most common ORM performance issue is N+1 queries. It happens when you fetch a list of objects and then access a related object on each one:

# BAD: 1 query for orders + N queries for customers
orders = Order.objects.all()
for order in orders:
    print(order.customer.name)  # Each access hits the database

Fix with select_related for foreign key and one-to-one relationships:

# GOOD: 1 query with JOIN
orders = Order.objects.select_related('customer').all()
for order in orders:
    print(order.customer.name)  # No additional query

Use prefetch_related for many-to-many and reverse foreign key relationships:

# GOOD: 2 queries total (1 for orders, 1 for items)
orders = Order.objects.prefetch_related('items').all()
for order in orders:
    for item in order.items.all():  # Uses prefetched data
        print(item.product_name)

Detecting query problems

Django Debug Toolbar shows exact SQL queries for every request during development:

# settings/dev.py
INSTALLED_APPS += ['debug_toolbar']
MIDDLEWARE.insert(0, 'debug_toolbar.middleware.DebugToolbarMiddleware')
INTERNAL_IPS = ['127.0.0.1']

In production, use django-querycount or logging to monitor query counts per request. Any endpoint generating more than 10 queries deserves investigation.

You can also print queries in the shell:

from django.db import connection
# ... run your code ...
for query in connection.queries:
    print(query['sql'][:100], query['time'])

Controlling queryset evaluation

Querysets are lazy. They do not hit the database until evaluated. Understanding when evaluation happens prevents unexpected query execution:

Evaluation triggers:

  • Iteration (for obj in queryset)
  • Slicing with a step (queryset[::2])
  • len(), list(), bool()
  • Printing or repr
  • Template rendering

Not evaluated:

  • Chaining filters (.filter().exclude().order_by())
  • Assigning to a variable
  • Passing to another function that chains further

Build your queryset completely before evaluation:

# Good: builds the full query, then evaluates once
qs = Product.objects.filter(is_active=True).select_related('category').order_by('-created_at')[:20]
products = list(qs)  # Single database hit

Using only() and defer()

When you need only specific fields, reduce the data transferred:

# Only fetch the fields needed for a listing page
products = Product.objects.only('id', 'name', 'slug', 'price').filter(is_active=True)

defer() is the inverse, fetching everything except specified fields:

# Fetch everything except the large description field
products = Product.objects.defer('description', 'specifications')

Accessing a deferred field triggers a separate query. Use these methods deliberately and verify they actually reduce response time for your specific case.

Aggregation and annotation

Compute values in the database rather than in Python:

from django.db.models import Count, Sum, Avg, Q

# Count products per category in a single query
categories = Category.objects.annotate(product_count=Count('products'))

# Conditional aggregation
stats = Order.objects.aggregate(
    total_revenue=Sum('total'),
    avg_order_value=Avg('total'),
    pending_count=Count('id', filter=Q(status='pending')),
)

Database-level aggregation is dramatically faster than fetching all rows and computing in Python. This matters especially for dashboard views and reporting endpoints.

Subqueries and expressions

For complex queries, use Subquery and OuterRef:

from django.db.models import Subquery, OuterRef

latest_order = Order.objects.filter(
    customer=OuterRef('pk')
).order_by('-created_at').values('total')[:1]

customers = Customer.objects.annotate(
    last_order_total=Subquery(latest_order)
)

This generates a single SQL query with a correlated subquery, avoiding the need to fetch data in multiple steps.

Bulk operations

For mass updates and creates, use bulk methods:

# Bulk create
Product.objects.bulk_create([
    Product(name=f'Product {i}', price=i * 10)
    for i in range(1000)
], batch_size=100)

# Bulk update
products = Product.objects.filter(category='books')
for p in products:
    p.price = p.price * 1.1
Product.objects.bulk_update(products, ['price'], batch_size=100)

bulk_create and bulk_update reduce database round trips from N to N/batch_size. Note that bulk_create does not call save() or trigger signals by default.

When to use raw SQL

The ORM handles 95% of queries well. For the remaining 5%, raw SQL is the right tool:

from django.db import connection

def get_revenue_by_month():
    with connection.cursor() as cursor:
        cursor.execute("""
            SELECT date_trunc('month', created_at) as month,
                   SUM(total) as revenue
            FROM orders_order
            WHERE status = 'completed'
            GROUP BY month
            ORDER BY month DESC
            LIMIT 12
        """)
        return cursor.fetchall()

Use raw SQL for complex reporting queries, database-specific features not exposed by the ORM, or when the generated SQL is genuinely worse than what you can write by hand. Always use parameterized queries to prevent SQL injection.

Frequently asked questions

How do I know if my queries are fast enough? Measure. Use Django Debug Toolbar in development and application monitoring in production. A good target for web pages is under 50ms of total database time per request.

Should I always use select_related? No. Use it when you know you will access the related object. Adding select_related for relationships you do not access wastes bandwidth by fetching unnecessary data in the JOIN.

Is the Django ORM slower than raw SQL? The ORM adds minimal overhead per query. The performance difference between ORM and raw SQL is usually negligible compared to the cost of the database work itself. The ORM’s real risk is generating inefficient queries, not being slow to construct them.

How do I handle full-text search efficiently? For PostgreSQL, use django.contrib.postgres.search. For more complex search requirements, consider a dedicated search engine. The PostgreSQL guide covers database-level search in detail.