How to optimize the performance of Django admin with millions of data in MongoDB

How to optimize the performance of Django admin with millions of data in MongoDB

In this article, I want to share my previous experience where I managed to increase the speed of Django admin to load more than 20 million records from MongoDB.

Real Case Scenario

We used to have MongoDB where it's running in Kubernetes pods and connected with Django admin to make CRUD operations easier and also for better searching with custom filtering methods.

The issue was the slow loading of records from MongoDB with sometimes timeout errors. The data used to increase on a daily basis which also makes loading even worse.

Also, we're using Djongo which translates queries for MongoDB by allowing us to use Django ORM itself.

Solution

Now, let's take a look at the general reasons for the slow performance.

Indexing DB

Normally, MongoDB scans all documents to select if it matches the query. Since we have millions of records, it takes very long to complete scanning. By indexing the specific fields that we're using in the query, MongoDB can use the index to limit the number of documents it must inspect.

Since we're considering the precise solution for the current problem, you can check the documentation for more detailed information.

  • You must create indexes for the field that you're using for filtering.

  • If filtering includes multiple fields then consider creating compound indexes.

Note that, indexes are special data structures that store a small subset of the data held in a collection’s documents separately from the documents themselves. So, you have to check memory (RAM) performance and increase it if required.

Custom Paginator

Django admin uses pagination by default which counts the number of records and creates the corresponding paginator. At this point, it takes plenty of time to wait until counting is finished.

First, let's take a look count property of Paginator class:

    @cached_property
    def count(self):
        """Return the total number of objects, across all pages."""
        c = getattr(self.object_list, 'count', None)
        if callable(c) and not inspect.isbuiltin(c) and method_has_no_args(c):
            return c()
        return len(self.object_list)

Now, since we don't need counting let's create a custom paginator to override default functionality by skipping the actual counting:

admin.py

from django.core.paginator import Paginator


class NoCountPaginator(Paginator):
    @property
    def count(self):
        return 999999999

Then define the paginator attribute in your admin class as below:

admin.py

@admin.register(models.Mymodel)
class MymodelAdmin:

    paginator = NoCountPaginator
    ...

We should also prevent the counting on the list form which disables count query on filtered pages to show the full count by setting show_full_result_count to False.

admin.py

@admin.register(models.Mymodel)
class MymodelAdmin:

    paginator = NoCountPaginator
    show_full_result_count = False

Custom Filterings

This is an extra alert to check custom filtering functions ( list_filter ) in Django admin that you might implement previously. In my case, there was a field which used to get distinct values across all documents. It extremely slows down the performance as the number of documents increases.

As a solution, you can replace it with textbox searching instead of scanning all documents and getting specific values for filtering.

Don't use skip and limit

Using offset in query will not change the performance but can make it even much slower than previous. Since you are declaring offset, MongoDB will go through all documents until it reaches the particular one. For the first few pages, it can be fast but the larger offset gets the slower performance will follow.

Support 🌏

If you feel like you unlocked new skills, please share it with your friends and stay connected.