How to Benchmark REST APIs in Python with Locust and Requests

Benchmarking REST APIs in Python: A Practical Guide with Locust and Requests#

Evaluating the performance of RESTful APIs under varying load conditions is essential for ensuring reliability, scalability, and a positive user experience. This process, known as API benchmarking or load testing, involves simulating multiple users interacting with an API simultaneously and measuring its response time, throughput, and error rate. Python is a widely adopted language for test automation, and two powerful libraries often used in conjunction for API benchmarking are Locust and Requests.

Understanding API Benchmarking

API benchmarking quantifies an API’s performance characteristics under specific load profiles. Key metrics typically measured include:

Response Time (Latency): The time taken for the API to respond to a request. This is often measured in milliseconds (ms) and reported as average, median, minimum, maximum, and percentiles (e.g., 90th, 95th, 99th percentile). Lower response times indicate better performance.
Throughput: The number of requests the API can handle per unit of time, usually measured in requests per second (Req/s). Higher throughput indicates greater capacity.
Error Rate: The percentage of requests that result in an error response (e.g., HTTP status codes 5xx). A high error rate under load can indicate stability issues or resource exhaustion.
Concurrency: The number of simultaneous active users or requests the API can handle effectively.

Benchmarking helps identify performance bottlenecks, determine infrastructure requirements for expected traffic, validate service level agreements (SLAs), and ensure stability before deploying or after updating an API.

Essential Tools: Locust and Requests

Locust: Locust is an open-source, Python-based load testing tool. It defines user behavior with Python code, making test scenarios highly customizable and expressive. Key features include:
- Defining user tasks and their probability of execution.
- Simulating large numbers of concurrent users.
- Providing a web-based user interface for controlling tests and viewing real-time statistics.
- Supporting distributed testing across multiple machines.
- Automatic reporting of response times, requests per second, and failure rates.
Requests: The Requests library is a de facto standard for making HTTP requests in Python. It provides a simple, user-friendly API for interacting with web services. While Locust provides a client (self.client) within its user classes that wraps the requests library and automatically logs statistics, understanding the underlying functionality of requests is beneficial for more complex scenarios.

Using Locust with Requests (specifically via Locust’s self.client) combines Locust’s load generation and reporting capabilities with Requests’ robust HTTP handling, creating a powerful and flexible benchmarking setup.

Setting up the Benchmarking Environment

To begin benchmarking, ensure Python is installed on the system. The necessary libraries can be installed using pip, Python’s package installer:

1
pip install locust requests

This command installs both Locust and the Requests library.

Defining User Behavior with Locust

Benchmarking with Locust involves creating a Python file (conventionally named locustfile.py) that defines the simulated user behavior. This file contains one or more User classes, typically inheriting from HttpUser when testing HTTP APIs.

A HttpUser class represents a type of user interacting with the system. Within this class, tasks are defined using the @task decorator. Each @task method represents an action a simulated user might perform.

1
from locust import HttpUser, task, between
2

3
class APIUser(HttpUser):
4
    """
5
    User class that does requests to the API
6
    """
7
    # The host attribute specifies the base URL for requests
8
    host = "http://api.example.com" # Replace with your API base URL
9

10
    # wait_time defines the time a user waits between executing tasks
11
    # between(1, 5) means wait between 1 and 5 seconds
12
    wait_time = between(1, 5)
13

14
    @task
15
    def get_items(self):
16
        """
17
        Task to perform a GET request to the /items endpoint
18
        """
19
        # self.client is a Requests client provided by Locust that logs stats
20
        self.client.get("/items")
21

22
    @task
23
    def create_item(self):
24
        """
25
        Task to perform a POST request to create an item
26
        """
27
        item_data = {"name": "test_item", "price": 10.0}
28
        self.client.post("/items", json=item_data)

HttpUser: Specifies that this user interacts over HTTP.
host: Defines the base URL for the API being tested. This allows tasks to use relative paths (e.g., /items).
wait_time: Controls the “think time” between tasks for each simulated user. This makes the load pattern more realistic. between(min, max) sets a random wait time within a range.
@task: Decorator indicating that a method represents a user task. Locust will pick tasks to execute based on their weight (default weight is 1 if not specified).
self.client: This is a key object provided by Locust’s HttpUser. It acts like a requests.Session object but automatically reports the request performance (response time, status code, size, etc.) to the Locust master. Methods like self.client.get(), self.client.post(), self.client.put(), self.client.delete() are available and work similarly to their requests counterparts.

Adding Data and Headers to Requests

The self.client methods support arguments similar to requests methods for sending data, setting headers, and handling other HTTP specifics.

1
    @task
2
    def update_item(self):
3
        """
4
        Task to perform a PUT request to update an item
5
        """
6
        item_id = 123 # Example item ID
7
        update_data = {"price": 12.5}
8
        headers = {"X-My-Header": "value"}
9
        self.client.put(f"/items/{item_id}", json=update_data, headers=headers)
10

11
    @task
12
    def get_filtered_items(self):
13
        """
14
        Task to perform a GET request with query parameters
15
        """
16
        params = {"status": "active", "limit": 10}
17
        self.client.get("/items", params=params)

Running the Locust Test

Once the locustfile.py is created, the test can be run from the terminal.

Navigate to the directory containing locustfile.py and execute:

1
locust -f locustfile.py

By default, Locust starts a web interface accessible at http://localhost:8089. Open this URL in a web browser.

The web UI allows configuring the test:

Number of users to simulate: The total number of concurrent users.
Spawn rate: The number of users to start per second until the total user count is reached.
Host: The base URL of the system under test (this can override the host attribute in locustfile.py).

After configuring, click “Start swarming!”.

Interpreting Locust Results

The Locust web UI provides real-time statistics during the test run:

Metric	Description	Insight
Requests per second	The rate at which requests are being processed by the API.	Indicates API throughput. Higher values suggest better capacity.
Response Time (Avg)	The arithmetic mean of response times for a specific endpoint.	Provides a general sense of typical latency.
Response Time (Median)	The middle value of response times. Less affected by outliers than the average.	A good indicator of the typical user experience regarding latency.
Response Time (Percentiles)	e.g., 90% means 90% of requests completed within this time.	Crucial for understanding the performance experienced by the majority of users. High percentiles indicate potential issues for a significant portion of users.
Failure Rate (%)	The percentage of requests resulting in non-success HTTP status codes.	Directly indicates API stability and error handling under load. Should be 0% or near 0% for healthy APIs.
Total Requests	The cumulative count of requests sent.	Shows the volume of traffic generated during the test.
Total Failures	The cumulative count of failed requests.	Helps pinpoint the specific endpoint(s) experiencing issues.

Analyzing these metrics together provides a comprehensive view of API performance under the simulated load. For instance, increasing the number of users and observing a significant rise in average response time or failure rate indicates that the API is struggling to handle the increased load.

Concrete Example: Benchmarking JSONPlaceholder

JSONPlaceholder is a free online REST API for testing and prototyping. Benchmarking its /posts endpoint with GET requests serves as a practical example.

locustfile_jsonplaceholder.py:

1
from locust import HttpUser, task, between
2

3
class JSONPlaceholderUser(HttpUser):
4
    host = "https://jsonplaceholder.typicode.com"
5
    wait_time = between(1, 5)
6

7
    @task
8
    def get_all_posts(self):
9
        """
10
        Fetch all posts
11
        """
12
        self.client.get("/posts")
13

14
    @task
15
    def get_first_post(self):
16
        """
17
        Fetch a specific post by ID
18
        """
19
        self.client.get("/posts/1")
20

21
    @task
22
    def get_posts_by_user(self):
23
        """
24
        Fetch posts filtered by user ID
25
        """
26
        self.client.get("/posts", params={"userId": 1})

To run this:

1
locust -f locustfile_jsonplaceholder.py

Open http://localhost:8089, set desired user count and spawn rate, and start the test. The Locust UI will then show the real-time performance metrics for the /posts and /posts/1 endpoints under the simulated load. For example, under moderate load, JSONPlaceholder’s response times are expected to be low (tens to hundreds of milliseconds) and the failure rate near zero, demonstrating typical performance for a simple, well-performing API endpoint. Increased load on a less robust API would reveal higher latencies and potential errors.

Case Study: Benchmarking a Microservice API Gateway

Consider a scenario involving an API Gateway that aggregates data from several downstream microservices. Benchmarking this gateway requires simulating realistic user flows that involve multiple API calls in sequence or parallel.

A Locust task could be designed to mimic a user loading a dashboard:

1
from locust import HttpUser, task, between
2

3
class DashboardUser(HttpUser):
4
    host = "http://api-gateway.internal.network" # API Gateway URL
5
    wait_time = between(0.5, 3)
6

7
    @task
8
    def load_dashboard(self):
9
        """
10
        Simulate loading a user dashboard, requiring multiple API calls
11
        """
12
        user_id = 101 # Example user ID - could be randomized in a real test
13
        # Call 1: Get user profile
14
        self.client.get(f"/users/{user_id}/profile", name="Get User Profile")
15

16
        # Call 2: Get user's recent activity feed
17
        self.client.get(f"/users/{user_id}/activity", name="Get Activity Feed")
18

19
        # Call 3: Get user's notifications
20
        self.client.get(f"/users/{user_id}/notifications", name="Get Notifications")
21

22
        # Optionally, perform a POST request after fetching data
23
        # self.client.post(f"/users/{user_id}/log_view", json={"dashboard": "viewed"})

In this example, the name parameter in self.client.get() is used to group statistics in the Locust UI, making it easier to analyze the performance of individual API calls within a complex user flow. Running this test under increasing load reveals not just the overall dashboard load time (which Locust measures based on the completion of the last request within the task, though custom timing is possible), but crucially, the performance characteristics of each underlying API call (/users/{user_id}/profile, /users/{user_id}/activity, etc.), helping pinpoint which specific service or endpoint is causing bottlenecks under load. Analysis would focus on which of these individual requests experiences the highest latency or failure rate as the number of concurrent users increases.

Key Takeaways and Actionable Insights

API Benchmarking is Critical: Understanding how an API performs under load is non-negotiable for building reliable and scalable systems.
Locust and Requests Synergy: Locust provides the load generation framework and reporting, while Requests (via self.client) handles the HTTP communication seamlessly, offering a powerful and Pythonic approach to benchmarking.
Realistic User Behavior: Define tasks in locustfile.py that accurately reflect how real users interact with the API, including request types, data payloads, headers, and wait times.
Analyze Key Metrics: Focus on Requests per second, Response Time (especially percentiles), and Failure Rate in the Locust UI to diagnose performance characteristics and identify bottlenecks.
Iterative Testing: Start with a small number of users and gradually increase the load to observe how performance degrades.
Test Environment Matters: Perform benchmarking on an environment that closely mirrors the production setup in terms of hardware, network, and data volume for meaningful results.
Group Requests: Use the name parameter in self.client calls to logically group requests for clearer statistics when simulating complex multi-step user flows.
Don’t Just Load, Validate: While self.client handles basic response status codes, incorporate checks within tasks (e.g., checking response content) to ensure the API is not just responding quickly but also correctly under load.