Benchmarking REST APIs in Python: A Practical Guide with Locust and Requests
Evaluating the performance of RESTful APIs under varying load conditions is essential for ensuring reliability, scalability, and a positive user experience. This process, known as API benchmarking or load testing, involves simulating multiple users interacting with an API simultaneously and measuring its response time, throughput, and error rate. Python is a widely adopted language for test automation, and two powerful libraries often used in conjunction for API benchmarking are Locust and Requests.
Understanding API Benchmarking
API benchmarking quantifies an API’s performance characteristics under specific load profiles. Key metrics typically measured include:
- Response Time (Latency): The time taken for the API to respond to a request. This is often measured in milliseconds (ms) and reported as average, median, minimum, maximum, and percentiles (e.g., 90th, 95th, 99th percentile). Lower response times indicate better performance.
- Throughput: The number of requests the API can handle per unit of time, usually measured in requests per second (Req/s). Higher throughput indicates greater capacity.
- Error Rate: The percentage of requests that result in an error response (e.g., HTTP status codes 5xx). A high error rate under load can indicate stability issues or resource exhaustion.
- Concurrency: The number of simultaneous active users or requests the API can handle effectively.
Benchmarking helps identify performance bottlenecks, determine infrastructure requirements for expected traffic, validate service level agreements (SLAs), and ensure stability before deploying or after updating an API.
Essential Tools: Locust and Requests
- Locust: Locust is an open-source, Python-based load testing tool. It defines user behavior with Python code, making test scenarios highly customizable and expressive. Key features include:
- Defining user tasks and their probability of execution.
- Simulating large numbers of concurrent users.
- Providing a web-based user interface for controlling tests and viewing real-time statistics.
- Supporting distributed testing across multiple machines.
- Automatic reporting of response times, requests per second, and failure rates.
- Requests: The Requests library is a de facto standard for making HTTP requests in Python. It provides a simple, user-friendly API for interacting with web services. While Locust provides a client (
self.client) within its user classes that wraps therequestslibrary and automatically logs statistics, understanding the underlying functionality ofrequestsis beneficial for more complex scenarios.
Using Locust with Requests (specifically via Locust’s self.client) combines Locust’s load generation and reporting capabilities with Requests’ robust HTTP handling, creating a powerful and flexible benchmarking setup.
Setting up the Benchmarking Environment
To begin benchmarking, ensure Python is installed on the system. The necessary libraries can be installed using pip, Python’s package installer:
pip install locust requestsThis command installs both Locust and the Requests library.
Defining User Behavior with Locust
Benchmarking with Locust involves creating a Python file (conventionally named locustfile.py) that defines the simulated user behavior. This file contains one or more User classes, typically inheriting from HttpUser when testing HTTP APIs.
A HttpUser class represents a type of user interacting with the system. Within this class, tasks are defined using the @task decorator. Each @task method represents an action a simulated user might perform.
from locust import HttpUser, task, between
class APIUser(HttpUser): """ User class that does requests to the API """ # The host attribute specifies the base URL for requests host = "http://api.example.com" # Replace with your API base URL
# wait_time defines the time a user waits between executing tasks # between(1, 5) means wait between 1 and 5 seconds wait_time = between(1, 5)
@task def get_items(self): """ Task to perform a GET request to the /items endpoint """ # self.client is a Requests client provided by Locust that logs stats self.client.get("/items")
@task def create_item(self): """ Task to perform a POST request to create an item """ item_data = {"name": "test_item", "price": 10.0} self.client.post("/items", json=item_data)HttpUser: Specifies that this user interacts over HTTP.host: Defines the base URL for the API being tested. This allows tasks to use relative paths (e.g.,/items).wait_time: Controls the “think time” between tasks for each simulated user. This makes the load pattern more realistic.between(min, max)sets a random wait time within a range.@task: Decorator indicating that a method represents a user task. Locust will pick tasks to execute based on their weight (default weight is 1 if not specified).self.client: This is a key object provided by Locust’sHttpUser. It acts like arequests.Sessionobject but automatically reports the request performance (response time, status code, size, etc.) to the Locust master. Methods likeself.client.get(),self.client.post(),self.client.put(),self.client.delete()are available and work similarly to theirrequestscounterparts.
Adding Data and Headers to Requests
The self.client methods support arguments similar to requests methods for sending data, setting headers, and handling other HTTP specifics.
@task def update_item(self): """ Task to perform a PUT request to update an item """ item_id = 123 # Example item ID update_data = {"price": 12.5} headers = {"X-My-Header": "value"} self.client.put(f"/items/{item_id}", json=update_data, headers=headers)
@task def get_filtered_items(self): """ Task to perform a GET request with query parameters """ params = {"status": "active", "limit": 10} self.client.get("/items", params=params)Running the Locust Test
Once the locustfile.py is created, the test can be run from the terminal.
Navigate to the directory containing locustfile.py and execute:
locust -f locustfile.pyBy default, Locust starts a web interface accessible at http://localhost:8089. Open this URL in a web browser.
The web UI allows configuring the test:
- Number of users to simulate: The total number of concurrent users.
- Spawn rate: The number of users to start per second until the total user count is reached.
- Host: The base URL of the system under test (this can override the
hostattribute inlocustfile.py).
After configuring, click “Start swarming!”.
Interpreting Locust Results
The Locust web UI provides real-time statistics during the test run:
| Metric | Description | Insight |
|---|---|---|
| Requests per second | The rate at which requests are being processed by the API. | Indicates API throughput. Higher values suggest better capacity. |
| Response Time (Avg) | The arithmetic mean of response times for a specific endpoint. | Provides a general sense of typical latency. |
| Response Time (Median) | The middle value of response times. Less affected by outliers than the average. | A good indicator of the typical user experience regarding latency. |
| Response Time (Percentiles) | e.g., 90% means 90% of requests completed within this time. | Crucial for understanding the performance experienced by the majority of users. High percentiles indicate potential issues for a significant portion of users. |
| Failure Rate (%) | The percentage of requests resulting in non-success HTTP status codes. | Directly indicates API stability and error handling under load. Should be 0% or near 0% for healthy APIs. |
| Total Requests | The cumulative count of requests sent. | Shows the volume of traffic generated during the test. |
| Total Failures | The cumulative count of failed requests. | Helps pinpoint the specific endpoint(s) experiencing issues. |
Analyzing these metrics together provides a comprehensive view of API performance under the simulated load. For instance, increasing the number of users and observing a significant rise in average response time or failure rate indicates that the API is struggling to handle the increased load.
Concrete Example: Benchmarking JSONPlaceholder
JSONPlaceholder is a free online REST API for testing and prototyping. Benchmarking its /posts endpoint with GET requests serves as a practical example.
locustfile_jsonplaceholder.py:
from locust import HttpUser, task, between
class JSONPlaceholderUser(HttpUser): host = "https://jsonplaceholder.typicode.com" wait_time = between(1, 5)
@task def get_all_posts(self): """ Fetch all posts """ self.client.get("/posts")
@task def get_first_post(self): """ Fetch a specific post by ID """ self.client.get("/posts/1")
@task def get_posts_by_user(self): """ Fetch posts filtered by user ID """ self.client.get("/posts", params={"userId": 1})To run this:
locust -f locustfile_jsonplaceholder.pyOpen http://localhost:8089, set desired user count and spawn rate, and start the test. The Locust UI will then show the real-time performance metrics for the /posts and /posts/1 endpoints under the simulated load. For example, under moderate load, JSONPlaceholder’s response times are expected to be low (tens to hundreds of milliseconds) and the failure rate near zero, demonstrating typical performance for a simple, well-performing API endpoint. Increased load on a less robust API would reveal higher latencies and potential errors.
Case Study: Benchmarking a Microservice API Gateway
Consider a scenario involving an API Gateway that aggregates data from several downstream microservices. Benchmarking this gateway requires simulating realistic user flows that involve multiple API calls in sequence or parallel.
A Locust task could be designed to mimic a user loading a dashboard:
from locust import HttpUser, task, between
class DashboardUser(HttpUser): host = "http://api-gateway.internal.network" # API Gateway URL wait_time = between(0.5, 3)
@task def load_dashboard(self): """ Simulate loading a user dashboard, requiring multiple API calls """ user_id = 101 # Example user ID - could be randomized in a real test # Call 1: Get user profile self.client.get(f"/users/{user_id}/profile", name="Get User Profile")
# Call 2: Get user's recent activity feed self.client.get(f"/users/{user_id}/activity", name="Get Activity Feed")
# Call 3: Get user's notifications self.client.get(f"/users/{user_id}/notifications", name="Get Notifications")
# Optionally, perform a POST request after fetching data # self.client.post(f"/users/{user_id}/log_view", json={"dashboard": "viewed"})In this example, the name parameter in self.client.get() is used to group statistics in the Locust UI, making it easier to analyze the performance of individual API calls within a complex user flow. Running this test under increasing load reveals not just the overall dashboard load time (which Locust measures based on the completion of the last request within the task, though custom timing is possible), but crucially, the performance characteristics of each underlying API call (/users/{user_id}/profile, /users/{user_id}/activity, etc.), helping pinpoint which specific service or endpoint is causing bottlenecks under load. Analysis would focus on which of these individual requests experiences the highest latency or failure rate as the number of concurrent users increases.
Key Takeaways and Actionable Insights
- API Benchmarking is Critical: Understanding how an API performs under load is non-negotiable for building reliable and scalable systems.
- Locust and Requests Synergy: Locust provides the load generation framework and reporting, while Requests (via
self.client) handles the HTTP communication seamlessly, offering a powerful and Pythonic approach to benchmarking. - Realistic User Behavior: Define tasks in
locustfile.pythat accurately reflect how real users interact with the API, including request types, data payloads, headers, and wait times. - Analyze Key Metrics: Focus on Requests per second, Response Time (especially percentiles), and Failure Rate in the Locust UI to diagnose performance characteristics and identify bottlenecks.
- Iterative Testing: Start with a small number of users and gradually increase the load to observe how performance degrades.
- Test Environment Matters: Perform benchmarking on an environment that closely mirrors the production setup in terms of hardware, network, and data volume for meaningful results.
- Group Requests: Use the
nameparameter inself.clientcalls to logically group requests for clearer statistics when simulating complex multi-step user flows. - Don’t Just Load, Validate: While
self.clienthandles basic response status codes, incorporate checks within tasks (e.g., checking response content) to ensure the API is not just responding quickly but also correctly under load.