1945 words
10 minutes
How to Build an API Uptime Dashboard with Python and Streamlit

Building an API Uptime Dashboard with Python and Streamlit#

Monitoring the availability and performance of application programming interfaces (APIs) is critical for maintaining reliable systems. Unmonitored APIs can lead to service outages, degraded user experience, and significant business impact. An API uptime dashboard provides a centralized view of the status of various API endpoints, offering immediate visibility into their health. This article outlines the process of building such a dashboard using Python and the Streamlit framework.

Understanding API Uptime Monitoring#

API uptime refers to the percentage of time an API is available and functioning correctly. Monitoring involves repeatedly checking an API endpoint to determine its responsiveness and the nature of its response. Key metrics collected during monitoring include:

  • Availability: Determined by the HTTP status code returned. A status code in the 2xx range (e.g., 200 OK) typically indicates success, while 4xx (client errors) or 5xx (server errors) indicate issues.
  • Latency: The time taken for the API to respond to a request. High latency can indicate performance degradation even if the API is technically “available”.
  • Response Content: Verification that the API returns the expected data or structure, providing a deeper check than just the status code.

Consistent monitoring generates data points over time, which can be analyzed and visualized to understand trends, identify recurring issues, and calculate uptime percentages.

Why Python and Streamlit for an Uptime Dashboard?#

Python is a versatile programming language with robust libraries for making HTTP requests (requests) and handling data (pandas). Streamlit is an open-source Python library that simplifies the creation of custom web applications and dashboards with minimal code, making it ideal for rapidly building interactive data visualizations without needing extensive web development knowledge.

Combining Python’s monitoring capabilities with Streamlit’s ease of dashboard creation offers an efficient approach to developing a custom API uptime monitoring tool tailored to specific needs.

Essential Concepts for the Dashboard#

Developing an API uptime dashboard involves several core components:

  1. API Endpoint List: A defined list of URLs to be monitored.
  2. Monitoring Logic: A script or function that sends requests to each API endpoint and records the result (status code, latency, timestamp).
  3. Data Storage: A method to store the monitoring results persistently or in-memory for display.
  4. Scheduling: A mechanism to run the monitoring logic at regular intervals.
  5. Visualization: The dashboard interface that displays the collected data in a user-friendly format (tables, charts).

This guide focuses on the monitoring logic, data handling, and visualization using Python and Streamlit. External scheduling is a crucial operational consideration but sits outside the scope of the core Streamlit application code itself.

Building the Dashboard: A Step-by-Step Walkthrough#

This section details the process of creating the Python script and Streamlit application.

Step 1: Set Up the Environment#

Begin by installing the necessary Python libraries. A virtual environment is recommended to manage dependencies.

Terminal window
# Create a virtual environment
python -m venv venv
# Activate the environment (Linux/macOS)
source venv/bin/activate
# Activate the environment (Windows)
.\venv\Scripts\activate
# Install required libraries
pip install requests pandas streamlit

Step 2: Define the API Endpoints#

Create a Python list containing the URLs of the APIs to monitor.

api_endpoints = [
"https://www.example.com/api/status",
"https://api.another-service.com/v1/health",
"https://api.yet-another.org/check"
]

(Note: Replace these with actual API endpoints to monitor).

Step 3: Implement the Monitoring Logic#

Write a Python function using the requests library to check a single API endpoint. This function should handle potential errors (like network issues) and record relevant data.

import requests
import time
from datetime import datetime
def check_api_status(url):
"""Checks the status and latency of a given URL."""
status_code = None
latency = None
error = None
timestamp = datetime.now().isoformat()
try:
start_time = time.time()
response = requests.get(url, timeout=10) # Set a timeout
end_time = time.time()
status_code = response.status_code
latency = (end_time - start_time) * 1000 # Latency in milliseconds
# Basic check for successful status codes
if not 200 <= status_code < 300:
error = f"HTTP Status: {status_code}"
except requests.exceptions.Timeout:
status_code = "Timeout"
error = "Request timed out"
except requests.exceptions.ConnectionError:
status_code = "Error"
error = "Connection Error"
except Exception as e:
status_code = "Error"
error = f"An unexpected error occurred: {e}"
return {
"timestamp": timestamp,
"url": url,
"status_code": status_code,
"latency_ms": round(latency, 2) if latency is not None else None,
"error": error,
"is_available": (error is None and status_code != "Timeout" and status_code != "Error")
}

Step 4: Store the Monitoring Results#

For a simple Streamlit application, storing data in-memory is the easiest approach. As the application runs, a list can accumulate the results of each check. For persistence across runs or historical analysis, a file (like CSV or JSON) or a simple database (like SQLite) would be necessary. This example uses an in-memory list, which resets each time the Streamlit app restarts.

import pandas as pd
# In-memory storage for demonstration
monitoring_results = []
# Function to perform checks for all APIs and store results
def perform_all_checks(endpoints):
current_results = []
for url in endpoints:
result = check_api_status(url)
current_results.append(result)
monitoring_results.append(result) # Add to historical list
return current_results # Return only the latest results for immediate display

Step 5: Create the Streamlit Interface#

Build the streamlit application file (e.g., dashboard.py). This file will define how the data is presented.

import streamlit as st
import pandas as pd
import time # Import time for potential sleep
# Assume check_api_status and api_endpoints are defined above
st.set_page_config(layout="wide") # Use wide layout
st.title("API Uptime Dashboard")
# Function to display the latest status
def display_latest_status(latest_results):
st.subheader("Latest Status Check")
# Use st.status to show check progress (optional but good UI)
with st.status("Checking APIs...", expanded=True) as status:
results_list = []
for url in api_endpoints:
st.write(f"Checking {url}...")
result = check_api_status(url)
monitoring_results.append(result) # Add to the global history
results_list.append(result) # Add to current batch
status.update(label="Checks Complete!", state="complete", expanded=False)
# Convert latest results to DataFrame for display
if results_list:
latest_df = pd.DataFrame(results_list)
latest_df_display = latest_df[['url', 'status_code', 'latency_ms', 'error', 'is_available']]
st.dataframe(latest_df_display, use_container_width=True)
else:
st.info("No latest check data available. Click 'Run Checks'.")
# Function to display historical data (from the global monitoring_results list)
def display_historical_data():
st.subheader("Historical Data")
if monitoring_results:
history_df = pd.DataFrame(monitoring_results)
st.dataframe(history_df, use_container_width=True)
# Basic uptime percentage calculation (example)
st.subheader("Uptime Summary")
uptime_summary = history_df.groupby('url')['is_available'].value_counts(normalize=True).unstack(fill_value=0)
uptime_summary['uptime_percentage'] = uptime_summary.get(True, 0) * 100 # Use .get(True, 0) in case no 'True'
st.dataframe(uptime_summary[['uptime_percentage']].round(2), use_container_width=True)
# Example plot: Latency over time for one API (select first one for simplicity)
if not history_df.empty and api_endpoints:
st.subheader(f"Latency Trend for {api_endpoints[0]}")
# Filter for the first API and successful checks with latency data
first_api_history = history_df[(history_df['url'] == api_endpoints[0]) & (history_df['latency_ms'].notna()) & (history_df['is_available'] == True)].copy()
# Convert timestamp to datetime for plotting
first_api_history['timestamp'] = pd.to_datetime(first_api_history['timestamp'])
if not first_api_history.empty:
st.line_chart(first_api_history.set_index('timestamp')['latency_ms'])
else:
st.info(f"No successful checks with latency data for {api_endpoints[0]} to plot.")
else:
st.info("No historical data yet. Run some checks first.")
# --- Streamlit App Structure ---
# Global variable to store results in memory (resets on app rerun)
if 'monitoring_results' not in st.session_state:
st.session_state.monitoring_results = []
monitoring_results = st.session_state.monitoring_results # Link global variable to session state
# Add a button to manually trigger checks
if st.button("Run Checks Now"):
st.session_state.latest_check_results = [] # Store latest check results separately for immediate display
with st.status("Running checks...", expanded=True) as status:
for url in api_endpoints:
st.write(f"Checking {url}...")
result = check_api_status(url)
st.session_state.monitoring_results.append(result) # Add to historical list
st.session_state.latest_check_results.append(result) # Add to the latest results list
status.update(label="Checks Complete!", state="complete", expanded=False)
# Display latest results after the button click
if st.session_state.latest_check_results:
latest_df = pd.DataFrame(st.session_state.latest_check_results)
latest_df_display = latest_df[['url', 'status_code', 'latency_ms', 'error', 'is_available']]
st.subheader("Latest Check Results")
st.dataframe(latest_df_display, use_container_width=True)
# Display historical data and summary below the button/latest results
display_historical_data()
# Note: For automatic, periodic checks, this Streamlit app is not the scheduler.
# A separate process/script would need to run the checks and potentially
# update a persistent data store (file, DB) that this app reads.

(Self-correction during thought process: Initially, I put the check logic directly in the display functions. This is bad practice in Streamlit as button clicks rerun the entire script. It’s better to trigger the check logic within the button’s scope or use st.session_state to manage the data and display based on that state.)

(Further refinement: Using st.session_state for monitoring_results makes the historical data persist across reruns triggered by user interaction, which is more useful than a simple global list. Also, separate latest_check_results to show only the results from the most recent manual run.)

Step 6: Run the Streamlit App#

Save the code as a Python file (e.g., dashboard.py) and run it from your terminal within the activated virtual environment:

Terminal window
streamlit run dashboard.py

This command starts a local web server and opens the dashboard in your browser. The “Run Checks Now” button will execute the monitoring logic and update the display with the latest and historical data stored in the session state.

Data Storage and Visualization Options#

Data Storage:

  • In-Memory (st.session_state): Simplest for demonstration. Data is lost when the Streamlit app restarts.
  • File Storage (CSV/JSON): Data persists between runs. Requires reading the file on startup and writing after each check. Scalability limited by file size and complexity of concurrent access if used with an external scheduler.
  • Simple Database (SQLite): More robust persistence. Supports concurrent access better than files. Requires a library like sqlite3. Ideal when an external script performs checks and writes to the DB, and the Streamlit app reads from it.

Visualization:

  • st.dataframe(): Excellent for displaying raw tabular data (latest checks, historical log).
  • st.line_chart() / st.bar_chart(): Useful for visualizing trends like latency over time or the distribution of status codes.
  • Conditional Formatting: Pandas DataFrames displayed by st.dataframe can be styled to highlight status codes or errors using DataFrame styling methods before passing to Streamlit.
  • Custom Components: Streamlit allows embedding HTML or using custom components for more advanced visualizations if needed.

Real-World Application and Considerations#

Consider a software company providing a service that integrates with multiple third-party APIs (e.g., payment gateways, shipping carriers, social media platforms). The reliability of these external APIs directly impacts the company’s service quality.

  • Use Case: Building a simple API uptime dashboard allows the operations or development team to have immediate visibility into the status of these critical dependencies. Instead of waiting for customer reports or internal errors, they can proactively identify outages or performance issues with a quick glance at the dashboard.
  • Implementation:
    1. A separate Python script runs periodically (e.g., every 5 minutes via cron or a cloud scheduler like AWS EventBridge).
    2. This script uses the check_api_status function for each required endpoint.
    3. Results are appended to a persistent data store, such as a SQLite database file or written to a simple log file (e.g., in JSON format, one object per line).
    4. The Streamlit application dashboard.py reads the latest data from this persistent store on startup and provides a button to refresh the view, querying the store again. It displays the current status, historical logs, and potentially calculates uptime percentages and plots latency trends from the stored data.

This decoupled architecture (scheduler + monitoring script separate from the dashboard) is more scalable and reliable for continuous monitoring than relying solely on a Streamlit app’s interactivity.

Further Considerations:

  • Authentication: Monitoring private APIs requires handling authentication (API keys, OAuth tokens). The requests library supports various authentication methods.
  • Advanced Checks: Beyond basic HTTP GET, monitoring might involve checking specific response headers, validating JSON structure, or performing POST requests.
  • Alerting: A dashboard provides visibility, but for critical APIs, integrating alerting (email, Slack, PagerDuty) based on failed checks is essential. This requires additional logic in the monitoring script.
  • Scaling: Monitoring hundreds or thousands of APIs requires a more robust architecture, potentially involving message queues, distributed workers, and a more powerful database, moving beyond the scope of a simple Python/Streamlit app towards dedicated monitoring solutions.

Key Takeaways#

  • Building an API uptime dashboard provides crucial visibility into the health of critical service dependencies.
  • Python, with libraries like requests and pandas, is effective for implementing monitoring logic and data handling.
  • Streamlit enables rapid development of interactive web dashboards with minimal code, ideal for visualizing API status.
  • A basic dashboard can be built using Streamlit’s data display (st.dataframe) and charting (st.line_chart) capabilities.
  • For continuous monitoring, the check scheduling should ideally be handled by a separate process or service, not the Streamlit application itself.
  • Persistent data storage (files, SQLite) is necessary for retaining historical monitoring data across application runs.
  • Real-world applications require considering authentication, more complex checks, alerting, and potential scaling requirements.
How to Build an API Uptime Dashboard with Python and Streamlit
https://dev-resources.site/posts/how-to-build-an-api-uptime-dashboard-with-python-and-streamlit/
Author
Dev-Resources
Published at
2025-06-30
License
CC BY-NC-SA 4.0