A Modern Guide to Prometheus Docker Compose

March 17, 2026•CloudCops

prometheus docker compose

docker monitoring

observability stack

prometheus grafana

devops tools

A Modern Guide to Prometheus Docker Compose

Pairing Prometheus with Docker Compose is a fast, no-nonsense way to stand up a powerful monitoring stack. Using a single docker-compose.yml file, you can define and launch your entire observability setup—including Prometheus, Grafana, and all your exporters. It’s the perfect solution for teams that need rapid deployment without jumping into the deep end with Kubernetes.

This approach gives you the best of both worlds: the simplicity of Docker Compose and the production-grade power of Prometheus.

Why Use Prometheus with Docker Compose

For many teams, especially in startups and small to mid-sized businesses, this combination hits a perfect sweet spot. It delivers robust, open-source monitoring without the operational overhead that comes with a full-blown container orchestrator like Kubernetes.

The real magic is in its declarative nature. Your entire monitoring configuration lives in version control, making your setup reproducible, portable, and dead simple to manage. With a single docker-compose up command, you can spin up a complete observability stack—a process that would otherwise take hours of manual, error-prone configuration. That kind of agility is a massive advantage when you need to get monitoring up and running for a new project today, not next week.

The Core Benefits for Modern Teams

This setup is more than just a convenience; it delivers real advantages that fit right into modern DevOps workflows. We see teams choose this path for a few critical reasons:

Rapid Deployment: You can go from zero to a fully functional monitoring stack in minutes. The docker-compose.yml file acts as a blueprint, ensuring total consistency across your development, staging, and production environments.
Version-Controlled Configuration: Storing your docker-compose.yml and prometheus.yml files in Git means you get a crystal-clear audit trail. You can track every change, collaborate with teammates, and roll back to a known-good state without breaking a sweat.
Portability and Flexibility: A Docker Compose stack isn't locked into a specific cloud provider or piece of hardware. You can run it on your laptop, an on-prem server, or any cloud VM that has Docker installed. For a deeper look into container runtimes, our guide on Docker vs. Podman is a great resource.
Cost-Effectiveness: Using open-source titans like Prometheus and Grafana gives you immense power without the eye-watering price tags of proprietary tools. This is a huge driver for adoption, especially for growing companies.

The numbers don't lie. Recent industry surveys show 76% of organizations using open-source observability point to cost savings as the number one benefit. Many are replacing six-figure bills from proprietary vendors with a simple, self-hosted stack like this one.

A Look at the Key Components

A typical Prometheus Docker Compose stack isn't just one service; it's a few essential components working together in harmony. Docker Compose is the conductor, making sure each piece plays its part.

This table gives a quick overview of the essential services in our monitoring setup and their roles.

Core Components in a Prometheus Docker Compose Stack

Component	Primary Function	Default Port
Prometheus	The heart of the stack. It scrapes and stores time-series metrics from all your configured targets.	`9090`
Grafana	The visualization layer. It queries Prometheus and transforms raw metrics into beautiful, actionable dashboards.	`3000`
Node Exporter	An official exporter that provides deep, host-level system metrics like CPU, memory, disk I/O, and network stats.	`9100`
cAdvisor	A tool from Google that dives into container-specific resource usage and performance metrics. Essential for monitoring Docker itself.	`8080`
Alertmanager	Manages alerts sent by Prometheus. It deduplicates, groups, and routes them to the right places, like Slack, PagerDuty, or email.	`9093`

Each of these components is a best-in-class tool on its own. When orchestrated by Docker Compose, they form a cohesive and powerful monitoring platform that's ready for production workloads.

Alright, we've covered the theory. Now it's time to get our hands dirty and build the monitoring stack. The entire setup is orchestrated from a single docker-compose.yml file—think of it as the blueprint for our infrastructure.

We'll start with the essentials: Prometheus to scrape and store metrics, Node Exporter for host-level data, and cAdvisor for container-specific insights. The goal here is a self-contained stack you can launch with one command. That means defining the services, the network they communicate on, and the volumes that keep their data from disappearing.

Here’s a quick look at the data flow we're building. It's a simple, effective pattern that's incredibly reliable.

Diagram illustrating the Docker Compose monitoring flow, from Exporters to Prometheus to Grafana.

Exporters collect raw data, Prometheus pulls it in for storage, and Grafana (which we'll add later) queries it for visualization. This linear flow is what makes the stack so easy to troubleshoot when things go wrong.

The Docker Compose Blueprint

First, create a docker-compose.yml file. This is where we'll define our three core services.

version: '3.8'

volumes:
  prometheus_data: {}

networks:
  monitoring:
    driver: bridge

services:
  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/usr/share/prometheus/console_libraries'
      - '--web.console.templates=/usr/share/prometheus/consoles'
    ports:
      - "9090:9090"
    networks:
      - monitoring
    restart: always

  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.sysfs=/host/sys'
      - '--path.rootfs=/rootfs'
      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc|rootfs/var/lib/docker/containers|rootfs/var/lib/docker/overlay2|rootfs/run/docker/netns|rootfs/var/lib/docker/aufs)($$|/)'
    ports:
      - "9100:9100"
    networks:
      - monitoring
    restart: always

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:rw
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
    ports:
      - "8080:8080"
    networks:
      - monitoring
    restart: always

Notice how all three services are placed on a shared Docker network named monitoring. This is a critical detail. It allows Prometheus to discover and scrape the exporters using their service names (node-exporter:9100 and cadvisor:8080) as if they were DNS-resolvable hostnames.

Why Persistent Data Is Non-Negotiable

Pay close attention to the volume configuration in the Prometheus service:

prometheus_data:/prometheus

This line maps a Docker-managed named volume called prometheus_data to the /prometheus directory inside the container, which is where Prometheus stores its time-series database (TSDB).

Without a persistent volume, all your collected metrics would be wiped out every time the Prometheus container restarts. Using a named volume ensures your historical data survives container restarts, updates, and even system reboots. This is the difference between a toy project and a production-ready monitoring setup.

I learned this the hard way years ago after a server crash wiped out a week's worth of crucial performance data. It's a mistake you only make once.

Telling Prometheus What to Scrape

With our services defined, we now need to tell Prometheus what to actually monitor. This is handled in a separate configuration file, which we'll call prometheus.yml.

Create this file in the same directory as your docker-compose.yml.

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['node-exporter:9100']

  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8080']

Let's quickly break this down.

global.scrape_interval: This sets the default frequency for scraping metrics to every 15 seconds. It’s a reasonable default for most use cases.
scrape_configs: This is an array where each entry defines a scrape "job."
- The prometheus job is for self-monitoring—Prometheus scrapes its own metrics.
- The node-exporter and cadvisor jobs tell Prometheus to hit our two exporters using their container names and default ports.

These foundational ideas about collecting and storing time-series data extend far beyond just server metrics. The same principles apply to more complex areas, like enhancing data observability for user behavior or application events.

With these two files—docker-compose.yml and prometheus.yml—in the same directory, you're ready to go. Open a terminal, navigate to your project folder, and run docker-compose up -d. Your entire monitoring platform will spin up, ready to start collecting data.

Visualizing Data and Creating Smart Alerts

Collecting metrics is just the first step. Raw numbers sitting in a time-series database are useless until you can see them, and alerts are just noise until they're intelligent. This is where we turn our passive data collection stack into a proactive, responsive system.

By adding Grafana for visualization and Alertmanager for smart notifications, we close the loop. We’re about to transform raw data into dashboards that tell a story and alerts that pinpoint real problems.

A diagram illustrating a monitoring and alerting workflow from Prometheus to Grafana, Alertmanager, Slack, and Email.

We'll expand our existing docker-compose setup to bring in these two new services. Grafana will pull data from Prometheus, and Alertmanager will take alerts from Prometheus and route them to places where humans will actually see them—like a Slack channel.

Integrating Grafana and Alertmanager

It’s time to update our docker-compose.yml to bring these new services online. We’ll add service definitions for both Grafana and Alertmanager and make sure they join our monitoring network so they can talk to Prometheus.

Here are the new services to add to your docker-compose.yml file:

# In your docker-compose.yml

volumes:
  # ... existing volumes
  grafana_data: {}

services:
  # ... existing prometheus, node-exporter, cadvisor services

  grafana:
    image: grafana/grafana-oss:latest
    container_name: grafana
    volumes:
      - grafana_data:/var/lib/grafana
    ports:
      - "3000:3000"
    networks:
      - monitoring
    restart: always

  alertmanager:
    image: prom/alertmanager:latest
    container_name: alertmanager
    volumes:
      - ./alertmanager.yml:/etc/alertmanager/alertmanager.yml
    command:
      - '--config.file=/etc/alertmanager/alertmanager.yml'
      - '--storage.path=/alertmanager'
    ports:
      - "9093:9093"
    networks:
      - monitoring
    restart: always

Take note of the grafana_data persistent volume. This is absolutely critical. Without it, you'll lose all your dashboards and configurations every time the container restarts. A quick docker-compose up -d will launch these new containers alongside our existing ones.

Configuring Your Alerting Pipeline

With the services running, we now have to configure the actual pipeline. This is a two-part job: first, we tell Prometheus what to fire an alert for, and second, we tell Alertmanager where to send that alert.

Start by creating a basic alertmanager.yml file in the same directory. This example configures a default route that sends all notifications to a Slack webhook.

# alertmanager.yml

global:
  slack_api_url: 'YOUR_SLACK_WEBHOOK_URL'

route:
  receiver: 'slack-notifications'

receivers:
- name: 'slack-notifications'
  slack_configs:
  - channel: '#alerts-channel'
    send_resolved: true

Next, we need to give Prometheus some rules. Create a new file named alert.rules.yml to define what constitutes a problem.

# alert.rules.yml

groups:
- name: HostAlerts
  rules:
  - alert: HighCpuUsage
    expr: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[1m])) * 100) > 80
    for: 5m
    labels:
      severity: warning
    annotations:
      summary: "High CPU usage detected on {{ $labels.instance }}"
      description: "CPU usage is above 80% for the last 5 minutes."

Finally, we have to make Prometheus aware of our new rules file and the Alertmanager service. Update your prometheus.yml to tie it all together.

# prometheus.yml

# ... global config

rule_files:
  - 'alert.rules.yml'

alerting:
  alertmanagers:
  - static_configs:
    - targets:
      - 'alertmanager:9093'

# ... scrape_configs

This addition tells Prometheus to load our rules and where to forward any firing alerts. Restart your containers, and your new alerting pipeline will be active. For more advanced alerting ideas, check out our collection of Prometheus useful queries.

Visualizing Metrics with Grafana

Once you run docker-compose up -d, Grafana will be waiting for you at http://localhost:3000. The default login is admin / admin. The first thing you need to do is tell it where to get its data.

Add Prometheus as a Data Source: Go to Configuration > Data Sources, click "Add data source," and choose Prometheus. Set the URL to http://prometheus:9090 (using the Docker service name). Click "Save & Test" to confirm it can connect.
Import a Dashboard: Navigate to Dashboards > Browse and click "Import." The Grafana community has thousands of pre-built dashboards. A great one for Node Exporter is dashboard ID 1860, and for cAdvisor, try ID 13978.

The ability to import community-built dashboards is a massive time-saver. You can go from zero to a comprehensive view of your host and container metrics in minutes, without writing a single query yourself. This is the real power of a strong open-source ecosystem.

The pairing of Prometheus and Docker Compose has become a cornerstone of modern open-source monitoring. Its 75% adoption rate in Kubernetes environments has led to 60% faster incident detection and 45% fewer production issues, according to 2026 DevOps benchmarks. In a CNCF survey of 628 IT professionals, 77% of organizations use Prometheus in their cloud-native stacks, almost always alongside Grafana and Alertmanager. This complete setup moves your system from being a passive data store to a proactive solution that finds problems for you.

Hardening Your Monitoring Stack for Production

Leaving Prometheus, Grafana, and Alertmanager open to the public internet is a mistake you can't afford to make. In development, it's tempting to expose the ports for quick access, but in production, it’s a massive security hole. Anyone could stumble upon your infrastructure metrics, alert configurations, or even start messing with your dashboards.

When you move a monitoring setup to production, security has to be the first thought, not an afterthought. The goal is to build a secure perimeter around your observability tools so that only authorized people can get in. This process, often called hardening, isn't optional for any serious deployment.

A diagram illustrates a Prometheus, Grafana, and Alertmanager monitoring system architecture with Nginx reverse proxy and security measures like Basic Auth and TLS.

Implementing a Reverse Proxy

The single most effective way to lock down your monitoring UIs is to put them behind a reverse proxy. Tools like Nginx, Traefik, or Caddy are built for this. The proxy acts as a gatekeeper, intercepting every single request before it ever reaches your monitoring services.

This approach lets you centralize all your security rules in one place. Instead of trying to secure each application individually, you manage access at a single choke point. The simplest and most common security layer to add is HTTP Basic Authentication, which forces users to provide a username and password.

By removing the ports section for services like Prometheus (9090) and Grafana (3000) in your docker-compose.yml, you force all traffic through the proxy. This one change dramatically cuts down your attack surface and is a fundamental step in production hardening.

To get this done, you'll yank the ports mapping from your Prometheus, Grafana, and Alertmanager services. Then, you'll add a new service for your reverse proxy, which exposes ports 80 and 443 and handles all the routing and authentication internally.

Securing With Basic Auth and TLS

A reverse proxy is just the first step; it needs to be configured to actually do something. Your two main goals are enforcing authentication and encrypting traffic.

Basic Authentication: This is your first line of defense. You can generate a user with a hashed password and tell your proxy to demand those credentials before granting access. It’s a simple but surprisingly effective way to keep out casual intruders.
TLS Encryption (HTTPS): All traffic between your users and your monitoring stack has to be encrypted. A reverse proxy simplifies managing TLS certificates, especially with automated tools like Let's Encrypt. This is non-negotiable for preventing eavesdropping and man-in-the-middle attacks.

For example, if you're using Traefik, you can add a few labels to your Grafana service right inside the docker-compose.yml to automatically enable HTTPS and apply authentication middleware. This keeps your security configuration declarative and version-controlled, living right alongside your infrastructure.

The Principle of Least Privilege

Security doesn't stop at the network edge. You have to apply the same principles to the files and processes themselves. This is known as the principle of least privilege, and it’s where many teams get lazy.

A common mistake I’ve seen is running containers as the root user or leaving config files with wide-open 777 permissions. If an attacker manages to get a foothold in one of those containers, they suddenly have full control. It's a huge risk.

Always define a specific, non-root user in your Dockerfiles or use the user directive in docker-compose.yml to run your services with limited permissions. At the same time, lock down your configuration files (prometheus.yml, alertmanager.yml) with the tightest permissions possible, allowing only the necessary user to read them.

Making your monitoring stack truly production-ready goes beyond just these steps. To build systems that are not just secure but also resilient and transparent, it's worth exploring broader concepts like observability best practices. These principles will help you build a far more robust and insightful system.

Advanced Management and Troubleshooting Tips

Getting your monitoring stack running is just the beginning. The real work starts now: maintaining a healthy, reliable observability platform as your systems scale and grow more complex. This is where we shift from setup guides to the operational wisdom that keeps things running smoothly.

A prometheus docker compose setup has become incredibly popular for a reason. Container usage has soared to 92% among IT professionals, and developer adoption of Docker jumped by 17 points in just the last year. While not every workload needs the full weight of Kubernetes, Compose is perfect for streamlined monitoring. It’s no surprise that Prometheus has 77% adoption among cloud-native users; this approach offers a fast, consistent way to pull metrics together. For more on this, you can check out this overview of popular open-source developer tools.

Backing Up and Managing Prometheus Data

Your Prometheus time-series database (TSDB) holds the history of your system's performance. It’s an invaluable asset, and losing it is not an option.

Thankfully, because we used a named volume (prometheus_data) in our configuration, backing it up is fairly straightforward. You can run a temporary container to archive the volume’s data with a single Docker command:

docker run --rm -v prometheus_data:/data -v $(pwd):/backup ubuntu tar cvf /backup/prometheus-backup.tar /data

This command mounts your prometheus_data volume and your current working directory into a disposable container, creating a tar archive of all your metrics. From there, you can move this backup file to secure, off-site storage like an S3 bucket or a similar service.

A common pitfall I’ve seen take down production servers is neglecting Docker’s own data. Leftover images and volumes can pile up and fill the disk. I once had to debug an outage where a critical application went down simply because old data in /var/lib/docker/overlay2 consumed all available disk space. Run docker image prune -f and docker container prune -f regularly to clean up dangling images and stopped containers. Be cautious with docker system prune --volumes—it removes all unused volumes, which can destroy important data if containers are stopped or recreated. Only use it when you’re certain no stopped containers hold data you need. Keep an eye on your disk usage—it’s an easy outage to prevent.

Scaling Your Prometheus Setup

A single Prometheus instance works great for small to medium deployments, but it will eventually hit a performance wall. When you're monitoring hundreds of hosts or thousands of containers, it's time to think about a scaling strategy.

Two main patterns have emerged for scaling Prometheus:

Federation: This is a hierarchical model. A global Prometheus server scrapes aggregated, high-level metrics from several downstream Prometheus instances. It’s perfect for getting a centralized overview without overwhelming the main server with every single low-level metric from every target.
Long-Term Storage (LTS): For massive scale and high availability, integrating a dedicated LTS solution is the standard approach. Tools like Thanos or Cortex work alongside Prometheus, offering a global query view, virtually unlimited storage, and data redundancy.

For most teams running a prometheus docker compose stack, Thanos is the logical next step. It’s a powerful solution that dramatically enhances your monitoring capabilities without forcing a complete architectural overhaul. For a deeper dive, we wrote a detailed case study on scaling monitoring with Thanos that walks through a real-world implementation.

Common Troubleshooting Scenarios

No matter how well you plan, things will break. Here are a few of the most common issues you'll run into and how to start diagnosing them.

Target Scrape Errors

You open the Prometheus UI and see a target marked as "DOWN." This simply means Prometheus can't reach the exporter. The cause is almost always one of three things.

Networking: Is the exporter on the same Docker network as Prometheus? Check your docker-compose.yml file. All services need to share a common network to communicate with each other using their service names.
Firewall Rules: If your exporter is running outside of Docker (e.g., on a separate host), make sure no firewall is blocking the port. For Node Exporter, that’s typically port 9100.
Configuration Mismatch: This is the most common culprit. Double-check your prometheus.yml file for a simple typo in the target's service name or port. It happens to everyone.

High Cardinality Performance Issues

High cardinality is what happens when your metrics have too many unique label combinations. Think about using a user_id or request_id as a metric label. Every unique ID creates a brand new time series, and Prometheus can quickly get overwhelmed.

This simple PromQL query is a lifesaver for finding the biggest offenders:

topk(10, count by (__name__)({__name__=~".+"}))

It will show you the top 10 metrics with the highest number of series. If you spot a metric with hundreds of thousands or even millions of series, you've found a cardinality problem. You'll need to rethink its labels and find a better way to aggregate that data. High cardinality doesn't just slow down queries; it inflates memory usage and can bring your entire monitoring system to its knees.

Alright, you've got your basic prometheus docker compose stack up and running. But getting the YAML to validate is just the start. Now the real-world questions begin to pop up.

Here are the answers to the questions we see most often after an initial setup.

How Do I Monitor Other Applications with This Setup?

This is the whole point, right? To get your own apps monitored. The process always follows the same pattern.

First, you need a Prometheus exporter for whatever you want to monitor. Running a Java app? You'll probably use a JMX exporter. Got a PostgreSQL database? There's an exporter for that, too. A quick search for "[your technology] prometheus exporter" will almost always give you a solid option.

Once you've found your exporter, you just need to:

Add the exporter as a new service in your docker-compose.yml. Make sure you connect it to the same monitoring network so Prometheus can see it.
Jump over to your prometheus.yml and add a new scrape job. The target will be the exporter's service name and port (e.g., my-app-exporter:9101).

And what about apps running outside of Docker, maybe on a separate VM? No problem. As long as your firewall allows the connection, you can add its ip:port directly to a static_configs block in your prometheus.yml, and Prometheus will scrape it just the same.

What's the Difference Between Node Exporter and cAdvisor?

This one trips up a lot of people. It’s easy to think they do the same thing, but they monitor completely different layers. You absolutely need both.

Node Exporter: This is for your host machine—the actual server Docker is running on. It gives you the big picture: total CPU load, memory usage for the entire system, disk I/O, and network stats. Think of it as the physical health check for your server.
cAdvisor (Container Advisor): This focuses only on the containers. It breaks down resource usage—CPU, memory, network—on a per-container basis.

You need Node Exporter to find out if your server is running hot. You need cAdvisor to figure out which specific container is causing the problem. Without both, you’re flying blind.

Can I Use This for a Large-Scale Production Environment?

Yes and no. This prometheus docker compose setup is perfect for small-to-medium production environments, startups, or for monitoring a specific project. Its simplicity is its biggest strength—you can get it running in minutes.

But let's be realistic. For a very large-scale, multi-host production system, a single Prometheus instance will eventually hit a wall. As you add hundreds of targets and the metric volume explodes, it will become a bottleneck for both data collection and querying.

When you feel that performance start to drag, it's time to graduate to a more robust architecture. You generally have two paths forward:

Prometheus Federation: You can set up a global, top-level Prometheus server that scrapes aggregated data from your existing Prometheus instances. This gives you a high-level, centralized view without bogging down the main server with every single metric from every single target.
Long-Term Storage Integration: This is the most common path for serious scale. You integrate a dedicated long-term storage solution like Thanos or Cortex. These tools build on your existing Prometheus setup to deliver high availability, a global query view across all your instances, and practically unlimited data retention.

Think of this Docker Compose stack as an incredible starting point. Knowing when to scale up with tools like Thanos is what separates a short-term fix from a long-term observability strategy.

As experts in building resilient and cost-effective observability platforms, CloudCops GmbH helps teams like yours design and implement monitoring solutions that scale. From initial setup to advanced Thanos deployments, we build the infrastructure that gives you clarity and control. Find out how we can help at https://cloudcops.com.

Ready to scale your cloud infrastructure?

Let's discuss how CloudCops can help you build secure, scalable, and modern DevOps workflows. Schedule a free discovery call today.

Book a Meeting with an Expert

Continue Reading

May 3, 2026

Terraform and Ansible: Master Integration for CI/CD

Terraform and ansible - Learn when to use Terraform vs. Ansible. This practical guide explores how to integrate terraform and ansible effectively, covering CI/CD pipelines, dynamic inventory, security compliance, and enterprise operations.

terraform and ansible

CloudCops

Apr 17, 2026

What Is Terraform Used For: The Definitive Guide

What is Terraform used for? Learn how teams use Terraform for multi-cloud IaC, GitOps workflows, key use cases, and infrastructure best practices in this 2026 guide.

what is terraform used for

CloudCops

Apr 1, 2026

Your Guide to Automation in Cloud Computing

Discover how automation in cloud computing boosts speed, slashes costs, and hardens security. Learn key patterns, tools, and a practical roadmap to get started.

automation in cloud computing

CloudCops