DevOps Tools Roadmap: What to Learn, and in What Order
Most DevOps tools roadmaps are lists. This one is a learning sequence with explicit reasoning behind the order, so you know why you are learning each tool before moving to the next, and which tools genuinely matter for getting hired versus which are overhyped in blog posts.
The mistake most learners make with DevOps tools
DevOps has an enormous number of tools, and the ecosystem changes constantly. The most common mistake is learning many tools at a shallow level — enough to follow a tutorial, not enough to use them under pressure or explain the decisions behind them. Employers are not impressed by a CV that lists fifteen tools. They are impressed by candidates who can demonstrate real competency in a core set and explain why they made architectural choices.
The goal of this roadmap is to give you a prioritised learning sequence — not a comprehensive list of every tool in existence. At each stage, the tool recommended is the one that gives you the highest career return per hour of learning. Other tools exist; you will encounter them later. Master the ones listed here first.
If you are aiming for a DevOps engineering role specifically, the DevOps engineer roadmap covers the full career path. This page is about the tools specifically and the order to learn them.
Tier 1: Non-negotiable foundations (learn these first)
Tier 1 tools are prerequisites for everything else. Do not start learning CI/CD before you have Git. Do not start learning Kubernetes before you understand Docker. The order here is the order.
1. Git — version control (learn this before anything else)
Git is the foundation of every other tool in this list. CI/CD pipelines trigger on Git events. IaC configuration is stored in Git. Container builds are driven by Dockerfiles committed to Git. If you do not have solid Git skills, no other tool knowledge matters.
What “solid Git skills” means in a DevOps context:
- Branching strategies — understand trunk-based development and GitFlow. Know when each is appropriate.
- Conflict resolution — resolve merge conflicts without losing changes.
- Rebasing —
git rebasevsgit merge, and why teams have opinions about this. - Git hooks — pre-commit hooks for linting and formatting. Relevant in CI/CD contexts.
- Pull request workflows — opening PRs, writing useful PR descriptions, reviewing diffs.
Time investment: 1–2 weeks to get to functional competence if you are new to Git. Do not rush past this.
2. Linux command line — the environment for everything
Almost every DevOps tool runs on Linux. You need to be comfortable in a Linux shell before you can use the tools effectively. This is not a tool per se, but it is a Tier 1 prerequisite.
Core skills: navigate the filesystem, manage file permissions (chmod, chown), manage processes (ps, kill, systemctl), pipe commands, write basic shell scripts (loops, conditionals, functions), use ssh, and read logs with journalctl and tail -f.
3. Docker — containerisation
Docker is the entry point to the container ecosystem. Learn it before Kubernetes because Kubernetes orchestrates containers — if you do not understand what a container is, Kubernetes is confusing. Docker is also used independently of Kubernetes in many environments.
What to learn:
- Writing Dockerfiles — understand
FROM,RUN,COPY,ENV,EXPOSE,CMD, andENTRYPOINT. Know the difference betweenCMDandENTRYPOINT. - Image layers — understand how layer caching works and why layer order matters for build performance.
- Docker Compose — define multi-container applications for local development. Know how to link services, share volumes, and set environment variables.
- Image registries — push to and pull from Docker Hub, Amazon ECR, GCR, or GHCR. Understand image tagging conventions (
latestvs semantic versions vs commit SHAs). - Security basics — do not run containers as root. Use official base images. Scan images for vulnerabilities with Trivy.
Project: Containerise an application you have written, write a Docker Compose file for local development with a database, and push the image to a registry. This is a foundational portfolio piece.
Tier 2: Core DevOps tools (learn after Tier 1)
Tier 2 tools are the ones that appear in nearly every DevOps job description and that interviewers expect you to have real experience with. Learn these after you have solid Tier 1 fundamentals — rushing to Tier 2 without Tier 1 causes shallow, brittle knowledge.
4. CI/CD — start with GitHub Actions
CI/CD (Continuous Integration / Continuous Delivery) automates building, testing, and deploying software. There are many CI/CD tools: Jenkins, GitLab CI, GitHub Actions, CircleCI, Cloud Build. Start with GitHub Actions. It is tightly integrated with the most widely used code hosting platform, has a large ecosystem of community Actions, and is used heavily by both startups and enterprise organisations.
Learn Git before CI/CD because CI/CD pipelines trigger on Git events (push, pull request, tag). If you do not understand branching, you will not understand why pipelines behave the way they do.
What to learn with GitHub Actions:
- Workflow syntax: triggers (
on: push,on: pull_request), jobs, steps, and actions. - Secrets and environment variables — store credentials as GitHub Secrets, not in workflow files.
- Matrix builds — run the same job against multiple versions of a language or OS.
- Reusable workflows — extract common pipeline logic into shared workflows.
- Deployment environments — configure staging and production environments with approval gates.
Once you have GitHub Actions experience, Jenkins is worth understanding conceptually because many large enterprises still run it. Jenkins uses Groovy-based Jenkinsfiles that are less readable than YAML-based pipelines, and it requires more operational overhead. Do not learn Jenkins first — learn GitHub Actions, then learn Jenkins differences when you encounter it in a job.
GitLab CI is the other major alternative. Its syntax is similar to GitHub Actions and it is the better choice if your target employer uses GitLab for code hosting.
5. Terraform — infrastructure as code
Terraform is the most important IaC tool to learn. Learn it before you worry about Terragrunt, Pulumi, or CDK. The Terraform roadmap covers the full learning path in detail — follow that for the deep dive. For this roadmap’s purposes, understand where Terraform fits in the sequence: after Docker and CI/CD, because you need to understand what you are provisioning infrastructure to run (containers) and how you will deploy changes (CI/CD pipelines that run terraform apply).
Terraform sits in the IaC category alongside CloudFormation (AWS-only), Pulumi (any language), Bicep (Azure-only), and Ansible (configuration management, not IaC in the strict sense). Terraform is the right one to start with because it is cloud-agnostic and the most widely used across organisations.
6. Kubernetes — container orchestration
Learn Kubernetes after Docker. Kubernetes orchestrates containers — it cannot make sense without understanding what a container is and how it works. The Kubernetes roadmap covers the full learning path. In the context of the overall DevOps toolchain, Kubernetes slots in after CI/CD because you will configure your CI/CD pipeline to deploy to Kubernetes, and understanding that integration requires knowing both tools.
Not every DevOps role requires deep Kubernetes expertise. If your target role is at a company that runs serverless or PaaS workloads, Kubernetes may be peripheral. If your target role is at a company running containerised microservices, Kubernetes knowledge is core.
Tier 3: Observability and operations tooling
Tier 3 tools support running systems in production. You need to understand how to observe and debug running systems, not just deploy them. Learn these after you have experience with Tier 1 and Tier 2 tools, because observability becomes meaningful once you have real systems to observe.
7. Prometheus and Grafana — metrics and dashboards
Prometheus + Grafana is the standard open-source monitoring stack. Prometheus scrapes metrics from application endpoints and stores them as time-series data. Grafana provides dashboards and alerting on top of Prometheus data.
What to learn:
- Prometheus data model — metrics types: Counter, Gauge, Histogram, Summary.
- PromQL (Prometheus Query Language) — write queries to calculate rates, aggregations, and percentiles. Know
rate(),increase(),sum() by(), andhistogram_quantile(). - Alertmanager — configure alerts with routing rules and notification channels (Slack, PagerDuty).
- Grafana dashboards — build dashboards from Prometheus queries. Import community dashboards from Grafana Labs (the Kubernetes and Node Exporter dashboards are standard starting points).
- Service instrumentation — understand how to add Prometheus metrics to an application using a client library.
Cloud-native alternatives exist (AWS CloudWatch, GCP Cloud Monitoring, Datadog) and are often used in production instead of self-hosted Prometheus. But Prometheus + Grafana teaches the underlying concepts. If you understand Prometheus, CloudWatch and Datadog are learnable by extension.
8. Logging — ELK stack or cloud-native
Logs from applications and infrastructure need to be aggregated, stored, and searched. The ELK stack (Elasticsearch, Logstash, Kibana) is the most widely known open-source logging solution. Alternatives: Loki (Grafana’s lightweight log aggregation, integrates naturally with Prometheus/Grafana), cloud-native services (CloudWatch Logs, GCP Cloud Logging, Azure Monitor Logs), and commercial options (Datadog, Splunk).
For learning purposes, start with Loki + Grafana if you are already learning Prometheus/Grafana — the stack is consistent and Loki is simpler to operate than ELK. Learn ELK conceptually (it is still widely deployed in enterprise environments) but do not invest heavily in self-hosting Elasticsearch unless your target role requires it.
Key concepts regardless of tool: structured logging (JSON format is searchable), log levels (DEBUG/INFO/WARN/ERROR), correlation IDs for tracing requests across services, and log retention policies.
Tier 4: Advanced and specialised tooling
Tier 4 tools are important in specific contexts but should not be learned until you have Tiers 1–3 solid. Learning these too early leads to context-free knowledge that does not stick.
Secret management — HashiCorp Vault or cloud-native
Secrets (API keys, database passwords, certificates) need secure storage and access control. Learn cloud-native first: AWS Secrets Manager, GCP Secret Manager, or Azure Key Vault depending on your primary cloud. These are simpler to operate and sufficient for most environments.
HashiCorp Vault is the tool to learn if you are working in multi-cloud environments or targeting organisations that prefer a vendor-neutral stack. Vault’s killer feature is dynamic secrets — it generates short-lived, use-once credentials for databases and cloud providers on demand, which dramatically reduces the risk of credential exposure. Understand this concept even if you do not need deep Vault expertise immediately.
Artifact management — Docker registries and Nexus/Artifactory
Artifact registries store build outputs: Docker images, npm packages, Maven JARs, Python wheels. For Docker images, you will already have a registry from Tier 1 learning (ECR, GCR, GHCR, Docker Hub). Nexus Repository and JFrog Artifactory are enterprise-grade universal artifact repositories that handle all artifact types. You will encounter one of these in larger organisations. Understand their role without needing to install and administer one as part of your initial learning.
Service mesh — Istio or Linkerd
A service mesh adds a layer of network infrastructure between services in a Kubernetes cluster, providing mutual TLS, observability, traffic management, and access control without modifying application code. Istio is the most feature-rich; Linkerd is simpler to operate. Service meshes are relevant in large microservices environments and in zero-trust security architectures. This is specialist knowledge — do not invest time here until you are working with Kubernetes in production and encounter a real problem that a service mesh solves.
GitOps — ArgoCD or Flux
GitOps is a pattern where the desired state of your infrastructure and deployments is always stored in Git, and a controller continuously reconciles the running state with Git. ArgoCD is the most widely adopted GitOps tool for Kubernetes. It watches a Git repository and applies changes to the cluster when they are committed. GitOps sits at the intersection of CI/CD and Kubernetes — learn both before investing in GitOps tooling.
Tools that are overhyped or situational
Not every tool that appears in a “DevOps tools 2025” blog post is worth your time. Here are some honest assessments of tools that generate a lot of content but are either situational or genuinely secondary.
Ansible — a configuration management tool that uses YAML playbooks to manage server state. Ansible was important before IaC and containers became standard. It is still widely used in organisations with large fleets of VMs or on-premises infrastructure. If your target role involves managing traditional VM infrastructure, learn Ansible. If your target role is cloud-native containers and Kubernetes, Ansible is less relevant and Terraform covers the IaC need.
Jenkins — still widely deployed, but most greenfield projects choose GitHub Actions, GitLab CI, or cloud-native CI. Worth understanding conceptually and being able to read a Jenkinsfile, but do not build your initial CI/CD knowledge on Jenkins.
Chef and Puppet — configuration management tools that are largely legacy in cloud-native contexts. Unless your target employer specifically requires them, do not invest learning time here.
Spinnaker — a multi-cloud continuous delivery platform developed at Netflix. Complex to operate and largely supplanted by GitOps tools (ArgoCD, Flux) and simpler CD pipelines in most organisations. Situational.
Kubernetes alternatives (Nomad, Swarm) — HashiCorp Nomad and Docker Swarm are alternative container orchestrators. Kubernetes has won. Learn Kubernetes; these are not worth prioritising.
The recommended learning sequence
To summarise the prioritised sequence explicitly:
- Git — before everything. Without this, nothing else makes sense.
- Linux command line — the environment all tools run in.
- Docker — understand containers before container orchestration.
- GitHub Actions — learn CI/CD after Git, before Kubernetes and Terraform.
- Terraform — IaC, after you understand what you are provisioning and how it will be deployed.
- Kubernetes — orchestration, after Docker and after CI/CD (so you can understand CI/CD-to-Kubernetes deployment flows).
- Prometheus + Grafana — observability, once you have real systems running.
- Logging (Loki or ELK) — log aggregation to complement metrics.
- Secrets management — cloud-native first, Vault when needed.
- GitOps (ArgoCD) — once you have Kubernetes and CI/CD experience.
The entry-level cloud engineering skills gap analysis on the entry-level cloud jobs page is useful context for how these tools map to job expectations at different seniority levels.
For a career-level view of how DevOps skills compound into a senior role, the DevOps engineer roadmap covers the progression from junior to senior, including what employers look for at each level. The DevOps engineer salary data shows how tool competency correlates with compensation.
Summary
- Learn tools in the right order: Git → Linux → Docker → GitHub Actions → Terraform → Kubernetes → Prometheus/Grafana → Logging. Each layer builds on the previous one.
- The most common mistake is learning many tools shallowly. Master Git, Docker, one CI/CD tool, Terraform, and Kubernetes before adding anything else.
- Tier 1 (Git, Linux, Docker) is non-negotiable. You cannot meaningfully use Tier 2 tools without solid Tier 1 foundations.
- GitHub Actions is the right CI/CD tool to start with. Learn Jenkins and GitLab CI conceptually, but build your real skills in GitHub Actions first.
- Prometheus + Grafana is the standard open-source observability stack. Understanding it transfers to cloud-native and commercial alternatives.
- Ignore Ansible, Chef, Puppet, and Spinnaker unless your specific target role requires them. They are legacy or situational tools in most cloud-native environments.