Skip to main content
DevOps and Deployment

GitOps in Action: Managing Infrastructure and Deployments with Declarative Code

Imagine your entire infrastructure described in a Git repository. Every change is a pull request, every merge triggers an automated sync, and your production environment always matches what's in the repo. That's the promise of GitOps. But is it just buzzword bingo, or does it actually solve real problems? This article breaks down GitOps from first principles, walks through a concrete example, and explores where it shines and where it stumbles. Whether you're a DevOps engineer evaluating new workflows or a team lead looking to reduce deployment chaos, you'll get a practical, honest take. Why GitOps Matters Now: The Pain of Manual Drift Every team that manages infrastructure has felt the slow creep of configuration drift. Someone SSHes into a server to fix a quick issue, a colleague updates a config file directly on the instance, and within weeks, the production environment is a snowflake that nobody fully understands.

Imagine your entire infrastructure described in a Git repository. Every change is a pull request, every merge triggers an automated sync, and your production environment always matches what's in the repo. That's the promise of GitOps. But is it just buzzword bingo, or does it actually solve real problems? This article breaks down GitOps from first principles, walks through a concrete example, and explores where it shines and where it stumbles. Whether you're a DevOps engineer evaluating new workflows or a team lead looking to reduce deployment chaos, you'll get a practical, honest take.

Why GitOps Matters Now: The Pain of Manual Drift

Every team that manages infrastructure has felt the slow creep of configuration drift. Someone SSHes into a server to fix a quick issue, a colleague updates a config file directly on the instance, and within weeks, the production environment is a snowflake that nobody fully understands. Deployments become nerve-wracking rituals, and rollbacks feel like gambling.

Traditional approaches like imperative scripts or configuration management tools (Ansible, Chef) help, but they still rely on humans running commands. The state of your infrastructure lives in a database or on a server, not in a version-controlled file that anyone can audit. GitOps flips this: the desired state is declared in Git, and an automated operator reconciles the real world with that declaration. The result is a system that self-heals, provides a clear audit trail, and makes rollbacks as simple as reverting a commit.

For teams adopting Kubernetes, GitOps has become almost synonymous with deployment best practices. But the principles apply broadly—to cloud resources, databases, and even configuration files. The core insight is that declarative configuration plus version control plus automated reconciliation eliminates an entire class of human errors.

The Cost of Configuration Drift

When environments diverge, debugging becomes a nightmare. A bug that only appears in production might trace back to a manual change no one remembers. GitOps prevents this by making Git the single source of truth. If production differs from the repo, the operator fixes it—or alerts you.

Auditability and Compliance

Every change is a commit with an author, a timestamp, and a review trail. For regulated industries, this is a big win. You can prove exactly who changed what and when, without relying on logs that might be incomplete.

Core Idea in Plain Language: Git as the Source of Truth

At its heart, GitOps is simple: you store everything needed to run your application in a Git repository. That includes Kubernetes manifests, Terraform files, Helm charts, or even plain YAML. Then you have a process that ensures the live environment matches that repository.

The key components are three: a Git repository with the desired state, an operator that watches the repo and applies changes, and a feedback loop that reports discrepancies. The operator (like Argo CD or Flux) continuously compares the live state with the repo. If someone pushes a new manifest, the operator updates the cluster. If someone manually changes the cluster, the operator reverts it—or flags it as a drift.

This is different from CI/CD pipelines that push artifacts. In GitOps, the pipeline's job is to update the repo (e.g., change a deployment image tag), and the operator handles the rest. The deployment is a pull-based model: the operator pulls changes from Git, rather than a pipeline pushing to the cluster. This improves security because the cluster doesn't need to expose credentials to a CI system.

Declarative vs. Imperative: A Simple Analogy

Think of declarative as giving a recipe vs. imperative as giving step-by-step cooking instructions. With a recipe, you say 'the cake should be moist and chocolate-flavored' and the oven figures out the rest. Imperative would be 'preheat to 350, mix for 5 minutes, then bake for 30.' GitOps is declarative: you describe the desired end state, and the operator makes it happen.

Pull-Based Deployments

In a traditional CI/CD pipeline, the CI server pushes changes to the cluster. That means the cluster must trust the CI server's credentials. In GitOps, the cluster pulls from Git. The operator runs inside the cluster and only needs read access to the repo. This reduces the attack surface and simplifies security audits.

How It Works Under the Hood: The Reconciliation Loop

The magic of GitOps lies in the reconciliation loop. An operator (like Argo CD or Flux) runs as a pod in your cluster. It periodically polls the Git repository (or uses webhooks) to fetch the latest desired state. It then queries the Kubernetes API to see what's actually running. If there's a difference, it applies changes to align the cluster with the repo.

This loop runs continuously. If someone accidentally deletes a deployment, the operator recreates it from Git. If a new version of an app is tagged in the repo, the operator rolls it out. The operator also reports status back: sync status, health checks, and any errors. This feedback is visible in a dashboard or via notifications.

Key Components: The Operator and the Repo Structure

The operator needs to understand your configuration format. For Kubernetes, it reads YAML manifests. For Terraform, tools like Crossplane or Terraform Controller extend GitOps to infrastructure. The repo structure matters: you typically have separate directories for environments (dev, staging, prod) and apps. Branching strategies (like GitFlow or trunk-based) affect how changes flow.

Sync Strategies: Automatic vs. Manual

Automatic sync means the operator applies changes as soon as they appear in Git. Manual sync requires a human to approve the sync via the operator's UI or CLI. Many teams start with manual sync for production to add a safety gate. The operator still detects drift but waits for approval to fix it.

Worked Example: Deploying an App with Argo CD

Let's walk through a realistic scenario. You have a simple web app (Node.js) and a Kubernetes cluster. You want to deploy version 1.0.0 and later upgrade to 1.1.0 using GitOps.

First, create a Git repository with a directory structure:

app/
  overlays/
    production/
      deployment-patch.yaml
      kustomization.yaml
    staging/
      deployment-patch.yaml
      kustomization.yaml
  base/
    deployment.yaml
    service.yaml
    kustomization.yaml

The base directory contains common manifests. Overlays add environment-specific changes (like replica count or image tag).

Install Argo CD in your cluster and add the repo. Create an Application resource pointing to the path for production:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: myapp-production
spec:
  destination:
    namespace: production
    server: https://kubernetes.default.svc
  project: default
  source:
    path: app/overlays/production
    repoURL: https://github.com/example/myapp.git
    targetRevision: main
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

Once you apply this, Argo CD syncs the cluster to match the repo. To upgrade to version 1.1.0, update the image tag in the production overlay (e.g., in deployment-patch.yaml), commit, and push. Argo CD detects the change and rolls out the new version. If something goes wrong, revert the commit and the operator rolls back.

Rollback in Action

Suppose version 1.1.0 causes a spike in error rates. Instead of running kubectl rollout undo, you revert the commit in Git. The operator sees the previous desired state and applies it. The rollback is atomic, auditable, and fast.

Handling Secrets

Secrets are a challenge because you don't want plaintext secrets in Git. Tools like Sealed Secrets, SOPS, or External Secrets Operator encrypt secrets before committing. The operator decrypts them at runtime using a key stored in the cluster. This keeps Git as the source of truth while protecting sensitive data.

Edge Cases and Exceptions: When GitOps Gets Tricky

GitOps isn't a silver bullet. Several scenarios require careful handling.

Stateful Applications

Databases and stateful sets don't fit the declarative model well. You can't simply redeploy a database from Git without losing data. GitOps works best for stateless services. For stateful components, you might use GitOps for the configuration but manage data separately (e.g., via backups and persistent volumes).

Emergency Hotfixes

In a production incident, waiting for a pull request to merge might be too slow. Some teams allow emergency overrides via a separate 'hotfix' branch or a manual kubectl command, but that defeats the purpose. A better approach is to have a fast-track Git process: a dedicated emergency branch with reduced review requirements, or a feature flag that can disable problematic code without a deploy.

Multi-Environment Drift

If your staging and production environments are identical in code but differ in scale or configuration, you need to manage overlays carefully. A common mistake is to apply a change to staging but forget production. GitOps helps because both environments are in the same repo, but you still need to ensure changes propagate correctly. Using Kustomize or Helm with environment-specific values is the standard approach.

Large-Scale Repos

As your infrastructure grows, a single repo can become unwieldy. Monorepo vs. multirepo is a trade-off. Monorepo simplifies cross-service changes but can slow down CI. Multirepo isolates changes but makes coordination harder. GitOps tools support both, but you need to set up proper access controls and CI pipelines.

Limits of the Approach: What GitOps Can't Do

GitOps excels at managing the state of your infrastructure, but it has blind spots.

Real-Time Configuration Changes

If you need to adjust a setting dynamically (e.g., scaling based on load), GitOps is too slow. The loop runs every few minutes. For autoscaling, you should use native Kubernetes HPA or event-driven scaling, and let GitOps manage the HPA configuration itself.

Infrastructure That Doesn't Support Declarative APIs

Some legacy systems or cloud resources lack a robust declarative API. If you can't describe the desired state in a config file, GitOps won't help. In those cases, you might use a hybrid approach: GitOps for what you can, and imperative scripts wrapped in operators for the rest.

Team Maturity

GitOps requires discipline. Everyone must commit changes to Git, not SSH into servers. Code reviews become mandatory. If your team is used to cowboy debugging, adopting GitOps will be a cultural shift. Start with a single non-critical service and expand gradually.

Tool Complexity

Operators like Argo CD and Flux are powerful but have a learning curve. You need to understand Kubernetes RBAC, CRDs, and networking. The initial setup can be daunting. However, once configured, the day-to-day operations become simpler.

Reader FAQ: Common Questions About GitOps

Is GitOps only for Kubernetes?

No. While GitOps is most popular with Kubernetes, the principles apply to any infrastructure that can be managed declaratively. Tools like Terraform, Pulumi, and AWS CloudFormation can be integrated with GitOps workflows. The key is having a tool that can reconcile desired state from Git.

How does GitOps differ from CI/CD?

CI/CD pipelines build and test code, then push artifacts. GitOps adds a layer: the pipeline updates the Git repo with new artifact references (e.g., a new image tag), and the operator deploys from there. The deployment step is pull-based, not push-based. This separates build from deploy and gives operators control over when changes hit production.

What happens if someone manually changes the cluster?

With auto-sync and self-heal enabled, the operator reverts the change. If manual intervention is needed (e.g., for a hotfix), you can temporarily disable auto-sync or use a different branch. The drift is detected and logged, so you can investigate.

Can I use GitOps with multiple clusters?

Yes. Tools like Argo CD support managing multiple clusters from a single instance. You define applications per cluster, and the operator syncs each cluster independently. This is common for multi-region or multi-cloud setups.

How do I handle database schema migrations?

Database migrations are stateful and often require careful ordering. GitOps can trigger a migration job as part of a deployment (e.g., using a Kubernetes Job), but the migration itself is usually handled by a separate tool (like Flyway or Liquibase). The key is to ensure the migration runs before the app pods update, which you can orchestrate with Helm hooks or Argo CD sync waves.

What are the security implications?

GitOps improves security by reducing the attack surface. The cluster only needs read access to Git, not CI system credentials. However, you must secure the Git repo (branch protection, signed commits) and the operator's access. Also, secrets management requires careful implementation.

Is GitOps worth it for small teams?

It depends. If you have a single developer and a simple app, the overhead might not be justified. But even small teams benefit from version control and automated deployments. Start with a lightweight setup (Flux with a single repo) and see if it reduces errors. The time saved on debugging drift often outweighs the initial setup cost.

Share this article:

Comments (0)

No comments yet. Be the first to comment!