Skip to content

Reference Architectures

This section documents reference architectures I've built and deployed for production Kubernetes platforms. These are not theoretical examples. They are adaptable starting points shaped by real constraints, trade-offs, and organizational context.

Each architecture applies the best practices laid out earlier in the handbook. Think of them as blueprints that show how design principles translate into concrete topology, tooling choices, and delivery flows.

Common Components

Every architecture includes these foundational elements.

  • Infrastructure provisioning
    • IAM roles for service accounts and provider access
    • Cluster provisioning (EKS, AKS, GKE, or self managed)
    • Network setup (VPCs, subnets, NAT, private endpoints)
    • Storage and observability backend resources
  • GitOps delivery
    • Automated cluster onboarding
    • App of apps pattern for platform and workload apps
  • Crossplane for app team self service infra
    • Compositions for common resources (Postgres, S3, queues, cache)
    • Claims submitted as Kubernetes manifests
  • Policy enforcement with Kyverno
  • Secrets management (External Secrets Operator, Sealed Secrets, or cloud native)
  • Observability stack (Prometheus, Loki, Grafana, Alertmanager)

Centralized vs Decentralized

The real difference is where you run the control plane components (Argo CD, Crossplane, policies).

Centralized runs them in a dedicated management cluster or account. Workload clusters stay lean and declarative. The platform team owns the single source of truth and can enforce consistent guardrails.

Decentralized runs them on the workload cluster. Teams get local autonomy and faster iteration but the platform fragments. You trade central visibility and consistency for flexibility and fault isolation.

Both patterns are valid. The right choice depends on team size, regulatory needs, blast radius tolerance, and whether you value consistency over autonomy. I lean centralized for production systems because it scales operationally and simplifies compliance. Decentralized makes sense for highly autonomous teams, regulated environments, or edge fleets where isolation is non negotiable.

Hybrid Approaches

These patterns are not mutually exclusive. You can mix and match components based on which parts of your platform need central control and which benefit from local autonomy.

PatternDescriptionWinsTradeoffs
Central Argo CD + Decentralized CrossplanePlatform team runs a single Argo CD instance to manage cluster lifecycle and core platform components. Each workload cluster runs its own Crossplane to provision cloud resources locally.Central delivery visibility and audit. Local resource provisioning flexibility.Still need policy enforcement on every cluster. Fragmented infra inventory.
Platform Argo CD + Team Argo CDPlatform Argo CD deploys shared infra (ingress controllers, cert manager, observability) across all clusters. Teams run their own Argo CD instances for application delivery and experimentation.Platform controls the base. Teams iterate fast on apps without waiting for central approval.Duplicate Argo CD operations. Need clear boundaries on what each instance manages.
Central Everything + Edge DecentralizedCore production clusters use centralized pattern (management cluster with Argo CD and Crossplane). Edge or restricted environments (air gapped, regulated) run decentralized.Optimize for each environment's constraints. Single model for core production.Platform team supports two operational models.

When to consider hybrid:

  • When organizational boundaries are clear (platform vs apps, core vs edge).
  • If you need to migrate gradually from decentralized to centralized.
  • When regulatory or latency constraints force local control in some environments.
  • If team maturity varies (centralize for new teams, decentralize for advanced ones).

The key is intentionality. Pick hybrid only when the operational cost of running both patterns is justified by clear business or technical constraints. Avoid it as a compromise to delay hard architecture decisions.