Kubernetes

Deploy and scale BroxiAI applications on Kubernetes for enterprise-grade container orchestration

Learn how to deploy, scale, and manage BroxiAI applications on Kubernetes for production-ready container orchestration with automatic scaling, self-healing, and service discovery.

Overview

Kubernetes provides:

  • Automatic scaling and load balancing

  • Self-healing and fault tolerance

  • Service discovery and networking

  • Rolling updates and rollbacks

  • Secret and configuration management

  • Multi-cloud and hybrid deployments

Basic Kubernetes Setup

Namespace and Resources

Namespace Configuration

# namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: broxi-ai
  labels:
    name: broxi-ai
    environment: production
---
apiVersion: v1
kind: ResourceQuota
metadata:
  name: broxi-quota
  namespace: broxi-ai
spec:
  hard:
    requests.cpu: "10"
    requests.memory: 20Gi
    limits.cpu: "20"
    limits.memory: 40Gi
    persistentvolumeclaims: "5"
    pods: "50"
---
apiVersion: v1
kind: LimitRange
metadata:
  name: broxi-limits
  namespace: broxi-ai
spec:
  limits:
  - default:
      cpu: "500m"
      memory: "512Mi"
    defaultRequest:
      cpu: "100m"
      memory: "128Mi"
    type: Container

ConfigMaps and Secrets

Configuration Management

# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: broxi-config
  namespace: broxi-ai
data:
  # Application configuration
  APP_ENV: "production"
  LOG_LEVEL: "info"
  MAX_WORKERS: "4"
  REQUEST_TIMEOUT: "30"
  
  # Database configuration
  DB_HOST: "postgres-service"
  DB_PORT: "5432"
  DB_NAME: "broxi_production"
  
  # Redis configuration
  REDIS_HOST: "redis-service"
  REDIS_PORT: "6379"
  REDIS_DB: "0"
  
  # BroxiAI configuration
  BROXI_API_URL: "https://api.broxi.ai/v1"
  BROXI_TIMEOUT: "60"
  BROXI_RETRY_ATTEMPTS: "3"

---
apiVersion: v1
kind: Secret
metadata:
  name: broxi-secrets
  namespace: broxi-ai
type: Opaque
data:
  # Base64 encoded secrets
  BROXI_API_TOKEN: <base64-encoded-token>
  DB_PASSWORD: <base64-encoded-password>
  REDIS_PASSWORD: <base64-encoded-password>
  SECRET_KEY: <base64-encoded-secret>
  JWT_SECRET: <base64-encoded-jwt-secret>

---
apiVersion: v1
kind: Secret
metadata:
  name: broxi-tls
  namespace: broxi-ai
type: kubernetes.io/tls
data:
  tls.crt: <base64-encoded-certificate>
  tls.key: <base64-encoded-private-key>

Application Deployment

Main Application

Deployment Configuration

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: broxi-app
  namespace: broxi-ai
  labels:
    app: broxi-app
    version: v1.0.0
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  selector:
    matchLabels:
      app: broxi-app
  template:
    metadata:
      labels:
        app: broxi-app
        version: v1.0.0
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "8000"
        prometheus.io/path: "/metrics"
    spec:
      serviceAccountName: broxi-service-account
      securityContext:
        runAsNonRoot: true
        runAsUser: 1001
        runAsGroup: 1001
        fsGroup: 1001
      containers:
      - name: broxi-app
        image: broxi/app:v1.0.0
        imagePullPolicy: Always
        ports:
        - containerPort: 8000
          name: http
          protocol: TCP
        env:
        - name: PORT
          value: "8000"
        envFrom:
        - configMapRef:
            name: broxi-config
        - secretRef:
            name: broxi-secrets
        resources:
          requests:
            cpu: 200m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi
        livenessProbe:
          httpGet:
            path: /health
            port: http
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /ready
            port: http
          initialDelaySeconds: 5
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 2
        startupProbe:
          httpGet:
            path: /startup
            port: http
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 10
        volumeMounts:
        - name: app-logs
          mountPath: /app/logs
        - name: temp-storage
          mountPath: /tmp
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
      volumes:
      - name: app-logs
        emptyDir: {}
      - name: temp-storage
        emptyDir: {}
      terminationGracePeriodSeconds: 30
      restartPolicy: Always

Service Configuration

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: broxi-app-service
  namespace: broxi-ai
  labels:
    app: broxi-app
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: nlb
spec:
  type: LoadBalancer
  selector:
    app: broxi-app
  ports:
  - name: http
    port: 80
    targetPort: 8000
    protocol: TCP
  sessionAffinity: None

---
apiVersion: v1
kind: Service
metadata:
  name: broxi-app-internal
  namespace: broxi-ai
  labels:
    app: broxi-app
spec:
  type: ClusterIP
  selector:
    app: broxi-app
  ports:
  - name: http
    port: 8000
    targetPort: 8000
    protocol: TCP

Database Deployment

PostgreSQL StatefulSet

# postgres.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
  namespace: broxi-ai
  labels:
    app: postgres
spec:
  serviceName: postgres-headless
  replicas: 1
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15-alpine
        ports:
        - containerPort: 5432
          name: postgres
        env:
        - name: POSTGRES_DB
          valueFrom:
            configMapKeyRef:
              name: broxi-config
              key: DB_NAME
        - name: POSTGRES_USER
          value: "broxi"
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: broxi-secrets
              key: DB_PASSWORD
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        resources:
          requests:
            cpu: 200m
            memory: 512Mi
          limits:
            cpu: 1000m
            memory: 2Gi
        volumeMounts:
        - name: postgres-storage
          mountPath: /var/lib/postgresql/data
        - name: postgres-config
          mountPath: /etc/postgresql/postgresql.conf
          subPath: postgresql.conf
        livenessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - broxi
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - pg_isready
            - -U
            - broxi
          initialDelaySeconds: 5
          periodSeconds: 5
      volumes:
      - name: postgres-config
        configMap:
          name: postgres-config
  volumeClaimTemplates:
  - metadata:
      name: postgres-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      storageClassName: "fast-ssd"
      resources:
        requests:
          storage: 50Gi

---
apiVersion: v1
kind: Service
metadata:
  name: postgres-service
  namespace: broxi-ai
  labels:
    app: postgres
spec:
  type: ClusterIP
  selector:
    app: postgres
  ports:
  - port: 5432
    targetPort: 5432

---
apiVersion: v1
kind: Service
metadata:
  name: postgres-headless
  namespace: broxi-ai
  labels:
    app: postgres
spec:
  type: ClusterIP
  clusterIP: None
  selector:
    app: postgres
  ports:
  - port: 5432
    targetPort: 5432

Redis Deployment

# redis.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
  namespace: broxi-ai
  labels:
    app: redis
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - name: redis
        image: redis:7-alpine
        command:
        - redis-server
        - /etc/redis/redis.conf
        ports:
        - containerPort: 6379
          name: redis
        env:
        - name: REDIS_PASSWORD
          valueFrom:
            secretKeyRef:
              name: broxi-secrets
              key: REDIS_PASSWORD
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 200m
            memory: 256Mi
        volumeMounts:
        - name: redis-config
          mountPath: /etc/redis
        - name: redis-data
          mountPath: /data
        livenessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          exec:
            command:
            - redis-cli
            - ping
          initialDelaySeconds: 5
          periodSeconds: 5
      volumes:
      - name: redis-config
        configMap:
          name: redis-config
      - name: redis-data
        persistentVolumeClaim:
          claimName: redis-pvc

---
apiVersion: v1
kind: Service
metadata:
  name: redis-service
  namespace: broxi-ai
  labels:
    app: redis
spec:
  type: ClusterIP
  selector:
    app: redis
  ports:
  - port: 6379
    targetPort: 6379

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: redis-pvc
  namespace: broxi-ai
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: "fast-ssd"
  resources:
    requests:
      storage: 10Gi

Auto-scaling Configuration

Horizontal Pod Autoscaler

HPA Configuration

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: broxi-app-hpa
  namespace: broxi-ai
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: broxi-app
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  - type: Pods
    pods:
      metric:
        name: http_requests_per_second
      target:
        type: AverageValue
        averageValue: "1000"
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 15
      - type: Pods
        value: 4
        periodSeconds: 15
      selectPolicy: Max
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
      - type: Percent
        value: 10
        periodSeconds: 60
      selectPolicy: Min

Vertical Pod Autoscaler

VPA Configuration

# vpa.yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: broxi-app-vpa
  namespace: broxi-ai
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: broxi-app
  updatePolicy:
    updateMode: "Auto"
  resourcePolicy:
    containerPolicies:
    - containerName: broxi-app
      maxAllowed:
        cpu: 1
        memory: 2Gi
      minAllowed:
        cpu: 100m
        memory: 128Mi
      mode: Auto

Cluster Autoscaler

Node Scaling Configuration

# cluster-autoscaler.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: cluster-autoscaler
  namespace: kube-system
  labels:
    app: cluster-autoscaler
spec:
  selector:
    matchLabels:
      app: cluster-autoscaler
  template:
    metadata:
      labels:
        app: cluster-autoscaler
      annotations:
        prometheus.io/scrape: 'true'
        prometheus.io/port: '8085'
    spec:
      serviceAccountName: cluster-autoscaler
      containers:
      - image: k8s.gcr.io/autoscaling/cluster-autoscaler:v1.21.0
        name: cluster-autoscaler
        resources:
          limits:
            cpu: 100m
            memory: 300Mi
          requests:
            cpu: 100m
            memory: 300Mi
        command:
        - ./cluster-autoscaler
        - --v=4
        - --stderrthreshold=info
        - --cloud-provider=aws
        - --skip-nodes-with-local-storage=false
        - --expander=least-waste
        - --node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled,k8s.io/cluster-autoscaler/broxi-cluster
        - --balance-similar-node-groups
        - --scale-down-enabled=true
        - --scale-down-delay-after-add=10m
        - --scale-down-unneeded-time=10m
        - --scale-down-utilization-threshold=0.5
        env:
        - name: AWS_REGION
          value: us-west-2
        volumeMounts:
        - name: ssl-certs
          mountPath: /etc/ssl/certs/ca-certificates.crt
          readOnly: true
        imagePullPolicy: "Always"
      volumes:
      - name: ssl-certs
        hostPath:
          path: "/etc/ssl/certs/ca-bundle.crt"

Advanced Kubernetes Features

Jobs and CronJobs

Background Processing Jobs

# jobs.yaml
apiVersion: batch/v1
kind: Job
metadata:
  name: broxi-data-migration
  namespace: broxi-ai
spec:
  backoffLimit: 3
  completions: 1
  parallelism: 1
  template:
    metadata:
      labels:
        app: broxi-migration
    spec:
      restartPolicy: Never
      containers:
      - name: migration
        image: broxi/app:v1.0.0
        command: ["python", "manage.py", "migrate"]
        envFrom:
        - configMapRef:
            name: broxi-config
        - secretRef:
            name: broxi-secrets
        resources:
          requests:
            cpu: 100m
            memory: 256Mi
          limits:
            cpu: 500m
            memory: 512Mi

---
apiVersion: batch/v1
kind: CronJob
metadata:
  name: broxi-cleanup
  namespace: broxi-ai
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
          - name: cleanup
            image: broxi/app:v1.0.0
            command: ["python", "scripts/cleanup.py"]
            envFrom:
            - configMapRef:
                name: broxi-config
            - secretRef:
                name: broxi-secrets
            resources:
              requests:
                cpu: 100m
                memory: 128Mi
              limits:
                cpu: 200m
                memory: 256Mi
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1

Network Policies

Security Network Policies

# network-policies.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: broxi-app-policy
  namespace: broxi-ai
spec:
  podSelector:
    matchLabels:
      app: broxi-app
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - namespaceSelector:
        matchLabels:
          name: ingress-nginx
    - podSelector:
        matchLabels:
          app: broxi-app
    ports:
    - protocol: TCP
      port: 8000
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: postgres
    ports:
    - protocol: TCP
      port: 5432
  - to:
    - podSelector:
        matchLabels:
          app: redis
    ports:
    - protocol: TCP
      port: 6379
  - to: []  # Allow external API calls
    ports:
    - protocol: TCP
      port: 443
    - protocol: TCP
      port: 80

---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: postgres-policy
  namespace: broxi-ai
spec:
  podSelector:
    matchLabels:
      app: postgres
  policyTypes:
  - Ingress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: broxi-app
    ports:
    - protocol: TCP
      port: 5432

Pod Disruption Budgets

PDB Configuration

# pdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: broxi-app-pdb
  namespace: broxi-ai
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: broxi-app

---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: postgres-pdb
  namespace: broxi-ai
spec:
  maxUnavailable: 0
  selector:
    matchLabels:
      app: postgres

Ingress and Load Balancing

Ingress Configuration

NGINX Ingress

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: broxi-ingress
  namespace: broxi-ai
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
    nginx.ingress.kubernetes.io/proxy-body-size: "50m"
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "60"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "60"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "60"
    nginx.ingress.kubernetes.io/rate-limit: "100"
    nginx.ingress.kubernetes.io/rate-limit-window: "1m"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
    nginx.ingress.kubernetes.io/configuration-snippet: |
      more_set_headers "X-Frame-Options: DENY";
      more_set_headers "X-Content-Type-Options: nosniff";
      more_set_headers "X-XSS-Protection: 1; mode=block";
spec:
  tls:
  - hosts:
    - api.broxi-app.com
    - app.broxi-app.com
    secretName: broxi-tls
  rules:
  - host: api.broxi-app.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: broxi-app-service
            port:
              number: 80
  - host: app.broxi-app.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: broxi-frontend-service
            port:
              number: 80

---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: broxi-websocket-ingress
  namespace: broxi-ai
  annotations:
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/proxy-http-version: "1.1"
    nginx.ingress.kubernetes.io/proxy-set-headers: |
      Upgrade $http_upgrade
      Connection "upgrade"
spec:
  rules:
  - host: ws.broxi-app.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: broxi-websocket-service
            port:
              number: 8080

Service Mesh (Istio)

Istio Configuration

# istio-config.yaml
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: broxi-gateway
  namespace: broxi-ai
spec:
  selector:
    istio: ingressgateway
  servers:
  - port:
      number: 443
      name: https
      protocol: HTTPS
    tls:
      mode: SIMPLE
      credentialName: broxi-tls
    hosts:
    - api.broxi-app.com

---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: broxi-vs
  namespace: broxi-ai
spec:
  hosts:
  - api.broxi-app.com
  gateways:
  - broxi-gateway
  http:
  - match:
    - uri:
        prefix: "/v1/"
    route:
    - destination:
        host: broxi-app-service
        port:
          number: 8000
    timeout: 60s
    retries:
      attempts: 3
      perTryTimeout: 20s

---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
  name: broxi-dr
  namespace: broxi-ai
spec:
  host: broxi-app-service
  trafficPolicy:
    loadBalancer:
      simple: LEAST_CONN
    connectionPool:
      tcp:
        maxConnections: 50
      http:
        http1MaxPendingRequests: 10
        maxRequestsPerConnection: 2
    circuitBreaker:
      consecutiveErrors: 3
      interval: 30s
      baseEjectionTime: 30s
      maxEjectionPercent: 50
  subsets:
  - name: v1
    labels:
      version: v1.0.0
    trafficPolicy:
      portLevelSettings:
      - port:
          number: 8000
        loadBalancer:
          simple: ROUND_ROBIN

Monitoring and Observability

Prometheus Monitoring

ServiceMonitor Configuration

# monitoring.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: broxi-app-monitor
  namespace: broxi-ai
  labels:
    app: broxi-app
spec:
  selector:
    matchLabels:
      app: broxi-app
  endpoints:
  - port: http
    path: /metrics
    interval: 30s
    scrapeTimeout: 10s

---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: broxi-alerts
  namespace: broxi-ai
spec:
  groups:
  - name: broxi.rules
    rules:
    - alert: BroxiAppDown
      expr: up{job="broxi-app"} == 0
      for: 5m
      labels:
        severity: critical
      annotations:
        summary: "BroxiAI application is down"
        description: "BroxiAI application has been down for more than 5 minutes"
    
    - alert: HighCPUUsage
      expr: rate(container_cpu_usage_seconds_total{pod=~"broxi-app-.*"}[5m]) > 0.8
      for: 10m
      labels:
        severity: warning
      annotations:
        summary: "High CPU usage detected"
        description: "CPU usage is above 80% for more than 10 minutes"
    
    - alert: HighMemoryUsage
      expr: container_memory_usage_bytes{pod=~"broxi-app-.*"} / container_spec_memory_limit_bytes > 0.9
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High memory usage detected"
        description: "Memory usage is above 90% for more than 5 minutes"
    
    - alert: PodCrashLooping
      expr: rate(kube_pod_container_status_restarts_total[15m]) > 0
      for: 0m
      labels:
        severity: critical
      annotations:
        summary: "Pod is crash looping"
        description: "Pod {{ $labels.pod }} is restarting frequently"

Distributed Tracing

Jaeger Configuration

# jaeger.yaml
apiVersion: jaegertracing.io/v1
kind: Jaeger
metadata:
  name: broxi-jaeger
  namespace: broxi-ai
spec:
  strategy: production
  storage:
    type: elasticsearch
    elasticsearch:
      nodeCount: 3
      redundancyPolicy: SingleRedundancy
      resources:
        requests:
          cpu: 200m
          memory: 1Gi
        limits:
          cpu: 1
          memory: 2Gi
  ingress:
    enabled: true
    annotations:
      kubernetes.io/ingress.class: nginx
    hosts:
    - jaeger.broxi-app.com

Security and RBAC

Service Accounts and RBAC

RBAC Configuration

# rbac.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: broxi-service-account
  namespace: broxi-ai

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: broxi-ai
  name: broxi-role
rules:
- apiGroups: [""]
  resources: ["configmaps", "secrets"]
  verbs: ["get", "list", "watch"]
- apiGroups: [""]
  resources: ["pods"]
  verbs: ["get", "list", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: broxi-role-binding
  namespace: broxi-ai
subjects:
- kind: ServiceAccount
  name: broxi-service-account
  namespace: broxi-ai
roleRef:
  kind: Role
  name: broxi-role
  apiGroup: rbac.authorization.k8s.io

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: broxi-cluster-role
rules:
- apiGroups: [""]
  resources: ["nodes", "namespaces"]
  verbs: ["get", "list", "watch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: broxi-cluster-role-binding
subjects:
- kind: ServiceAccount
  name: broxi-service-account
  namespace: broxi-ai
roleRef:
  kind: ClusterRole
  name: broxi-cluster-role
  apiGroup: rbac.authorization.k8s.io

Pod Security Standards

Pod Security Policy

# security-policy.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: broxi-psp
spec:
  privileged: false
  allowPrivilegeEscalation: false
  requiredDropCapabilities:
    - ALL
  volumes:
    - 'configMap'
    - 'emptyDir'
    - 'projected'
    - 'secret'
    - 'downwardAPI'
    - 'persistentVolumeClaim'
  runAsUser:
    rule: 'MustRunAsNonRoot'
  seLinux:
    rule: 'RunAsAny'
  fsGroup:
    rule: 'RunAsAny'
  readOnlyRootFilesystem: true

---
apiVersion: v1
kind: Pod
metadata:
  name: broxi-secure-pod
  namespace: broxi-ai
  annotations:
    seccomp.security.alpha.kubernetes.io/pod: runtime/default
spec:
  serviceAccountName: broxi-service-account
  securityContext:
    runAsNonRoot: true
    runAsUser: 1001
    runAsGroup: 1001
    fsGroup: 1001
    seccompProfile:
      type: RuntimeDefault
  containers:
  - name: broxi-app
    image: broxi/app:v1.0.0
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      readOnlyRootFilesystem: true
    resources:
      requests:
        cpu: 100m
        memory: 128Mi
      limits:
        cpu: 500m
        memory: 512Mi

GitOps and CI/CD

ArgoCD Application

ArgoCD Configuration

# argocd-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: broxi-app
  namespace: argocd
  finalizers:
  - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: https://github.com/your-org/broxi-k8s-manifests
    targetRevision: main
    path: overlays/production
  destination:
    server: https://kubernetes.default.svc
    namespace: broxi-ai
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
      allowEmpty: false
    syncOptions:
    - CreateNamespace=true
    - PrunePropagationPolicy=foreground
    - PruneLast=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m
  revisionHistoryLimit: 10

Helm Charts

Helm Chart Structure

# Chart.yaml
apiVersion: v2
name: broxi-app
description: BroxiAI application Helm chart
type: application
version: 1.0.0
appVersion: "v1.0.0"

dependencies:
- name: postgresql
  version: 12.1.2
  repository: https://charts.bitnami.com/bitnami
  condition: postgresql.enabled
- name: redis
  version: 17.3.7
  repository: https://charts.bitnami.com/bitnami
  condition: redis.enabled

---
# values.yaml
replicaCount: 3

image:
  repository: broxi/app
  tag: v1.0.0
  pullPolicy: Always

service:
  type: LoadBalancer
  port: 80
  targetPort: 8000

ingress:
  enabled: true
  className: nginx
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
  hosts:
  - host: api.broxi-app.com
    paths:
    - path: /
      pathType: Prefix
  tls:
  - secretName: broxi-tls
    hosts:
    - api.broxi-app.com

autoscaling:
  enabled: true
  minReplicas: 3
  maxReplicas: 20
  targetCPUUtilizationPercentage: 70
  targetMemoryUtilizationPercentage: 80

resources:
  limits:
    cpu: 500m
    memory: 512Mi
  requests:
    cpu: 200m
    memory: 256Mi

postgresql:
  enabled: true
  auth:
    postgresPassword: "changeme"
    database: "broxi_production"
  primary:
    persistence:
      enabled: true
      size: 50Gi

redis:
  enabled: true
  auth:
    enabled: true
    password: "changeme"
  master:
    persistence:
      enabled: true
      size: 10Gi

nodeSelector: {}
tolerations: []
affinity: {}

Disaster Recovery

Backup and Restore

Velero Backup Configuration

# backup.yaml
apiVersion: velero.io/v1
kind: Schedule
metadata:
  name: broxi-daily-backup
  namespace: velero
spec:
  schedule: "0 1 * * *"  # Daily at 1 AM
  template:
    includedNamespaces:
    - broxi-ai
    excludedResources:
    - events
    - events.events.k8s.io
    snapshotVolumes: true
    ttl: 720h  # 30 days
    storageLocation: default
    volumeSnapshotLocations:
    - default

---
apiVersion: velero.io/v1
kind: Backup
metadata:
  name: broxi-pre-deployment-backup
  namespace: velero
spec:
  includedNamespaces:
  - broxi-ai
  snapshotVolumes: true
  storageLocation: default

Database Backup CronJob

# db-backup.yaml
apiVersion: batch/v1
kind: CronJob
metadata:
  name: postgres-backup
  namespace: broxi-ai
spec:
  schedule: "0 2 * * *"  # Daily at 2 AM
  jobTemplate:
    spec:
      template:
        spec:
          restartPolicy: OnFailure
          containers:
          - name: postgres-backup
            image: postgres:15-alpine
            command:
            - /bin/sh
            - -c
            - |
              PGPASSWORD=$POSTGRES_PASSWORD pg_dump \
                -h $POSTGRES_HOST \
                -U $POSTGRES_USER \
                -d $POSTGRES_DB \
                -f /backup/backup-$(date +%Y%m%d_%H%M%S).sql
              
              # Upload to S3
              aws s3 cp /backup/backup-$(date +%Y%m%d_%H%M%S).sql \
                s3://broxi-backups/postgres/
              
              # Cleanup local backup
              rm /backup/backup-$(date +%Y%m%d_%H%M%S).sql
            env:
            - name: POSTGRES_HOST
              value: postgres-service
            - name: POSTGRES_USER
              value: broxi
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: broxi-secrets
                  key: DB_PASSWORD
            - name: POSTGRES_DB
              value: broxi_production
            - name: AWS_ACCESS_KEY_ID
              valueFrom:
                secretKeyRef:
                  name: aws-credentials
                  key: access-key-id
            - name: AWS_SECRET_ACCESS_KEY
              valueFrom:
                secretKeyRef:
                  name: aws-credentials
                  key: secret-access-key
            volumeMounts:
            - name: backup-storage
              mountPath: /backup
          volumes:
          - name: backup-storage
            emptyDir: {}
  successfulJobsHistoryLimit: 7
  failedJobsHistoryLimit: 3

Best Practices

Resource Management

Resource Optimization

  • Set appropriate resource requests and limits

  • Use horizontal and vertical pod autoscaling

  • Implement cluster autoscaling for node management

  • Monitor resource utilization continuously

  • Use pod disruption budgets for availability

Security

Security Best Practices

  • Run containers as non-root users

  • Use network policies for traffic control

  • Implement RBAC with least privilege

  • Regular security scanning of container images

  • Use pod security standards/policies

  • Encrypt secrets and sensitive data

High Availability

HA Configuration

  • Deploy across multiple availability zones

  • Use pod anti-affinity rules

  • Implement health checks and probes

  • Configure ingress with load balancing

  • Use persistent volumes for stateful data

  • Regular backup and disaster recovery testing

Troubleshooting

Common Issues

Pod Issues

# Check pod status
kubectl get pods -n broxi-ai

# Describe pod for events
kubectl describe pod <pod-name> -n broxi-ai

# Check pod logs
kubectl logs <pod-name> -n broxi-ai

# Debug running pod
kubectl exec -it <pod-name> -n broxi-ai -- /bin/sh

Service Discovery Issues

# Check services
kubectl get svc -n broxi-ai

# Test service connectivity
kubectl run debug --image=busybox -it --rm --restart=Never -- \
  nslookup broxi-app-service.broxi-ai.svc.cluster.local

# Check endpoints
kubectl get endpoints -n broxi-ai

Networking Issues

# Check network policies
kubectl get networkpolicy -n broxi-ai

# Test pod-to-pod connectivity
kubectl exec -it <pod1> -n broxi-ai -- \
  wget -qO- <pod2-ip>:8000/health

Next Steps

After Kubernetes deployment:

  1. Monitoring Enhancement: Implement comprehensive observability

  2. Security Hardening: Regular security assessments and updates

  3. Performance Tuning: Optimize for your specific workload

  4. Disaster Recovery: Test backup and recovery procedures

  5. Cost Optimization: Monitor and optimize resource usage


Last updated