Phase 1 - Observability with Prometheus, Grafana & Loki

Series: Kubernetes Homelab on VMware Workstation Prerequisites: Phase 0 - Argo CD & GitOps complete Source Code: jmartinez-homelab-gitops

What We’re Building

By the end of this guide, you will have:

Prometheus collecting metrics from nodes, pods, and Kubernetes objects
Grafana with dashboards for cluster and application monitoring
Loki + Promtail for centralized log aggregation
All accessible via Traefik ingress at grafana.lab.local

Why Observability

You can’t manage what you can’t measure. In Kubernetes, you need visibility into:

Metrics — CPU, memory, network, request rates, error rates
Logs — Application output, system events, errors
Dashboards — Visual representation of cluster health

Prometheus + Grafana is the de facto standard for Kubernetes monitoring. Loki provides logging without the resource overhead of the ELK stack — ideal for a homelab.

Before You Start

Verify Phase 0 is complete:

# Argo CD running
kubectl get pods -n argocd
 
# Online Boutique deployed and synced
kubectl get application -n argocd
# NAME              SYNC STATUS   HEALTH STATUS
# online-boutique   Synced        Healthy

Step 0: Resize VMs

The monitoring stack requires more resources than the default VM allocations. If you’re running VMware Workstation on a 40GB laptop, allocate:

VM	Memory	Rationale
k3s-server	8 GB	Control-plane + Argo CD + scheduling
k3s-agent-1	4 GB	Monitoring stack (Prometheus, Grafana)
k3s-agent-2	4 GB	Application workloads
k3s-agent-3	4 GB	Stateful workloads + overflow
Host	~20 GB	VMware + OS overhead

Total VM allocation: 20 GB. Leaves 20 GB for laptop OS.

How to Resize

Shut down the VM: sudo shutdown -h now
VMware Workstation → right-click VM → Settings → Hardware → Memory
Adjust to recommended value
Start the VM
Verify: kubectl describe node <node-name> | grep -A 5 Capacity

Step 1: Install Helm

Helm is the package manager for Kubernetes — similar to apt or brew, but for cluster applications.

curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version

Step 2: Add Helm Repos

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Step 3: Deploy Prometheus + Grafana

The kube-prometheus-stack Helm chart bundles Prometheus, Grafana, Alertmanager, kube-state-metrics, and node-exporter in a single install.

kubectl create namespace monitoring
 
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --set prometheus.prometheusSpec.resources.requests.memory=512Mi \
  --set prometheus.prometheusSpec.resources.limits.memory=1Gi \
  --set prometheus.prometheusSpec.resources.requests.cpu=200m \
  --set prometheus.prometheusSpec.resources.limits.cpu=500m \
  --set prometheus.prometheusSpec.retention=7d \
  --set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=5Gi \
  --set grafana.resources.requests.memory=128Mi \
  --set grafana.resources.limits.memory=256Mi \
  --set grafana.resources.requests.cpu=100m \
  --set alertmanager.alertmanagerSpec.resources.requests.memory=128Mi \
  --set alertmanager.alertmanagerSpec.resources.limits.memory=256Mi \
  --set kubeStateMetrics.resources.requests.memory=64Mi \
  --set prometheus-node-exporter.resources.requests.memory=32Mi

Resource limits are tuned for a homelab with ~18 GB total cluster RAM. Adjust if your setup differs.

What this deploys:

Component	Purpose
Prometheus	Scrapes and stores time-series metrics
Grafana	Dashboarding and visualization
Alertmanager	Routes alerts to notification channels
kube-state-metrics	Exposes Kubernetes object states as metrics
node-exporter	DaemonSet collecting hardware/OS metrics from each node

Verify:

kubectl get pods -n monitoring
# All pods should be Running

Step 4: Access Grafana

Get the Admin Password

kubectl get secret -n monitoring kube-prometheus-stack-grafana \
  -o jsonpath="{.data.admin-password}" | base64 -d; echo

Default username: admin

Option A: Port-forward (quick)

kubectl port-forward -n monitoring svc/kube-prometheus-stack-grafana 3000:80
# Open http://localhost:3000

Option B: Ingress (permanent)

kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: grafana-ingress
  namespace: monitoring
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    - host: grafana.lab.local
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: kube-prometheus-stack-grafana
                port:
                  number: 80
EOF

Update your /etc/hosts file:

<NODE_IP> boutique.lab.local argocd.lab.local grafana.lab.local

Access at http://grafana.lab.local

Step 5: Deploy Loki + Promtail

Loki is a log aggregation system designed to be Prometheus-like but for logs. Promtail is the agent that ships logs from nodes to Loki.

# Loki — single-binary mode for small clusters
helm install loki grafana/loki \
  --namespace monitoring \
  --set deploymentMode=SingleBinary \
  --set loki.commonConfig.replication_factor=1 \
  --set loki.storage.type=filesystem \
  --set singleBinary.replicas=1 \
  --set singleBinary.resources.requests.memory=256Mi \
  --set singleBinary.resources.limits.memory=512Mi \
  --set singleBinary.resources.requests.cpu=100m \
  --set singleBinary.persistence.size=5Gi \
  --set monitoring.selfMonitoring.grafanaAgent.installOperator=false \
  --set gateway.enabled=false \
  --set read.replicas=0 \
  --set write.replicas=0 \
  --set backend.replicas=0
 
# Promtail — collects and ships logs
helm install promtail grafana/promtail \
  --namespace monitoring \
  --set config.clients[0].url=http://loki.monitoring.svc:3100/loki/api/v1/push \
  --set resources.requests.memory=64Mi \
  --set resources.limits.memory=128Mi

Step 6: Connect Loki to Grafana

kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
  name: loki-datasource
  namespace: monitoring
  labels:
    grafana_datasource: "1"
data:
  loki-datasource.yaml: |
    apiVersion: 1
    datasources:
      - name: Loki
        type: loki
        access: proxy
        url: http://loki.monitoring.svc:3100
        isDefault: false
        editable: true
EOF

Grafana auto-discovers this ConfigMap and adds Loki as a data source.

Step 7: Import Dashboards

In Grafana → Dashboards → Import, enter these community dashboard IDs:

Dashboard	ID	Description
Kubernetes Cluster Monitoring	315	Node/pod CPU, memory, network overview
Node Exporter Full	1860	Detailed hardware and OS metrics
Kubernetes Pods	6417	Per-pod resource usage
Loki Logs	13639	Log search and filtering interface

Step 8: Verify

# All monitoring pods running
kubectl get pods -n monitoring
 
# Prometheus scraping targets
kubectl port-forward -n monitoring svc/kube-prometheus-stack-prometheus 9090:9090
# Open http://localhost:9090/targets — all targets should be UP
 
# Loki is ready
kubectl port-forward -n monitoring svc/loki 3100:3100
curl -s http://localhost:3100/ready
 
# Resource usage
kubectl top nodes
kubectl top pods -n monitoring

Expected Resource Usage

Component	Memory	CPU
Prometheus	512 Mi – 1 Gi	200m – 500m
Grafana	128 – 256 Mi	100m
Alertmanager	128 – 256 Mi	—
Loki	256 – 512 Mi	100m
Promtail	64 – 128 Mi	—
node-exporter (3 pods)	~96 Mi	—
kube-state-metrics	64 Mi	—
Total	~1.2 – 2.3 Gi	~400m – 700m

Access Summary

Service	URL	Credentials
Online Boutique	http://boutique.lab.local	—
Argo CD UI	http://argocd.lab.local	admin / bootstrap password
Grafana	http://grafana.lab.local	admin / helm-generated password

Troubleshooting

Issue	Fix
Grafana ingress returns 404	Verify ingress exists: `kubectl get ingress -n monitoring`
Prometheus targets DOWN	Check pod logs: `kubectl logs -n monitoring deploy/kube-prometheus-stack-prometheus`
Loki not receiving logs	Verify Promtail: `kubectl logs -n monitoring daemonset/promtail`
Pods stuck in Pending	Node out of resources: `kubectl describe pod <name> -n monitoring`

GitOps Integration

To manage this stack via Argo CD instead of manual Helm commands, migrate to Kustomize overlays:

infrastructure/
└── monitoring/
    ├── base/
    │   ├── kustomization.yaml
    │   └── namespace.yaml
    └── overlays/lab/
        ├── kustomization.yaml
        ├── values.yaml         # Helm values
        └── ingress.yaml        # Grafana ingress

Then add an Argo CD Application in bootstrap/argocd/apps/ to manage it (see Phase 0 for the App of Apps pattern).

James Lab

Explorer

Phase 1 - Observability with Prometheus, Grafana & Loki

Phase 1 - Observability with Prometheus, Grafana & Loki

What We’re Building

Why Observability

Before You Start

Step 0: Resize VMs

How to Resize

Step 1: Install Helm

Step 2: Add Helm Repos

Step 3: Deploy Prometheus + Grafana

Step 4: Access Grafana

Get the Admin Password

Option A: Port-forward (quick)

Option B: Ingress (permanent)

Step 5: Deploy Loki + Promtail

Step 6: Connect Loki to Grafana

Step 7: Import Dashboards

Step 8: Verify

Expected Resource Usage

Access Summary

Troubleshooting

GitOps Integration

Table of Contents

Backlinks