Phase 1 - Observability with Prometheus, Grafana & Loki
Series: Kubernetes Homelab on VMware Workstation Prerequisites: Phase 0 - Argo CD & GitOps complete Source Code: jmartinez-homelab-gitops
What We’re Building
By the end of this guide, you will have:
- Prometheus collecting metrics from nodes, pods, and Kubernetes objects
- Grafana with dashboards for cluster and application monitoring
- Loki + Promtail for centralized log aggregation
- All accessible via Traefik ingress at
grafana.lab.local
Why Observability
You can’t manage what you can’t measure. In Kubernetes, you need visibility into:
- Metrics — CPU, memory, network, request rates, error rates
- Logs — Application output, system events, errors
- Dashboards — Visual representation of cluster health
Prometheus + Grafana is the de facto standard for Kubernetes monitoring. Loki provides logging without the resource overhead of the ELK stack — ideal for a homelab.
Before You Start
Verify Phase 0 is complete:
# Argo CD running
kubectl get pods -n argocd
# Online Boutique deployed and synced
kubectl get application -n argocd
# NAME SYNC STATUS HEALTH STATUS
# online-boutique Synced HealthyStep 0: Resize VMs
The monitoring stack requires more resources than the default VM allocations. If you’re running VMware Workstation on a 40GB laptop, allocate:
| VM | Memory | Rationale |
|---|---|---|
| k3s-server | 8 GB | Control-plane + Argo CD + scheduling |
| k3s-agent-1 | 4 GB | Monitoring stack (Prometheus, Grafana) |
| k3s-agent-2 | 4 GB | Application workloads |
| k3s-agent-3 | 4 GB | Stateful workloads + overflow |
| Host | ~20 GB | VMware + OS overhead |
Total VM allocation: 20 GB. Leaves 20 GB for laptop OS.
How to Resize
- Shut down the VM:
sudo shutdown -h now - VMware Workstation → right-click VM → Settings → Hardware → Memory
- Adjust to recommended value
- Start the VM
- Verify:
kubectl describe node <node-name> | grep -A 5 Capacity
Step 1: Install Helm
Helm is the package manager for Kubernetes — similar to apt or brew, but for cluster applications.
curl -fsSL https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm versionStep 2: Add Helm Repos
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo updateStep 3: Deploy Prometheus + Grafana
The kube-prometheus-stack Helm chart bundles Prometheus, Grafana, Alertmanager, kube-state-metrics, and node-exporter in a single install.
kubectl create namespace monitoring
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--set prometheus.prometheusSpec.resources.requests.memory=512Mi \
--set prometheus.prometheusSpec.resources.limits.memory=1Gi \
--set prometheus.prometheusSpec.resources.requests.cpu=200m \
--set prometheus.prometheusSpec.resources.limits.cpu=500m \
--set prometheus.prometheusSpec.retention=7d \
--set prometheus.prometheusSpec.storageSpec.volumeClaimTemplate.spec.resources.requests.storage=5Gi \
--set grafana.resources.requests.memory=128Mi \
--set grafana.resources.limits.memory=256Mi \
--set grafana.resources.requests.cpu=100m \
--set alertmanager.alertmanagerSpec.resources.requests.memory=128Mi \
--set alertmanager.alertmanagerSpec.resources.limits.memory=256Mi \
--set kubeStateMetrics.resources.requests.memory=64Mi \
--set prometheus-node-exporter.resources.requests.memory=32MiResource limits are tuned for a homelab with ~18 GB total cluster RAM. Adjust if your setup differs.
What this deploys:
| Component | Purpose |
|---|---|
| Prometheus | Scrapes and stores time-series metrics |
| Grafana | Dashboarding and visualization |
| Alertmanager | Routes alerts to notification channels |
| kube-state-metrics | Exposes Kubernetes object states as metrics |
| node-exporter | DaemonSet collecting hardware/OS metrics from each node |
Verify:
kubectl get pods -n monitoring
# All pods should be RunningStep 4: Access Grafana
Get the Admin Password
kubectl get secret -n monitoring kube-prometheus-stack-grafana \
-o jsonpath="{.data.admin-password}" | base64 -d; echoDefault username: admin
Option A: Port-forward (quick)
kubectl port-forward -n monitoring svc/kube-prometheus-stack-grafana 3000:80
# Open http://localhost:3000Option B: Ingress (permanent)
kubectl apply -f - <<EOF
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grafana-ingress
namespace: monitoring
annotations:
traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
rules:
- host: grafana.lab.local
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kube-prometheus-stack-grafana
port:
number: 80
EOFUpdate your /etc/hosts file:
<NODE_IP> boutique.lab.local argocd.lab.local grafana.lab.local
Access at http://grafana.lab.local
Step 5: Deploy Loki + Promtail
Loki is a log aggregation system designed to be Prometheus-like but for logs. Promtail is the agent that ships logs from nodes to Loki.
# Loki — single-binary mode for small clusters
helm install loki grafana/loki \
--namespace monitoring \
--set deploymentMode=SingleBinary \
--set loki.commonConfig.replication_factor=1 \
--set loki.storage.type=filesystem \
--set singleBinary.replicas=1 \
--set singleBinary.resources.requests.memory=256Mi \
--set singleBinary.resources.limits.memory=512Mi \
--set singleBinary.resources.requests.cpu=100m \
--set singleBinary.persistence.size=5Gi \
--set monitoring.selfMonitoring.grafanaAgent.installOperator=false \
--set gateway.enabled=false \
--set read.replicas=0 \
--set write.replicas=0 \
--set backend.replicas=0
# Promtail — collects and ships logs
helm install promtail grafana/promtail \
--namespace monitoring \
--set config.clients[0].url=http://loki.monitoring.svc:3100/loki/api/v1/push \
--set resources.requests.memory=64Mi \
--set resources.limits.memory=128MiStep 6: Connect Loki to Grafana
kubectl apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: loki-datasource
namespace: monitoring
labels:
grafana_datasource: "1"
data:
loki-datasource.yaml: |
apiVersion: 1
datasources:
- name: Loki
type: loki
access: proxy
url: http://loki.monitoring.svc:3100
isDefault: false
editable: true
EOFGrafana auto-discovers this ConfigMap and adds Loki as a data source.
Step 7: Import Dashboards
In Grafana → Dashboards → Import, enter these community dashboard IDs:
| Dashboard | ID | Description |
|---|---|---|
| Kubernetes Cluster Monitoring | 315 | Node/pod CPU, memory, network overview |
| Node Exporter Full | 1860 | Detailed hardware and OS metrics |
| Kubernetes Pods | 6417 | Per-pod resource usage |
| Loki Logs | 13639 | Log search and filtering interface |
Step 8: Verify
# All monitoring pods running
kubectl get pods -n monitoring
# Prometheus scraping targets
kubectl port-forward -n monitoring svc/kube-prometheus-stack-prometheus 9090:9090
# Open http://localhost:9090/targets — all targets should be UP
# Loki is ready
kubectl port-forward -n monitoring svc/loki 3100:3100
curl -s http://localhost:3100/ready
# Resource usage
kubectl top nodes
kubectl top pods -n monitoringExpected Resource Usage
| Component | Memory | CPU |
|---|---|---|
| Prometheus | 512 Mi – 1 Gi | 200m – 500m |
| Grafana | 128 – 256 Mi | 100m |
| Alertmanager | 128 – 256 Mi | — |
| Loki | 256 – 512 Mi | 100m |
| Promtail | 64 – 128 Mi | — |
| node-exporter (3 pods) | ~96 Mi | — |
| kube-state-metrics | 64 Mi | — |
| Total | ~1.2 – 2.3 Gi | ~400m – 700m |
Access Summary
| Service | URL | Credentials |
|---|---|---|
| Online Boutique | http://boutique.lab.local | — |
| Argo CD UI | http://argocd.lab.local | admin / bootstrap password |
| Grafana | http://grafana.lab.local | admin / helm-generated password |
Troubleshooting
| Issue | Fix |
|---|---|
| Grafana ingress returns 404 | Verify ingress exists: kubectl get ingress -n monitoring |
| Prometheus targets DOWN | Check pod logs: kubectl logs -n monitoring deploy/kube-prometheus-stack-prometheus |
| Loki not receiving logs | Verify Promtail: kubectl logs -n monitoring daemonset/promtail |
| Pods stuck in Pending | Node out of resources: kubectl describe pod <name> -n monitoring |
GitOps Integration
To manage this stack via Argo CD instead of manual Helm commands, migrate to Kustomize overlays:
infrastructure/
└── monitoring/
├── base/
│ ├── kustomization.yaml
│ └── namespace.yaml
└── overlays/lab/
├── kustomization.yaml
├── values.yaml # Helm values
└── ingress.yaml # Grafana ingress
Then add an Argo CD Application in bootstrap/argocd/apps/ to manage it (see Phase 0 for the App of Apps pattern).