기존 kube-prometheus 에서 helm 차트로 관리되는 kube-prometheus-stack 으로 변경
배포는 helm으로, 예제는 kube-prometheus/manifast 를 참고한다.
helm 사용 예제
•
사전 환경 구성
MetalLB : 온프레미스 환경에서 사용할 수 있는 서비스(로드밸런서 타입)입니다
Ingress-nginx :클러스터의 서비스에 대한 외부 액세스를 관리하는 API 개체(일반적으로 HTTP)입니다.
•
NodePort 를 통한 접근
•
Ingress 를 통한 접근
kube-prmetheus-stack 배포
위에서 처럼 metallb 및 ingress 배포가 완료되면 ingress-controller 가 LoadBalancer type 으로 변경되고 External ip 가 보인다.
$ k get svc -n external-ingress
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ex-ing-ingress-nginx-controller LoadBalancer 10.108.209.70 10.0.0.101 80:30080/TCP,443:30443/TCP 6d5h
ex-ing-ingress-nginx-controller-admission ClusterIP 10.106.15.229 <none> 443/TCP 6d5h
Bash
복사
kube-prometheus-stack 을 배포해보자.
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
k create namespace monitoring
helm install kube-prometheus prometheus-community/kube-prometheus-stack -n monitoring
배포가 완료되면 ingress 가 자동으로 생성되어 들어간다.
$ k get ing -A
NAMESPACE NAME CLASS HOSTS ADDRESS PORTS AGE
monitoring kube-prometheus-grafana external-nginx grafana.syseng.kr 10.0.0.101 80, 443 6d5h
monitoring kube-prometheus-kube-prome-alertmanager external-nginx alert.syseng.kr 10.0.0.101 80, 443 6d5h
monitoring kube-prometheus-kube-prome-prometheus external-nginx prometheus.syseng.kr 10.0.0.101 80, 443 6d5h
Bash
복사
변경된 values
# kube-prometheus-stack 의 경우 values 가 3400줄 가까이 되고
# 일일히 설명하기 어려우니 ingress와 연동하여 URL 이 정상적으로 뜨는지만 확인
# 아래는 alertmanager ingress 예시
ingress:
enabled: true
# For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
# See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
ingressClassName: external-nginx
annotations: {}
labels: {}
## Redirect ingress to an additional defined port on the service
# servicePort: 8081
## Hosts must be provided if Ingress is enabled.
##
#hosts: []
hosts:
- alert.syseng.kr
## Paths to use for ingress rules - one path should match the alertmanagerSpec.routePrefix
##
paths: []
# - /
## For Kubernetes >= 1.18 you should specify the pathType (determines how Ingress paths should be matched)
## See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#better-path-matching-with-path-types
# pathType: ImplementationSpecific
## TLS configuration for Alertmanager Ingress
## Secret must be manually created in the namespace
##
tls:
- secretName: syseng.kr
hosts:
- alert.syseng.kr
YAML
복사
alertmanager-alertmanager.yaml
apiVersion: monitoring.coreos.com/v1
kind: Alertmanager
metadata:
labels:
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 0.24.0
name: main
namespace: monitoring
spec:
image: quay.io/prometheus/alertmanager:v0.24.0
nodeSelector:
kubernetes.io/os: linux
podMetadata:
labels: # 세세하게 분류가 잘되어있으며 아래 라벨은 kubernetes 에서 권장하는 라벨이다.
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 0.24.0
replicas: 3
resources:
limits: # limits 를 적용할경우 해당 프로세스가 아래 수치를 넘을때 프로세스를 종료시킨다.
cpu: 100m # (1core = 1000m 단위: 밀리코어 )
memory: 100Mi
requests:
cpu: 4m
memory: 100Mi
securityContext:
fsGroup: 2000 # 보조 그룹
runAsNonRoot: true # root 권한 제외
runAsUser: 1000 # 파드 내 모든 프로세스 UID
serviceAccountName: alertmanager-main
version: 0.24.0
YAML
복사
alertmanager-networkPolicy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
labels:
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 0.24.0
name: alertmanager-main
namespace: monitoring
spec:
egress: # 내부에서 외부로 나갈때 거처가는 트래픽
- {}
ingress: # 외부에서 내부로 들어올때 거쳐가는 트래픽
- from:
- podSelector:
matchLabels:
app.kubernetes.io/name: prometheus
ports:
- port: 9093
protocol: TCP
- port: 8080
protocol: TCP
- from:
- podSelector:
matchLabels:
app.kubernetes.io/name: alertmanager
ports:
- port: 9094
protocol: TCP
- port: 9094
protocol: UDP
podSelector:
matchLabels:
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
policyTypes:
- Egress
- Ingress
YAML
복사
alertmanager-podDisruptionBudget.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
labels:
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 0.24.0
name: alertmanager-main
namespace: monitoring
spec:
maxUnavailable: 1 # 업데이트 과정에 spec.replicas 수 기준 최대 이용 불가능 파드 수
selector:
matchLabels:
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
YAML
복사
alertmanager-secret.yaml
apiVersion: v1
kind: Secret
metadata:
labels:
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 0.24.0
name: alertmanager-main
namespace: monitoring
stringData:
alertmanager.yaml: |-
"global":
"resolve_timeout": "5m"
"inhibit_rules":
- "equal":
- "namespace"
- "alertname"
"source_matchers":
- "severity = critical"
"target_matchers":
- "severity =~ warning|info"
- "equal":
- "namespace"
- "alertname"
"source_matchers":
- "severity = warning"
"target_matchers":
- "severity = info"
- "equal":
- "namespace"
"source_matchers":
- "alertname = InfoInhibitor"
"target_matchers":
- "severity = info"
"receivers":
- "name": "Default"
- "name": "Watchdog"
- "name": "Critical"
- "name": "null"
"route":
"group_by":
- "namespace"
"group_interval": "5m"
"group_wait": "30s"
"receiver": "Default"
"repeat_interval": "12h"
"routes":
- "matchers":
- "alertname = Watchdog"
"receiver": "Watchdog"
- "matchers":
- "alertname = InfoInhibitor"
"receiver": "null"
- "matchers":
- "severity = critical"
"receiver": "Critical"
type: Opaque
YAML
복사
alertmanager-service.yaml
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 0.24.0
name: alertmanager-main
namespace: monitoring
spec:
ports:
- name: web
port: 9093
targetPort: web
- name: reloader-web
port: 8080
targetPort: reloader-web
selector:
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
sessionAffinity: ClientIP # 클라이언트의 요청이 같은 인스턴스로 접속할수 있도록 세션을 유지
YAML
복사
alertmanager-serviceMonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
app.kubernetes.io/version: 0.24.0
name: alertmanager-main
namespace: monitoring
spec:
endpoints:
- interval: 30s
port: web
- interval: 30s
port: reloader-web
selector:
matchLabels:
app.kubernetes.io/component: alert-router
app.kubernetes.io/instance: main
app.kubernetes.io/name: alertmanager
app.kubernetes.io/part-of: kube-prometheus
YAML
복사