Я пытаюсь настроить ведение журнала Prometheus, я пытаюсь развернуть ниже yamls, но в модуле произошел сбой с «Откат перезапуска сбой контейнера»
Полное описание:
Name: prometheus-75dd748df4-wrwlr
Namespace: monitoring
Priority: 0
Node: kbs-vm-02/172.16.1.8
Start Time: Tue, 28 Apr 2020 06:13:22 +0000
Labels: app=prometheus
pod-template-hash=75dd748df4
Annotations: <none>
Status: Running
IP: 10.44.0.7
IPs:
IP: 10.44.0.7
Controlled By: ReplicaSet/prometheus-75dd748df4
Containers:
prom:
Container ID: docker://50fb273836c5522bbbe01d8db36e18688e0f673bc54066f364290f0f6854a74f
Image: quay.io/prometheus/prometheus:v2.4.3
Image ID: docker-pullable://quay.io/prometheus/prometheus@sha256:8e0e85af45fc2bcc18bd7221b8c92fe4bb180f6bd5e30aa2b226f988029c2085
Port: 9090/TCP
Host Port: 0/TCP
Args:
--config.file=/prometheus-cfg/prometheus.yml
--storage.tsdb.path=/data
--storage.tsdb.retention=$(STORAGE_LOCAL_RETENTION)
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 28 Apr 2020 06:14:08 +0000
Finished: Tue, 28 Apr 2020 06:14:08 +0000
Ready: False
Restart Count: 3
Limits:
memory: 1Gi
Requests:
cpu: 200m
memory: 500Mi
Environment Variables from:
prometheus-config-flags ConfigMap Optional: false
Environment: <none>
Mounts:
/data from storage (rw)
/prometheus-cfg from config-file (rw)
/var/run/secrets/kubernetes.io/serviceaccount from prometheus-token-bt7dw (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-file:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: prometheus-config-file
Optional: false
storage:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: prometheus-storage-claim
ReadOnly: false
prometheus-token-bt7dw:
Type: Secret (a volume populated by a Secret)
SecretName: prometheus-token-bt7dw
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 76s (x3 over 78s) default-scheduler running "VolumeBinding" filter plugin for pod "prometheus-75dd748df4-wrwlr": pod has unbound immediate PersistentVolumeClaims
Normal Scheduled 73s default-scheduler Successfully assigned monitoring/prometheus-75dd748df4-wrwlr to kbs-vm-02
Normal Pulled 28s (x4 over 72s) kubelet, kbs-vm-02 Container image "quay.io/prometheus/prometheus:v2.4.3" already present on machine
Normal Created 28s (x4 over 72s) kubelet, kbs-vm-02 Created container prom
Normal Started 27s (x4 over 71s) kubelet, kbs-vm-02 Started container prom
Warning BackOff 13s (x6 over 69s) kubelet, kbs-vm-02 Back-off restarting failed container
развертывание файл:
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
namespace: monitoring
labels:
app: prometheus
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
strategy:
type: Recreate
template:
metadata:
labels:
app: prometheus
spec:
securityContext:
fsGroup: 1000
serviceAccountName: prometheus
containers:
- image: quay.io/prometheus/prometheus:v2.4.3
name: prom
args:
- '--config.file=/prometheus-cfg/prometheus.yml'
- '--storage.tsdb.path=/data'
- '--storage.tsdb.retention=$(STORAGE_LOCAL_RETENTION)'
envFrom:
- configMapRef:
name: prometheus-config-flags
ports:
- containerPort: 9090
name: prom-port
resources:
limits:
memory: 1Gi
requests:
cpu: 200m
memory: 500Mi
volumeMounts:
- name: config-file
mountPath: /prometheus-cfg
- name: storage
mountPath: /data
volumes:
- name: config-file
configMap:
name: prometheus-config-file
- name: storage
persistentVolumeClaim:
claimName: prometheus-storage-claim
PV Yaml:
apiVersion: v1
kind: PersistentVolume
metadata:
name: prometheus-storage
namespace: monitoring
labels:
app: prometheus
spec:
capacity:
storage: 12Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/data"
PV C Данные Yaml:
[vidya@KBS-VM-01 7-1_prometheus]$ cat prometheus/prom-pvc.yml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: prometheus-storage-claim
namespace: monitoring
labels:
app: prometheus
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
Знаете ли вы, что это за проблема и как ее исправить? Это. Пожалуйста, также дайте мне знать, что больше файлов нужно отправить,
Угадаю, что-то не так с конфигами хранилища, просматривая в журналах событий
Предупреждение FailedScheduling 76s (x3 over 78s) по умолчанию работает планировщик Подключаемый модуль фильтра «VolumeBinding» для модуля «prometheus-75dd748df4-wrwlr»: модуль pod имеет несвязанные немедленные PersistentVolumeClaims
Я использую локальное хранилище.
[vidya@KBS-VM-01 7-1_prometheus]$ kubectl describe pvc prometheus-storage-claim -n monitoring
Name: prometheus-storage-claim
Namespace: monitoring
StorageClass:
Status: Bound
Volume: prometheus-storage
Labels: app=prometheus
Annotations: pv.kubernetes.io/bind-completed: yes
pv.kubernetes.io/bound-by-controller: yes
Finalizers: [kubernetes.io/pvc-protection]
Capacity: 12Gi
Access Modes: RWO
VolumeMode: Filesystem
Mounted By: prometheus-75dd748df4-wrwlr
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedBinding 37m persistentvolume-controller no persistent volumes available for this claim and no storage class is set
[vidya@KBS-VM-01 7-1_prometheus]$ kubectl logs prometheus-75dd748df4-zlncv -n monitoring
level=info ts=2020-04-28T07:49:07.885529914Z caller=main.go:238 msg="Starting Prometheus" version="(version=2.4.3, branch=HEAD, revision=167a4b4e73a8eca8df648d2d2043e21bdb9a7449)"
level=info ts=2020-04-28T07:49:07.885635014Z caller=main.go:239 build_context="(go=go1.11.1, user=root@1e42b46043e9, date=20181004-08:42:02)"
level=info ts=2020-04-28T07:49:07.885812014Z caller=main.go:240 host_details="(Linux 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019 x86_64 prometheus-75dd748df4-zlncv (none))"
level=info ts=2020-04-28T07:49:07.885833214Z caller=main.go:241 fd_limits="(soft=1048576, hard=1048576)"
level=info ts=2020-04-28T07:49:07.885849614Z caller=main.go:242 vm_limits="(soft=unlimited, hard=unlimited)"
level=info ts=2020-04-28T07:49:07.888695413Z caller=main.go:554 msg="Starting TSDB ..."
level=info ts=2020-04-28T07:49:07.889017612Z caller=main.go:423 msg="Stopping scrape discovery manager..."
level=info ts=2020-04-28T07:49:07.889033512Z caller=main.go:437 msg="Stopping notify discovery manager..."
level=info ts=2020-04-28T07:49:07.889041112Z caller=main.go:459 msg="Stopping scrape manager..."
level=info ts=2020-04-28T07:49:07.889048812Z caller=main.go:433 msg="Notify discovery manager stopped"
level=info ts=2020-04-28T07:49:07.889071612Z caller=main.go:419 msg="Scrape discovery manager stopped"
level=info ts=2020-04-28T07:49:07.889083112Z caller=main.go:453 msg="Scrape manager stopped"
level=info ts=2020-04-28T07:49:07.889098012Z caller=manager.go:638 component="rule manager" msg="Stopping rule manager..."
level=info ts=2020-04-28T07:49:07.889109912Z caller=manager.go:644 component="rule manager" msg="Rule manager stopped"
level=info ts=2020-04-28T07:49:07.889124912Z caller=notifier.go:512 component=notifier msg="Stopping notification manager..."
level=info ts=2020-04-28T07:49:07.889137812Z caller=main.go:608 msg="Notifier manager stopped"
level=info ts=2020-04-28T07:49:07.889169012Z caller=web.go:397 component=web msg="Start listening for connections" address=0.0.0.0:9090
level=error ts=2020-04-28T07:49:07.889653412Z caller=main.go:617 err="opening storage failed: lock DB directory: open /data/lock: permission denied"