You are here
Kubernetes Monitoring
Kubernetes is an open-source container-orchestration system for automating application deployment, scaling, and management. It was originally designed by Google, and is now maintained by the Cloud Native Computing Foundation.
What You Can Monitor
Opsview provides an all in one Kubernetes monitoring hosted locally or on the cloud. Monitor live usage metrics such as CPU, Memory, Disk and Network Status from your cluster down to your individual pods. Additionally, this Opspack collects other useful metrics such as HTTP statistics, file descriptors and more.
Host Templates
The following Host Templates are provided within this Opspack. Click the name of each Host Template to be taken to the relevant information page, including a full Service Check description and usage instructions.
Application - Kubernetes - Cluster
add_circleService Check Name | Description | Default Thresholds (Warning, Critical) | UOM |
---|---|---|---|
Kubernetes - Cluster - File Descriptors | Summary of the open file descriptors against the total number available. | file_descriptor_usage=70,90 | % |
Kubernetes - Cluster - HTTP Stats | Summary of the HTTP requests made, divided based on their status codes. | N/A | N/A |
Kubernetes - Cluster - Process Time | Total amount of user and system CPU time spent in seconds. | N/A | s |
Kubernetes - Cluster - ETCD Helper Stats | Number of ETCD Cache hits and misses. | cache_hit_percentage=25:,10: | % |
Kubernetes - Cluster - Namespaces | High level summary of the namespaces in the Active state on this cluster. | namespaces_inactive=0,0 | N/A |
Kubernetes - Cluster - Nodes | High level summary of the of the nodes in Ready status on this cluster. | nodes_not_running=0,0 | N/A |
Application - Kubernetes - Namespace
add_circleService Check Name | Description | Default Thresholds (Warning, Critical) | UOM |
---|---|---|---|
Kubernetes - Namespace - Pods | Summary of the status of the pods within this namespace. | pods_not_running=0,0 pods_total=1:, |
N/A |
Kubernetes - Namespace - Active | Summary of whether the namespace is active or not. | namespace_critical=0,0 | N/A |
Application - Kubernetes - Node
add_circleService Check Name | Description | Default Thresholds (Warning, Critical) | UOM |
---|---|---|---|
Kubernetes - Node - CPU Usage | The amount of CPU resources currently being used by the node. | cpu_usage=70,90 | % |
Kubernetes - Node - Memory Usage | The current memory usage of the node. | memory_usage=70,90 | % |
Kubernetes - Node - Pods | The number of pods and their current state. | pods_not_ready=0,0 pods_total=1:, |
N/A |
Kubernetes - Node - Allocatable Storage | The current ephemeral storage usage for the node and volume usage, if used. | N/A | B |
Kubernetes - Node - Pod Capacity | The maximum number of pods that can be created on this node. | N/A | N/A |
Kubernetes - Node - Ready | Check if the node is healthy and ready to accept pods. | node_ready=0,0 | N/A |
Kubernetes - Node - PID Pressure | Check if pressure exists on the processes - that is, if there are too many processes on the node. | node_pid_pressure=0,0 | N/A |
Kubernetes - Node - Disk Pressure | Check if pressure exists on the disk - that is, if the disk capacity is low. | node_disk_pressure=0,0 | N/A |
Kubernetes - Node - Memory Pressure | Check if pressure exists on the node memory - that is, if the node memory is low. | node_memory_pressure=0,0 | N/A |
Application - Kubernetes - Pod
add_circleService Check Name | Description | Default Thresholds (Warning, Critical) | UOM |
---|---|---|---|
Kubernetes - Pod - CPU Usage | The amount of virtual CPU resources, measured in MilliCores, currently being used by the pod. | N/A | n |
Kubernetes - Pod - Phase | High level summary of where the pod is in its lifetime; Pending, Running, Succeeded, Failed or Unknown. | N/A | N/A |
Kubernetes - Pod - Memory Usage | The current memory usage and capacity of the pod, in bytes. | N/A | B |
Kubernetes - Pod - Ephemeral Storage Usage | Cumulative count of bytes read and written in the ephemeral storage for this pod. | N/A | B |
Kubernetes - Pod - Volume Usage | Cumulative count of bytes read and written in the volumes for this pod. | N/A | B |
Kubernetes - Pod - Network Bytes | Cumulative count of bytes received and sent. | N/A | B |
Kubernetes - Pod - Network Errors | Cumulative count of errors encountered during network usage. | N/A | N/A |
Kubernetes Monitoring Prerequisites
To access live usage metrics, you must install metrics-server on your cluster and follow the correct authentication setup for your host.
It is assumed that kubectl is installed and configured for use with your cluster.
Kubernetes Monitoring Setup
- Install Metrics Server on the cluster
- Retrieve the API Server address and port number for the cluster
- Setup the appropriate authentication depending on your environment setup
Install Metrics Server
Local cluster
If you are using a local Kubernetes cluster, run the following commands from the location of your cluster:
git clone https://github.com/kubernetes-incubator/metrics-server.git
# deploy the latest metric-server
cd metrics-server
kubectl create -f deploy/1.8+/
kubectl edit deploy -n kube-system metrics-server
When the edit window opens, add the following flags underneath spec.containers.name
:
args:
- --kubelet-insecure-tls # only required if using self signed certificates
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
AWS
If you are using a Kubernetes cluster hosted on AWS / EKS, refer to the Installing Metrics Server on AWS guide.
Google Cloud Platform (GCP) or Microsoft Azure
If you are using a GCP or Azure Kubernetes cluster, the Metrics Server is installed and configured by default. Ensure you have setup the read-only service account and role bindings shown in the steps below.
Retrieve API Server address and the port number
From the location of your cluster:
kubectl config view
This will give you a list of all the configuration information for your Kubernetes environment.
It will look something like:
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://1.1.1.1:6443 # COPY THIS ADDRESS
name: kubernetes
See the cluster, server address shown above. The port may or may not be present, copy the entire URL (including the port, to the KUBERNETES_CLUSTER_DETAILS
, API server address variable.
Setup an authentication mechanism
This Opspack supports client authentication through X509 Client Certs and Bearer Tokens.
For more details, refer to Kubernetes authentication strategies
Client authentication using X509 Client Certs
Client certificate authentication is enabled by supplying the CA path, client certificate and client key arguments in the KUBERNETES_CERTIFICATES
variable.
Client authentication using Bearer Tokens
Setup a service account for authentication
To create a service account for authentication, copy and paste the following commands into your Kubernetes cluster terminal.
kubectl create sa opsview # create the service account
# create the read only role
cat <<EOF | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
annotations:
rbac.authorization.kubernetes.io/autoupdate: 'true'
labels:
name: opsview-read-only
namespace: default
rules:
- apiGroups: ['*']
resources: ['*']
verbs:
- get
- list
- watch
- apiGroups:
- extensions
resources: ['*']
verbs:
- get
- list
- watch
- apiGroups:
- apps
resources: ['*']
verbs:
- get
- list
- watch
- nonResourceURLs:
- /metrics
- /api/*
verbs:
- get
- list
- watch
EOF
# bind the role to the service account
cat <<EOF | kubectl apply -f -
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: opsview-binding
subjects:
- kind: ServiceAccount
name: opsview
namespace: default
roleRef:
kind: ClusterRole
name: opsview-read-only
apiGroup: rbac.authorization.k8s.io
EOF
Retrieve the bearer token for authentication
Local
If your Kubernetes environment has been set up locally, you will need to run the following commands:
SECRET_NAME=$(kubectl get serviceaccount opsview -o jsonpath='{.secrets[0].name}')
TOKEN=$(kubectl get secret $SECRET_NAME -o jsonpath='{.data.token}' | base64 --decode)
echo $TOKEN
Copy the value of $TOKEN
to your KUBERNETES_CLUSTER_DETAILS
Opsview variable.
AWS
If your Kubernetes environment has been set up on AWS, you will need to run the following commands:
Ensure you have the AWS CLI installed. For details on how to install the AWS CLI, refer to: Installing the AWS CLI
# update kubectl config with your AWS setup
aws eks --region YOUR_REGION update-kubeconfig --name YOUR_CLUSTER_NAME
# download the aws kubernetes config map
curl -o aws-auth-cm.yaml https://amazon-eks.s3-us-west-2.amazonaws.com/cloudformation/2019-02-11/aws-auth-cm.yaml
# edit the config map, replacing the rolearn variable with the Role ARN shown in your EKS dashboard
nano aws-auth-cm.yaml
# apply the config map
kubectl apply -f aws-auth-cm.yaml
APISERVER=$(kubectl config view --minify -o jsonpath='{.clusters[0].cluster.server}')
SECRET_NAME=$(kubectl get serviceaccount opsview -o jsonpath='{.secrets[0].name}')
TOKEN=$(kubectl get secret $SECRET_NAME -o jsonpath='{.data.token}' | base64 --decode)
echo $TOKEN
Copy the value of $TOKEN
to your KUBERNETES_CLUSTER_DETAILS
Opsview variable.
To ensure the communication between the cluster and nodes, AWS requires you to add inbound and outbound rules in your security group for the node pool to allow HTTPS connections on port 443 with the source of 0.0.0.0/0.
Google Cloud Platform (GCP)
If your Kubernetes environment has been set up on GCP, you will need to run the following commands:
SECRET_NAME=$(kubectl get serviceaccount opsview -o jsonpath='{.secrets[0].name}')
TOKEN=$(kubectl get secret $SECRET_NAME -o jsonpath='{.data.token}' | base64 --decode)
echo $TOKEN
Copy the value of $TOKEN
to your KUBERNETES_CLUSTER_DETAILS
Opsview variable.
Microsoft Azure
If your Kubernetes environment has been set up on Azure, you will need to run the following commands:
Ensure you have the Azure CLI installed. For details on how to install the Azure CLI, refer to: Installing the Azure CLI
# login to azure
az login
# get kube config for azure
az aks get-credentials --resource-group YOUR_RESOURCE_GROUP --name YOUR_CLUSTER_NAME
SECRET_NAME=$(kubectl get serviceaccount opsview -o jsonpath='{.secrets[0].name}')
TOKEN=$(kubectl get secret $SECRET_NAME -o jsonpath='{.data.token}' | base64 --decode)
echo $TOKEN
Copy the value of $TOKEN
to your KUBERNETES_CLUSTER_DETAILS
Opsview variable.
Importing this Opspack
Download the application-kubernetes.opspack file from the Releases section of this repository, and import it into your Opsview Monitor instance. Now you can add the Host Templates you want following the info links in the table at the top.
For more information, refer to Opsview Knowledge Center - Importing an Opspack.