Site icon TheWindowsUpdate.com

Mastering AKS Troubleshooting #3: Kernel view and AKS Observability

This post has been republished via RSS; it originally appeared at: Microsoft Tech Community - Latest Blogs - .

Introduction

This blog post concludes the three part series that addresses common networking problems that may occur while working with Azure Kubernetes Service (AKS). Although AKS is a managed container orchestration service, issues can still arise, requiring troubleshooting.

 

The earlier blog post covered endpoint connectivity issues across virtual networks and port configuration problems with services and their associated pods. This article focusses on solving issues using Linux toolsets to get a kernel view of the Kubernetes layout and using Container Insights to view logging and diagnostics to take remedial actions.

 

Prerequisites

Before setting up AKS, ensure that you have the necessary Azure account and subscription permissions, as well as PowerShell installed on your system. Follow setup and scenario instructions found in this Github link. It's important to be familiar with troubleshooting inbound and outbound networking scenarios in AKS. The environment shown in the figure uses a custom VNet with an NSG attached to its subnet. Additionally, AKS uses this custom subnet and creates its own NSG attached to the Nodepool's Network Interface.

 

Scenario 5: Using Linux toolset to analyze failed application

Objective: Within the cluster, there are two applications running: one is functioning correctly, while the other is experiencing issues and causing ‘curl’ to fail with a timeout. In this lab we will use tools available on the Linux node hosting the application to diagnose the problem with the faulty application. 

 

Step 1: Set up the environment.

kubectl create ns student
kubectl config set-context --current --namespace=student 

# Verify current namespace
kubectl config view --minify --output 'jsonpath={..namespace}'
  1. Enable Cloud Shell within Azure Portal. Select Bash option and set the storage and allow completion.

From AKS blade > Overview > Connect, run the ‘az account..’ and ‘az aks get-credentials..’ commands in the Cloud Shell. 
Verify kubectl commands work.
curl -LO https://github.com/kvaps/kubectl-node-shell/raw/master/kubectl-node_shell
$ ./kubectl-node_shell <node-name from above>
cd Lab5; .\working.ps1

Scripts, as shown below, setup the deployment with 3 Pod replicas, and service of type Loadbalancer running on port 4000. They are both identical applications, except for the image. One deployment works and the other does not.

 

Working
---
apiVersion: v1
kind: Service
metadata:
  name: working-app-clusterip
spec:
  type: LoadBalancer
  ports:
  - port: 4000
    protocol: TCP
    targetPort: 4000
  selector:
    app: working-app
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: working-app-deployment
  labels:
    app: working-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: working-app
  template:
    metadata:
      labels:
        app: working-app
    spec:
      containers:
      - name: working-app
        image: jvargh/nodejs-app:working
        ports:
        - containerPort: 4000
---
Faulty
---
apiVersion: v1
kind: Service
metadata:
  name: faulty-app-clusterip
spec:
  type: LoadBalancer
  ports:
  - port: 4000
    protocol: TCP
    targetPort: 4000
  selector:
    app: faulty-app
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: faulty-app-deployment
  labels:
    app: faulty-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: faulty-app
  template:
    metadata:
      labels:
        app: faulty-app
    spec:
      containers:
      - name: faulty-app
        image: jvargh/nodejs-app:faulty
        ports:
        - containerPort: 4000
---

 

kubectl run test-pod --image=nginx --port=80 --restart=Never
$custom_aks_nsg="custom_aks_nsg" # <- verify
$nsg_list=az network nsg list `
--query "[?contains(name,'$custom_aks_nsg')].{ ResourceGroup:resourceGroup}" --output json

# Extract NSG Resource Group
$resource_group=$(echo $nsg_list | jq -r '.[].ResourceGroup')
echo $nsg_list, $nsg_name, $resource_group

az network nsg rule create --name AllowHTTPInbound `
--resource-group $resource_group --nsg-name $custom_aks_nsg `
--destination-port-range * --destination-address-prefix * `
--source-address-prefixes Internet --protocol tcp `
--priority 100 --access allow
# Test internal access within cluster
kubectl exec -it test-pod -- curl working-app-clusterip:4000 # works
kubectl exec -it test-pod -- curl faulty-app-clusterip:4000 # fails with Connection refused

# Test external access to cluster
curl <Working-App-External-IP>:4000 # works
curl <Faulty-App-External-IP>:4000 # fails with `Unable to connect to the remote server`

 

Step 2: Walk through the Kubernetes view

Use this step to confirm that from Internet through to the Pod, Kubernetes setup is configured correctly.

  1. Curl request hits the Load balancer Public IP assigned to the service. Service IP get added as a Front End IP rule to the existing AKS Loadbalancer.
  2. Service ties into the endpoints by forwarding requests to the pods, and in turn to the Application container.

 

Step 3: Verify Loadbalancer Insights and Metrics

From LoadBalancer blade, go to AKS LB > Insights. Ensure the Loadbalancer is functional and capturing metrics. From below you can  there’s an issue with the backend pool.

 

From Detailed metrics > ‘Frontend and Backend Availability’ section, you should see the Failing app FE IP is Red for Availability but Working app FE IP is Green for Availability. Change ‘Time Range’ to 5m.

 

Step 4: Perform network trace to the Faulty app

Use this step to confirm if the Faulty app is even listening. We should see Working app responding but Faulty app does not.

# IP addresses listed below applies to this example,  for reference only. Replace with your own
test-pod-ip =       #10.244.0.61
working-app-svc =   #10.0.81.248
faulty-app-svc =    #10.0.189.236
working-pod-IPs =   #10.244.0.54, 55, 56
faulty-pod-IPs =    #10.244.0.57, 58, 59

 

  1. Get the test-pod IP and destination service IP and run tcpdump on the associated Node of the Pod
  2. Working app provides trace
kubectl exec -it test-pod – curl <working-app-svc>:4000
  1. From Cloud Shell in Azure Portal, run below command. Get node from ‘kubectl get pods -o wide’.
kubectl-node_shell <Node associated with pods>
  1. Setup trace from test-pod and the Pod network.
tcpdump -en -i any src <test-pod IP> and dst net 10.244.0.0/16
  1. From another terminal, execute curl to Faulty and Working app’s service.
kubectl exec -it test-pod -- curl <faulty-app-svc>:4000
kubectl exec -it test-pod -- curl <working-app-svc>:4000

Working App

Faulty App

From trace above, there’s only response from Working App pod. No response from Faulty App pod.

 

Step 5: Collect tcpdump trace for Wireshark view

This section captures to file, copy from nsenter pod to local desktop where Wireshark will visualize the trace. Need two consoles.

  1. From Cloud Shell run kubectl-node_shell <Node associated with pods>.  Run below commands.
cd /tmp

tcpdump -nn -s0 -vvv -i any -w capture.cap
where,
     -nn: display IP addresses and port numbers in numeric
     -s0: set snapshot=0 i.e., capture entire packet
     -l: output asap without buffering
     -vvv: max verbosity
  1. From 2nd console run below to view the HTML output
kubectl exec -it <test-pod> -- curl <working-app-pod>
  1. On Cloud Shell, break the tcpdump (CTRL+c) and capture.cap should be written to /tmp
  2. From 2nd console use below command to download capture.cap. Use ‘k get pod’ to get nsenter pod name.
kubectl cp <nsenter-pod>:/tmp/capture.cap capture.cap

# Wireshark will need to be installed for next step. Check this link for Win OS install

  1. Open capture.cap in Wireshark. Use below filter to refine view.
ip.addr == <test-pod> # might not need this "and ip.addr == <working-app-pod>"
  1. Use Analyze > Follow > HTTP Stream to view the HTTP flow as seen below

 

  1. For long running traces that need to be saved to storage account, use utility below. Helm install creates storage account and daemon set creates tcpdump Pods on all nodes, that continuously writes capture to storage account.

https://github.com/amjadaljunaidi/tcpdump/blob/main/README.md

Uninstall Helm chart to stop tracing and capture will be left intact in storage account.

  1. To just focus on one node than all nodes, as above, use Lab5 > tcpdump-pod.yaml.  Change node name and use below command. Storage account > file share should have tcpdump contents.
kubectl apply -f tcpdump-pod.yaml

 

View from Storage

View from Pod

On completion delete using “kubectl delete -f tcpdump-pod.yaml”. Delete storage account to delete file share.

 

Step 6: Walk through the Linux Kernel view

Use this step to confirm that from Linux Kernel level everything is configured correctly, allowing packets to flow. Also, it is not a Firewall issue since we have Working-App Pods able to be called from the Internet.

 

  1. Run below command.  This provides higher level privileges on the Node.
kubectl get pods -o wide # gives node name to use below
kubectl-node_shell <Node associated with pods>

# IPTables has a chain structure. Managed by kube-proxy pods in cluster.
  1. View faulty app’s iptable NAT table and show the KUBE-SERVICES chain, using below command to show the Services Internal and External IPs.
iptables -t nat -nL KUBE-SERVICES | grep faulty-app
  1. Walk down the chain by using below command below, which gives the Endpoints for the Service. Also gives selection probability of the Endpoint. Running this again gives on an Endpoint, gives the Pod IP associated with the Endpoint.
iptables -t nat -nL <kube-service id>

In KUBE-SERVICES chain, if src=0.0.0.0/0 or ANY and Protocol=TCP then forward from KUBE-SERVICES to KUBE-SVC chain, as the next hop for incoming packet. KUBE-SVC represents the Service’s Cluster IP.

In KUBE-SVC chain, if src=0.0.0.0/0 or ANY and Protocol=TCP then forward from KUBE-SVC to KUBE-SEP chain, as the next hop for incoming packet. KUBE-SEP represents the ENDPOINT. Notice there are 3 rules for 3 Pod’s. Pod1 gets 1/3 traffic, Pod2 gets 1/2 traffic and rest goes to Pod3. This could affect latency due to statistical load balancing based on probability, especially around multi-zone balancing where Pod’s are distributed across zones and latency caused by hops.

In KUBE-SEP chain, if IP=<Pod-IP> then direct incoming packet to the designated Pod.

 

  1. Validate route associated with Pod network and its eth0 interface. This route should map to AKS route table for Kubenet.

This should validate the route in the route table for the Kubenet networking associated with the AKS.

By using the ‘crictl ps’ command, you can navigate through the running containers on a Kubernetes cluster and interact with the container runtime interface (CRI) to manage containers. Below provides containers labelled faulty.

crictl ps | grep faulty

Map this to ‘kubectl get pods -o wide | grep faulty’ to get a match on Pod names.

# From ‘kubectl get pod’ to get one of the faulty pod names to use here to grep. 
# Use the obtained Container ID of one of the faulty app containers to return the Process ID.

crictl inspect --output go-template --template '{{.info.pid}}' <container_id>

  1. Use the Process ID to enter the Pods’ Network namespace using command nsenter. This allows us to execute commands into the Pod namespace. In this case, command ‘ip address show’ displays Pod IP. Running ‘k get pods’ confirms from the IP that we’re on the right pod.

 

Step 7: Confirm if App is listening

This step uses lsof (List Open Files) utility using following parameters:

Command to use:   nsenter -t <Process ID_Working or Faulty container> -n lsof -i -P -n

From below, working container is listening on ANY IPs i.e., *:4000.

Faulty container is tied to local loopback or 127.0.0.1 instead of ANY as above.

 

Step 8: Fixing the issue

Issue was in the Docker file where working app was set to bind to 0.0.0.0 or default/Any address, but faulty app was set to bind to a fixed loopback 127.0.0.1 address, as seen below.

Working

Faulty

 

Step 9: Challenge                                                                                                           

From docker-app folder fix the Dockerfile for Faulty app, create a new image, and create new Pod using this image to check if it resolves issue.

 

Step 10: Cleanup                                                                         

k delete ns student
az network nsg rule delete --name AllowHTTPInbound `
--resource-group $resource_group --nsg-name $nsg_name

 

Scenario 6: Enable AKS Monitoring and Logging

Objective: Enable Container Insights to provide container performance and health monitoring. Also enable Container Diagnostics to collect container logs and metrics and make them available for analysis and troubleshooting.

 

Step 1: Set up the environment.

  1. Setup up AKS as outlined in this script. Clone solutions Github link and cd to Lab6 folder.
  2. Create and switch to the newly created namespace.
kubectl create ns student
kubectl config set-context --current --namespace=student

# Verify current namespace
kubectl config view --minify --output 'jsonpath={..namespace}'
  1. Confirm Container Insights has been set up. This was setup during AKS cluster creation in Lab setup section. From AKS blade in portal > Monitor > Insights, confirm metrics collection.

 

Step 2: Deploy and Monitor apps that spike CPU/Memory utilization

  1. Assuming namespace ‘student’ still exists, run working.ps1 shown below to turn on CPU and Memory load.

 

$kubectl_apply = @"
---
# deployment to generate high cpu
apiVersion: apps/v1
kind: Deployment 
metadata:
    name: openssl-loop 
    namespace: student
spec:
    replicas: 3 
    selector: 
      matchLabels:
        app: openssl-loop 
    template: 
      metadata: 
        labels:
          app: openssl-loop 
      spec:
        containers:	
        - args:
          - |
            while true; do
              openssl speed >/dev/null; 
            done 
          command:
          - /bin/bash
          - -c
          image: polinux/stress 
          name: openssl-loop
---
# deployment to generate high memory
apiVersion: apps/v1
kind: Deployment 
metadata:
    name: stress-memory
    namespace: student
spec:
    replicas: 3 
    selector: 
      matchLabels:
        app: stress-memory
    template: 
      metadata: 
        labels:
          app: stress-memory
      spec:
        containers:	
        - image: polinux/stress
          name: stress-memory-container
          resources:
            requests:
              memory: 50Mi          
            limits:
              memory: 50Mi          
          command: ["stress"]
          args: ["--vm", "1", "--vm-bytes", "250M", "--vm-hang", "1"]
---
"@
$kubectl_apply | kubectl apply -f -

 

‘kubectl get pods’ should have stress-memory pods in ‘CrashLoopBackOff’ and empty-loop pods in ‘Pending’.

 

  1. From Insights tab, validate the CPU/Memory consumption

From Nodes tab, see if the top consuming Pods match those deployed.

 

Step 3: View container logs and generate an alert resulting in email

  1. From Logs, search and select KubeEvents and run the below query to get the Pod results.

 

KubePodInventory
| where TimeGenerated > ago(2h)
| where ContainerStatusReason == "CrashLoopBackOff"
| where Namespace == "student"
| project TimeGenerated, Name, ContainerStatus, ContainerStatusReason

  1. Create an alert as highlighted above. Confirm Email has been received on next alert.
  1. Confirm email receipt on next occurrence of the threshold.

 

Step 4: Search Diagnostics logs

  1. Ensure AzureDiagnostics section is seen in Logs. If available, run below commands to Create and Delete objects. This should generate additional log data.

 

k create ns test-diag
k create deploy deploy-diag-alert --image busybox -n test-diag
k delete deploy deploy-diag-alert -n test-diag

Queries section should lead to the Query finder to get AzureDiagnostics logs if it exists.

 

  1. Run the below queries to view log data. Log details are found in log_s. Using parse_json() you can drill down to display content of embedded fields, objects, or arrays.
AzureDiagnostics
| where Category contains "kube-audit" 
| extend log=parse_json(log_s)
| extend verb=log.verb
| extend resource=log.objectRef.resource
| extend ns=log.objectRef.namespace
| extend name=log.objectRef.name
| where resource == "pods"
| where ns=="test-diag"
| project TimeGenerated, verb, resource, name, log_s

To get graphical view, run below. This gets line chart of all the created pods in ns ‘test-diag’

AzureDiagnostics
| where Category contains "kube-audit" 
| extend log=parse_json(log_s)
| extend verb=log.verb
| extend resource=log.objectRef.resource
| extend name=log.objectRef.name
| extend ns=log.objectRef.namespace
| where resource == "pods"
| where verb=="create"
| where ns=="test-diag"
| summarize count() by bin(TimeGenerated, 1m), tostring(name), tostring(verb)
| render timechart

 

Step 5: Challenge                                                              

Repeat labs 1 to 5 and use the Logs section above to query and analyze the logs.

 

Step 6: Final cleanup

az group delete -n <aksrg> -y

 

Conclusion

This post illustrated usage of Linux tools to get a kernel level view of Kubernetes processes and get to the root cause of the faulty application. We also saw the use of Container Insights and diagnostics to analyze and troubleshoot using logs and metrics. Finally, we hope that this three-part series has been helpful in your troubleshooting journey with AKS, and that the techniques and best practices discussed will aid you in resolving issues efficiently and effectively.

 

Disclaimer

The sample scripts are not supported by any Microsoft standard support program or service. The sample scripts are provided AS IS without a warranty of any kind. Microsoft further disclaims all implied warranties including, without limitation, any implied warranties of merchantability or of fitness for a particular purpose. The entire risk arising out of the use or performance of the sample scripts and documentation remains with you. In no event shall Microsoft, its authors, or anyone else involved in the creation, production, or delivery of the scripts be liable for any damages whatsoever (including, without limitation, damages for loss of business profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to use the sample scripts or documentation, even if Microsoft has been advised of the possibility of such damages.

 

Exit mobile version