Container abruptly killed with warning "cleaning up after killed shim"

5.6k views Asked by At

We have recently upgraded from docker version 17.06.0-ce to 18.09.2 on our deployment environment. Experienced container got killed suddenly after running for few days without much information in docker logs.

Monitored the memory usage, and the affected containers are well below all limits (per container and also the host has enough memory free).

Setup observations during the issue:

  1. docker version with 18.09.2 with around 30 running containers.
  2. Experienced container got killed after running for few days.

Docker Logs observed during container crash

Nov 16 15:42:11 site1 containerd[1762]: time="2020-11-16T15:42:11.171040904Z" level=info msg="shim reaped" id=d39355d3061d461ad4a305c717b699bd332aae50d47c2bf2b547bef50f767c7d
Nov 16 15:42:11 site1 containerd[1762]: time="2020-11-16T15:42:11.171156262Z" level=warning msg="cleaning up after killed shim" id=d39355d3061d461ad4a305c717b699bd332aae50d47c2bf2b547bef50f767c7d namespace=moby
Nov 16 15:42:11 site1 dockerd[3022]: time="2020-11-16T15:42:11.171164295Z" level=warning msg="failed to delete process" container=d39355d3061d461ad4a305c717b699bd332aae50d47c2bf2b547bef50f767c7d error="ttrpc: client shutting down: ttrpc: closed: unknown" module=libcontainerd namespace=moby process=b0d77b1ebf2c82b09c152530a5e24491d76e216b852e385686c46128c94e7f5a
Nov 16 15:42:11 site1 c73920e3476c[3022]: INFO: 2020/11/16 15:42:11.396872 [nameserver a6:0c:6a:18:69:1f] container d39355d3061d461ad4a305c717b699bd332aae50d47c2bf2b547bef50f767c7d died; tombstoning entry test-endpoint-s104.weave.local. -> 10.44.0.14


Output of Docker version

Client:
 Version:           18.09.2
 API version:       1.39
 Go version:        go1.10.6
 Git commit:        6247962
 Built:             Sun Feb 10 04:13:50 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.2
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.6
  Git commit:       6247962
  Built:            Sun Feb 10 03:42:13 2019
  OS/Arch:          linux/amd64
  Experimental:     false



Output of Docker Info:

Containers: 30
 Running: 25
 Paused: 0
 Stopped: 5
Images: 236
Server Version: 18.09.2
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: journald
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9754871865f7fe2f4e74d43e2fc7ccd237edcbce
runc version: 09c8266bf2fcf9519a651b04ae54c967b9ab86ec
init version: fec3683
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-171-generic
Operating System: Ubuntu 16.04.6 LTS
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 62.92GiB
Name: fpas-site1-dra-director-a
ID: KKSM:3YNF:LE7N:NVFE:Y5C4:C6CN:LAQT:QRRZ:VYQS:O4PP:VQKG:DXTK
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Labels:
 com.broadhop.swarm.uuid=uuid4:d96aef99-b5fc-44e3-b7fa-65b08b7e30f3
 com.broadhop.swarm.role=endpoint-role
 com.broadhop.swarm.node=
 com.broadhop.swarm.hostname=site1
 com.broadhop.swarm.mode=
 com.broadhop.network.interfaces=internal:172.26.50.13
Experimental: false
Insecure Registries:
 registry:5000
 127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine

WARNING: API is accessible on http://127.0.0.1:2375 without encryption.
         Access to the remote API is equivalent to root access on the host. Refer
         to the 'Docker daemon attack surface' section in the documentation for
         more information: https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface
WARNING: No swap limit support

NOTE: Since this deployment is on critical infrastructure and that we want to understand why this happened and ascertain that this does not occur again. Did anyone faced same kind of issue in any environment and please let us know if there are known issues in with the docker versions being used.

1

There are 1 answers

0
Aziz F Dagli On

Your go lang version is quite old, you may try to update. I found this issue in the github.

https://github.com/moby/moby/issues/38742