I have successfully upgraded Deis to v1.0.1 with 3 nodes cluster, with each node having 2GB ram, hosted by Digital Ocean.
I then nse'ed into a deis-store-monitor
service, ran ceph -s
, and realized it has entered active+undersized+degraded
state, and never get back to the active+clean
state.
Detail messages follow:
root@deis-2:/# ceph -s
libust[276/276]: Warning: HOME environment variable not set. Disabling LTTng-UST per-user tracing. (in setup_local_apps() at lttng-ust-comm.c:305)
cluster dfa09ba0-66f2-46bb-8d84-12795f281f7d
health HEALTH_WARN 1536 pgs degraded; 1536 pgs stuck unclean; 1536 pgs undersized; recovery 1314/3939 objects degraded (33.359%)
monmap e3: 3 mons at {deis-1=10.132.183.190:6789/0,deis-2=10.132.183.191:6789/0,deis-3=10.132.183.192:6789/0}, election epoch 28, quorum 0,1,2 deis-1,deis-2,deis-3
mdsmap e32: 1/1/1 up {0=deis-1=up:active}, 2 up:standby
osdmap e77: 3 osds: 2 up, 2 in
pgmap v109093: 1536 pgs, 12 pools, 897 MB data, 1313 objects
27342 MB used, 48256 MB / 77175 MB avail
1314/3939 objects degraded (33.359%)
1536 active+undersized+degraded
client io 817 B/s wr, 0 op/s
I am totally new to ceph. I wonder:
- Is it a big deal to fix this issue, or could I let it be in this state?
- If it is recommended to fix this, would you point out how should I go about it?
I read about Ceph troubleshooting section and POOL, PG AND CRUSH CONFIG REFERENCE, but still have no idea what I should do next.
Thanks a lot!
From this output:
osdmap e77: 3 osds: 2 up, 2 in
. It sounds like one of yourdeis-store-daemons
isn't responding.deisctl restart store-daemon
should recover your cluster, but I'd be curious about what happened to that daemon. I'd love to seejournalctl --no-pager -u deis-store-daemon
on all of your hosts. If you could add your logs to https://github.com/deis/deis/issues/2520 that'd help us figure out why the daemon isn't responding.Also, 2GB nodes on DO will likely result in performance issues (and Ceph may be unhappy).