We have an XMPP system used by our software that uses an ejabberd server to send realtime messages. Think of this as a 2010 era homegrown version of Firebase Cloud Messaging.
We recently updated from ejabberd-16 to ejabberd-22.10 (Had to jump because of LetsEncrypt issues with v18 through v20).
Our normal load is 3000 to 4000 active users.
Since the upgrade, when our server gets up above 1000 active users. the running processes of beam.smp explode. Each one takes 10-20% of CPU which pulls our server down. I can fix this by turning off ejabberd for a few minutes and restarting it, which kicks the number of active users lower. But I really need to get back to our full volume of 3000-4000 active users.
top - 08:05:09 up 20:50, 2 users, load average: 40.03, 22.40, 15.82
Tasks: 643 total, 11 running, 497 sleeping, 0 stopped, 0 zombie
%Cpu(s): 61.1 us, 35.8 sy, 0.0 ni, 0.1 id, 0.0 wa, 0.0 hi, 0.4 si, 2.7 st
KiB Mem : 16367432 total, 186740 free, 3427940 used, 12752752 buff/cache
KiB Swap: 262140 total, 258300 free, 3840 used. 12440420 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
11019 ejabberd 20 0 2781448 38864 12584 S 19.0 0.2 0:00.64 beam.smp
10096 ejabberd 20 0 2787856 45624 15536 S 15.7 0.3 0:01.10 beam.smp
10543 ejabberd 20 0 2781700 39608 13056 S 15.7 0.2 0:00.74 beam.smp
10678 ejabberd 20 0 2783768 39916 12892 S 15.4 0.2 0:00.66 beam.smp
10749 ejabberd 20 0 2781712 39396 14616 S 14.8 0.2 0:00.87 beam.smp
10745 ejabberd 20 0 2782452 37120 12688 S 12.8 0.2 0:00.50 beam.smp
2088 ejabberd 20 0 2893856 148116 44624 S 12.5 0.9 11:26.30 beam.smp
10755 ejabberd 20 0 2785552 40760 12472 S 12.1 0.2 0:00.44 beam.smp
9260 ejabberd 20 0 2786804 49224 17136 S 11.5 0.3 0:00.95 beam.smp
11319 ejabberd 20 0 2782480 31788 11204 S 11.1 0.2 0:00.34 beam.smp
10093 ejabberd 20 0 2782224 42140 15008 S 10.8 0.3 0:00.91 beam.smp
9986 ejabberd 20 0 2782704 43572 15112 S 10.5 0.3 0:00.87 beam.smp
10169 ejabberd 20 0 2782736 38956 12904 S 9.8 0.2 0:00.73 beam.smp
10407 ejabberd 20 0 2781700 39708 13052 S 9.8 0.2 0:00.72 beam.smp
What configuration am I missing to get my active users higher. We are using mnesia database and wish to keep using it.
I have not a clear answer, so I'll give several ideas, hoping that one will point to you to something useful. If you don't get yet any clue, you can update your original post answering those questions, and somebody else may get some clue.
A) Around 1000 concurrent user connections? What a curious number, it remembers me to the "ulimit -n" which was by default 1024, see https://www.ejabberd.im/benchmark/index.html
B) You are now using Mnesia. I imagine it was also being used in the old deployment, so probably this isn't the problem
C) Are you using some custom module not included in the standard ejabberd? Maybe from ejabberd-contrib or elsewhere. Maybe it has some limit, or some incompatibilty with the new ejabberd version.
D) Are those clients idle (and just consuming the TCP connection and some RAM), or are they actively doing things (like sending messages to MUC rooms, or changing presences, which consume CPU)?
E) Do all the users use the same XMPP client? Maybe that client behaves strangely with the new ejabberd version.
F) Does the problem increase slowly from 1 client up to 1000? Or does the problem appear suddenly around 1000 connections?
G) BEAM is a virtual machine, which internally has "erlang processes" that you can look using something similar to "top". Maybe there is some erlang process or a few of them consuming all this CPU...
I can think two methods to view the erlang processes that exist inside the erlang virtual machine:
An easy method is using the "etop" tool. Simply run:
Alternatively, you can install ejabberd_observer_cli which provides more details:
1 Install it:
2 Now run
3 in that shell run:
4 press H and then Enter to view the Home screen
What you are looking for: processes that have a lot of Reds/Reductions, which means they are executing many functions many times; or that have a large Message Queue, which means they are saturated and can't handle the load fast enought; or that consume a lot of memory.