Erlang supervisor shutdowns after running child

183 views Asked by At

I have a test module and a one_for_one supervisor.

test.erl

-module(test).

-export([do_job/1,run/2, start_worker/1]).


run(Id, Fun) ->
    test_sup:start_child(Id, [Fun]).


do_job(Fun) ->
    Fun().


start_worker(Args) ->
    Pid = spawn_link(test, do_job, Args),
    io:format("started ~p~n",[Pid]),
    {ok, Pid}.

test_sup.erl

-module(test_sup).
-behaviour(supervisor).

-export([start_link/0]).
-export([init/1]).
-export([start_child/2]).

start_link() ->
    supervisor:start_link({local, ?MODULE}, ?MODULE, []).


init(_Args) ->
    SupFlags = #{strategy => one_for_one, intensity => 2, period => 20},
    {ok, {SupFlags, []}}.


start_child(Id, Args) ->
    ChildSpecs = #{id => Id,
                    start => {test, start_worker, [Args]},
                    restart => transient,
                    shutdown => brutal_kill,
                    type => worker,
                    modules => [test]},

    supervisor:start_child(?MODULE, ChildSpecs).

Now i start supervisor in shell and run command test:run(id, fun() -> erlang:throw(err) end). It works nice and the function start_worker/1 restart three times but after that, an exception occurs and the supervisor process shutdowns and i must start it manually with command test_sup:start_link(). What is the problem?

Shell:

1> test_sup:start_link().
{ok,<0.36.0>}
2> test:run(id, fun() -> erlang:throw(err) end).
started <0.38.0>
started <0.39.0>
started <0.40.0>
{ok,<0.38.0>}

=ERROR REPORT==== 16-Dec-2016::23:31:50 ===
Error in process <0.38.0> with exit value:
{{nocatch,err},[{test,do_job,1,[]}]}

=ERROR REPORT==== 16-Dec-2016::23:31:50 ===
Error in process <0.39.0> with exit value:
{{nocatch,err},[{test,do_job,1,[]}]}

=ERROR REPORT==== 16-Dec-2016::23:31:50 ===
Error in process <0.40.0> with exit value:
{{nocatch,err},[{test,do_job,1,[]}]}
** exception exit: shutdown
1

There are 1 answers

0
Ryan Stewart On BEST ANSWER

What is the problem?

There is no "problem". It's working exactly as you told it to:

To prevent a supervisor from getting into an infinite loop of child process terminations and restarts, a maximum restart intensity is defined using two integer values specified with keys intensity and period in the above map. Assuming the values MaxR for intensity and MaxT for period, then, if more than MaxR restarts occur within MaxT seconds, the supervisor terminates all child processes and then itself.

Your supervisor's configuration says, "If I have to restart a child more than two times (intensity) in 20 seconds (period), then something is wrong, so just shut down." As to why you have to restart the supervisor manually, it's because your supervisor isn't supervised itself. Otherwise, the supervisor's supervisor might try to restart it based on its own configuration.