system() occasionally returns 2

475 views Asked by At

I have coded a function using system() library function as below:

int execute(const char* cmd)
{
  int ret = system(cmd);

  if (ret != -1)
  {
    if (WIFEXITED(ret))
      ret = WEXITSTATUS(ret);
    else
      ret = -1;
  }

  LINFO( "execute %s, ret = %d", cmd, ret);  // logging
  return ret;
}

Then, I called it with a shell script as below:

#!/bin/sh
PATH=$PATH:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin:/bin
cd $(dirname $0)

agent_name=`grep 'agent_name' ../etc/config.ini |awk '{ print $3 }'`
py='../../python26/bin/python'

check_alive()
{
    status=`ps -ef | grep "$agent_name" | grep -v "grep" |wc -l`

    if [ $status -ne 0 ]; then
        # process exist
        echo "$agent_name already exist"
        exit 1
    fi    

}

check_alive
eval '$py ../bin/agent.py -d'

status=`ps -ef | grep "$agent_name" | grep -v "grep" |wc -l`
if  [ $status -lt 1 ] 
then
    echo "run failed"
    exit -1
else
    echo "run succ"
    exit 0
fi

But sometimes there was a odd return code of 2, as below:

[INFO]execute ./admin/trystart.sh, ret = 1
[INFO]execute ./admin/trystart.sh, ret = 1
[INFO]execute ./admin/trystart.sh, ret = 2
[INFO]execute ./admin/trystart.sh, ret = 2
[INFO]execute ./admin/trystart.sh, ret = 1
[INFO]execute ./admin/trystart.sh, ret = 1
[INFO]execute ./admin/trystart.sh, ret = 1
[INFO]execute ./admin/trystart.sh, ret = 1
[INFO]execute ./admin/trystart.sh, ret = 1
[INFO]execute ./admin/trystart.sh, ret = 2

I would like to understand why there was a return code of 2.

================new 2015/06/12 13:54============================

I have found that when the system() returns 2, there was a error message of bash as below:

bash: xmalloc: locale.c:73: cannot allocate 2 bytes (0 bytes allocated)
1

There are 1 answers

6
rici On BEST ANSWER

The status code of 2 is being returned by your shell. Without a lot more details, it will be difficult to diagnose. Adding the actual error message produced by the shell was useful.

If you shell is bash -- as it now appears to be -- then the status return code of 2 indicates a memory allocation error, which is confirmed by the error message generated by bash:

bash: xmalloc: locale.c:73: cannot allocate 2 bytes (0 bytes allocated)

Line 73 is, as far as I can see, the first memory allocation performed by a newly-started bash process (which is confirmed by the error message's indication that no bytes have yet been allocated), so it seems likely that the problem is that malloc cannot allocate any memory.

It is possible that there really is no memory available, particularly if you are running on a heavily-congested system with no swap configured. But there are a few hints scattered around the internet which suggest that this might have to do with memory protection options; in particular, configurations in which sbrk is not available but the malloc library expects to be able to use it.

You might want to start further diagnosis by verifying whether it is possible to reliably start new shells:

for i in {0..999}; do sh -c 'exit 0' || echo Failure $?; done

An earlier guess, which might be of use to someone else. The fix to the invocation of exit is recommended, even though it probably had nothing to do with the particular problem in this question.

The dash shell, used as a /bin/sh implementation by a number of distributions, returns status code 2 when the shell exits as a result of an error condition.

One possible culprit from the above script is

 exit -1

Status return codes are eight-bit unsigned values; in other words, legal return codes range from 0 to 255. -1 is not in that range, and you shouldn't use it in a call to exit.

Bash's exit builtin (like the C exit function) simply uses the low order byte of the supplied return code, so with bash you would have seen a return code of 255. But the dash builtin exit expects its argument to be an unsigned number, and complains that -1 is not a valid number.

Since exit is a special builtin and the shell is not interactive, the error causes the shell to exit, as per the Posix standard. Dash sets the exit code to 2 when it exits because of a shell error.