I have a console application written in C# .Net 4.5 and running on Mono 3.2.8 on Linux (Ubuntu 14.04 LTS).
The console app is started as a service with upstart and I am logging output using log4net v2.0.5 and a console appender.
Upstart redirects all the output in /var/log/upstart/{appname}.log
Randomly, after few hours up to a couple of days, the application hangs and I see nothing in the logs.
What I know:
- Disk is not full
- Memory is not full and there is no swap
- Command ps shows the process still running
- The application is not sending anymore data to external server. When working properly the external server receives data every few seconds.
- Log file is not populated anymore
I used the command strace to see the last syscall and this is what I get:
$ strace -p 5602
Process 5602 attached
poll([{fd=6, events=POLLIN}, {fd=8, events=POLLIN}], 2, -1) = ? ERESTART_RESTARTBLOCK (Interrupted by signal)
--- SIGPWR {si_signo=SIGPWR, si_code=SI_TKILL, si_pid=5602, si_uid=1001} ---
rt_sigprocmask(SIG_BLOCK, [XCPU], NULL, 8) = 0
futex(0x26b18c, FUTEX_WAKE_PRIVATE, 1) = 1
rt_sigsuspend(~[XCPU RTMIN RT_1]) = ? ERESTARTNOHAND (To be restarted if no handler)
--- SIGXCPU {si_signo=SIGXCPU, si_code=SI_TKILL, si_pid=5602, si_uid=1001} ---
rt_sigreturn() = -1 EINTR (Interrupted system call)
rt_sigprocmask(SIG_UNBLOCK, [XCPU], NULL, 8) = 0
futex(0x26b18c, FUTEX_WAKE_PRIVATE, 1) = 1
rt_sigreturn() = -1 EINTR (Interrupted system call)
The 2 file descriptors 6 and 8 are pipes:
$ file /proc/5602/fd/6
/proc/5602/fd/6: broken symbolic link to pipe:[6562495]'
$ file /proc/5602/fd/8
/proc/5602/fd/8: broken symbolic link to pipe:[6562496]'
I do not understand the root cause of the issue and I do not know what to try to fix it.
EDIT:
As suggested by @sushihangover I ran the app in a terminal with --debug option. Eventually the app hanged again. No exception, no error.
The code is acquiring frames from MJPEG streams and doing some work on a background worker to detect object in the frame.
The work only happens when the background worker is not busy. The last log trace I get is:
[Background worker] nothing to do
I suspect the app is hanging while trying to get the next frame from the camera. I am using AForge.NET to read the MJPEG stream. AForge raise an events every time a new frame arrives.
Here is some code:
private static void Camera_NewFrame(object sender, NewImageEventArgs e)
{
var bmp = (Bitmap)e.Frame;
log.DebugFormat("got image " + DateTime.Now.Ticks + " {0} x {1}", bmp.Width, bmp.Height);
if (!bWorker.IsBusy)
{
// Run the background operation to check image and update cloud
log.Debug("Starting background work");
bWorker.RunWorkerAsync();
}
else { } // skip frame
}
private static void BWorker_DoWork(object sender, DoWorkEventArgs e)
{
log.Debug("[Background worker] enter");
if (registeredCar)
{
log.Debug("[Background worker] opening the gate");
OpenGate();
}
else
{
log.Debug("[Background worker] nothing to do");
}
}
And the backtrace given by gdb attached to the hanging process:
(gdb) bt
0 0xb6e19fc0 in poll () at ../sysdeps/unix/syscall-template.S:81
1 0xb4b264be in Mono_Unix_UnixSignal_WaitAny () from /usr/lib/libMonoPosixHelper.so
2 0xb4b6f740 in ?? ()