My program dies with an access violation in Windows (Windows 7 - 32 bits). It is C code compiled with gcc 4.8.1 under MinGW. It uses pthreads-w32 2.9.1. There are several threads working concurrently with no other apparent issues. It can run well for days or fail in a couple of hours. Code can also be compiled in several linux architectures, but I am not having problems there. It is also very difficult to run the program under a debugger.
Following is the function where the crash happens.
static void *timeout_lector(void *indice)
{
int e;
pthread_mutex_t fm;
pthread_cond_t fc;
struct timespec ts;
struct timeval tv;
struct equipo *eq;
eq = &estacion.equipos[*((int *)indice)];
if (pthread_mutex_init(&fm, NULL) || pthread_cond_init(&fc, NULL)) {
LOG_PRINT("Error creating mutex or cond in timeout_lector.\n");
exit(1);
}
pthread_mutex_lock(&fm);
gettimeofday(&tv, NULL);
ts.tv_sec = tv.tv_sec;
ts.tv_nsec = tv.tv_usec * 1000;
siguiente_tmseg(&ts, 150); /* Increments ts by 150 ms */
pthread_cleanup_push(thread_cleanup_fc, (void *)(&fc));
pthread_cleanup_push(thread_cleanup_fm, (void *)(&fm));
if ((e = pthread_cond_timedwait(&fc, &fm, &ts)) != ETIMEDOUT) {
LOG_PRINT("Error waiting in timeout_lector: %d.\n", e);
exit(1);
}
pthread_mutex_lock(&eq->mutexto);
eq->to = 1;
pthread_mutex_unlock(&eq->mutexto);
e = pthread_cond_wait(&fc, &fm); /* wait until we are cancelled */
LOG_PRINT("Error in timeout_lector: %d.\n", e);
exit(1);
/* Next lines are never executed and just for correct syntax */
pthread_cleanup_pop(1);
pthread_cleanup_pop(1);
return(NULL);
}
These are the cleanup functions:
void thread_cleanup_fc(void *fc)
{
pthread_cond_destroy(fc);
}
void thread_cleanup_fm(void *fm)
{
pthread_mutex_unlock(fm);
pthread_mutex_destroy(fm);
}
I have been running it with Dr.MinGW 0.7.3 and here is the report:
estacion.exe caused an Access Violation at location 62489D38 in module pthreadGC2.dll Writing to location 0101FCFC.
Registers:
eax=00000000 ebx=0444febc ecx=ffffffff edx=0101fcfc esi=0444fee0 edi=0444fcf0
eip=62489d38 esp=0444fcd0 ebp=0444fd18 iopl=0 nv up ei pl zr na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
AddrPC Params
62489D38 0444FEE0 00000000 FFFFFFFF pthreadGC2.dll!pthread_cond_destroy
004194E0 0444FEE0 0444FEBC 0444FD68 estacion.exe!thread_cleanup_fc [C:/codigo/CeltaDAS/estacion/tiempo.c @ 223]
6248ABB5 00000684 FFFFFFFF 0444FE08 pthreadGC2.dll!ptw32_pop_cleanup.constprop.3
6248BAAF 00000000 00000000 03C63EE0 pthreadGC2.dll!sem_timedwait
6248512D 003E1138 0444FEB0 767DA53A pthreadGC2.dll!pthread_getspecific
62485D22 003E1138 00000078 00000000 pthreadGC2.dll!ptw32_push_cleanup
62485D22 0444FEE0 0444FEE4 0444FED0 pthreadGC2.dll!ptw32_push_cleanup
0040C0D1 014908EC 014AAD70 00000000 estacion.exe!timeout_lector [C:/codigo/CeltaDAS/estacion/equipos.c @ 214]
62485BD3 014B1190 0805AA4A 00000000 pthreadGC2.dll!ptw32_threadStart@4
767E1287 0444FF94 773DEE1C 03CF0048 msvcrt.dll!itow_s
767E1328 03CF0048 0444FFD4 777037EB msvcrt.dll!endthreadex
773DEE1C 03CF0048 319EB544 00000000 kernel32.dll!BaseThreadInitThunk
777037EB 767E12E5 03CF0048 00000000 ntdll.dll!RtlInitializeExceptionChain
777037BE 767E12E5 03CF0048 00000000 ntdll.dll!RtlInitializeExceptionChain
Windows 6.1.7601
DrMingw 0.7.3
Dr.MinGW reports in the stack trace passing through line 214 in timeout_lector. It corresponds to the pthread_cond_wait line in the code.
This threads are created continuosly and several in parallel. They wait for some time while they can be cancelled, if time passes they change a variable, and then wait until they are cancelled, what happens almost inmediately by another thread. There are no many cancellable functions that I can use for waiting and that are portable in different systems, so I chose pthread_cond_timedwait for it. But the problem seems to be in the second and indefinite wait.
I have search for similar problems, but nothing appears to be the same problem.
More similar are:
pthread_cond_wait: random segmentation fault
Cancelling pthread_cond_wait() hangs with PRIO_INHERIT mutex
I would really appreciate it if someone could help me with this.
First destroy mutex and condition on which thread is waiting and then do thread cancel. If you destroy first thread then this kind of problem comes in windows. Have look on this.