Cancelling a thread in pthread_cond_wait yields to access violation under MinGW

570 views Asked by At

My program dies with an access violation in Windows (Windows 7 - 32 bits). It is C code compiled with gcc 4.8.1 under MinGW. It uses pthreads-w32 2.9.1. There are several threads working concurrently with no other apparent issues. It can run well for days or fail in a couple of hours. Code can also be compiled in several linux architectures, but I am not having problems there. It is also very difficult to run the program under a debugger.

Following is the function where the crash happens.

static void *timeout_lector(void *indice)
{
    int e;
    pthread_mutex_t fm;
    pthread_cond_t fc;
    struct timespec ts;
    struct timeval tv;
    struct equipo *eq;

    eq = &estacion.equipos[*((int *)indice)];

    if (pthread_mutex_init(&fm, NULL) || pthread_cond_init(&fc, NULL)) {
        LOG_PRINT("Error creating mutex or cond in timeout_lector.\n");
        exit(1);
    }
    pthread_mutex_lock(&fm);
    gettimeofday(&tv, NULL);
    ts.tv_sec  = tv.tv_sec;
    ts.tv_nsec = tv.tv_usec * 1000;
    siguiente_tmseg(&ts, 150);  /* Increments ts by 150 ms */
    pthread_cleanup_push(thread_cleanup_fc, (void *)(&fc));
    pthread_cleanup_push(thread_cleanup_fm, (void *)(&fm));
    if ((e = pthread_cond_timedwait(&fc, &fm, &ts)) != ETIMEDOUT) {
        LOG_PRINT("Error waiting in timeout_lector: %d.\n", e);
        exit(1);
    }
    pthread_mutex_lock(&eq->mutexto);
    eq->to = 1;
    pthread_mutex_unlock(&eq->mutexto);
    e = pthread_cond_wait(&fc, &fm);  /* wait until we are cancelled */
    LOG_PRINT("Error in timeout_lector: %d.\n", e);
    exit(1);
    /* Next lines are never executed and just for correct syntax */
    pthread_cleanup_pop(1);
    pthread_cleanup_pop(1);
    return(NULL);
}

These are the cleanup functions:

void thread_cleanup_fc(void *fc)
{
    pthread_cond_destroy(fc);
}

void thread_cleanup_fm(void *fm)
{
    pthread_mutex_unlock(fm);
    pthread_mutex_destroy(fm);
}

I have been running it with Dr.MinGW 0.7.3 and here is the report:

estacion.exe caused an Access Violation at location 62489D38 in module pthreadGC2.dll Writing to location 0101FCFC.

Registers:
eax=00000000 ebx=0444febc ecx=ffffffff edx=0101fcfc esi=0444fee0 edi=0444fcf0
eip=62489d38 esp=0444fcd0 ebp=0444fd18 iopl=0         nv up ei pl zr na po nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246

AddrPC   Params
62489D38 0444FEE0 00000000 FFFFFFFF  pthreadGC2.dll!pthread_cond_destroy
004194E0 0444FEE0 0444FEBC 0444FD68  estacion.exe!thread_cleanup_fc  [C:/codigo/CeltaDAS/estacion/tiempo.c @ 223]
6248ABB5 00000684 FFFFFFFF 0444FE08  pthreadGC2.dll!ptw32_pop_cleanup.constprop.3
6248BAAF 00000000 00000000 03C63EE0  pthreadGC2.dll!sem_timedwait
6248512D 003E1138 0444FEB0 767DA53A  pthreadGC2.dll!pthread_getspecific
62485D22 003E1138 00000078 00000000  pthreadGC2.dll!ptw32_push_cleanup
62485D22 0444FEE0 0444FEE4 0444FED0  pthreadGC2.dll!ptw32_push_cleanup
0040C0D1 014908EC 014AAD70 00000000  estacion.exe!timeout_lector  [C:/codigo/CeltaDAS/estacion/equipos.c @ 214]
62485BD3 014B1190 0805AA4A 00000000  pthreadGC2.dll!ptw32_threadStart@4
767E1287 0444FF94 773DEE1C 03CF0048  msvcrt.dll!itow_s
767E1328 03CF0048 0444FFD4 777037EB  msvcrt.dll!endthreadex
773DEE1C 03CF0048 319EB544 00000000  kernel32.dll!BaseThreadInitThunk
777037EB 767E12E5 03CF0048 00000000  ntdll.dll!RtlInitializeExceptionChain
777037BE 767E12E5 03CF0048 00000000  ntdll.dll!RtlInitializeExceptionChain

Windows 6.1.7601
DrMingw 0.7.3

Dr.MinGW reports in the stack trace passing through line 214 in timeout_lector. It corresponds to the pthread_cond_wait line in the code.

This threads are created continuosly and several in parallel. They wait for some time while they can be cancelled, if time passes they change a variable, and then wait until they are cancelled, what happens almost inmediately by another thread. There are no many cancellable functions that I can use for waiting and that are portable in different systems, so I chose pthread_cond_timedwait for it. But the problem seems to be in the second and indefinite wait.

I have search for similar problems, but nothing appears to be the same problem. More similar are:
pthread_cond_wait: random segmentation fault
Cancelling pthread_cond_wait() hangs with PRIO_INHERIT mutex

I would really appreciate it if someone could help me with this.

1

There are 1 answers

0
Mohan On

First destroy mutex and condition on which thread is waiting and then do thread cancel. If you destroy first thread then this kind of problem comes in windows. Have look on this.