Finding time slice for threads using gettimeofday

Question

Finding time slice for threads using gettimeofday

451 views Asked by SA.93 At 14 September 2024 at 03:24

I'm trying to find the time slice of a program, first I assigned to each of the threads an ID to help identify them in their f function. In the f function, I used the timeval struct to check for the start and finish times, subtract them from each other and multiply by 1000 to convert to ms.
The result I expected (for example on 2 threads) was an infinite loop that would print that time for each thread. The result I got was an infinite loop, only going through the second thread and the time is always 0 or 0.001ms (why?).
I'd appreciate hints on how I should think of the solution \ what I've done wrong?
The code:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <time.h> //For gettimeofday()

int i; //Global variable to act as an ID for the threads
void *f(void *p);

void main(int argc, char *argv[]){
    if(argc < 2){
        perror("Illegal input");
        exit(1);
    }

    int num;
    num = atoi(argv[1]);
    int index[num];

    pthread_t arr[num];

    for(i=0; i<num; i++){
        index[i] = i+1;
        pthread_create (&arr[i], NULL, f, (void *) &index[i]);
    }

    for(i=0; i<num; i++) //Wait for all threads to finish
        pthread_join(arr[i], NULL);
}

void *f(void *p){
    int pid = *((int *)p); //Thread ID
    struct timeval start, finish;
    double elapsedTime;

    while(1){
        gettimeofday(&start, NULL);
        gettimeofday(&finish, NULL);

        elapsedTime = (finish.tv_sec - start.tv_sec) * 1000.0; //sec to ms
        elapsedTime += (finish.tv_usec - start.tv_usec) / 1000.0;   // us to ms
        fprintf(stdout, "Time slice for thread %d = %lf ms.\n", pid, elapsedTime);
    }
}

The output (infinite loop):

Time slice for thread 2 = 0.000000 ms.
Time slice for thread 2 = 0.000000 ms.
Time slice for thread 2 = 0.000000 ms.
Time slice for thread 2 = 0.000000 ms.
Time slice for thread 2 = 0.000000 ms.
Time slice for thread 2 = 0.000000 ms.
Time slice for thread 2 = 0.000000 ms.
Time slice for thread 2 = 0.001000 ms.
Time slice for thread 2 = 0.001000 ms.
Time slice for thread 2 = 0.001000 ms.
Time slice for thread 2 = 0.001000 ms.
Time slice for thread 2 = 0.000000 ms.
Time slice for thread 2 = 0.001000 ms.
Time slice for thread 2 = 0.001000 ms.
Time slice for thread 2 = 0.001000 ms.
Time slice for thread 2 = 0.001000 ms.
Time slice for thread 2 = 0.001000 ms.
Time slice for thread 2 = 0.001000 ms.

Original Q&A

There are 3 answers

**KamilCuk** · Answer 1 · 2020-12-04 17:03:18

Try:

int main() {
    // here we start
    struct timeval start;
    gettimeofday(&start, NULL);

    while(1){
        // here we are right now
        struct timeval now;
        gettimeofday(&now, NULL);

        double elapsedTime = (now.tv_sec - start.tv_sec) * 1000.0; //sec to ms
        elapsedTime += (now.tv_usec - start.tv_usec) / 1000.0;   // us to ms
        fprintf(stdout, "Time passed since thread started %d = %lf ms.\n", pid, elapsedTime);
    }
}

Note that calculating difference between two calls of gettimeofday makes little sense - the clock is not monotonic. In anything serious use clock_gettime(CLOCK_MONOTONIC for a monotonic clock.

**Brendan** · Answer 2 · 2020-12-05 03:44:44

I'm trying to find the time slice of a program, first I assigned to each of the threads an ID to help identify them in their f function. In the f function, I used the timeval struct to check for the start and finish times, subtract them from each other and multiply by 1000 to convert to ms.

This (the general idea) cannot work. Elapsed time can include:

IRQs that are not task switches
time slices that were cut short by the task blocking and not using the whole time slice (including blocking while waiting to acquire a mutex on stdout in the lower levels of a fprintf()).
any number of tasks all having time slices (e.g. elapsed time might be the sum of 20 different time slices)
any combination of the above (e.g. elapsed time might be the sum of 5 IRQs that interrupted the thread's time slice, plus 20 different partial time slices of different lengths).
"migration" to a different CPU (e.g. ending a time slice on one CPU then almost immediately starting a new time slice on a different CPU).

For these reasons; any value you get for elapsed time can never be assumed to accurately reflect the time slice length; and (due to "high risk of accidentally repeating timing patterns") any "frequently repeated" value you get for elapsed time can never be assumed to accurately reflect the time slice length.

In addition; most operating systems don't have one fixed length time slice in the first place. For example, Modern Linux uses a "auto-diddling of a thread's priority that influences that thread's time slice length" combined with a tickless scheduler.

**Glärbo** · Answer 3 · 2020-12-04 21:44:06

A better measurement procedure would be to use clock_gettime(CLOCK_MONOTONIC, &now) or clock_gettime(CLOCK_BOOTTIME, &now) in a loop. Whenever the difference between consecutive iterations is larger than some limit (say, 0.025ms or 25µs), you start a new period. The period ends in the measurement before the next such bigger difference.

Here is an example program that lets you name the clock to be used, the number of threads, and the number of seconds to measure:

#define  _POSIX_C_SOURCE  200809L
#include <stdlib.h>
#include <inttypes.h>
#include <pthread.h>
#include <limits.h>
#include <string.h>
#include <strings.h>
#include <stdio.h>
#include <time.h>
#include <errno.h>

/* Maximum interval between iterations */
#ifndef  NS_PER_ITERATION_MAX
#define  NS_PER_ITERATION_MAX  25000    /* 25 us, or 1/40,000ths of a second */
#endif

/* Number of slots per second to reserve per thread. */
#ifndef  SLOTS_PER_SEC
#define  SLOTS_PER_SEC  1000
#endif

static inline int64_t difftimespec_ns(const struct timespec after, const struct timespec before)
{
    return (int64_t)(after.tv_sec - before.tv_sec) * INT64_C(1000000000) + (int64_t)(after.tv_nsec - before.tv_nsec);
}

struct measurements {
    struct measurements *next;
    pthread_t            thread;
    clockid_t            clk;
    int64_t              ns_runtime;
    size_t               ns_max;
    size_t               ns_num;
    uint32_t             ns[];
};

void *measure_slices(void *payload) {
    struct measurements *const m = payload;
    const clockid_t            clk = m->clk;
    const int64_t              ns_runtime = m->ns_runtime;
    const size_t               ns_max = m->ns_max;
    size_t                     ns_num = 0;
    uint32_t *const            ns_array = m->ns;

    struct timespec            started, mark, prev, curr;
    int64_t                    ns;

    if (clock_gettime(clk, &started) == -1)
        return (void *)(intptr_t)errno;
    if (clock_gettime(clk, &mark) == -1)
        return (void *)(intptr_t)errno;
    if (clock_gettime(clk, &prev) == -1)
        return (void *)(intptr_t)errno;

    while (1) {

        if (clock_gettime(clk, &curr) == -1)
            return (void *)(intptr_t)errno;

        /* Ran for long enough? */
        if (difftimespec_ns(curr, started) >= ns_runtime)
            break;

        /* Interval between this and previous iteration */
        ns = difftimespec_ns(curr, prev);
        if (ns <= NS_PER_ITERATION_MAX) {
            /* Still within the same timeslice */
            prev = curr;
            continue;
        }

        /* We started a new timeslice. */
        ns = difftimespec_ns(prev, mark);
        if (ns > NS_PER_ITERATION_MAX && ns < INT64_C(4294967296)) {
            /* Skip the very first measurement. */
            if (ns_num > 0)
                ns_array[ns_num - 1] = ns;
            if (++ns_num >= ns_max)
                 break;
        }

        mark = prev = curr;
    }

    m->ns_num = ns_num - 1;
    return 0;
}

static int parse_clock(const char *src, clockid_t *dst)
{
    if (!src || !*src || !dst)
        return -1;

    /* Skip optional CLOCK_ prefix. */
    if (!strncasecmp(src, "CLOCK_", 6))
        src += 6;

    if (!strcasecmp(src, "REALTIME")) {
        *dst = CLOCK_REALTIME;
        return 0;
    } else
    if (!strcasecmp(src, "REALTIME_COARSE")) {
        *dst = CLOCK_REALTIME_COARSE;
        return 0;
    } else
    if (!strcasecmp(src, "MONOTONIC")) {
        *dst = CLOCK_MONOTONIC;
        return 0;
    } else
    if (!strcasecmp(src, "MONOTONIC_COARSE")) {
        *dst = CLOCK_MONOTONIC_COARSE;
        return 0;
    } else
    if (!strcasecmp(src, "MONOTONIC_RAW")) {
        *dst = CLOCK_MONOTONIC_RAW;
        return 0;
    } else
    if (!strcasecmp(src, "BOOTTIME")) {
        *dst = CLOCK_BOOTTIME;
        return 0;
    }

    /* We deliberately omit CLOCK_PROCESS_CPUTIME_ID and CLOCK_THREAD_CPUTIME_ID,
       because they reflect CPU time used, and our algorithm needs a wall clock. */
    return -1;
}

static int parse_double(const char *src, double *dst)
{
    const char *end;
    double      val;

    if (!src || !*src || !dst)
        return -1;

    end = src;
    errno = 0;
    val = strtod(src, (char **)(&end));
    if (errno || *end)
        return -1;

    *dst = val;
    return 0;
}

static int parse_size(const char *src, size_t *dst)
{
    const char    *end;
    unsigned long  val;

    if (!src || !*src || !dst)
        return -1;

    end = src;
    errno = 0;
    val = strtoul(src, (char **)(&end), 0);
    if (errno || *end)
        return -1;
    if ((unsigned long)(size_t)(val) != val)
        return -1;

    *dst = val;
    return 0;
}

static int compare_uint32_t(const void *ptr1, const void *ptr2)
{
    const uint32_t val1 = *(const uint32_t *)ptr1;
    const uint32_t val2 = *(const uint32_t *)ptr2;
    return (val1 < val2) ? -1 :
           (val1 > val2) ? +1 : 0;
}

int main(int argc, char *argv[])
{
    int64_t              ns;
    double               seconds;
    size_t               threads, n, i;
    clockid_t            clk;
    struct measurements *list, *curr;
    uint32_t            *allns;
    pthread_attr_t       attrs;
    int                  err;

    if (argc != 4 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
        const char *argv0 = (argc > 0 && argv && argv[0] && argv[0][0]) ? argv[0] : "(this)";
        fprintf(stderr, "\n");
        fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv0);
        fprintf(stderr, "       %s CLOCK THREADS SECONDS\n", argv0);
        fprintf(stderr, "\n");
        fprintf(stderr, "This program measures the named clock,\n");
        fprintf(stderr, "       REALTIME\n");
        fprintf(stderr, "       MONOTONIC\n");
        fprintf(stderr, "       BOOTTIME\n");
        fprintf(stderr, "in a tight loop, in order to determine how long such\n");
        fprintf(stderr, "a tight loop can run uninterrupted (technically, with\n");
        fprintf(stderr, "at most %.1f microseconds (%.3f ms) between iterations)\n",
                         NS_PER_ITERATION_MAX / 1000.0, NS_PER_ITERATION_MAX / 1000000.0);
        fprintf(stderr, "collecting statistics of such durations.\n");
        fprintf(stderr, "Output is the measured durations in nanoseconds.\n");
        fprintf(stderr, "\n");
        return (argc < 3) ? EXIT_SUCCESS : EXIT_FAILURE;
    }

    if (parse_clock(argv[1], &clk)) {
        fprintf(stderr, "%s: Invalid clock name.\n", argv[1]);
        return EXIT_FAILURE;
    }

    if (parse_size(argv[2], &threads) || threads < 1) {
        fprintf(stderr, "%s: Invalid number of threads.\n", argv[2]);
        return EXIT_FAILURE;
    }

    if (parse_double(argv[3], &seconds) || seconds <= 0.010) {
        fprintf(stderr, "%s: Invalid number of seconds to run.\n", argv[3]);
        return EXIT_FAILURE;
    }
    ns = (int64_t)(1000000000.0 * seconds);
    if (ns <= 0) {
        fprintf(stderr, "%s: Invalid number of seconds to run.\n", argv[3]);
        return EXIT_FAILURE;
    }

    /* Number of duration slots per second per thread. */
    n = (size_t)(SLOTS_PER_SEC * seconds);

    /* Allocate a measurement structure per thread. */
    list = NULL;
    for (i = 0; i < threads; i++) {
        curr = malloc(n * sizeof (uint32_t) + sizeof (struct measurements));
        if (!curr) {
            fprintf(stderr, "Out of memory.\n");
            return EXIT_FAILURE;
        }

        /* Initialize. */
        curr->clk = clk;
        curr->ns_runtime = ns;
        curr->ns_max = n;
        curr->ns_num = 0;

        /* Prepend to list of measurement structures. */
        curr->next = list;
        list = curr;
    }

    /* Measurement threads don't need much stack. */
    pthread_attr_init(&attrs);
    pthread_attr_setstacksize(&attrs, 2 * PTHREAD_STACK_MIN);
    for (curr = list; curr != NULL; curr = curr->next) {
        err = pthread_create(&(curr->thread), &attrs, measure_slices, curr);
        if (err) {
            fprintf(stderr, "Cannot create measurement threads: %s.\n", strerror(err));
            exit(EXIT_FAILURE); /* This kills also the already running threads. */
        }
    }
    pthread_attr_destroy(&attrs);

    fprintf(stderr, "Measuring ... ");
    n = 0;
    for (curr = list; curr != NULL; curr = curr->next) {
        void *retval = (void *)0;
        err = pthread_join(curr->thread, &retval);
        if (err) {
            fprintf(stderr, "Lost a measurement thread: %s.\n", strerror(err));
            exit(EXIT_FAILURE); /* Also kills still running threads. */
        }
        if (retval != (void *)0) {
            fprintf(stderr, "Measurement thread failed: %s.\n", strerror((int)(intptr_t)retval));
            exit(EXIT_FAILURE);
        }
        n += curr->ns_num;
    }
    fprintf(stderr, "Done; obtained %zu durations total.\n", n);

    /* Collate results. */
    allns = malloc(n * sizeof allns[0]);
    if (!allns) {
        fprintf(stderr, "Cannot collate results: %s.\n", strerror(ENOMEM));
        return EXIT_FAILURE;
    }
    i = 0;
    while (list) {
        curr = list;
        list = list->next;
        curr->next = NULL;

        if (i + curr->ns_num > n) {
            fprintf(stderr, "Internal bug when collating data: buffer overrun.\n");
            return EXIT_FAILURE;
        }
        memcpy(allns + i, curr->ns, curr->ns_num * sizeof (uint32_t));
        i += curr->ns_num;

        free(curr);
    }
    if (i != n) {
        fprintf(stderr, "Internal bug when collating data: buffer underrun.\n");
        return EXIT_FAILURE;
    }
    qsort(allns, n, sizeof (uint32_t), compare_uint32_t);

    /* Output in increasing order. */
    for (i = 0; i < n; i++)
        printf("%lu\n", (unsigned long)allns[i]);

    return EXIT_SUCCESS;
}

If you save it as e.g. example.c, you can compile it to ex using gcc -Wall -Wextra -O2 example.c -pthread -o ex . Run ./ex --help to see the usage.

Here is an example output (durations between larger clock differences, in nanoseconds, in increasing order) when running ./ex BOOTTIME 4 1 on an i5-7200U with two cores with two threads each, so four threads of execution:

Note that these are in nanoseconds, or 1/1,000,000,000ths of a second; and 1 ms = 1,000,000 ns. As you can see, there is no particularly typical "time slice" length; they vary from very short (perhaps cache effects, perhaps a kernel hardware interrupt) to 0.39 seconds.

https://pastebin.com/cvLfrWxw has the output of ./ex BOOTTIME 8 10 for a larger dataset. With 8 threads, the 60 Hz refresh rate no longer comes up (probably because the scheduler then also switches to a different task, as there are much more tasks running than there are hardware threads).

As you can see, there is no typical duration (that a thread can run without being interrupted for longer than 25µs).

This is as I expected, since the current Linux task scheduler (CFS Scheduler) has no time slices in the traditional sense.

TechQA.

Finding time slice for threads using gettimeofday

There are 3 answers

Related Questions in C

Related Questions in LINUX

Related Questions in PTHREADS

Related Questions in GETTIMEOFDAY

Popular Questions

Popular Tags

Trending Questions