Why is functional-styled code so much faster than imperative code in C?

110 views Asked by At

I've been working on an algorithm for calculating maximum depth of an expression (i.e. how many nested parentheses there are) in various languages just for fun/practice.

I noticed there's a huge performance discrepancy in the performance of the functional styled C code and the imperative styled C code and was wondering why that is.

Given the string "(1+(2*3)+((8)/4))+1" the imperative code finishes consistently in about 10-13us but the functional code takes 2-3us, more than twice as fast. Both algorithms are compiled with -O2 and gcc, so I found this extremely surprising, but I don't know enough about the compiler's implementation to understand why.

So can anyone tell me why the functional code is so significantly faster?

Functional code (note the _ERR stuff are just #define's with integers):

const int max_depth_functional(
        const char *expr, const int sum, const int max) {
    switch(*expr) {
        case '\0':
            return sum == 0 ? max : UNTERM_PARENTH_ERR;
        case '0': case '1': case '2': case '3': case '4':
        case '5': case '6': case '7': case '8': case '9':
        case '+': case '-': case '*': case '/': case '^':
            return max_depth_functional(expr + 1, sum, max);
        case '(':
            return max_depth_functional(
                expr + 1, sum + 1, sum + 1 > max ? sum + 1 : max
            );
        case ')':
            return max_depth_functional(expr + 1, sum - 1, max);
        default:
            return INVALID_EXPR_ERR;
    }
}

Imperative code:

const int max_depth_imperative(const char *expr) {
    int curr_sum = 0, curr_max = 0;
    while(*expr != '\0') {
        switch(*expr++) {
            case '0': case '1': case '2': case '3': case '4':
            case '5': case '6': case '7': case '8': case '9':
            case '+': case '-': case '*': case '/': case '^':
                break;
            case '(':
                curr_sum++;
                curr_max = curr_sum > curr_max ? curr_sum : curr_max;
                break;
            case ')':
                curr_sum--;
                break;
            default:
                return INVALID_EXPR_ERR;
        }
    }
    return curr_sum == 0 ? curr_max : UNTERM_PARENTH_ERR;
}

Both are called like:

const clock_t start = clock();
const int func_result = max_depth_func(args[1]);
const clock_t end = clock();

Also, I'm using Linux x86_64 to build and run

1

There are 1 answers

1
Dylan Turner On

As per comments, I ran the code using:

double imperative_time_sum = 0, functional_sum_time = 0;
for(int i = 0; i < 100000; i++) {
    const clock_t start_imp = clock();
    max_depth(args[1]);
    const clock_t end_imp = clock();
    max_depth_functional_fast(args[1], 0, 0);
    const clock_t end_func = clock();

    imperative_time_sum +=
        1000 * (double) (end_imp - start_imp) / CLOCKS_PER_SEC;
    functional_sum_time +=
        1000 * (double) (end_func - end_imp) / CLOCKS_PER_SEC;
}
printf("Average imperative: %fms\n", imperative_time_sum / 100000);
printf("Average functional: %fms\n", functional_sum_time / 100000);

Which produced results:

Average imperative: 0.002412ms
Average functional: 0.002421ms

Although I reran the program upwards of 100 times before, I hadn't run anywhere close to 100000 times. After that, the times were very close to eachother.