I've been working on an algorithm for calculating maximum depth of an expression (i.e. how many nested parentheses there are) in various languages just for fun/practice.
I noticed there's a huge performance discrepancy in the performance of the functional styled C code and the imperative styled C code and was wondering why that is.
Given the string "(1+(2*3)+((8)/4))+1" the imperative code finishes consistently in about 10-13us but the functional code takes 2-3us, more than twice as fast. Both algorithms are compiled with -O2
and gcc, so I found this extremely surprising, but I don't know enough about the compiler's implementation to understand why.
So can anyone tell me why the functional code is so significantly faster?
Functional code (note the _ERR stuff are just #define's with integers):
const int max_depth_functional(
const char *expr, const int sum, const int max) {
switch(*expr) {
case '\0':
return sum == 0 ? max : UNTERM_PARENTH_ERR;
case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9':
case '+': case '-': case '*': case '/': case '^':
return max_depth_functional(expr + 1, sum, max);
case '(':
return max_depth_functional(
expr + 1, sum + 1, sum + 1 > max ? sum + 1 : max
);
case ')':
return max_depth_functional(expr + 1, sum - 1, max);
default:
return INVALID_EXPR_ERR;
}
}
Imperative code:
const int max_depth_imperative(const char *expr) {
int curr_sum = 0, curr_max = 0;
while(*expr != '\0') {
switch(*expr++) {
case '0': case '1': case '2': case '3': case '4':
case '5': case '6': case '7': case '8': case '9':
case '+': case '-': case '*': case '/': case '^':
break;
case '(':
curr_sum++;
curr_max = curr_sum > curr_max ? curr_sum : curr_max;
break;
case ')':
curr_sum--;
break;
default:
return INVALID_EXPR_ERR;
}
}
return curr_sum == 0 ? curr_max : UNTERM_PARENTH_ERR;
}
Both are called like:
const clock_t start = clock();
const int func_result = max_depth_func(args[1]);
const clock_t end = clock();
Also, I'm using Linux x86_64 to build and run
As per comments, I ran the code using:
Which produced results:
Although I reran the program upwards of 100 times before, I hadn't run anywhere close to 100000 times. After that, the times were very close to eachother.