I have a program that requires me to find primes up till 10**10-1 (10,000,000,000)
. I wrote a Sieve of Eratosthenes to do this, and it worked very well (and accurately) as high as 10**9 (1,000,000,000)
. I confirmed its accuracy by having it count the number of primes it found, and it matched the value of 50,847,534 on the chart found here. I used unsigned int
as the storage type and it successfully found all the primes in approximately 30 seconds.
However, 10**10
requires that I use a larger storage type: long long int
. Once I switched to this, the program is running signifigantly slower (its been 3 hours plus and its still working). Here is the relevant code:
typedef unsigned long long ul_long;
typedef unsigned int u_int;
ul_long max = 10000000000;
u_int blocks = 1250000000;
char memField[1250000000];
char mapBit(char place) { //convert 0->0x80, 1->0x40, 2->0x20, and so on
return 0x80 >> (place);
}
for (u_int i = 2; i*i < max; i++) {
if (memField[i / 8] & activeBit) { //Use correct memory block
for (ul_long n = 2 * i; n < max; n += i) {
char secondaryBit = mapBit(n % 8); //Determine bit position of n
u_int activeByte = n / 8; //Determine correct memory block
if (n < 8) { //Manual override memory block and bit for first block
secondaryBit = mapBit(n);
activeByte = 0;
}
memField[activeByte] &= ~(secondaryBit); //Set the flag to false
}
}
activeBit = activeBit >> 1; //Check the next
if (activeBit == 0x00) activeBit = 0x80;
}
I figure that since 10**10
is 10x larger then 10**9
it should take 10 times the amount of time. Where is the flaw in this? Why did changing to long long
cause such significant performance issues and how can I fix this? I recognize that the numbers get larger, so it should be somewhat slower, but only towards the end. Is there something I'm missing.
Note: I realize long int
should technically be large enough but my limits.h says it isn't even though I'm compiling 64 bit. Thats why I use long long int
in case anyone was wondering. Also, keep in mind, I have no computer science training, just a hobbyist.
edit: just ran it in "Release" as x86-64 with some of the debug statements suggested. I got the following output:
looks like I hit the u_int bound. I don't know why i
is getting that large.
Your program has an infinite loop in
for (u_int i = 2; i*i < max; i++)
.i
is anunsigned int
soi*i
wraps at 32-bit and is always less thanmax
. Makei
anul_long
.Note that you should use simpler bit pattern from 1 to 0x80 for bit 0 to 7.
Here is a complete version:
It completes in 10 seconds for 10^9 and 131 seconds for 10^10: