I'm trying to write code that is done over a lot of values, and my code is doing a ton of extra work. I'm not even quite sure mathematically how to describe the values I want to skip, because as i
gets larger I want to skip a larger and larger percentage of i
s.
it will be easiest to explain with psudocode:
f(n)
while n > 0
if 32 < n%250 < 129 do work, else discard
n = floor( n/250 )
g(n)
while n > 0
if n%3 == 0 do work, else discard
n = floor( n/288 )
main()
for i in range
print f(i) and g(i) if neither function hits discard & both same total # of loops
I want a range
that wastes as little computation as possible. the optimizations I've been able to think of so far are using (0..n).step_by(3)
to at least make the first loop of g(n)
will always be good, and running+checking f(n)
first so you don't need to run g(n)
if it hits discard. what I'm looking for is a way to define a range similarly to how .step_by(3)
only tries values that work for the first loop of g(n)
for all the loops in the while
s. optimally, I would like to not need any if statements in the while
loops at all and instead have a range that only tries successful values.
basically what I'm asking is if there's a way to try fewer values so that the big O of the code is something to do with to 91 instead of 250 and 288 (or if you can incorporate some logic from the final if and bring it even lower than 91 that'd be awesome). I'm getting correct outputs, but it is taking a very long time. I am trying to run this on as large a range as possible; end goal would be to run it up to values on the order of 250^40==8.27e95==2^319.7, so determining which values are skippable before needing to run the functions would save a large amount of computation.
to help me make sense of describing the values I want, I have drawn a grid for valid values only considering the g(n) case and with smaller numbers [n%2
and floor(n/6)
]. the highlighted full blue columns are the numbers that would go through in this case and the highlighted white rows are the points where the values are discarded.
here's the same thing but with [n%3
and floor(n/6)
]
here's the f(n) function but with 1<n%6<4
if you want to look at the full code or recommend any other optimizations you can see that below as well:
use arrayvec::ArrayVec;
fn main() {
'outer: for n in (0..18446744073709551615u64).step_by(3) { //2^64-1; make smaller if you want the code to finish
let mut base250: ArrayVec<u8, 8> = ArrayVec::new(); //using ArrayVec makes reallocation slowdowns go away
let mut tmp:u8;
let mut m = n;
while m > 0 { //f(n) of the psudocode; inlined here
tmp = (m%250).try_into().unwrap(); //type goes from u64(n) into u8(tmp)
if tmp-33 < 96 { //takes advantage of uint rollover to just do 1 comparison
base250.push(tmp-1);
} else { continue 'outer; } //condition broke; discard this run and try the next value
m /= 250;
}
m = n;
let mut base288: ArrayVec<u8, 8> = ArrayVec::new();
while m > 0 { //g(n) of the psudocode; inlined here
if m%3==0 {
base288.push((m/3%96+32).try_into().unwrap()); //type goes from u64(n) into u8(t)
} else { continue 'outer; } //condition broke; discard this run and try the next value
m /= 288;
}
base250.reverse();
if base250.iter().zip(base288.iter()).filter(|&(a,b)| a==b).count() > 3 { //checks how many values are matching
let long:String = base250.iter().map(|&c|{if c == 127u8 {'¶'} // convert the vector to a string
else if c == 32u8 {'█'} // with readable newlines and spaces
else {c.try_into().unwrap()} }).collect();
let short:String = base288.iter().map(|&c|{if c == 127u8 {'¶'}
else if c == 32u8 {'█'}
else {c.try_into().unwrap()} }).collect();
println!("{long} {short} {n}");
}
}
}
a commenter suggested I try saving the vector one value at a time to get rid of the while loops, but since the while loops are much smaller than the for loop I think it is more worthwhile to try to do fewer big loops. I tried implementing it, but it is much slower on all ranges my computer can run; here it is anyways if you have any suggestions that would speed it up. mathematically, this code is identical to the previous code; ie it gives the exact same output. I've seen places saying that the .clone()
call is expensive but I don't know enough about *
and &
to get it to compile without a .clone()
fn main() {
let mut base250known: HashMap<u64, ArrayVec<u8>> = HashMap::from([(0u64,ArrayVec::new())]);
let mut base288known: HashMap<u64, ArrayVec<u8>> = HashMap::from([(0u64,ArrayVec::new())]);
for n in 1..18446744073709551615u64 {
let mut check: bool = true;
if n%250-33 < 96 {
match base250known.get(&(n/250)) {
Some(old) => {
let mut new = old.clone();
new.push((n%250-1).try_into().unwrap());
base250known.insert(n,new);
},
None => { check = false; }
};
} else { check = false; }
if n%3==0 {
match base288known.get(&(n/288)) {
Some(old) => {
let mut new = old.clone();
new.push((n/3%96+32).try_into().unwrap());
base288known.insert(n,new);
},
None => { check = false; }
};
} else { check = false; }
if check { if base250known[&n].iter().zip(base288known[&n].iter().rev()).filter(|&(a,b)| a==b).count() >= 0 {
let long:String = base250known[&n].iter().map(|&c|{if c == 127u8 {'¶'}
else if c == 32u8 {'█'}
else {c.try_into().unwrap()} }).collect();
let short:String = base288known[&n].iter().rev().map(|&c|{if c == 127u8 {'¶'}
else if c == 32u8 {'█'}
else {c.try_into().unwrap()} }).collect();
println!("{long} {short} {n}"); } } } }
this is a self-answer, and a bad one, but I technically got the code to do what I said I wanted. essentially, this makes the first check the range, so each time the while would have looped in the original code I just make a new for loop. after creating the range, the second check becomes a filter which I broke out into a second function living at the end of the code. as long as you create enough loops to catch the whole range it's the same code.
the code is stupidly repetitive and very long, so I feel like it should be able to be rewritten to be one loop of the form
and so on but I do not know how to phrase that in rust. I still welcome any more comments about speeding things up, especially for this check here:
because I suspect that this is a significant amount of the work that the computer is doing. I'm just trying to check for similarity between the two vecs, if you know a faster way to check for 'close but not necessarily exact', even if it is a different implementation of that concept from this line, I'd be very grateful.