Code efficiency and accuracy

110 views Asked by At

I'm trying to solve a problem on code wars and the unit tests provided make absolutely no sense...

The problem is as follows and sounds absolutely simple enough to have something working in 5 minutes

Consider a sequence u where u is defined as follows:

The number u(0) = 1 is the first one in u.
For each x in u, then y = 2 * x + 1 and z = 3 * x + 1 must be in u too.
There are no other numbers in u.
Ex: u = [1, 3, 4, 7, 9, 10, 13, 15, 19, 21, 22, 27, ...]

1 gives 3 and 4, then 3 gives 7 and 10, 4 gives 9 and 13, then 7 gives 15 and 22 and so on...

Task:

Given parameter n the function dbl_linear (or dblLinear...) returns the element u(n) of the ordered (with <) sequence u.

Example:

dbl_linear(10) should return 22

At first I used a sortedset with a linq query as I didnt really care about efficiency, I quickly learned that this operation will have to calculate to ranges where n could equal ~100000 in under 12 seconds.

So this abomination was born, then butchered time and time again since a for loop would generate issues for some reason. It was then "upgraded" to a while loop which gave slightly more passed unit tests ( 4 -> 8 ).

public class DoubleLinear {
    public static int DblLinear(int n) {
        ListSet<int> table = new ListSet<int> {1};
        for (int i = 0; i < n; i++) {
            table.Put(Y(table[i]));
            table.Put(Z(table[i]));
        }
        table.Sort();
        return table[n];
    }

    private static int Y(int y) {
        return 2 * y + 1;
    }

    private static int Z(int z) {
        return 3 * z + 1;
    }

}

public class ListSet<T> : List<T> {
    public void Put(T item) {
        if (!this.Contains(item))
            this.Add(item);
    }

}

With this code it still fails the calculation in excess of n = 75000, but passes up to 8 tests.

I've checked if other people have passed this, and they have. However, i cannot check what they wrote to learn from it.

Can anyone provide insight to what could be wrong here? I'm sure the answer is blatantly obvious and I'm being dumb.

Also is using a custom list in this way a bad idea? is there a better way?

1

There are 1 answers

2
patrickjp93 On

ListSet is slow for sorting, and you constantly get memory reallocation as you build the set. I would start by allocating the table in its full size first, though honestly I would also tell you using a barebones array of the size you need is best for performance.

If you know you need n = 75,000+, allocate a ListSet (or an ARRAY!) of that size. If the unit tests start taking you into the stratosphere, there is a binary segmentation technique we can discuss, but that's a bit involved and logically tougher to build.

I don't see anything logically wrong with the code. The numbers it generates are correct from where I'm standing.

EDIT: Since you know 3n+1 > 2n+1, you only ever have to maintain 6 values:

Target index in u
Current index in u
Current x for y
Current x for z
Current val for y
Current val for z
public static int DblLinear(int target) {
    uint index = 1;
    uint ind_y = 1;
    uint ind_z = 1;
    uint val_y = 3;
    uint val_z = 4;

    if(target < 1)
        return 1;

    while(index < target) {
        if(val_y < val_z) {
            ind_y++;
            val_y = 2*ind_y + 1;
        } else {
            ind_z++;
            val_z = 3*ind_z + 1;
        }
        index++;
    }

    return (val_y < val_z) ? val_y : val_z;

}

You could modify the val_y if to be a while loop (more efficient critical path) if you either widen the branch to 2 conditions or implement a backstep loop for when you blow past your target index.

No memory allocation will definitely speed your calculations up, even f people want to (incorrectly) belly ache about branch prediction in such an easily predictable case.

Also, did you turn optimization on in your Visual Studio project? If you're submitting a binary and not a code file, then that can also shave quite a bit of time.