== vs equals vs XOR benchmark

654 views Asked by At

I was doing a benchmark to find out which of ==,equals and ^ (XOR) could be potentially faster while comparing Objects. for this purpose I wrote this program, and got the following results:

First Iteration:

equals  took : 345
==  took : 625
xor  took : 284  

Second Iteration:

equals  took : 346
==  took : 182
xor  took : 254  

total after 100 iterations:

average equals:  305
average ==:  164
average XOR:  246

I have couple of questions here:

  1. using equals method is faster than using hashCode() for the first time but using hashcode becomes faster after the second iteration. so is it safe to say if I'm using the same dataset over time to do some calculation its faster to use hashCode() rather than equals()?
  2. after checking String.hashCode() implementation I know why hashCode() becomes faster for the second time, but why XOR gets faster? I can't see any logic for that.

here is the program just for refrence:

public static void main(String... args) throws Exception{

    String[] set1=new String[10000000];
    String[] set2=new String[10000000];
    Random r=new Random();
    for(int i=0;i<10000000;i++){
        int next=r.nextInt(1000);
        set1[i]=next+"";
        set2[i]=next+"";
    }


    for(int q=0;q<2;q++){  // change 2 to another number for more iterations
        long start0=System.currentTimeMillis();
        int equals0=0;
        for(int i=0;i<1;i++){
            for(int j=set2.length-1;j>=0;j--){
                if(set1[i].equals(set2[j])){
                    equals0++;
                }
            }
        }
        System.out.println("equals  took : " + (System.currentTimeMillis()-start0));


        long start1=System.currentTimeMillis();
        int equals1=0;
        for(int i=0;i<1;i++){
            for(int j=set2.length-1;j>=0;j--){
                if(set1[i].hashCode()==set2[j].hashCode()){
                    equals1++;
                }
            }
        }
        System.out.println("==  took : " + (System.currentTimeMillis()-start1));



        int equals2=0;
        long start2=System.currentTimeMillis();
        for(int i=0;i<1;i++){
            for(int j=set2.length-1;j>=0;j--){
                if((set1[i].hashCode() ^ set2[j].hashCode())==0){
                    equals2++;
                }
            }
        }
        System.out.println("xor  took : " + (System.currentTimeMillis()-start2));

    }

EDIT: After running the program running for 100 iterations, I realized only the last iteration would speed up using XOR . Still and I'm not quite sure what's so special about last iteration?

2

There are 2 answers

0
Durandal On

Since the only way to compare objects for equality is to use equals() the question is moot, the only alternative that may be applicable depending on circumstances and use-case is identity comparison (==).

But identity comparison and equality are not the same concept, so you need to carefully decide which one is the right tool for the use case.

0
Westranger On

Let's have a look into the implementations of the two mehtods

hashCode()

When looking at the implementation of the Hashcode function we can see that is is dependant on the String length, which influences the time needed for computing the hash code. This explains why in your case it is sometimes faster. Anyway this method is not suited to compare very long String of different lengths, because the has value is limited to 32bit. It can happen that two differne (very long) Strings have the same has value. The hashCode() method also only does pure math to compute the hashvalue.

public int hashCode() {
    int h = hash;
    if (h == 0) {
        int off = offset;
        char val[] = value;
        int len = count;

        for (int i = 0; i < len; i++) {
            h = 31*h + val[off++];
        }
        hash = h;
    }
    return h;
}

equals()

When looking at the implementation of equals, we can also see that this one is length dependant. But unlike the hasCode() implementation does this method quit the loop at the moment when the Strings are not equal. This could make the comparison faster in some occasions. The hashcode method is always computed for the whole String. The equals method has an if statement in it's while loop, which makes the computation slower than the hashCode() loop.

public boolean equals(Object anObject) {
        if (this == anObject) {
            return true;
        }
        if (anObject instanceof String) {
            String anotherString = (String)anObject;
            int n = count;
            if (n == anotherString.count) {
                char v1[] = value;
                char v2[] = anotherString.value;
                int i = offset;
                int j = anotherString.offset;
                while (n-- != 0) {
                    if (v1[i++] != v2[j++])
                        return false;
                }
                return true;
            }
        }
        return false;
    }

XOR

The only explanation why the XOR part is slower that your == part is that the logic-expressions differ. The == expression is A == B where the XOR one is (A XOR B) == 0. But I'm not totally sure with this one because the timing difference is almost 50%. In the end you XOR comparision is useless,because it also calls the hashCode() methods like you == implemenation does, in which the most time is spend.

you should stick to the equals method which ist the only reliable approach here.