I encountered a question in interview to write a method to check for similar words irrespective of character cases.
I answered it by using the difference of ASCII value for each pair of characters. But at home, when I went through the actual implementation of it in String.class, I get disturbed - Why is it implemented that way!
I tried to draw a comparison between inbuilt and my custom method, this way-
public class EqualsIgnoreCase {
public static void main(String[] args) {
String str1 = "Srimant @$ Sahu 959s";
String str2 = "sriMaNt @$ sAhu 959s";
System.out.println("Avg millisecs with inbuilt () - " + averageOfTenForInbuilt(str1, str2));
System.out.println("\nAvg millisecs with custom () - " + averageOfTenForCustom(str1, str2));
}
public static int averageOfTenForInbuilt(String str1, String str2) {
int avg = 0;
for (int itr = 0; itr < 10; itr++) {
long start1 = System.currentTimeMillis();
for (int i = 0; i < 100000; i++) {
str1.equalsIgnoreCase(str2);
}
avg += System.currentTimeMillis() - start1;
}
return avg / 10;
}
public static int averageOfTenForCustom(String str1, String str2) {
int avg = 0;
for (int itr = 0; itr < 10; itr++) {
long start2 = System.currentTimeMillis();
for (int i = 0; i < 100000; i++) {
isEqualsIgnoreCase(str1, str2);
}
avg += System.currentTimeMillis() - start2;
}
return avg / 10;
}
public static boolean isEqualsIgnoreCase(String str1, String str2) {
int length = str1.length();
if (str2.length() != length) {
return false;
}
for (int i = 0; i < length; i++) {
char ch1 = str1.charAt(i);
char ch2 = str2.charAt(i);
int val = Math.abs(ch1 - ch2);
if (val != 0) {
if (isInAlphabetsRange(ch1, ch2)) {
if (val != 32) {
return false;
}
} else {
return false;
}
}
}
return true;
}
public static boolean isInAlphabetsRange(char ch1, char ch2) {
return (((ch1 <= 122 && ch1 >= 97) || (ch1 <= 90 && ch1 >= 65)) && ((ch2 <= 122 && ch2 >= 97) || (ch2 <= 90 && ch2 >= 65)));
}
}
Output-
Avg millisecs with inbuilt () - 14
Avg millisecs with custom () - 5
I found that the inbuilt method is hitting efficiency, as because of lots of checks and method calls. Is there any specific reasons behind such an implementation? Or Am I missing something in my logic?
Any suggestions, will be heartily appreciated!
Your routine only handles ASCII characters. The system one handles all unicode characters.
Consider following example: