Simple solution for Isomorphic string check which just coulnd't pass 1 test case

748 views Asked by At

The following is my solution for Isomorphic string problem given in leetcode:

public bool IsIsomorphic(string s, string t)
    {
        int[] s1 = new int[s.Length];
        int[] t1 = new int[t.Length];
        bool isI = true;
        if (s1.Length > 0 && t1.Length > 0)
        {
            s1[0] = 0;
            int i = 1;
            int j = 1;
            for (i = 1; i < s.Length; i++)
            {
                if (s[i] == s[i - 1])
                {
                    s1[i] = 1;
                }
                else
                {
                    s1[i] = 0;
                }
            }
            for (j = 1; j < t.Length; j++)
            {
                if (t[j] == t[j - 1])
                {
                    t1[j] = 1;
                }
                else
                {
                    t1[j] = 0;
                }
            }
            for (int k = 0; k < s1.Length; k++)
            {
                if(s1[k] != t1[k])
                {
                    isI = false;
                }
            }
        }
        else
        {
            isI = true;
        }
        return isI;
    }

which passed 29/30 test cases but didn't pass the following test case which is ridiculously long. Shared the code with the input in google drive:

https://docs.google.com/document/d/1UkG8Rc6VItiihwvqzJdM3uMHIX-BCsslJ_lVklkxvq8/edit?usp=sharing

Any help would be great.

1

There are 1 answers

12
Orel Eraki On

Ok, because you haven't included how your algorithm works, and let us do all the heavy lifting, then i took it on myself to understand it, examine it and also suggest another more pleasant alternative.

Assumption you probably did that led you to errors.

You code algorithm + a guess on what you've tried.

You iterating on every string, and only checks if duplicate appears in the same index but on opposite sides, if they are you're marking them, as 1 otherwise 0. Same goes to the other string. When you're finish iterating on both strings you're checking those values are equal.

This is a false assumption.

False Assumption #1: Laying only on 2 characters
Only laying on two numbers 0, 1 to represent the repetition on a string is impossible and thus incorrect.
What happens when we have couple repeating characters ? let's say Banana on the current algorithm, you can not represent both B, a and n with only 2 characters.
Insight: Now we understand that for every unique character we need new number.

False Assumption #2: Marking repeating characters
Let's work on the assumption that for every new character we allocate a digit.

  1. How could we identify new characters ? That means we need to scan for the old character to see if it already existed, which means we can not do on ONE fast sweep on our string, that means we need to scan for every current I(index) character for 0-(I-1) to see if that word already existed. Which Leads us to another problem, We have Digits and not actual characters ? What we do now ? Create another array to hold for every unique number the character ? that means another array (Complexity and Memory).

  2. Let say we have 6 long string, you only thinking you can scan repeating characters by comparing the values by their indices 0,5 | 1,4 | 2, 3 this is not enough to establish they are repeating, because you're working on the assumption you know what order(index) the repeating characters in one string suppose to be, which we can not foresee. e.g: banana => b != a, a != n, n != a That means this word doesn't have repeating characters ? This is obviously wrong.

As you're seeing we can smell something funny going on, lot of hassle for such a small part.

It all comes to Design if we would have invest more time on how to attack this problem in the more efficient way, and the tools that we have for our disposal(We can make tools by our self [Array over Dictionary]) we would have found out all those problems.

Always Invest in Design, it is worth investing and you will not be sorry.

Solution #1:
Why using digits ? let's use the actual characters. In this case another solution alternative is a better choice.

public bool IsIsomorphic(string s, string t)
{
    if (s.Length != t.Length)
    {
        return false;
    }

    bool result = true;
    var relation = new Dictionary<char, char>();
    for (int i = 0; i < s.Length; i++)
    {
        char sChar = s[i];
        char tChar = t[i];
        char c;
        if (relation.TryGetValue(sChar, out c))
        {
            if (c != tChar)
            {
                result = false;
                break;
            }
        }
        else if (relation.ContainsValue(tChar))
        {
            result = false;
            break;
        }
        else
        {
            relation.Add(sChar, tChar);
        }
    }
    return result;
}

Solution #2:
More memory, less readability and without using external tools (the check at the end is simple array value comparing)

public bool IsIsomorphic3(string s, string t)
{
    if (s.Length != t.Length)
    {
        return false;
    }

    int[] firstNumbering = new int[26];
    int[] firstStringOrder = new int[s.Length];
    int firstCounter = 0;
    for (int i = 0; i < s.Length; i++)
    {
        char c = s[i];
        if (firstNumbering[c - 'a'] == 0)
        {
            firstNumbering[c - 'a'] = ++firstCounter;
        }
        firstStringOrder[i] = firstNumbering[c - 'a'];
    }

    int[] secondNumbering = new int[26];
    int[] secondStringOrder = new int[t.Length];
    int secondCounter = 0;
    for (int j = 0; j < t.Length; j++)
    {
        char c = t[j];
        if (secondNumbering[c - 'a'] == 0)
        {
            secondNumbering[c - 'a'] = ++secondCounter;
        }
        secondStringOrder[j] = secondNumbering[c - 'a'];
    }

    return firstStringOrder.SequenceEqual(secondStringOrder); // Comparing values using Linq
}