transform short word to original word

132 views Asked by At

I used some word counting algorithm and by a closer look I was wondering because I got out less words than originally in the text because they count for example "it's" as one word. So I tried to find a solution but without any success, so I asked myself if their exist anything to transform a "short word" like "it's" to their "base words", say "it is".

2

There are 2 answers

4
user1438038 On

Well, basically you need to provide a data structure that maps abbreviated terms to their corresponding long versions. However, this will not be as simple as it sounds, for example you won't want to transform "The client's car." to "The client is car."

To manage these cases, you will probably need a heuristic that has a deeper understanding of the language you are processing and the grammar rules it incorporates.

2
Drew Kennedy On

I just built this from scratch for the challenge. It seems to be working on my end. Let me know how it works for you.

public static void main(String[] args) {

    String s = "it's such a lovely day! it's really amazing!";

    System.out.println(convertText(s));
    //output: it is such a lovely day! it is really amazing!

}

public static String convertText(String text) {
    String noContraction = null;
    String replaced = null;
    String[] words = text.split(' ');

    for (String word : words) {
        if (word.contains("'s")) {
            String replaceAposterphe = word.replace("'", "$");
            String[] splitWord = replaceAposterphe.split('$');
            noContraction = splitWord[0] + " is";
            replaced = text.replace(word, noContraction);
        }
    }
    return replaced;
}

I did this in C# and tried to convert it into Java. If you see any syntax errors, please point them out.