Count occurrences and remove duplicates from a String

1.5k views Asked by At

So I am building a huffman tree and I am in need of taking a String as input, and then create 2 arrays containing each letter and the number of occurrences of that letter in the original string, like this:

String s = "mississippi"

Should result in:

char[] charArr = {'m','i', 's', 'p'};
int[] count = {1,4,4,2};

There are alot of question regarding this and alot of examples on how to solve this, especially here on stackoverflow but the only one that I managed to get working was this:

private void findOccurences(String s) {
        List<Character> original = new ArrayList<Character>(s.length());
        List<Character> duplicateRemoved;

        for (int i = 0; i < s.length(); i++) {
            original.add(s.charAt(i));
        }
        duplicateRemoved = new ArrayList<Character>(original);

        // Remove duplicates from second list.
        Set<Character> hs = new HashSet<Character>();
        hs.addAll(duplicateRemoved);
        duplicateRemoved.clear();
        duplicateRemoved.addAll(hs);

        charFreqs = new int[duplicateRemoved.size()];
        charArr = new char[duplicateRemoved.size()];

        for (int i = 0; i < charArr.length; i++) {
            char c = duplicateRemoved.get(i);
            int count = Collections.frequency(original, c);
            charArr[i] = c;
            charFreqs[i] = count;
        }
    }

But it feels very cumberstone and it also scrambles the order of the letters in the array. If I use this my resulting array is as follows:

char[] charArr = {'p','s', 'i', 'm'};

Are there any better way to do what I want?

2

There are 2 answers

0
navnath On
String s = "Navnath Jadhav";
Function<Map<Character, Long>, Map<Character, Long>> filterDuplicates = (m) -> m.entrySet()
        .stream()
        .filter(e -> e.getValue() == 1)
        .collect(toMap(Map.Entry::getKey, Map.Entry::getValue, (e1, e2) -> e1, LinkedHashMap::new));

final Map<Character, Long> collect = s.chars().mapToObj(c -> (char) c).collect(
        collectingAndThen(
                groupingBy(c -> c, LinkedHashMap::new, counting()),
                filterDuplicates
        )
);

Using Java 8 :

  1. chars() method of String returns us surrogate code point of each character in String

  2. Using mapToObject map those code point to character c -> (char) c

  3. using Collectors.groupingBy and Collectors.counting() we will calculate occurrences of each character in string

  4. After counting occurrences we want to remove duplicate characters.collectingAndThen help us to do some computation on the final result, It accepts first parameter as Collector so we will provide groupingBy and second parameter as a Function which accepts result generated by groupingBy and returns new result

    Function<Map<Character, Long>, Map<Character, Long>> This function will drop characters which are duplicate and only allow character which has occurred only once filter(e -> e.getValue() == 1)

*If you don't want to preserve order of character then you can remove LinkedHashMap, So it will be like toMap(Map.Entry::getKey, Map.Entry::getValue) and groupingBy(c -> c, counting())

Demo

0
A Sdi On

I will do it this way

String s = "mississippi";
List<String> original = Arrays.stream(s.split(""))
                     .collect(Collectors.toList());
List<String> duplicateRemoved = Arrays.stream(s.split(""))
                      .distinct()
                      .collect(Collectors.toList());
ArrayList<Integer> Occurrences = new ArrayList<>();
int counter = 1;

for (String aList : duplicateRemoved) {
    counter = (int) original.stream().filter(s1 -> s1.equals(aList)).count();
            Occurrences.add(counter);
        }
System.out.println(duplicateRemoved);
System.out.println(Occurrences);

and the output

enter image description here