Reference: LeetCode
Difficulty: Medium

My Post: Java Solution Summary of HashMap + Sorting / Buckets (why not using TreeSet?)

Problem

Given a string, sort it in decreasing order based on the frequency of characters.

Example:

1
2
3
4
5
6
7
8
9
10
Input:
"tree"
Output:
"eert"

Input:
"cccaaa"

Output:
"cccaaa"

Analysis

HashMap + Sorting

  • Build the map to count frequency.
  • Build an array list to sort elements by frequency.
  • Build the string.

Note: Actually you can also use PriorityQueue to sort.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
public String frequencySort(String s) {
if (s == null) {
return null;
}

Map<Character, Integer> map = new HashMap<>();

for (int i = 0; i < s.length(); ++i) {
Character c = s.charAt(i);
map.put(c, map.getOrDefault(c, 0) + 1);
}

List<Character> arrayList = new ArrayList<>(map.keySet());
Collections.sort(arrayList, (n1, n2) -> (map.get(n2) - map.get(n1)));

// build the string
StringBuilder sb = new StringBuilder();
for (Character c : arrayList) {
int count = map.get(c);
while (count > 0) {
sb.append(c);
count -= 1;
}
}

return sb.toString();
}

Time: $O(N\log{M})$ where $M$ is the number of distinct characters.
Space: $O(N)$

  • $O(M)$ the array and the map (with greater overhead).
  • $O(N)$ the output string.

Using TreeSet here is not a good idea!

The code will not give correct results. In the tree example, r will be missing. It is because equals and compareTo methods should be consistent. Since we have a comparator (n1, n2) - > (map.get(n2)- map.get(n1)), when t and r have the same count, they are treated as the “same” characters in the set. Thus, t won’t be added into the set.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
public String frequencySort(String s) {
if (s == null) {
return null;
}

Map<Character, Integer> map = new HashMap<>();
Comparator<Character> comp = (n1, n2) -> (map.get(n2) - map.get(n1));
Set<Character> set = new TreeSet<>(comp);

for (int i = 0; i < s.length(); ++i) {
Character c = s.charAt(i);
map.put(c, map.getOrDefault(c, 0) + 1); // before set
set.add(c);
}

// build the string
StringBuilder sb = new StringBuilder();
for (Character c : set) {
int count = map.get(c);
while (count > 0) {
sb.append(c);
count -= 1;
}
}

return sb.toString();
}

Bucket Sort

We can use n buckets to store characters with specific frequencies. It is a typical way to use much space to gain running time improvement.

Note:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
public String frequencySort(String s) {
int n = s.length();
Map<Character, Integer> map = new HashMap<>();

// Count frequency
for (int i = 0; i < s.length(); ++i) {
Character c = s.charAt(i);
map.put(c, map.getOrDefault(c, 0) + 1);
}

List<Character>[] buckets = new ArrayList[n + 1]; // or 256

// Build buckets
for (char c: map.keySet()) {
int count = map.get(c);
if (buckets[count] == null) {
buckets[count] = new ArrayList<>();
}
buckets[count].add(c);
}

// Build output string
StringBuilder sb = new StringBuilder();

for (int i = n; i >= 0; --i) {
if (buckets[i] != null) {
continue;
}
for (char c : buckets[i]) {
int count = i;
while (count > 0) {
sb.append(c);
count -= 1;
}
}
}
return sb.toString();
}

Time: $O(N)$
Space: $O(N)$

  • $O(M)$ the buckets and the map (with greater overhead).
  • $O(N)$ the output string.

Comment