[Neetcode/Array, Hash Table] Group Anagrams
https://neetcode.io/problems/anagram-groups
NeetCode
neetcode.io
class Solution:
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
res = {}
for word in strs:
count = [0] * 26
for w in word:
count[ord(w) - ord('a')] += 1
key = tuple(count)
# print(key)
if key in res:
res[key].append(word)
else:
res[key] = [word]
return list(res.values())
🔐 Why tuple(count)?
- We use tuple(count) because Python dictionaries require immutable (hashable) keys, and a list is not hashable, while a tuple is.
- A list cannot be used as a dictionary key because it's mutable (can be changed).
- A tuple is an immutable version of a list.
- When you do tuple(count), you convert the list to a format that can be used as a dictionary key.
- Because two anagrams will produce the same letter counts, so their tuple(count) will be identical.
Example:
count = [1, 0, 0, 0, 1, 0, ..., 1] # can't use this directly
key = tuple(count) # now it's hashable and usable
res[key].append(word) # use it as key to group anagrams
Tuples
A tuple is a data type used to store multiple items in a single variable. In most cases tuples are used to group together 2-3 items. Tuples that store 3 items are sometimes referred to as a 'triple'. Tuples are commonly used to store pairs of data together or return multiple values inside a function.
🧠 What ord() does:
ord() returns the ASCII (Unicode) code of a character.
For example:
ord('a') = 97
ord('b') = 98
...
ord('z') = 122
✅ Why ord(w) - ord('a')?
It converts the character into an index from 0 to 25:
ord('c') - ord('a') = 99 - 97 = 2
ord('t') - ord('a') = 116 - 97 = 19
So we can use this number as an index in our list of size 26.
ord('a') = 97
ord('b') = 98
ord('c') = 99
...
ord('z') = 122
If we subtract ord('a') = 97:
Character | ord(c) | ord(c) - ord('a') |
'a' | 97 | 0 |
'b' | 98 | 1 |
'c' | 99 | 2 |
'z' | 122 | 25 |
Perfect for indexing into [0] * 26.
🎯 Example:
Suppose w = "eat".
Initialize count = [0] * 26 (for each letter of alphabet).
Then for each character in "eat":
- 'e' → index = ord('e') - ord('a') = 4 → count[4] += 1
- 'a' → index = 0 → count[0] += 1
- 't' → index = 19 → count[19] += 1
Final count looks like:
[1, 0, 0, 0, 1, 0, ..., 1, ..., 0]
→ 'a', 'e', and 't' appear once, others zero.