Saturday, April 20, 2013

Anagrams


Given a pair of words, check if they are anagrams or not.

Two words are anagrams if the second word can be got by a rearrangement of letters in the first word. For example, the words 'orchestra' and 'carthorse' are anagrams. The letters and their frequencies appearing in both the words are the same.

One way to check for anagrams is to compute the letters and their frequencies in both the words. If they match, the two words are anagrams. For the above example, the frequency counts are:

O
R
C
H
E
S
T
A
1
2
1
1
1
1
1
1


C
A
R
T
H
O
S
E
1
1
2
1
1
1
1
1

This approach requires one pass over the words to compute and store the frequency counts in a data structure like a map. These map contents are then compared. This requires additional space proportional to O(n), where n is the number of unique characters in the words – this will be capped at 256 (8 bits = 1 byte) for ASCII strings. If the strings are significantly longer for some reason, then this makes sense.

A simpler approach is to sort the two words and compare the result. The two words are anagrams if their results match. For our example, the sorted outcome for both the words is 'acehorrst'.

Following is the C++ code to check for anagrams:

#include <cstdio>
#include <iostream>
#include <cstring>
#include <algorithm>

bool IsAnagrams(string s, string t) {
if(s.length() != t.length()) { //no need to compare further
  return false;
}
  sort(s.begin(),s.end());
  sort(t.begin(),t.end());
  if(s == t) {
    return true;
  }
  return false;
}


The sort function performs an in-place sort, which can be done in O(nlogn). Hence, the overall algorithm takes O(nlogn) time where n is the number of letters in the given words and does not require extra space.

No comments:

Post a Comment