Sunday, April 28, 2013

Rotate an array in-place


Given an array, rotate it to the right by 'k' positions. For example, given an input array A = {1,2,3,4,5,6} and k = 4, the rotated array is {5,6,1,2,3,4}.

A straight-forward method is to have a new array R of the same size as A. Starting from the index k, copy all the elements from A to R. Then copy the elements between the indexes 0 and k-1 in A to R. This approach uses O(n) additional space.

To rotate an array in-place, we need to perform a sequence of in-place reversal operations.
Given A = {1,2,3,4,5,6} and k = 4, perform the following reversals:

Step 1: Reverse the elements between indexes 0 and k-1 in-place

4
3
2
1
5
6

Step 2: Reverse the elements between indexes k and n-1 in-place, where n is the size of A

4
3
2
1
6
5

Step 3: Reverse the entire array (between 0 and n-1) in-place

5
6
1
2
3
4

The outcome of the step 3 is the rotated array.

The time complexity of this method is O(n) with no additional space required. Here is the C++ code:

int a[] = {1,2,3,4,5,6};
int size = 6;

void Rotate(int index) {
  Reverse(0,index-1);
  Reverse(index,size-1);
  Reverse(0,size-1);
  Print();
}

void Reverse(int begin, int end) {
if(begin < 0 || end >= size) {
  return;
}
for(int i = begin, j = end; i <= j; ++i, --j) {
  int tmp;
  tmp = a[i];
  a[i] = a[j];
  a[j] = tmp;
}
}

void Print() {
for(int i = 0; i < size; ++i) {
  cout << a[i] << " ";
}
}

This method does 2n swaps in all. There is a different divide-and-conquer approach which can be used to get an algorithm that does the rotate in n swaps. I will try to discuss that in a future post.

Maximum sub-array


Given an array, find the contiguous sequence which has the largest sum. The array can contain positive or negative numbers or a combination of positive and negative numbers.

A simple approach is to construct a table representing the sum of all the subproblems. The subproblem with the largest sum is the solution to this problem.

Consider an 'input' array of five integers - {-2, 4, 5, -9, -8}. Construct a table 'Sum' as shown below. Start by initializing the diagonal to the input elements. We are only interested in the half of the matrix above the diagonal. Let us start filling up the rest of the entries.

Going over each of the rows, for each cell add the value corresponding to the column with the value in the previous column of that row.

For example,


-2
4
5
-9
-8
-2
-2
2
7
-2
-10
4
0
4
9
0
-8
5
0
0
5
-4
-12
-9
0
0
0
-9
-17
-8
0
0
0
0
-8

When filling the first row, the first element -2 has already been filled in as part of the diagonal.
Next the second element is (-2 + 4) = 2 , the third element is (2 + 5) = 7, the fourth element is (7 + -9 ) = -2 and the last element is (-2 + -8) = -10

Repeat similarly for each row independently.

Each cell (i, j) now represents the sum of all numbers start from the ith element till the jth element including both of them.

Pick the cell with the maximum value. In this case, it is 9, which is the largest sum of any subarray. In this example, the sub-array with the largest sum is {4, 5}.

The time complexity of this algorithm is O(n^2). It also requires O(n^2) additional space. Here is the C++ code:

int input[5] = {-2, 4, 5, -9, -8};
int SubArrayMaxSum(int n) {
if(n == 0) return 0; //empty array
if(n == 1) return input[0]; //only one element
int sum[n][n];
//initialize the matrix to 0
for(int i = 0; i < n; ++i) {
  for(int j = 0; j < n; ++j) {
    sum[i][j] = 0;
  }
}
//initialize the diagonal elements to the input array
for(int i = 0; i < n; ++i) {
  sum[i][i] = input[i];
}
int max_sum = sum[0][0];
for(int i = 0; i < n; ++i) {
  for(int j = i; j < n; ++j) {
    if(i == 0 && j == 0) continue; //skip
      sum[i][j] = sum[i][j-1] + input[j];
      if(sum[i][j] > max_sum) { //update the maximum sum
        max_sum = sum[i][j];
      }
  }
}
return max_sum;
}

Invoke the function as SubArrayMaxSum(5).

A better approach is to use dynamic programming and memoize the results in an array S. At each index i, we solve a subproblem that corresponds to the the maximum possible sub-array sum ending at i. As we iterate over the array for each element, say at index i, we make a decision if the maximum sub-array sum is got either by expanding the previous best sub-array ending at (i-1) or to start a new sub-array from this element. The value is hence the larger of the two values – current element or the sum of the maximal sub-array ending at the previous element and the current element. The maximum value in this result array represents the largest possible sub-array sum.

Si = max (Si-1+input[i], input[i]) and the maximum sub-array sum is
Smax = max (Si), 0 < i < n

Consider the same example input = {-2, 4, 5, -9, -8}. Construct the array as below. Initially, the sub-array with the largest sum is the first input element.

-2
4
9
0
-8

index 0: initialize to the first input element = -2
index 1: max(-2+4, 4) = 4
index 2: max(4+5, 5) = 9
index 3: max(9-9, -9) = 0
index 4: max(0-8,-8) = -8

Smax = 9, which is the largest possible sub-array sum.

In this method, we will visit all the array values exactly once. The time complexity is O(n). As we are using an additional array, the space complexity is O(n).
Here is the C++ code:

int input[5] = {-2, 4, 5, -9, -8};
int CalculateMaxSubArraySum(int n) {
if(n == 0) return 0; //empty array
if(n == 1) return input[0]; //only one element
int local_sum[n];
local_sum[0] = input[0];
for(int i = 1; i < n; ++i) {
  local_sum[i] = 0;
}
int max_sum = input[0];
for(int i = 1; i < n; ++i) {
//either include the element to the previous max sub-array
//or start a new max sub-array from here
  local_sum[i] = max(input[i], local_sum[i-1] + input[i]);
  if(max_sum < local_sum[i]) {
    max_sum = local_sum[i];
  }
}
for(int i = 0; i < n; ++i) {
  cout << local_sum[i] << endl;
}
return max_sum;
}

Invoke the function as: CalculateMaxSubArraySum(5).

Sunday, April 21, 2013

Search in a sorted matrix


Search for an element in an nxm matrix in which the elements in the rows and columns individually sorted in ascending order.

Eg:

1     2    3   9
4     5    6   12
7     8   11  22
10  18  30  35

Consider a sorted matrix and a 'key' to search. Start by comparing the key with the last element in the first row. If it is the element we are searching for, we are done. Otherwise, continue to search by eliminating either a row or a column on each comparison. The algorithm stops when either we have found the element or do not have any more elements to compare, in which case the 'key' is not in the matrix.

Here is a step-by-step process to search for an element '7' in the matrix above.

Step 1: Compare with 9. Eliminate the current column and move left as 7 < 9.
1
2
3
9
4
5
6
12
7
8
11
22
10
18
30
35

Step 2: Compare with 3. Eliminate the current row and move down as 7 > 3.

1
2
3
9
4
5
6
12
7
8
11
22
10
18
30
35

Step 3: Compare with 6. Eliminate the current row and move down as 7 > 3.

1
2
3
9
4
5
6
12
7
8
11
22
10
18
30
35

Step 4: Compare with 11. Eliminate the current column and move left as 7 < 11.

1
2
3
9
4
5
6
12
7
8
11
22
10
18
30
35

Step 5: Compare with 8. Eliminate the current column and move left as 7 < 8

1
2
3
9
4
5
6
12
7
8
11
22
10
18
30
35

Step 6: Compare with 7. Print the position and stop.

1
2
3
9
4
5
6
12
7
8
11
22
10
18
30
35

Here is the C++ code for this algorithm:

int sorted_matrix[4][4] = { {1,2,3,9}, {4,5,6,12}, {7,8,11,22}, {10,18,30,35} };
bool FindInSortedMatrix(int key, int row_size, int col_size) {
int row = 0, col = col_size; //start at the first row, last column
while(row <= row_size && col >= 0) {
  if(sorted_matrix[row][col] == key) { //key found
    cout << "Found at " << row << " " << col << endl;
    return true;
  } else if(sorted_matrix[row][col] > key) {
      --col; //eliminate this column
  } else {
      ++row; //eliminate this row
    }
  }
return false; //no more elements to compare
}


To find the element 7, invoke the function as: FindInSortedMatrix(7,3,3);

In the best case, only one comparison has to be performed, which takes O(1) time. In the worst case, we will have to compare the key with all the elements in the last column and all the elements in the last row. The complexity in this case is O(row_size + col_size). If it is a square matrix, the complexity is O(row_size), as row_size = col_size in this case.

This approach is simple. A divide-and-conquer approach can be used to get a more efficient solution. I'll discuss it in a future post.

Saturday, April 20, 2013

Anagrams


Given a pair of words, check if they are anagrams or not.

Two words are anagrams if the second word can be got by a rearrangement of letters in the first word. For example, the words 'orchestra' and 'carthorse' are anagrams. The letters and their frequencies appearing in both the words are the same.

One way to check for anagrams is to compute the letters and their frequencies in both the words. If they match, the two words are anagrams. For the above example, the frequency counts are:

O
R
C
H
E
S
T
A
1
2
1
1
1
1
1
1


C
A
R
T
H
O
S
E
1
1
2
1
1
1
1
1

This approach requires one pass over the words to compute and store the frequency counts in a data structure like a map. These map contents are then compared. This requires additional space proportional to O(n), where n is the number of unique characters in the words – this will be capped at 256 (8 bits = 1 byte) for ASCII strings. If the strings are significantly longer for some reason, then this makes sense.

A simpler approach is to sort the two words and compare the result. The two words are anagrams if their results match. For our example, the sorted outcome for both the words is 'acehorrst'.

Following is the C++ code to check for anagrams:

#include <cstdio>
#include <iostream>
#include <cstring>
#include <algorithm>

bool IsAnagrams(string s, string t) {
if(s.length() != t.length()) { //no need to compare further
  return false;
}
  sort(s.begin(),s.end());
  sort(t.begin(),t.end());
  if(s == t) {
    return true;
  }
  return false;
}


The sort function performs an in-place sort, which can be done in O(nlogn). Hence, the overall algorithm takes O(nlogn) time where n is the number of letters in the given words and does not require extra space.