Student: Shiri Azenkot
School: Pomona College
Email: shiri.azenkot@pomona.edu
shiriaz@dimax.rutgers.edu
Project Name: String Edit Distance with Moves
Research Advisor: Dr. Graham Cormode, DIMACS Postdoc
Rutgers University




Problem Description

Given two strings x and y, the edit distance, d(x, y), is the smallest number of operations required to turn one string into the other. Allowed operations are inserting a character, deleting a character, or replacing a character.

Example:

x = abcdef = defabc
d(x, y) = 6

One way to turn x into y in six operation is to first delete the characters d, e, and f with three operations. Next, insert each of them at the beginning of the string, using another three operations.

A dynamic programming algorithm exists to solve the string edit distance problem in time O(mn), where m is the length of one string, and n is the length of the other.

My research involves a slightly different problem: the string edit distance with moves, where we allow the additional operation of moving a substring. The edit distance with moves of the strings x and y as defined above is only 1. The substring def can be moved from the end of x to the beginning in just one operation.

The edit distance with moves problem has applications in many areas, including computational biology, text editing, and webpage updating. The long term goal of this project is to apply the edit distance with moves problem to genetic sequences.