
In our analysis work, we also wanted to know how robust our results were to errors in the biological data (specifically, the scoring matrices). To simulate this, we also generated some CPLEX problems by first perturbing the real distance vector. Our method of perturbation was to vary each value in the distance vector by + or - some random amount within P% of the original distance value. Algebraiclly this is:
DP = D0 + (s * r * P)
Where D0 is the original distance, s is either 1 or -1 (randomly), r is a random number between 0 and 1, P is the perturbation percent, and DP is the perturbed distance
In our datasets, we found that up to 10% perturbation had no affect on the minimum possible distortion. Only very high perturbations (around 50%) had any affect on the minimum possible distortion.