Information About

Subsequence




Formally, suppose that ''X'' is a set and that (''a''''k'')''k'' ∈ ''K'' is a sequence in ''X'', where ''K'' = {1,2,3,...,''n''} if (''a''''k'') is a finite sequence and ''K'' = N if (''a''''k'') is an infinite sequence. Then, a subsequence of (''ak'') is a sequence of the form (a_{n_r}) where (''nr'') is a strictly increasing sequence in the index set ''K''.


EXAMPLE

As an example,
: < B,C,D,B >
is a subsequence of
: < A,C,B,D,E,G,C,E,D,B,G > ,
with corresponding index sequence <3,7,9,10>.

Given two sequences ''X'' and ''Y'', a sequence ''G'' is said to be a ''common subsequence'' of ''X'' and ''Y'', if ''G'' is a subsequence of both ''X'' and ''Y''. For example, if
: X = < A,C,B,D,E,G,C,E,D,B,G > and
: Y = < B,E,G,C,F,E,U,B,K >
then common subsequence of ''X'' and ''Y'' could be
: G = < B,E,E >

This would ''not'' be the '' Longest Common Subsequence '', since ''G'' only has length 3, and the common subsequence < B,E,E,B > has length 4. The longest common subsequence of ''X'' and ''Y'' is < B,E,G,C,E,B >


APPLICATIONS

Subsequences have applications to Computer Science , especially in the discipline of Bioinformatics , where computers are used to compare, analyze, and store DNA strands.

Take two strands of DNA, say :

ORG1 = ACGGTGTCGTGCTATGCTGATGCTGACTTATATGCTA

ORG2 = CGTTCGGCTATCGTACGTTCTATTCTATGATTTCTAA

Subsequences are used to determine how similar the two strands of DNA are, using the DNA bases: Adenine , Guanine , Cytosine and Thymine .


SUBSTRING VS. SUBSEQUENCE

In computer science, '' String '' is often used as a synonym for ''sequence'', but it is important to note that '' Substring '' and ''subsequence'' are not synonyms. Substrings are consecutive parts of a string, while subsequences need not be. This means that a substring of a string is always a subsequence of the string, but the opposite is not true.


SEE ALSO



REFERENCES

  Last Gusfield
  First Dan
  Origyear 1997
  Year 1999
  Title Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology
  Publisher Cambridge University Press
  Location USA
  Id ISBN 0-521-58519-8
  Pages 4