String Matching Algorithms Spring-2005, Monday, 15-18

 

Professor Gad M. Landau

 

Office hours: Wednesday, 09-10

8240103 (office), 052-4895100 (cell)

landau@cs.haifa.ac.il

 

Text Book:

Gusfield, D., Algorithms on Strings, Trees, and Sequences. Cambridge University Press, (1997).

 

List of subjects:

 

1.       Exact String Matching

Knuth, D.E., J.H. Morris and V.R. Pratt, Fast Pattern Matching in Strings, SIAM J. Computing, 6, 323-350, (1977).

Karp, R.M. and M.O. Rabin, Efficient randomized pattern-matching algorithms, IBM Journal Of Research and Development, 31, 2, 249{260, (1987).

 

2.       Approximate String Matching

Abrahamson, K.R. Generalized String Matching. SIAM J. Comput., 16, 6, 1039-1051 (1987).

Landau, G.M. and U. Vishkin, Fast parallel and serial approximate string matching, Journal of Algorithms, 10, 2, 157{169 (1989).

Galil, Z. and R. Giancarlo, Improved String Matching with k Mismatches, SIGACT NEWS, 62, 52-54, (1986).

Amir, A., M. Lewenstein and E. Porat, Faster algorithms for string matching with k mis-

matches, SODA, 794-803 (2000).

Presentation by A. Amir,  Presentation by M. Lewenstein.

 

3.       LCA

Berkman, O. and U. Vishkin, Recursive star-tree parallel data-structure,

SIAM J. Computing, 22, 221-242. (1989).

Presentation by Danny Hermelin.  

Bender, M. and M. Farach-Colton, The LCA Problem Revisited, Proceedings of the 4th Latin American Symposium on Theoretical Informatics, Pages: 88 – 94

 

 

 

4.       Suffix trees

Ukkonen, E., On-line Construction of Suffix Trees, Algorithmica, 14, 3, 249-260 (1995).

Presentation by Ora Arbel,

Weiner, P.,  Linear pattern matching algorithm, Proc. 14 IEEE Symposium on Switching and Automata  Theory, 1--11, 1973.

 Farach, M. Optimal suffix tree construction with large alphabets.  Proc. 38th IEEE Symposium on Foundations of Computer Science, pp. 137--143, 1997.

 

5.                   Suffix arrays

 Martin Farach-Colton

Karkkaaainen, J. and P. Sanders. Simple linear work suffix array construction. Proc. 30th International Colloquium on Automata, Languages and Programming (ICALP 03)}, pp. 943--955, 2003. LNCS 2719.

 Kim, D.K., J.S. Sim, H. Park and K. Park. Linear-time construction of suffix arrays. Symp. Combinatorial Pattern Matching (CPM 03)}, pp. 186-- 199, 2003. LNCS 2676.

 Ko, P. and S. Aluru. Space-efficient linear-time construction of suffix arrays. Symp. Combinatorial Pattern Matching (CPM 03)}, pp. 200--210, 2003. LNCS 2676.                                                                 

                     

6.                   Construction of Aho Corasick Automaton       

                      Aho, A.V. and M.J. Corasick. Efficient string matching. Comm. ACM, 18(6):333--340, 1975. 

                      Dori, S. and G.M. Landau, Construction of Aho Corasick automaton in linear time for integer alphabets, Proc. 16th

                      Combinatorial Pattern Matching Conference (CPM), Lecture Notes in Computer Science,  Springer-Verlag,

                      to appear (2005).

                      Presentation by  S. Dori. Presentation by  G. Landau.

 

7.                      Dynamic Programming, Hamming,  LCS,  Edit Distance

 

           

8.       Local Alignment

Smith, T.f., M.S. Waterman, The identification of common molecular subsequences, J. Mol. Biol., 147, 195-197 (1981).

Efraty, N., and G.M. Landau, Sparse Normalized Local Alignment, CPM 2004.

Presentation by Nadav Efraty

Arslan, A.N., Ö. Eğecioğlu and P.A. Pevzner, A new approach to sequence comparison:

normalized sequence alignment, Bioinformatics, 17, 4, 327-337, (2001).

Presentation by Dekel Tsur

 

 

9.     Monge

                   

Crochemore, M., G.M. Landau and M. Ziv-Ukelson, A Sub-quadratic sequence

alignment algorithm for unrestricted scoring matrices, SIAM J. Comput., 32, 5, 1654 -- 1673 (2003).

Presentation by  M.  Ziv-Ukelson.

 

Aggarwal, A., M. Klawe, S. Moran, P. Shor, and R. Wilber: Geometric Applications of a Matrix-Searching Algorithm, Algorithmica, 2, 195-208 (1987).

 

Lempel, A., and J. Ziv: On the complexity of finite sequences, IEEE Transactions on Information Theory, 22, 75--81 (1976).

 

Ziv, J., and A. Lempel: A Universal Algorithm for Sequential Data Compression, IEEE Transactions on Information Theory, IT-23(3), 337--343 (1977).     

 

 

10. Grade: Test.

 

Remarks:

 

1. Workshop, April 3-8.

 

2. Seminar, Tuesday 12-14.

 

3. Presentations.