Book contents
- Frontmatter
- Contents
- List of code fragments
- Preface
- Part I Basic concepts
- Part II Pattern analysis algorithms
- Part III Constructing kernels
- 9 Basic kernels and kernel types
- 10 Kernels for text
- 11 Kernels for structured data: strings, trees, etc.
- 12 Kernels from generative models
- Appendix A Proofs omitted from the main text
- Appendix B Notational conventions
- Appendix C List of pattern analysis methods
- Appendix D List of kernels
- References
- Index
11 - Kernels for structured data: strings, trees, etc.
from Part III - Constructing kernels
Published online by Cambridge University Press: 29 March 2011
- Frontmatter
- Contents
- List of code fragments
- Preface
- Part I Basic concepts
- Part II Pattern analysis algorithms
- Part III Constructing kernels
- 9 Basic kernels and kernel types
- 10 Kernels for text
- 11 Kernels for structured data: strings, trees, etc.
- 12 Kernels from generative models
- Appendix A Proofs omitted from the main text
- Appendix B Notational conventions
- Appendix C List of pattern analysis methods
- Appendix D List of kernels
- References
- Index
Summary
Probably the most important data type after vectors and free text is that of symbol strings of varying lengths. This type of data is commonplace in bioinformatics applications, where it can be used to represent proteins as sequences of amino acids, genomic DNA as sequences of nucleotides, promoters and other structures. Partly for this reason a great deal of research has been devoted to it in the last few years. Many other application domains consider data in the form of sequences so that many of the techniques have a history of development within computer science, as for example in stringology, the study of string algorithms.
Kernels have been developed to compute the inner product between images of strings in high-dimensional feature spaces using dynamic programming techniques. Although sequences can be regarded as a special case of a more general class of structures for which kernels have been designed, we will discuss them separately for most of the chapter in order to emphasise their importance in applications and to aid understanding of the computational methods. In the last part of the chapter, we will show how these concepts and techniques can be extended to cover more general data structures, including trees, arrays, graphs and so on.
Certain kernels for strings based on probabilistic modelling of the data-generating source will not be discussed here, since Chapter 12 is entirely devoted to these kinds of methods. There is, however, some overlap between the structure kernels presented here and those arising from probabilistic modelling covered in Chapter 12.
- Type
- Chapter
- Information
- Kernel Methods for Pattern Analysis , pp. 344 - 396Publisher: Cambridge University PressPrint publication year: 2004
- 2
- Cited by