Book contents
- Frontmatter
- Contents
- List of code fragments
- Preface
- Part I Basic concepts
- Part II Pattern analysis algorithms
- 5 Elementary algorithms in feature space
- 6 Pattern analysis using eigen-decompositions
- 7 Pattern analysis using convex optimisation
- 8 Ranking, clustering and data visualisation
- Part III Constructing kernels
- Appendix A Proofs omitted from the main text
- Appendix B Notational conventions
- Appendix C List of pattern analysis methods
- Appendix D List of kernels
- References
- Index
8 - Ranking, clustering and data visualisation
from Part II - Pattern analysis algorithms
Published online by Cambridge University Press: 29 March 2011
- Frontmatter
- Contents
- List of code fragments
- Preface
- Part I Basic concepts
- Part II Pattern analysis algorithms
- 5 Elementary algorithms in feature space
- 6 Pattern analysis using eigen-decompositions
- 7 Pattern analysis using convex optimisation
- 8 Ranking, clustering and data visualisation
- Part III Constructing kernels
- Appendix A Proofs omitted from the main text
- Appendix B Notational conventions
- Appendix C List of pattern analysis methods
- Appendix D List of kernels
- References
- Index
Summary
In this chapter we conclude our presentation of kernel-based pattern analysis algorithms by discussing three further common tasks in data analysis: ranking, clustering and data visualisation.
Ranking is the problem of learning a ranking function from a training set of ranked data. The number of ranks need not be specified though typically the training data comes with a relative ordering specified by assignment to one of an ordered sequence of labels.
Clustering is perhaps the most important and widely used method of unsupervised learning: it is the problem of identifying groupings of similar points that are relatively ‘isolated’ from each other, or in other words to partition the data into dissimilar groups of similar items. The number of such clusters may not be specified a priori. As exact solutions are often computationally hard to find, effective approximations via relaxation procedures need to be sought.
Data visualisation is often overlooked in pattern analysis and machine learning textbooks, despite being very popular in the data mining literature. It is a crucial step in the process of data analysis, enabling an understanding of the relations that exist within the data by displaying them in such a way that the discovered patterns are emphasised. These methods will allow us to visualise the data in the kernel-defined feature space, something very valuable for the kernel selection process. Technically it reduces to finding low-dimensional embeddings of the data that approximately retain the relevant information.
- Type
- Chapter
- Information
- Kernel Methods for Pattern Analysis , pp. 252 - 288Publisher: Cambridge University PressPrint publication year: 2004