Book contents
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- Part Two Supervised and Unsupervised Learning Algorithms
- Part Three Alternative Learning Settings
- 14 Parallel Online Learning
- 15 Parallel Graph-Based Semi-Supervised Learning
- 16 Distributed Transfer Learning via Cooperative Matrix Factorization
- 17 Parallel Large-Scale Feature Selection
- Part Four Applications
- Subject Index
- References
15 - Parallel Graph-Based Semi-Supervised Learning
from Part Three - Alternative Learning Settings
Published online by Cambridge University Press: 05 February 2012
- Frontmatter
- Contents
- Contributors
- Preface
- 1 Scaling Up Machine Learning: Introduction
- Part One Frameworks for Scaling Up Machine Learning
- Part Two Supervised and Unsupervised Learning Algorithms
- Part Three Alternative Learning Settings
- 14 Parallel Online Learning
- 15 Parallel Graph-Based Semi-Supervised Learning
- 16 Distributed Transfer Learning via Cooperative Matrix Factorization
- 17 Parallel Large-Scale Feature Selection
- Part Four Applications
- Subject Index
- References
Summary
Semi-supervised learning (SSL) is the process of training decision functions using small amounts of labeled and relatively large amounts of unlabeled data. In many applications, annotating training data is time consuming and error prone. Speech recognition is the typical example, which requires large amounts of meticulously annotated speech data (Evermann et al., 2005) to produce an accurate system. In the case of document classification for internet search, it is not even feasible to accurately annotate a relatively large number of web pages for all categories of potential interest. SSL lends itself as a useful technique in many machine learning applications because one need annotate only relatively small amounts of the available data. SSL is related to the problem of transductive learning (Vapnik, 1998). In general, a learner is transductive if it is designed for prediction on only a closed dataset, where the test set is revealed at training time. In practice, however, transductive learners can be modified to handle unseen data (Sindhwani, Niyogi, and Belkin, 2005; Zhu, 2005a). Chapter 25 in Chapelle, Scholkopf, and Zien (2007) gives a full discussion on the relationship between SSL and transductive learning. In this chapter, SSL refers to the semi-supervised transductive classification problem.
Let x ∈ X denote the input to the decision function (classifier), f, and y ∈ Y denote its output label, that is, f : X → Y. In most cases f(x) = argmaxy∈Yp(y|x).
- Type
- Chapter
- Information
- Scaling up Machine LearningParallel and Distributed Approaches, pp. 307 - 330Publisher: Cambridge University PressPrint publication year: 2011
References
- 1
- Cited by