Identification from L2 Speech Using Neural Spectrogram Analysis

25 November 2020, Version 1
This content is an early or alternative research output and has not been peer-reviewed by Cambridge University Press at the time of posting.

Abstract

The main objective of this project is to model L1-L2 interaction and uncover discriminative speech features that can identify the L1 background of a speaker from their non-native English speech. Traditional phonetic analyses of L1-L2 interaction tend to use a pre-selected set of acoustic features. This, however, may not be sufficient to capture all traces of the L1 in the L2 speech to make an accurate classification. Deep learning has the potential to address this by exploring the space of features automatically. In this talk I report a series of classification experiments involving a deep convolutional neural network (CNN) based on spectrogram pictures. The classification problem consists of determining whether English speech samples from a large spontaneous speech corpus are spoken by a native speaker of SSBE, Japanese, Dutch, French or Polish.

Keywords

L1 Identification
English as a Second Language
Phonetics
Pronunciation
Deep learning
Phonetic modelling

Supplementary weblinks

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting and Discussion Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.