Beyond the long-term Mean: Exploring the Potential of F0 Distribution Parameters in Forensic Speaker Recognition

Yuko Kinoshita, Shunichi Ishihara, Phil Rose

Research output: A Conference proceeding or a Chapter in BookConference contribution

Abstract

Despite its many prima facie attractive properties for Forensic Speaker Recognition, F0 is regarded as having limited forensic value due to its large within-speaker variability. However, its forensic use to date has been limited mostly to its long-term mean and standard deviation. This paper examines the discriminatory potential, within a Likelihood Ratio-based approach, of additional parametric features from the distribution of long-term F0: its skew, kurtosis, modal F0 and modal density. Motivated by the observation that the overall long-term F0 distribution shows less within-speaker occasion-to-occasion difference, we report a forensic discrimination experiment with noncontemporaneous speech samples from 201 male Japanese speakers. Using a multivariate LR as discriminant distance with the six LTF0 distribution parameters, an EER of 10.7% is obtained from 201 target and 80400 non-target trials. We also investigate how the EER degrades as a function of amount of voiced speech.
Original languageEnglish
Title of host publicationOdyssey 2008: The Speaker and Language Recognition Workshop
EditorsNiko Brummer
Place of PublicationStellenbosch, South Africa
PublisherInternational Speech Communication Association
Pages1-8
Number of pages8
Volume1
ISBN (Print)9780620403313
Publication statusPublished - 2008
EventOdyssey 2008, The Speaker and Language Recognition Workshop - Stellenbosch, Stellenbosch, South Africa
Duration: 21 Jan 200824 Jan 2008

Conference

ConferenceOdyssey 2008, The Speaker and Language Recognition Workshop
CountrySouth Africa
CityStellenbosch
Period21/01/0824/01/08

Fingerprint

parameter
distribution
experiment
speech
trial

Cite this

Kinoshita, Y., Ishihara, S., & Rose, P. (2008). Beyond the long-term Mean: Exploring the Potential of F0 Distribution Parameters in Forensic Speaker Recognition. In N. Brummer (Ed.), Odyssey 2008: The Speaker and Language Recognition Workshop (Vol. 1, pp. 1-8). Stellenbosch, South Africa: International Speech Communication Association.
Kinoshita, Yuko ; Ishihara, Shunichi ; Rose, Phil. / Beyond the long-term Mean: Exploring the Potential of F0 Distribution Parameters in Forensic Speaker Recognition. Odyssey 2008: The Speaker and Language Recognition Workshop. editor / Niko Brummer. Vol. 1 Stellenbosch, South Africa : International Speech Communication Association, 2008. pp. 1-8
@inproceedings{a8ceb79a3c48448ab48d4c859f45b1c6,
title = "Beyond the long-term Mean: Exploring the Potential of F0 Distribution Parameters in Forensic Speaker Recognition",
abstract = "Despite its many prima facie attractive properties for Forensic Speaker Recognition, F0 is regarded as having limited forensic value due to its large within-speaker variability. However, its forensic use to date has been limited mostly to its long-term mean and standard deviation. This paper examines the discriminatory potential, within a Likelihood Ratio-based approach, of additional parametric features from the distribution of long-term F0: its skew, kurtosis, modal F0 and modal density. Motivated by the observation that the overall long-term F0 distribution shows less within-speaker occasion-to-occasion difference, we report a forensic discrimination experiment with noncontemporaneous speech samples from 201 male Japanese speakers. Using a multivariate LR as discriminant distance with the six LTF0 distribution parameters, an EER of 10.7{\%} is obtained from 201 target and 80400 non-target trials. We also investigate how the EER degrades as a function of amount of voiced speech.",
author = "Yuko Kinoshita and Shunichi Ishihara and Phil Rose",
year = "2008",
language = "English",
isbn = "9780620403313",
volume = "1",
pages = "1--8",
editor = "Niko Brummer",
booktitle = "Odyssey 2008: The Speaker and Language Recognition Workshop",
publisher = "International Speech Communication Association",

}

Kinoshita, Y, Ishihara, S & Rose, P 2008, Beyond the long-term Mean: Exploring the Potential of F0 Distribution Parameters in Forensic Speaker Recognition. in N Brummer (ed.), Odyssey 2008: The Speaker and Language Recognition Workshop. vol. 1, International Speech Communication Association, Stellenbosch, South Africa, pp. 1-8, Odyssey 2008, The Speaker and Language Recognition Workshop, Stellenbosch, South Africa, 21/01/08.

Beyond the long-term Mean: Exploring the Potential of F0 Distribution Parameters in Forensic Speaker Recognition. / Kinoshita, Yuko; Ishihara, Shunichi; Rose, Phil.

Odyssey 2008: The Speaker and Language Recognition Workshop. ed. / Niko Brummer. Vol. 1 Stellenbosch, South Africa : International Speech Communication Association, 2008. p. 1-8.

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - Beyond the long-term Mean: Exploring the Potential of F0 Distribution Parameters in Forensic Speaker Recognition

AU - Kinoshita, Yuko

AU - Ishihara, Shunichi

AU - Rose, Phil

PY - 2008

Y1 - 2008

N2 - Despite its many prima facie attractive properties for Forensic Speaker Recognition, F0 is regarded as having limited forensic value due to its large within-speaker variability. However, its forensic use to date has been limited mostly to its long-term mean and standard deviation. This paper examines the discriminatory potential, within a Likelihood Ratio-based approach, of additional parametric features from the distribution of long-term F0: its skew, kurtosis, modal F0 and modal density. Motivated by the observation that the overall long-term F0 distribution shows less within-speaker occasion-to-occasion difference, we report a forensic discrimination experiment with noncontemporaneous speech samples from 201 male Japanese speakers. Using a multivariate LR as discriminant distance with the six LTF0 distribution parameters, an EER of 10.7% is obtained from 201 target and 80400 non-target trials. We also investigate how the EER degrades as a function of amount of voiced speech.

AB - Despite its many prima facie attractive properties for Forensic Speaker Recognition, F0 is regarded as having limited forensic value due to its large within-speaker variability. However, its forensic use to date has been limited mostly to its long-term mean and standard deviation. This paper examines the discriminatory potential, within a Likelihood Ratio-based approach, of additional parametric features from the distribution of long-term F0: its skew, kurtosis, modal F0 and modal density. Motivated by the observation that the overall long-term F0 distribution shows less within-speaker occasion-to-occasion difference, we report a forensic discrimination experiment with noncontemporaneous speech samples from 201 male Japanese speakers. Using a multivariate LR as discriminant distance with the six LTF0 distribution parameters, an EER of 10.7% is obtained from 201 target and 80400 non-target trials. We also investigate how the EER degrades as a function of amount of voiced speech.

M3 - Conference contribution

SN - 9780620403313

VL - 1

SP - 1

EP - 8

BT - Odyssey 2008: The Speaker and Language Recognition Workshop

A2 - Brummer, Niko

PB - International Speech Communication Association

CY - Stellenbosch, South Africa

ER -

Kinoshita Y, Ishihara S, Rose P. Beyond the long-term Mean: Exploring the Potential of F0 Distribution Parameters in Forensic Speaker Recognition. In Brummer N, editor, Odyssey 2008: The Speaker and Language Recognition Workshop. Vol. 1. Stellenbosch, South Africa: International Speech Communication Association. 2008. p. 1-8