Age-group and gender classification through class-dependent phone recognition

Michael Norris, Michael Wagner

    Research output: A Conference proceeding or a Chapter in BookConference contributionpeer-review

    Abstract

    This study proposes a method to determine the gender and age group of a speaker by means of an automatic speech recognition system that is trained on six different sets of phones: one for each intersection of the two gender and three age-group classes. The study uses the Australian National Database of Spoken Language (ANDOSL) with 18 speakers in each class reading a set of 200 phonetically rich sentences. The system trains 44 context-independent phone models for each of the six classes and determines the gender and age group of an unknown utterance by finding the best matching phone sequence against the combined set of 264 phone models. Two methods of utilising the resulting phone sequences for gender and age-group recognition are evaluated: firstly, simple counting of the number of phones that belong to each class is used as the basis for the six-way class decision; secondly, the recognised phone sequence is converted to a 264-dimensional vector, whose components contain the phone counts in the phone sequence for each of the 6 x 44 phones in the combined set. An artificial neural network is trained to make the final gender and age-group decision using the count vectors as input. The artificial neural network outperforms the simple counting method with an average correct recall for gender of 97.7%, an average correct recall for age group of 60.5% and an average correct recall for combined gender and age group of 58.9%.
    Original languageEnglish
    Title of host publication13th Australasian International Conference on Speech Science and Technology
    Place of PublicationOnline
    PublisherAustralasian Speech Science and Technology Association (ASSTA)
    Pages38-41
    Number of pages4
    ISBN (Print)9780958194631
    Publication statusPublished - 2010
    Event13th Australasian International Conference on Speech Science and Technology - Melbourne, Australia
    Duration: 14 Dec 201016 Dec 2010

    Conference

    Conference13th Australasian International Conference on Speech Science and Technology
    Country/TerritoryAustralia
    CityMelbourne
    Period14/12/1016/12/10

    Fingerprint

    Dive into the research topics of 'Age-group and gender classification through class-dependent phone recognition'. Together they form a unique fingerprint.

    Cite this