Predicting the presence of hepatitis B virus surface antigen in Chinese patients by pathology data mining

Guifang Shang, Alice RICHARDSON, Michelle GAHAN, Simon Easteal, Stephen Ohms, Brett A. Lidbury

    Research output: Contribution to journalArticle

    8 Citations (Scopus)

    Abstract

    Hepatitis B virus (HBV) is a pathogen of worldwide health significance, associated with liver disease. A vaccine is available, yet HBV prevalence remains a concern, particularly in developing countries. Pathology laboratories have a primary role in the diagnosis and monitoring of HBV infection, through hepatitis B surface antigen (HBsAg) immunoassay and associated tests. Analysis of HBsAg immunoassay and associated pathology data from 821 Chinese patients applied 10-fold cross-validation to establish classification decision trees (CDTs), with CDT results used subsequently to develop a logistic regression model. The robustness of logistic regression model was confirmed by the Hosmer-Lemeshow test, Pseudo-R2 and an area under receiver operating characteristic curve (AUROC) result that showed the logistic regression model was capable of accurately discriminating the HBsAg positive from HBsAg negative patients at 95% accuracy. Overall CDT sensitivity and specificity was 94.7% (+/- 5.0%) and 89.5% (+/- 5.7%), respectively, close to the sensitivity and specificity of the immunoassay, providing an alternative to predict HBsAg status. Both the CDT and logistic regression modeling demonstrated the importance of the routine pathology variables alanine aminotransferase (ALT), serum albumin (ALB), and alkaline phosphatase (ALP) to accurately predict HBsAg status in a Chinese patient cohort. The study demonstrates that CDTs and a linked logistic regression model applied to routine pathology data were an effective supplement to HBsAg immunoassay, and a possible replacement method where immunoassays are not requested or not easily available for the laboratory diagnosis of HBV infection
    Original languageEnglish
    Pages (from-to)1334-1339
    Number of pages6
    JournalJournal of Medical Virology
    Volume85
    Issue number8
    DOIs
    Publication statusPublished - 2013

    Fingerprint

    Data Mining
    Hepatitis B Surface Antigens
    Hepatitis B virus
    Logistic Models
    Pathology
    Decision Trees
    Immunoassay
    Virus Diseases
    Sensitivity and Specificity
    Clinical Laboratory Techniques
    Alanine Transaminase
    Serum Albumin
    ROC Curve
    Developing Countries
    Alkaline Phosphatase
    Liver Diseases
    Vaccines

    Cite this

    Shang, Guifang ; RICHARDSON, Alice ; GAHAN, Michelle ; Easteal, Simon ; Ohms, Stephen ; Lidbury, Brett A. / Predicting the presence of hepatitis B virus surface antigen in Chinese patients by pathology data mining. In: Journal of Medical Virology. 2013 ; Vol. 85, No. 8. pp. 1334-1339.
    @article{b8cbf612ead74848bf8d16ef226e44ec,
    title = "Predicting the presence of hepatitis B virus surface antigen in Chinese patients by pathology data mining",
    abstract = "Hepatitis B virus (HBV) is a pathogen of worldwide health significance, associated with liver disease. A vaccine is available, yet HBV prevalence remains a concern, particularly in developing countries. Pathology laboratories have a primary role in the diagnosis and monitoring of HBV infection, through hepatitis B surface antigen (HBsAg) immunoassay and associated tests. Analysis of HBsAg immunoassay and associated pathology data from 821 Chinese patients applied 10-fold cross-validation to establish classification decision trees (CDTs), with CDT results used subsequently to develop a logistic regression model. The robustness of logistic regression model was confirmed by the Hosmer-Lemeshow test, Pseudo-R2 and an area under receiver operating characteristic curve (AUROC) result that showed the logistic regression model was capable of accurately discriminating the HBsAg positive from HBsAg negative patients at 95{\%} accuracy. Overall CDT sensitivity and specificity was 94.7{\%} (+/- 5.0{\%}) and 89.5{\%} (+/- 5.7{\%}), respectively, close to the sensitivity and specificity of the immunoassay, providing an alternative to predict HBsAg status. Both the CDT and logistic regression modeling demonstrated the importance of the routine pathology variables alanine aminotransferase (ALT), serum albumin (ALB), and alkaline phosphatase (ALP) to accurately predict HBsAg status in a Chinese patient cohort. The study demonstrates that CDTs and a linked logistic regression model applied to routine pathology data were an effective supplement to HBsAg immunoassay, and a possible replacement method where immunoassays are not requested or not easily available for the laboratory diagnosis of HBV infection",
    keywords = "Decision tree, Hepatitis B virus, Logistic regression, Machine learning",
    author = "Guifang Shang and Alice RICHARDSON and Michelle GAHAN and Simon Easteal and Stephen Ohms and Lidbury, {Brett A.}",
    year = "2013",
    doi = "10.1002/jmv.23609",
    language = "English",
    volume = "85",
    pages = "1334--1339",
    journal = "Journal of Medical Virology",
    issn = "0146-6615",
    publisher = "Wiley-Liss Inc.",
    number = "8",

    }

    Shang, G, RICHARDSON, A, GAHAN, M, Easteal, S, Ohms, S & Lidbury, BA 2013, 'Predicting the presence of hepatitis B virus surface antigen in Chinese patients by pathology data mining', Journal of Medical Virology, vol. 85, no. 8, pp. 1334-1339. https://doi.org/10.1002/jmv.23609

    Predicting the presence of hepatitis B virus surface antigen in Chinese patients by pathology data mining. / Shang, Guifang; RICHARDSON, Alice; GAHAN, Michelle; Easteal, Simon; Ohms, Stephen; Lidbury, Brett A.

    In: Journal of Medical Virology, Vol. 85, No. 8, 2013, p. 1334-1339.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - Predicting the presence of hepatitis B virus surface antigen in Chinese patients by pathology data mining

    AU - Shang, Guifang

    AU - RICHARDSON, Alice

    AU - GAHAN, Michelle

    AU - Easteal, Simon

    AU - Ohms, Stephen

    AU - Lidbury, Brett A.

    PY - 2013

    Y1 - 2013

    N2 - Hepatitis B virus (HBV) is a pathogen of worldwide health significance, associated with liver disease. A vaccine is available, yet HBV prevalence remains a concern, particularly in developing countries. Pathology laboratories have a primary role in the diagnosis and monitoring of HBV infection, through hepatitis B surface antigen (HBsAg) immunoassay and associated tests. Analysis of HBsAg immunoassay and associated pathology data from 821 Chinese patients applied 10-fold cross-validation to establish classification decision trees (CDTs), with CDT results used subsequently to develop a logistic regression model. The robustness of logistic regression model was confirmed by the Hosmer-Lemeshow test, Pseudo-R2 and an area under receiver operating characteristic curve (AUROC) result that showed the logistic regression model was capable of accurately discriminating the HBsAg positive from HBsAg negative patients at 95% accuracy. Overall CDT sensitivity and specificity was 94.7% (+/- 5.0%) and 89.5% (+/- 5.7%), respectively, close to the sensitivity and specificity of the immunoassay, providing an alternative to predict HBsAg status. Both the CDT and logistic regression modeling demonstrated the importance of the routine pathology variables alanine aminotransferase (ALT), serum albumin (ALB), and alkaline phosphatase (ALP) to accurately predict HBsAg status in a Chinese patient cohort. The study demonstrates that CDTs and a linked logistic regression model applied to routine pathology data were an effective supplement to HBsAg immunoassay, and a possible replacement method where immunoassays are not requested or not easily available for the laboratory diagnosis of HBV infection

    AB - Hepatitis B virus (HBV) is a pathogen of worldwide health significance, associated with liver disease. A vaccine is available, yet HBV prevalence remains a concern, particularly in developing countries. Pathology laboratories have a primary role in the diagnosis and monitoring of HBV infection, through hepatitis B surface antigen (HBsAg) immunoassay and associated tests. Analysis of HBsAg immunoassay and associated pathology data from 821 Chinese patients applied 10-fold cross-validation to establish classification decision trees (CDTs), with CDT results used subsequently to develop a logistic regression model. The robustness of logistic regression model was confirmed by the Hosmer-Lemeshow test, Pseudo-R2 and an area under receiver operating characteristic curve (AUROC) result that showed the logistic regression model was capable of accurately discriminating the HBsAg positive from HBsAg negative patients at 95% accuracy. Overall CDT sensitivity and specificity was 94.7% (+/- 5.0%) and 89.5% (+/- 5.7%), respectively, close to the sensitivity and specificity of the immunoassay, providing an alternative to predict HBsAg status. Both the CDT and logistic regression modeling demonstrated the importance of the routine pathology variables alanine aminotransferase (ALT), serum albumin (ALB), and alkaline phosphatase (ALP) to accurately predict HBsAg status in a Chinese patient cohort. The study demonstrates that CDTs and a linked logistic regression model applied to routine pathology data were an effective supplement to HBsAg immunoassay, and a possible replacement method where immunoassays are not requested or not easily available for the laboratory diagnosis of HBV infection

    KW - Decision tree

    KW - Hepatitis B virus

    KW - Logistic regression

    KW - Machine learning

    U2 - 10.1002/jmv.23609

    DO - 10.1002/jmv.23609

    M3 - Article

    VL - 85

    SP - 1334

    EP - 1339

    JO - Journal of Medical Virology

    JF - Journal of Medical Virology

    SN - 0146-6615

    IS - 8

    ER -