Tightrope walking: Using predictors of 25 (OH)D concentration based on multivariable linear regression to infer associations with health risks

Ning DING, Keith Dear, Shuyu Guo, Fan Xiang, Robyn Lucas

    Research output: Contribution to journalArticle

    Abstract

    The debate on the causal association between Vitamin D status, measured as serum concentration of 25-hydroxyVitamin D (25[OH]D), and various health outcomes warrants investigation in large-scale health surveys. Measuring the 25(OH)D concentration for each participant is not always feasible, because of the logistics of blood collection and the costs of Vitamin D testing. To address this problem, past research has used predicted 25(OH)D concentration, based on multivariable linear regression, as a proxy for unmeasured Vitamin D status. We restate this approach in a mathematical framework, to deduce its possible pitfalls. Monte Carlo simulation and real data from the National Health and Nutrition Examination Survey 2005-06 are used to confirm the deductions. The results indicate that variables that are used in the prediction model (for 25[OH]D concentration) but not in the model for the health outcome (called instrumental variables), play an essential role in the identification of an effect. Such variables should be unrelated to the health outcome other than through Vitamin D; otherwise the estimate of interest will be biased. The approach of predicted 25 (OH)D concentration derived from multivariable linear regression may be valid. However, careful verification that the instrumental variables are unrelated to the health outcome is required.
    Original languageEnglish
    Article numbere0125551
    Pages (from-to)1-13
    Number of pages13
    JournalPLoS One
    Volume10
    Issue number5
    DOIs
    Publication statusPublished - 1 May 2015

    Fingerprint

    Health risks
    vitamin D
    Linear regression
    Vitamin D
    walking
    Walking
    Linear Models
    Health
    National Health and Nutrition Examination Survey
    Nutrition Surveys
    Proxy
    blood serum
    Health Surveys
    Nutrition
    Logistics
    Costs and Cost Analysis
    Blood
    prediction
    blood
    Serum

    Cite this

    DING, Ning ; Dear, Keith ; Guo, Shuyu ; Xiang, Fan ; Lucas, Robyn. / Tightrope walking: Using predictors of 25 (OH)D concentration based on multivariable linear regression to infer associations with health risks. In: PLoS One. 2015 ; Vol. 10, No. 5. pp. 1-13.
    @article{47e0f2be648543c582b2e28678357612,
    title = "Tightrope walking: Using predictors of 25 (OH)D concentration based on multivariable linear regression to infer associations with health risks",
    abstract = "The debate on the causal association between Vitamin D status, measured as serum concentration of 25-hydroxyVitamin D (25[OH]D), and various health outcomes warrants investigation in large-scale health surveys. Measuring the 25(OH)D concentration for each participant is not always feasible, because of the logistics of blood collection and the costs of Vitamin D testing. To address this problem, past research has used predicted 25(OH)D concentration, based on multivariable linear regression, as a proxy for unmeasured Vitamin D status. We restate this approach in a mathematical framework, to deduce its possible pitfalls. Monte Carlo simulation and real data from the National Health and Nutrition Examination Survey 2005-06 are used to confirm the deductions. The results indicate that variables that are used in the prediction model (for 25[OH]D concentration) but not in the model for the health outcome (called instrumental variables), play an essential role in the identification of an effect. Such variables should be unrelated to the health outcome other than through Vitamin D; otherwise the estimate of interest will be biased. The approach of predicted 25 (OH)D concentration derived from multivariable linear regression may be valid. However, careful verification that the instrumental variables are unrelated to the health outcome is required.",
    keywords = "Adult, Blood Pressure, Computer Simulation, Female, Humans, Linear Models, Male, Middle Aged, Monte Carlo Method, Multivariate Analysis, Nutrition Surveys, Regression Analysis, Risk Factors, United States, Vitamin D/analogs & derivatives",
    author = "Ning DING and Keith Dear and Shuyu Guo and Fan Xiang and Robyn Lucas",
    year = "2015",
    month = "5",
    day = "1",
    doi = "10.1371/journal.pone.0125551",
    language = "English",
    volume = "10",
    pages = "1--13",
    journal = "PLoS One",
    issn = "1932-6203",
    publisher = "Public Library of Science",
    number = "5",

    }

    Tightrope walking: Using predictors of 25 (OH)D concentration based on multivariable linear regression to infer associations with health risks. / DING, Ning; Dear, Keith; Guo, Shuyu; Xiang, Fan; Lucas, Robyn.

    In: PLoS One, Vol. 10, No. 5, e0125551, 01.05.2015, p. 1-13.

    Research output: Contribution to journalArticle

    TY - JOUR

    T1 - Tightrope walking: Using predictors of 25 (OH)D concentration based on multivariable linear regression to infer associations with health risks

    AU - DING, Ning

    AU - Dear, Keith

    AU - Guo, Shuyu

    AU - Xiang, Fan

    AU - Lucas, Robyn

    PY - 2015/5/1

    Y1 - 2015/5/1

    N2 - The debate on the causal association between Vitamin D status, measured as serum concentration of 25-hydroxyVitamin D (25[OH]D), and various health outcomes warrants investigation in large-scale health surveys. Measuring the 25(OH)D concentration for each participant is not always feasible, because of the logistics of blood collection and the costs of Vitamin D testing. To address this problem, past research has used predicted 25(OH)D concentration, based on multivariable linear regression, as a proxy for unmeasured Vitamin D status. We restate this approach in a mathematical framework, to deduce its possible pitfalls. Monte Carlo simulation and real data from the National Health and Nutrition Examination Survey 2005-06 are used to confirm the deductions. The results indicate that variables that are used in the prediction model (for 25[OH]D concentration) but not in the model for the health outcome (called instrumental variables), play an essential role in the identification of an effect. Such variables should be unrelated to the health outcome other than through Vitamin D; otherwise the estimate of interest will be biased. The approach of predicted 25 (OH)D concentration derived from multivariable linear regression may be valid. However, careful verification that the instrumental variables are unrelated to the health outcome is required.

    AB - The debate on the causal association between Vitamin D status, measured as serum concentration of 25-hydroxyVitamin D (25[OH]D), and various health outcomes warrants investigation in large-scale health surveys. Measuring the 25(OH)D concentration for each participant is not always feasible, because of the logistics of blood collection and the costs of Vitamin D testing. To address this problem, past research has used predicted 25(OH)D concentration, based on multivariable linear regression, as a proxy for unmeasured Vitamin D status. We restate this approach in a mathematical framework, to deduce its possible pitfalls. Monte Carlo simulation and real data from the National Health and Nutrition Examination Survey 2005-06 are used to confirm the deductions. The results indicate that variables that are used in the prediction model (for 25[OH]D concentration) but not in the model for the health outcome (called instrumental variables), play an essential role in the identification of an effect. Such variables should be unrelated to the health outcome other than through Vitamin D; otherwise the estimate of interest will be biased. The approach of predicted 25 (OH)D concentration derived from multivariable linear regression may be valid. However, careful verification that the instrumental variables are unrelated to the health outcome is required.

    KW - Adult

    KW - Blood Pressure

    KW - Computer Simulation

    KW - Female

    KW - Humans

    KW - Linear Models

    KW - Male

    KW - Middle Aged

    KW - Monte Carlo Method

    KW - Multivariate Analysis

    KW - Nutrition Surveys

    KW - Regression Analysis

    KW - Risk Factors

    KW - United States

    KW - Vitamin D/analogs & derivatives

    UR - http://www.scopus.com/inward/record.url?scp=84959569394&partnerID=8YFLogxK

    U2 - 10.1371/journal.pone.0125551

    DO - 10.1371/journal.pone.0125551

    M3 - Article

    VL - 10

    SP - 1

    EP - 13

    JO - PLoS One

    JF - PLoS One

    SN - 1932-6203

    IS - 5

    M1 - e0125551

    ER -