On a partial least squares regression model for asymmetric data with a chemical application in mining

Mauricio Huerta, Víctor Leiva, Shuangzhe Liu, Marcelo Rodríguez, Danny Villegas

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

In chemometrical applications, covariates in regression models are often correlated, causing a collinearity problem that can be solved by partial least squares (PLS)regression. In addition, high dimensionality in the space of covariates is also a problem with more parameters than cases, a phenomenon usually found in chemical spectral data that can also be solved by PLS regression. The Birnbaum-Saunders distribution has theoretical justifications for modeling chemical data. In this paper, a new methodology based on PLS regression models is proposed considering a reparameterized Birnbaum-Saunders (RBS)distribution for the response, which is useful for describing asymmetric data frequently found in chemical phenomena. We estimate the RBS-PLS model parameters using the maximum likelihood method. A bootstrap approach is employed to obtain the optimal number of PLS components. Quantile residuals and Cook and Mahalanobis type distances are utilized for detecting possible anomalies in the modeling. We conduct perturbation studies to assess the performance of these diagnostic tools. The proposed methodology is applied to real-world kaolinite data and compared to other competing models. This provides a useful illustration of chemical analysis in the mining industry.

Original languageEnglish
Pages (from-to)55-68
Number of pages14
JournalChemometrics and Intelligent Laboratory Systems
Volume190
DOIs
Publication statusPublished - 15 Jul 2019

Fingerprint

Kaolin
Kaolinite
Mineral industry
Maximum likelihood
Chemical analysis

Cite this

Huerta, Mauricio ; Leiva, Víctor ; Liu, Shuangzhe ; Rodríguez, Marcelo ; Villegas, Danny. / On a partial least squares regression model for asymmetric data with a chemical application in mining. In: Chemometrics and Intelligent Laboratory Systems. 2019 ; Vol. 190. pp. 55-68.
@article{56052968e5ac44aba49e0eb0dbb23bf7,
title = "On a partial least squares regression model for asymmetric data with a chemical application in mining",
abstract = "In chemometrical applications, covariates in regression models are often correlated, causing a collinearity problem that can be solved by partial least squares (PLS)regression. In addition, high dimensionality in the space of covariates is also a problem with more parameters than cases, a phenomenon usually found in chemical spectral data that can also be solved by PLS regression. The Birnbaum-Saunders distribution has theoretical justifications for modeling chemical data. In this paper, a new methodology based on PLS regression models is proposed considering a reparameterized Birnbaum-Saunders (RBS)distribution for the response, which is useful for describing asymmetric data frequently found in chemical phenomena. We estimate the RBS-PLS model parameters using the maximum likelihood method. A bootstrap approach is employed to obtain the optimal number of PLS components. Quantile residuals and Cook and Mahalanobis type distances are utilized for detecting possible anomalies in the modeling. We conduct perturbation studies to assess the performance of these diagnostic tools. The proposed methodology is applied to real-world kaolinite data and compared to other competing models. This provides a useful illustration of chemical analysis in the mining industry.",
keywords = "Bootstrapping, Cook and Mahalanobis distances, Diagnostic analysis, GLM, Likelihood method, NIR spectral data, PCA, R software, Statistical residuals",
author = "Mauricio Huerta and V{\'i}ctor Leiva and Shuangzhe Liu and Marcelo Rodr{\'i}guez and Danny Villegas",
year = "2019",
month = "7",
day = "15",
doi = "10.1016/j.chemolab.2019.04.013",
language = "English",
volume = "190",
pages = "55--68",
journal = "Chemometrics and Intelligent Laboratory Systems",
issn = "0169-7439",
publisher = "Elsevier",

}

On a partial least squares regression model for asymmetric data with a chemical application in mining. / Huerta, Mauricio; Leiva, Víctor; Liu, Shuangzhe; Rodríguez, Marcelo; Villegas, Danny.

In: Chemometrics and Intelligent Laboratory Systems, Vol. 190, 15.07.2019, p. 55-68.

Research output: Contribution to journalArticle

TY - JOUR

T1 - On a partial least squares regression model for asymmetric data with a chemical application in mining

AU - Huerta, Mauricio

AU - Leiva, Víctor

AU - Liu, Shuangzhe

AU - Rodríguez, Marcelo

AU - Villegas, Danny

PY - 2019/7/15

Y1 - 2019/7/15

N2 - In chemometrical applications, covariates in regression models are often correlated, causing a collinearity problem that can be solved by partial least squares (PLS)regression. In addition, high dimensionality in the space of covariates is also a problem with more parameters than cases, a phenomenon usually found in chemical spectral data that can also be solved by PLS regression. The Birnbaum-Saunders distribution has theoretical justifications for modeling chemical data. In this paper, a new methodology based on PLS regression models is proposed considering a reparameterized Birnbaum-Saunders (RBS)distribution for the response, which is useful for describing asymmetric data frequently found in chemical phenomena. We estimate the RBS-PLS model parameters using the maximum likelihood method. A bootstrap approach is employed to obtain the optimal number of PLS components. Quantile residuals and Cook and Mahalanobis type distances are utilized for detecting possible anomalies in the modeling. We conduct perturbation studies to assess the performance of these diagnostic tools. The proposed methodology is applied to real-world kaolinite data and compared to other competing models. This provides a useful illustration of chemical analysis in the mining industry.

AB - In chemometrical applications, covariates in regression models are often correlated, causing a collinearity problem that can be solved by partial least squares (PLS)regression. In addition, high dimensionality in the space of covariates is also a problem with more parameters than cases, a phenomenon usually found in chemical spectral data that can also be solved by PLS regression. The Birnbaum-Saunders distribution has theoretical justifications for modeling chemical data. In this paper, a new methodology based on PLS regression models is proposed considering a reparameterized Birnbaum-Saunders (RBS)distribution for the response, which is useful for describing asymmetric data frequently found in chemical phenomena. We estimate the RBS-PLS model parameters using the maximum likelihood method. A bootstrap approach is employed to obtain the optimal number of PLS components. Quantile residuals and Cook and Mahalanobis type distances are utilized for detecting possible anomalies in the modeling. We conduct perturbation studies to assess the performance of these diagnostic tools. The proposed methodology is applied to real-world kaolinite data and compared to other competing models. This provides a useful illustration of chemical analysis in the mining industry.

KW - Bootstrapping

KW - Cook and Mahalanobis distances

KW - Diagnostic analysis

KW - GLM

KW - Likelihood method

KW - NIR spectral data

KW - PCA

KW - R software

KW - Statistical residuals

UR - http://www.scopus.com/inward/record.url?scp=85066243590&partnerID=8YFLogxK

UR - http://www.mendeley.com/research/partial-least-squares-regression-model-asymmetric-data-chemical-application-mining

U2 - 10.1016/j.chemolab.2019.04.013

DO - 10.1016/j.chemolab.2019.04.013

M3 - Article

VL - 190

SP - 55

EP - 68

JO - Chemometrics and Intelligent Laboratory Systems

JF - Chemometrics and Intelligent Laboratory Systems

SN - 0169-7439

ER -