Abstract
It is a fact that most real-world datasets in biomedical research contain outliers and leverage points. To define what an outlier and a leverage point is, let us assume a Y\X regression model where Y is the outcome variable and X the independent covariate(s). Outliers are Y outcome observations that are distant from the majority of the other observations (in terms of the y-axis). Outliers can sometimes be influential, meaning they can substantially impact the results of a regression analysis, i.e., the estimated b-coefficients and, consequently, the predicted outcome y variable. However, at this point we have to distinguish between (a) “non-influential” outliers i.e., those that have a minimal impact on the estimated regression model but will still lead to an overestimation of the standard error and (b) the “influential” outliers which seriously impact the estimated model because they “pull” the regression line towards themselves [1].
| Original language | English |
|---|---|
| Pages (from-to) | 1267-1269 |
| Number of pages | 3 |
| Journal | Archives of Medical Science |
| Volume | 16 |
| Issue number | 5 |
| DOIs | |
| Publication status | Published - 6 Aug 2019 |
| Externally published | Yes |