The concept of compositional data analysis in practice

Total major element concentrations in agricultural and grazing land soils of Europe

Clemens Reimann, Peter Filzmoser, Karl Fabian, Karel Hron, Manfred Birke, Alecos Demetriades, Enrico Dinelli, Anna Ladenberger, S. Albanese, M. Andersson, A. Arnoldussen, R. Baritz, M. J. Batista, A. Bel-lan, D. Cicchella, B. De Vivo, W. De Vos, M. Duris, A. Dusza-Dobek, O. A. Eggen & 66 others M. Eklund, V. Ernstsen, T. E. Finne, D. Flight, S. Forrester, M. Fuchs, U. Fugedi, A. Gilucis, M. Gosar, V. Gregorauskiene, A. Gulan, J. Halamic, E. Haslinger, P. Hayoz, G. Hobiger, R. Hoffmann, J. Hoogewerff, H. Hrvatovic, S. Husnjak, L. Janik, C. C. Johnson, G. Jordan, J. Kirby, J. Kivisilla, V. Klos, F. Krone, P. Kwecko, L. Kuti, A. Lima, J. Locutura, P. Lucivjansky, D. Mackovych, B. I. Malyuk, R. Maquil, M. J. McLaughlin, R. G. Meuli, N. Miosic, G. Mol, P. Négrel, P. O'Connor, K. Oorts, R. T. Ottesen, A. Pasieczna, V. Petersell, S. Pfleiderer, M. Ponavic, C. Prazeres, U. Rauch, Salpeteur, A. Schedl, A. Scheib, I. Schoeters, P. Sefcik, E. Sellersjö, F. Skopljak, I. Slaninka, A. Šorša, R. Srvkota, T. Stafilov, T. Tarvainen, V. Trendavilov, P. Valera, V. Verougstraete, D. Vidojevic, A. M. Zissimos, Z. Zomeni

Research output: Contribution to journalArticle

110 Citations (Scopus)

Abstract

Applied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original or log-transformed absolute element concentrations are studied. However, compositional data do not vary independently, and a concentration based approach to data analysis can lead to faulty conclusions. For this reason a better statistical approach was introduced in the 1980s, exclusively based on relative information. Because the difference between the two methods should be most pronounced in large-scale, and therefore highly variable, datasets, here a new dataset of agricultural soils, covering all of Europe (5.6millionkm2) at an average sampling density of 1site/2500km2, is used to demonstrate and compare both approaches. Absolute element concentrations are certainly of interest in a variety of applications and can be provided in tabulations or concentration maps. Maps for the opened data (ratios to other elements) provide more specific additional information. For compositional data XY plots for raw or log-transformed data should only be used with care in an exploratory data analysis (EDA) sense, to detect unusual data behaviour, candidate subgroups of samples, or to compare pre-defined groups of samples. Correlation analysis and the Euclidean distance are not mathematically meaningful concepts for this data type. Element relationships have to be investigated via a stability measure of the (log-)ratios of elements. Logratios are also the key ingredient for an appropriate multivariate analysis of compositional data.

Original languageEnglish
Pages (from-to)196-210
Number of pages15
JournalScience of the Total Environment
Volume426
DOIs
Publication statusPublished - 2012
Externally publishedYes

Fingerprint

grazing
Soils
Geochemistry
soil
Sampling
data analysis
Europe
land
multivariate analysis
agricultural soil
geochemistry
Multivariate Analysis
sampling

Cite this

Reimann, Clemens ; Filzmoser, Peter ; Fabian, Karl ; Hron, Karel ; Birke, Manfred ; Demetriades, Alecos ; Dinelli, Enrico ; Ladenberger, Anna ; Albanese, S. ; Andersson, M. ; Arnoldussen, A. ; Baritz, R. ; Batista, M. J. ; Bel-lan, A. ; Cicchella, D. ; De Vivo, B. ; De Vos, W. ; Duris, M. ; Dusza-Dobek, A. ; Eggen, O. A. ; Eklund, M. ; Ernstsen, V. ; Finne, T. E. ; Flight, D. ; Forrester, S. ; Fuchs, M. ; Fugedi, U. ; Gilucis, A. ; Gosar, M. ; Gregorauskiene, V. ; Gulan, A. ; Halamic, J. ; Haslinger, E. ; Hayoz, P. ; Hobiger, G. ; Hoffmann, R. ; Hoogewerff, J. ; Hrvatovic, H. ; Husnjak, S. ; Janik, L. ; Johnson, C. C. ; Jordan, G. ; Kirby, J. ; Kivisilla, J. ; Klos, V. ; Krone, F. ; Kwecko, P. ; Kuti, L. ; Lima, A. ; Locutura, J. ; Lucivjansky, P. ; Mackovych, D. ; Malyuk, B. I. ; Maquil, R. ; McLaughlin, M. J. ; Meuli, R. G. ; Miosic, N. ; Mol, G. ; Négrel, P. ; O'Connor, P. ; Oorts, K. ; Ottesen, R. T. ; Pasieczna, A. ; Petersell, V. ; Pfleiderer, S. ; Ponavic, M. ; Prazeres, C. ; Rauch, U. ; Salpeteur ; Schedl, A. ; Scheib, A. ; Schoeters, I. ; Sefcik, P. ; Sellersjö, E. ; Skopljak, F. ; Slaninka, I. ; Šorša, A. ; Srvkota, R. ; Stafilov, T. ; Tarvainen, T. ; Trendavilov, V. ; Valera, P. ; Verougstraete, V. ; Vidojevic, D. ; Zissimos, A. M. ; Zomeni, Z. / The concept of compositional data analysis in practice : Total major element concentrations in agricultural and grazing land soils of Europe. In: Science of the Total Environment. 2012 ; Vol. 426. pp. 196-210.
@article{5e6e71ade4a840e3826cb2a47d40b279,
title = "The concept of compositional data analysis in practice: Total major element concentrations in agricultural and grazing land soils of Europe",
abstract = "Applied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original or log-transformed absolute element concentrations are studied. However, compositional data do not vary independently, and a concentration based approach to data analysis can lead to faulty conclusions. For this reason a better statistical approach was introduced in the 1980s, exclusively based on relative information. Because the difference between the two methods should be most pronounced in large-scale, and therefore highly variable, datasets, here a new dataset of agricultural soils, covering all of Europe (5.6millionkm2) at an average sampling density of 1site/2500km2, is used to demonstrate and compare both approaches. Absolute element concentrations are certainly of interest in a variety of applications and can be provided in tabulations or concentration maps. Maps for the opened data (ratios to other elements) provide more specific additional information. For compositional data XY plots for raw or log-transformed data should only be used with care in an exploratory data analysis (EDA) sense, to detect unusual data behaviour, candidate subgroups of samples, or to compare pre-defined groups of samples. Correlation analysis and the Euclidean distance are not mathematically meaningful concepts for this data type. Element relationships have to be investigated via a stability measure of the (log-)ratios of elements. Logratios are also the key ingredient for an appropriate multivariate analysis of compositional data.",
keywords = "Agricultural soil, Compositional data, Europe, Geochemistry, Major elements, XRF",
author = "Clemens Reimann and Peter Filzmoser and Karl Fabian and Karel Hron and Manfred Birke and Alecos Demetriades and Enrico Dinelli and Anna Ladenberger and S. Albanese and M. Andersson and A. Arnoldussen and R. Baritz and Batista, {M. J.} and A. Bel-lan and D. Cicchella and {De Vivo}, B. and {De Vos}, W. and M. Duris and A. Dusza-Dobek and Eggen, {O. A.} and M. Eklund and V. Ernstsen and Finne, {T. E.} and D. Flight and S. Forrester and M. Fuchs and U. Fugedi and A. Gilucis and M. Gosar and V. Gregorauskiene and A. Gulan and J. Halamic and E. Haslinger and P. Hayoz and G. Hobiger and R. Hoffmann and J. Hoogewerff and H. Hrvatovic and S. Husnjak and L. Janik and Johnson, {C. C.} and G. Jordan and J. Kirby and J. Kivisilla and V. Klos and F. Krone and P. Kwecko and L. Kuti and A. Lima and J. Locutura and P. Lucivjansky and D. Mackovych and Malyuk, {B. I.} and R. Maquil and McLaughlin, {M. J.} and Meuli, {R. G.} and N. Miosic and G. Mol and P. N{\'e}grel and P. O'Connor and K. Oorts and Ottesen, {R. T.} and A. Pasieczna and V. Petersell and S. Pfleiderer and M. Ponavic and C. Prazeres and U. Rauch and Salpeteur and A. Schedl and A. Scheib and I. Schoeters and P. Sefcik and E. Sellersj{\"o} and F. Skopljak and I. Slaninka and A. Šorša and R. Srvkota and T. Stafilov and T. Tarvainen and V. Trendavilov and P. Valera and V. Verougstraete and D. Vidojevic and Zissimos, {A. M.} and Z. Zomeni",
year = "2012",
doi = "10.1016/j.scitotenv.2012.02.032",
language = "English",
volume = "426",
pages = "196--210",
journal = "Science of the Total Environment",
issn = "0048-9697",
publisher = "Elsevier",

}

Reimann, C, Filzmoser, P, Fabian, K, Hron, K, Birke, M, Demetriades, A, Dinelli, E, Ladenberger, A, Albanese, S, Andersson, M, Arnoldussen, A, Baritz, R, Batista, MJ, Bel-lan, A, Cicchella, D, De Vivo, B, De Vos, W, Duris, M, Dusza-Dobek, A, Eggen, OA, Eklund, M, Ernstsen, V, Finne, TE, Flight, D, Forrester, S, Fuchs, M, Fugedi, U, Gilucis, A, Gosar, M, Gregorauskiene, V, Gulan, A, Halamic, J, Haslinger, E, Hayoz, P, Hobiger, G, Hoffmann, R, Hoogewerff, J, Hrvatovic, H, Husnjak, S, Janik, L, Johnson, CC, Jordan, G, Kirby, J, Kivisilla, J, Klos, V, Krone, F, Kwecko, P, Kuti, L, Lima, A, Locutura, J, Lucivjansky, P, Mackovych, D, Malyuk, BI, Maquil, R, McLaughlin, MJ, Meuli, RG, Miosic, N, Mol, G, Négrel, P, O'Connor, P, Oorts, K, Ottesen, RT, Pasieczna, A, Petersell, V, Pfleiderer, S, Ponavic, M, Prazeres, C, Rauch, U, Salpeteur, Schedl, A, Scheib, A, Schoeters, I, Sefcik, P, Sellersjö, E, Skopljak, F, Slaninka, I, Šorša, A, Srvkota, R, Stafilov, T, Tarvainen, T, Trendavilov, V, Valera, P, Verougstraete, V, Vidojevic, D, Zissimos, AM & Zomeni, Z 2012, 'The concept of compositional data analysis in practice: Total major element concentrations in agricultural and grazing land soils of Europe', Science of the Total Environment, vol. 426, pp. 196-210. https://doi.org/10.1016/j.scitotenv.2012.02.032

The concept of compositional data analysis in practice : Total major element concentrations in agricultural and grazing land soils of Europe. / Reimann, Clemens; Filzmoser, Peter; Fabian, Karl; Hron, Karel; Birke, Manfred; Demetriades, Alecos; Dinelli, Enrico; Ladenberger, Anna; Albanese, S.; Andersson, M.; Arnoldussen, A.; Baritz, R.; Batista, M. J.; Bel-lan, A.; Cicchella, D.; De Vivo, B.; De Vos, W.; Duris, M.; Dusza-Dobek, A.; Eggen, O. A.; Eklund, M.; Ernstsen, V.; Finne, T. E.; Flight, D.; Forrester, S.; Fuchs, M.; Fugedi, U.; Gilucis, A.; Gosar, M.; Gregorauskiene, V.; Gulan, A.; Halamic, J.; Haslinger, E.; Hayoz, P.; Hobiger, G.; Hoffmann, R.; Hoogewerff, J.; Hrvatovic, H.; Husnjak, S.; Janik, L.; Johnson, C. C.; Jordan, G.; Kirby, J.; Kivisilla, J.; Klos, V.; Krone, F.; Kwecko, P.; Kuti, L.; Lima, A.; Locutura, J.; Lucivjansky, P.; Mackovych, D.; Malyuk, B. I.; Maquil, R.; McLaughlin, M. J.; Meuli, R. G.; Miosic, N.; Mol, G.; Négrel, P.; O'Connor, P.; Oorts, K.; Ottesen, R. T.; Pasieczna, A.; Petersell, V.; Pfleiderer, S.; Ponavic, M.; Prazeres, C.; Rauch, U.; Salpeteur; Schedl, A.; Scheib, A.; Schoeters, I.; Sefcik, P.; Sellersjö, E.; Skopljak, F.; Slaninka, I.; Šorša, A.; Srvkota, R.; Stafilov, T.; Tarvainen, T.; Trendavilov, V.; Valera, P.; Verougstraete, V.; Vidojevic, D.; Zissimos, A. M.; Zomeni, Z.

In: Science of the Total Environment, Vol. 426, 2012, p. 196-210.

Research output: Contribution to journalArticle

TY - JOUR

T1 - The concept of compositional data analysis in practice

T2 - Total major element concentrations in agricultural and grazing land soils of Europe

AU - Reimann, Clemens

AU - Filzmoser, Peter

AU - Fabian, Karl

AU - Hron, Karel

AU - Birke, Manfred

AU - Demetriades, Alecos

AU - Dinelli, Enrico

AU - Ladenberger, Anna

AU - Albanese, S.

AU - Andersson, M.

AU - Arnoldussen, A.

AU - Baritz, R.

AU - Batista, M. J.

AU - Bel-lan, A.

AU - Cicchella, D.

AU - De Vivo, B.

AU - De Vos, W.

AU - Duris, M.

AU - Dusza-Dobek, A.

AU - Eggen, O. A.

AU - Eklund, M.

AU - Ernstsen, V.

AU - Finne, T. E.

AU - Flight, D.

AU - Forrester, S.

AU - Fuchs, M.

AU - Fugedi, U.

AU - Gilucis, A.

AU - Gosar, M.

AU - Gregorauskiene, V.

AU - Gulan, A.

AU - Halamic, J.

AU - Haslinger, E.

AU - Hayoz, P.

AU - Hobiger, G.

AU - Hoffmann, R.

AU - Hoogewerff, J.

AU - Hrvatovic, H.

AU - Husnjak, S.

AU - Janik, L.

AU - Johnson, C. C.

AU - Jordan, G.

AU - Kirby, J.

AU - Kivisilla, J.

AU - Klos, V.

AU - Krone, F.

AU - Kwecko, P.

AU - Kuti, L.

AU - Lima, A.

AU - Locutura, J.

AU - Lucivjansky, P.

AU - Mackovych, D.

AU - Malyuk, B. I.

AU - Maquil, R.

AU - McLaughlin, M. J.

AU - Meuli, R. G.

AU - Miosic, N.

AU - Mol, G.

AU - Négrel, P.

AU - O'Connor, P.

AU - Oorts, K.

AU - Ottesen, R. T.

AU - Pasieczna, A.

AU - Petersell, V.

AU - Pfleiderer, S.

AU - Ponavic, M.

AU - Prazeres, C.

AU - Rauch, U.

AU - Salpeteur, null

AU - Schedl, A.

AU - Scheib, A.

AU - Schoeters, I.

AU - Sefcik, P.

AU - Sellersjö, E.

AU - Skopljak, F.

AU - Slaninka, I.

AU - Šorša, A.

AU - Srvkota, R.

AU - Stafilov, T.

AU - Tarvainen, T.

AU - Trendavilov, V.

AU - Valera, P.

AU - Verougstraete, V.

AU - Vidojevic, D.

AU - Zissimos, A. M.

AU - Zomeni, Z.

PY - 2012

Y1 - 2012

N2 - Applied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original or log-transformed absolute element concentrations are studied. However, compositional data do not vary independently, and a concentration based approach to data analysis can lead to faulty conclusions. For this reason a better statistical approach was introduced in the 1980s, exclusively based on relative information. Because the difference between the two methods should be most pronounced in large-scale, and therefore highly variable, datasets, here a new dataset of agricultural soils, covering all of Europe (5.6millionkm2) at an average sampling density of 1site/2500km2, is used to demonstrate and compare both approaches. Absolute element concentrations are certainly of interest in a variety of applications and can be provided in tabulations or concentration maps. Maps for the opened data (ratios to other elements) provide more specific additional information. For compositional data XY plots for raw or log-transformed data should only be used with care in an exploratory data analysis (EDA) sense, to detect unusual data behaviour, candidate subgroups of samples, or to compare pre-defined groups of samples. Correlation analysis and the Euclidean distance are not mathematically meaningful concepts for this data type. Element relationships have to be investigated via a stability measure of the (log-)ratios of elements. Logratios are also the key ingredient for an appropriate multivariate analysis of compositional data.

AB - Applied geochemistry and environmental sciences invariably deal with compositional data. Classically, the original or log-transformed absolute element concentrations are studied. However, compositional data do not vary independently, and a concentration based approach to data analysis can lead to faulty conclusions. For this reason a better statistical approach was introduced in the 1980s, exclusively based on relative information. Because the difference between the two methods should be most pronounced in large-scale, and therefore highly variable, datasets, here a new dataset of agricultural soils, covering all of Europe (5.6millionkm2) at an average sampling density of 1site/2500km2, is used to demonstrate and compare both approaches. Absolute element concentrations are certainly of interest in a variety of applications and can be provided in tabulations or concentration maps. Maps for the opened data (ratios to other elements) provide more specific additional information. For compositional data XY plots for raw or log-transformed data should only be used with care in an exploratory data analysis (EDA) sense, to detect unusual data behaviour, candidate subgroups of samples, or to compare pre-defined groups of samples. Correlation analysis and the Euclidean distance are not mathematically meaningful concepts for this data type. Element relationships have to be investigated via a stability measure of the (log-)ratios of elements. Logratios are also the key ingredient for an appropriate multivariate analysis of compositional data.

KW - Agricultural soil

KW - Compositional data

KW - Europe

KW - Geochemistry

KW - Major elements

KW - XRF

UR - http://www.scopus.com/inward/record.url?scp=84861519032&partnerID=8YFLogxK

U2 - 10.1016/j.scitotenv.2012.02.032

DO - 10.1016/j.scitotenv.2012.02.032

M3 - Article

VL - 426

SP - 196

EP - 210

JO - Science of the Total Environment

JF - Science of the Total Environment

SN - 0048-9697

ER -