The schema last approach to data fusion

Neil Brittliff, Dharmendra Sharma

Research output: A Conference proceeding or a Chapter in BookConference contribution

Abstract

Big Data presents new challenges that require new and novel approaches in order to resolve the problems associated with the variability and variety of data obtained from multiple sources. This paper focuses on how to manage variety and the eclectic nature of big data using a technique known as 'Schema Last'. The 'Schema Last' approach is a frame work which defers the application of a descriptive model until it is required. This paper also provides a formal definition of the 'Schema Last' methodology and demonstrates the effectiveness over the more traditional Extract- Transform-Load methodologies employed in many organizations. The 'Schema Last' approach can be used as input to Map Reduction, Index creation and various data mining techniques. Ultimately, the Schema Last approach provides the frame-work to 'fuse' semistructured data into a single coherent view.

Original languageEnglish
Title of host publicationTwelfth Australasian Data Mining Conference (AusDM14)
Subtitle of host publicationProceedings of the 12th Australasian Data Mining Conference, AusDM 2014
EditorsXue Li, Lin Liu, Kok-Leong Ong, Yanchang Zhao
Place of PublicationBrisbane, Australia
PublisherAustralian Computer Society
Pages51-58
Number of pages8
Volume158
ISBN (Print)9781921770173
Publication statusPublished - 2014

Publication series

NameConferences in Research and Practice in Information Technology Series
Volume158
ISSN (Print)1445-1336

Fingerprint

Data fusion
Electric fuses
Data mining
Mathematical transformations
Big data

Cite this

Brittliff, N., & Sharma, D. (2014). The schema last approach to data fusion. In X. Li, L. Liu, K-L. Ong, & Y. Zhao (Eds.), Twelfth Australasian Data Mining Conference (AusDM14): Proceedings of the 12th Australasian Data Mining Conference, AusDM 2014 (Vol. 158, pp. 51-58). (Conferences in Research and Practice in Information Technology Series; Vol. 158). Brisbane, Australia: Australian Computer Society.
Brittliff, Neil ; Sharma, Dharmendra. / The schema last approach to data fusion. Twelfth Australasian Data Mining Conference (AusDM14): Proceedings of the 12th Australasian Data Mining Conference, AusDM 2014. editor / Xue Li ; Lin Liu ; Kok-Leong Ong ; Yanchang Zhao. Vol. 158 Brisbane, Australia : Australian Computer Society, 2014. pp. 51-58 (Conferences in Research and Practice in Information Technology Series).
@inproceedings{9dbb62ad3a3e4facbf2fa70884c099f4,
title = "The schema last approach to data fusion",
abstract = "Big Data presents new challenges that require new and novel approaches in order to resolve the problems associated with the variability and variety of data obtained from multiple sources. This paper focuses on how to manage variety and the eclectic nature of big data using a technique known as 'Schema Last'. The 'Schema Last' approach is a frame work which defers the application of a descriptive model until it is required. This paper also provides a formal definition of the 'Schema Last' methodology and demonstrates the effectiveness over the more traditional Extract- Transform-Load methodologies employed in many organizations. The 'Schema Last' approach can be used as input to Map Reduction, Index creation and various data mining techniques. Ultimately, the Schema Last approach provides the frame-work to 'fuse' semistructured data into a single coherent view.",
keywords = "Data fusion",
author = "Neil Brittliff and Dharmendra Sharma",
year = "2014",
language = "English",
isbn = "9781921770173",
volume = "158",
series = "Conferences in Research and Practice in Information Technology Series",
publisher = "Australian Computer Society",
pages = "51--58",
editor = "Xue Li and Lin Liu and Kok-Leong Ong and Yanchang Zhao",
booktitle = "Twelfth Australasian Data Mining Conference (AusDM14)",
address = "Australia",

}

Brittliff, N & Sharma, D 2014, The schema last approach to data fusion. in X Li, L Liu, K-L Ong & Y Zhao (eds), Twelfth Australasian Data Mining Conference (AusDM14): Proceedings of the 12th Australasian Data Mining Conference, AusDM 2014. vol. 158, Conferences in Research and Practice in Information Technology Series, vol. 158, Australian Computer Society, Brisbane, Australia, pp. 51-58.

The schema last approach to data fusion. / Brittliff, Neil; Sharma, Dharmendra.

Twelfth Australasian Data Mining Conference (AusDM14): Proceedings of the 12th Australasian Data Mining Conference, AusDM 2014. ed. / Xue Li; Lin Liu; Kok-Leong Ong; Yanchang Zhao. Vol. 158 Brisbane, Australia : Australian Computer Society, 2014. p. 51-58 (Conferences in Research and Practice in Information Technology Series; Vol. 158).

Research output: A Conference proceeding or a Chapter in BookConference contribution

TY - GEN

T1 - The schema last approach to data fusion

AU - Brittliff, Neil

AU - Sharma, Dharmendra

PY - 2014

Y1 - 2014

N2 - Big Data presents new challenges that require new and novel approaches in order to resolve the problems associated with the variability and variety of data obtained from multiple sources. This paper focuses on how to manage variety and the eclectic nature of big data using a technique known as 'Schema Last'. The 'Schema Last' approach is a frame work which defers the application of a descriptive model until it is required. This paper also provides a formal definition of the 'Schema Last' methodology and demonstrates the effectiveness over the more traditional Extract- Transform-Load methodologies employed in many organizations. The 'Schema Last' approach can be used as input to Map Reduction, Index creation and various data mining techniques. Ultimately, the Schema Last approach provides the frame-work to 'fuse' semistructured data into a single coherent view.

AB - Big Data presents new challenges that require new and novel approaches in order to resolve the problems associated with the variability and variety of data obtained from multiple sources. This paper focuses on how to manage variety and the eclectic nature of big data using a technique known as 'Schema Last'. The 'Schema Last' approach is a frame work which defers the application of a descriptive model until it is required. This paper also provides a formal definition of the 'Schema Last' methodology and demonstrates the effectiveness over the more traditional Extract- Transform-Load methodologies employed in many organizations. The 'Schema Last' approach can be used as input to Map Reduction, Index creation and various data mining techniques. Ultimately, the Schema Last approach provides the frame-work to 'fuse' semistructured data into a single coherent view.

KW - Data fusion

UR - http://www.scopus.com/inward/record.url?scp=84992623411&partnerID=8YFLogxK

UR - http://www.mendeley.com/research/schema-last-approach-data-fusion

M3 - Conference contribution

SN - 9781921770173

VL - 158

T3 - Conferences in Research and Practice in Information Technology Series

SP - 51

EP - 58

BT - Twelfth Australasian Data Mining Conference (AusDM14)

A2 - Li, Xue

A2 - Liu, Lin

A2 - Ong, Kok-Leong

A2 - Zhao, Yanchang

PB - Australian Computer Society

CY - Brisbane, Australia

ER -

Brittliff N, Sharma D. The schema last approach to data fusion. In Li X, Liu L, Ong K-L, Zhao Y, editors, Twelfth Australasian Data Mining Conference (AusDM14): Proceedings of the 12th Australasian Data Mining Conference, AusDM 2014. Vol. 158. Brisbane, Australia: Australian Computer Society. 2014. p. 51-58. (Conferences in Research and Practice in Information Technology Series).