dartr: An r package to facilitate analysis of SNP data generated from reduced representation genome sequencing

Bernd GRUBER, Peter UNMACK, Oliver F. Berry, Arthur GEORGES

Research output: Contribution to journalArticlepeer-review

454 Citations (Scopus)


Although vast technological advances have been made and genetic software packages are growing in number, it is not a trivial task to analyse SNP data. We announce a new r package, dartr, enabling the analysis of single nucleotide polymorphism data for population genomic and phylogenomic applications. dartr provides user-friendly functions for data quality control and marker selection, and permits rigorous evaluations of conformation to Hardy–Weinberg equilibrium, gametic-phase disequilibrium and neutrality. The package reports standard descriptive statistics, permits exploration of patterns in the data through principal components analysis and conducts standard F-statistics, as well as basic phylogenetic analyses, population assignment, isolation by distance and exports data to a variety of commonly used downstream applications (e.g., newhybrids, faststructure and phylogeny applications) outside of the r environment. The package serves two main purposes: first, a user-friendly approach to lower the hurdle to analyse such data—therefore, the package comes with a detailed tutorial targeted to the r beginner to allow data analysis without requiring deep knowledge of r. Second, we use a single, well-established format—genlight from the adegenet package—as input for all our functions to avoid data reformatting. By strictly using the genlight format, we hope to facilitate this format as the de facto standard of future software developments and hence reduce the format jungle of genetic data sets. The dartr package is available via the r CRAN network and GitHub.

Original languageEnglish
Pages (from-to)691-699
Number of pages9
JournalMolecular Ecology Resources
Issue number3
Publication statusPublished - 1 May 2018


Dive into the research topics of 'dartr: An r package to facilitate analysis of SNP data generated from reduced representation genome sequencing'. Together they form a unique fingerprint.

Cite this