Application of DArT seq derived SNP tags for comparative genome analysis in fishes; An alternative pipeline using sequence data from a non-traditional model species, Macquaria ambigua

Foyez Shams, Fiona Dyer, Ross Thompson, Richard P. Duncan, Jason D. Thiem, Andrzej Kilian, Tariq Ezaz

Research output: Contribution to journalArticle

2 Downloads (Pure)

Abstract

Bi-allelic Single Nucleotide Polymorphism (SNP) markers are widely used in population genetic studies. In most studies, sequences either side of the SNPs remain unused, although these sequences contain information beyond that used in population genetic studies. In this study, we show how these sequence tags either side of a single nucleotide polymorphism can be used for comparative genome analysis. We used DArTseq (Diversity Array Technology) derived SNP data for a non-model Australian native freshwater fish, Macquaria ambigua, to identify genes linked to SNP associated sequence tags, and to discover homologies with evolutionarily conserved genes and genomic regions. We concatenated 6,776 SNP sequence tags to create a hypothetical genome (representing 0.1–0.3% of the actual genome), which we used to find sequence homologies with 12 model fish species using the Ensembl genome browser with stringent filtering parameters. We identified sequence homologies for 17 evolutionarily conserved genes (cd9b, plk2b, rhot1b, sh3pxd2aa, si:ch211-148f13.1, si:dkey-166d12.2, zgc:66447, atp8a2, clvs2, lyst, mkln1, mnd1, piga, pik3ca, plagl2, rnf6, sec63) along with an ancestral evolutionarily conserved syntenic block (euteleostomi Block_210). Our analysis also revealed repetitive sequences covering approximately 12% of the hypothetical genome where DNA transposon, LTR and non-LTR retrotransposons were most abundant. A hierarchical pattern of the number of sequence homologies with phylogenetically close species validated the approach for repeatability. This new approach of using SNP associated sequence tags for comparative genome analysis may provide insight into the genome evolution of non-model species where whole genome sequences are unavailable.

Original languageEnglish
Article numbere0226365
Pages (from-to)1-13
Number of pages13
JournalPLoS One
Volume14
Issue number12
DOIs
Publication statusPublished - 1 Jan 2019

Fingerprint

Polymorphism
Fish
single nucleotide polymorphism
Single Nucleotide Polymorphism
Fishes
Nucleotides
Pipelines
Genes
Genome
genome
fish
Sequence Homology
sequence homology
Population Genetics
population genetics
Retroelements
DNA Transposable Elements
repetitive sequences
genes
Macquaria ambigua

Cite this

@article{703ee5f939ce4fb4aeec1dbb493dd41f,
title = "Application of DArT seq derived SNP tags for comparative genome analysis in fishes; An alternative pipeline using sequence data from a non-traditional model species, Macquaria ambigua",
abstract = "Bi-allelic Single Nucleotide Polymorphism (SNP) markers are widely used in population genetic studies. In most studies, sequences either side of the SNPs remain unused, although these sequences contain information beyond that used in population genetic studies. In this study, we show how these sequence tags either side of a single nucleotide polymorphism can be used for comparative genome analysis. We used DArTseq (Diversity Array Technology) derived SNP data for a non-model Australian native freshwater fish, Macquaria ambigua, to identify genes linked to SNP associated sequence tags, and to discover homologies with evolutionarily conserved genes and genomic regions. We concatenated 6,776 SNP sequence tags to create a hypothetical genome (representing 0.1–0.3{\%} of the actual genome), which we used to find sequence homologies with 12 model fish species using the Ensembl genome browser with stringent filtering parameters. We identified sequence homologies for 17 evolutionarily conserved genes (cd9b, plk2b, rhot1b, sh3pxd2aa, si:ch211-148f13.1, si:dkey-166d12.2, zgc:66447, atp8a2, clvs2, lyst, mkln1, mnd1, piga, pik3ca, plagl2, rnf6, sec63) along with an ancestral evolutionarily conserved syntenic block (euteleostomi Block_210). Our analysis also revealed repetitive sequences covering approximately 12{\%} of the hypothetical genome where DNA transposon, LTR and non-LTR retrotransposons were most abundant. A hierarchical pattern of the number of sequence homologies with phylogenetically close species validated the approach for repeatability. This new approach of using SNP associated sequence tags for comparative genome analysis may provide insight into the genome evolution of non-model species where whole genome sequences are unavailable.",
author = "Foyez Shams and Fiona Dyer and Ross Thompson and Duncan, {Richard P.} and Thiem, {Jason D.} and Andrzej Kilian and Tariq Ezaz",
year = "2019",
month = "1",
day = "1",
doi = "10.1371/journal.pone.0226365",
language = "English",
volume = "14",
pages = "1--13",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "12",

}

TY - JOUR

T1 - Application of DArT seq derived SNP tags for comparative genome analysis in fishes; An alternative pipeline using sequence data from a non-traditional model species, Macquaria ambigua

AU - Shams, Foyez

AU - Dyer, Fiona

AU - Thompson, Ross

AU - Duncan, Richard P.

AU - Thiem, Jason D.

AU - Kilian, Andrzej

AU - Ezaz, Tariq

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Bi-allelic Single Nucleotide Polymorphism (SNP) markers are widely used in population genetic studies. In most studies, sequences either side of the SNPs remain unused, although these sequences contain information beyond that used in population genetic studies. In this study, we show how these sequence tags either side of a single nucleotide polymorphism can be used for comparative genome analysis. We used DArTseq (Diversity Array Technology) derived SNP data for a non-model Australian native freshwater fish, Macquaria ambigua, to identify genes linked to SNP associated sequence tags, and to discover homologies with evolutionarily conserved genes and genomic regions. We concatenated 6,776 SNP sequence tags to create a hypothetical genome (representing 0.1–0.3% of the actual genome), which we used to find sequence homologies with 12 model fish species using the Ensembl genome browser with stringent filtering parameters. We identified sequence homologies for 17 evolutionarily conserved genes (cd9b, plk2b, rhot1b, sh3pxd2aa, si:ch211-148f13.1, si:dkey-166d12.2, zgc:66447, atp8a2, clvs2, lyst, mkln1, mnd1, piga, pik3ca, plagl2, rnf6, sec63) along with an ancestral evolutionarily conserved syntenic block (euteleostomi Block_210). Our analysis also revealed repetitive sequences covering approximately 12% of the hypothetical genome where DNA transposon, LTR and non-LTR retrotransposons were most abundant. A hierarchical pattern of the number of sequence homologies with phylogenetically close species validated the approach for repeatability. This new approach of using SNP associated sequence tags for comparative genome analysis may provide insight into the genome evolution of non-model species where whole genome sequences are unavailable.

AB - Bi-allelic Single Nucleotide Polymorphism (SNP) markers are widely used in population genetic studies. In most studies, sequences either side of the SNPs remain unused, although these sequences contain information beyond that used in population genetic studies. In this study, we show how these sequence tags either side of a single nucleotide polymorphism can be used for comparative genome analysis. We used DArTseq (Diversity Array Technology) derived SNP data for a non-model Australian native freshwater fish, Macquaria ambigua, to identify genes linked to SNP associated sequence tags, and to discover homologies with evolutionarily conserved genes and genomic regions. We concatenated 6,776 SNP sequence tags to create a hypothetical genome (representing 0.1–0.3% of the actual genome), which we used to find sequence homologies with 12 model fish species using the Ensembl genome browser with stringent filtering parameters. We identified sequence homologies for 17 evolutionarily conserved genes (cd9b, plk2b, rhot1b, sh3pxd2aa, si:ch211-148f13.1, si:dkey-166d12.2, zgc:66447, atp8a2, clvs2, lyst, mkln1, mnd1, piga, pik3ca, plagl2, rnf6, sec63) along with an ancestral evolutionarily conserved syntenic block (euteleostomi Block_210). Our analysis also revealed repetitive sequences covering approximately 12% of the hypothetical genome where DNA transposon, LTR and non-LTR retrotransposons were most abundant. A hierarchical pattern of the number of sequence homologies with phylogenetically close species validated the approach for repeatability. This new approach of using SNP associated sequence tags for comparative genome analysis may provide insight into the genome evolution of non-model species where whole genome sequences are unavailable.

UR - http://www.scopus.com/inward/record.url?scp=85076431746&partnerID=8YFLogxK

UR - http://www.mendeley.com/research/application-dart-seq-derived-snp-tags-comparative-genome-analysis-fishes-alternative-pipeline-using

U2 - 10.1371/journal.pone.0226365

DO - 10.1371/journal.pone.0226365

M3 - Article

VL - 14

SP - 1

EP - 13

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 12

M1 - e0226365

ER -