-
Whole-genome modeling accurately predicts quantitative traits in plants
Authors:
Laurent Gentzbittel,
Cecile Ben,
Melanie Mazurier,
Min-Gyoung Shin,
Martin Triska,
Martina Rickauer,
Yuri Nikolsky,
Paul Marjoram,
Sergey Nuzhdin,
Tatiana Tatarinova
Abstract:
Understanding the relationship between genomic variation and variation in phenotypes for quantitative traits such as physiology, yield, fitness or behavior, will provide important insights for both predicting adaptive evolution and for breeding schemes. A particular question is whether the genetic variation that influences quantitative phenotypes is typically the result of one or two mutations of…
▽ More
Understanding the relationship between genomic variation and variation in phenotypes for quantitative traits such as physiology, yield, fitness or behavior, will provide important insights for both predicting adaptive evolution and for breeding schemes. A particular question is whether the genetic variation that influences quantitative phenotypes is typically the result of one or two mutations of large effect, or multiple mutations of small effect. In this paper we explore this issue using the wild model legume Medicago truncatula. We show that phenotypes, such as quantitative disease resistance, can be well-predicted using genome-wide patterns of admixture, from which it follows that there must be many mutations of small effect. Our findings prove the potential of our novel 'whole-genome modeling' -WhoGEM- method and experimentally validate, for the first time, the infinitesimal model as a mechanism for adaptation of quantitative phenotypes in plants. This insight can accelerate breeding and biomedicine research programs.
△ Less
Submitted 8 November, 2015;
originally announced November 2015.
-
The mysterious orphans of Mycoplasmataceae
Authors:
Tatiana V. Tatarinova,
Inna Lysnyansky,
Yuri V. Nikolsky,
Alexander Bolshoy
Abstract:
Background: The length of a protein sequence is largely determined by its function, i.e. each functional group is associated with an optimal size. However, comparative genomics revealed that proteins length may be affected by additional factors. In 2002 it was shown that in bacterium Escherichia coli and the archaeon Archaeoglobus fulgidus, protein sequences with no homologs are, on average, short…
▽ More
Background: The length of a protein sequence is largely determined by its function, i.e. each functional group is associated with an optimal size. However, comparative genomics revealed that proteins length may be affected by additional factors. In 2002 it was shown that in bacterium Escherichia coli and the archaeon Archaeoglobus fulgidus, protein sequences with no homologs are, on average, shorter than those with homologs. Most experts now agree that the length distributions are distinctly different between protein sequences with and without homologs in bacterial and archaeal genomes. In this study, we examine this postulate by a comprehensive analysis of all annotated prokaryotic genomes and focusing on certain exceptions.
Results: We compared lengths distributions of having homologs proteins (HHPs) and non-having homologs proteins (orphans or ORFans) in all currently annotated completely sequenced prokaryotic genomes. As expected, the HHPs and ORFans have strikingly different length distributions in almost all genomes. As previously established, the HHPs, indeed, are, on average, longer than the ORFans, and the length distributions for the ORFans have a relatively narrow peak, in contrast to the HHPs, whose lengths spread over a wider range of values. However, about thirty genomes do not obey these rules. Practically all genomes of Mycoplasma and Ureaplasma have atypical ORFans distributions, with the mean lengths of ORFan larger than the mean lengths of HHPs. These genera constitute over 80% of atypical genomes.
Conclusions: We confirmed on a ubiquitous set of genomes the previous observation that HHPs and ORFans have different gene length distributions. We also showed that Mycoplasmataceae genomes have distinctive distributions of ORFans lengths. We offer several possible biological explanations of this phenomenon.
△ Less
Submitted 24 August, 2015;
originally announced August 2015.
-
Genomic study of the Ket: a Paleo-Eskimo-related ethnic group with significant ancient North Eurasian ancestry
Authors:
Pavel Flegontov,
Piya Changmai,
Anastassiya Zidkova,
Maria D. Logacheva,
Olga Flegontova,
Mikhail S. Gelfand,
Evgeny S. Gerasimov,
Ekaterina E. Khrameeva,
Olga P. Konovalova,
Tatiana Neretina,
Yuri V. Nikolsky,
George Starostin,
Vita V. Stepanova,
Igor V. Travinsky,
Martin Tříska,
Petr Tříska,
Tatiana V. Tatarinova
Abstract:
The Kets, an ethnic group in the Yenisei River basin, Russia, are considered the last nomadic hunter-gatherers of Siberia, and Ket language has no transparent affiliation with any language family. We investigated connections between the Kets and Siberian and North American populations, with emphasis on the Mal'ta and Paleo-Eskimo ancient genomes using original data from 46 unrelated samples of Ket…
▽ More
The Kets, an ethnic group in the Yenisei River basin, Russia, are considered the last nomadic hunter-gatherers of Siberia, and Ket language has no transparent affiliation with any language family. We investigated connections between the Kets and Siberian and North American populations, with emphasis on the Mal'ta and Paleo-Eskimo ancient genomes using original data from 46 unrelated samples of Kets and 42 samples of their neighboring ethnic groups (Uralic-speaking Nganasans, Enets, and Selkups). We genotyped over 130,000 autosomal SNPs, determined mitochondrial and Y-chromosomal haplogroups, and performed high-coverage genome sequencing of two Ket individuals. We established that the Kets belong to the cluster of Siberian populations related to Paleo-Eskimos. Unlike other members of this cluster (Nganasans, Ulchi, Yukaghirs, and Evens), Kets and closely related Selkups have a high degree of Mal'ta ancestry. Implications of these findings for the linguistic hypothesis uniting Ket and Na-Dene languages into a language macrofamily are discussed.
△ Less
Submitted 12 August, 2015;
originally announced August 2015.
-
Elucidation of differential response networks from toxicogenomics data
Authors:
Z. Dezso,
R. Welch,
V. Kazandaev,
A. Naito,
J. Fuscoe,
C. Melvin,
Y. Dragan,
Y. Nikolsky,
T. Nikolskaya,
A. Bugrim
Abstract:
We describe a novel approach to the analysis of toxicogenomics data and elucidation of biological networks affected by drug treatments. In this method approximately 15,000 linear pathway modules were generated from manually assembled pathway maps from MetaCore (GeneGo, Inc.). Microarray expression data from livers of rat exposed to phenobarbital, mestranol and tamoxifen were mapped onto these mo…
▽ More
We describe a novel approach to the analysis of toxicogenomics data and elucidation of biological networks affected by drug treatments. In this method approximately 15,000 linear pathway modules were generated from manually assembled pathway maps from MetaCore (GeneGo, Inc.). Microarray expression data from livers of rat exposed to phenobarbital, mestranol and tamoxifen were mapped onto these modules. Using different analytical techniques we have identified sets of "differential" pathways featuring highly correlated expression among multiple repeats of the same treatment while showing strong anti-correlation across different treatments. Network modules distinguishing chemical treatments were re-assembled based on these pathways. Unlike traditional statistical and clustering procedures in expression profiling, our method takes into account both network connectivity and gene expression in the course of the analysis. We demonstrate that it enables identification of important cellular mechanisms involved in drug response that would have been missed by the analysis based on individual gene expression profiles.
△ Less
Submitted 23 May, 2008;
originally announced May 2008.