-
Something wrong with this record ?
Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes
P. Flegontov, U. Işıldak, R. Maier, E. Yüncü, P. Changmai, D. Reich
Language English Country United States
Document type Journal Article, Research Support, N.I.H., Extramural, Research Support, Non-U.S. Gov't
Grant support
R01 HG012287
NHGRI NIH HHS - United States
NLK
Directory of Open Access Journals
from 2005
Free Medical Journals
from 2005
Public Library of Science (PLoS)
from 2005-07-01
PubMed Central
from 2005
Europe PubMed Central
from 2005
ProQuest Central
from 2005-07-01
Open Access Digital Library
from 2005-01-01
Open Access Digital Library
from 2005-07-01
Open Access Digital Library
from 2005-01-01
Medline Complete (EBSCOhost)
from 2005-07-01
Health & Medicine (ProQuest)
from 2005-07-01
- MeSH
- African People * genetics MeSH
- Biological Variation, Population genetics MeSH
- Black People genetics MeSH
- Demography * history MeSH
- Phylogeny * MeSH
- Genotype MeSH
- Polymorphism, Single Nucleotide * genetics MeSH
- Humans MeSH
- Chromosome Mapping MeSH
- Neanderthals genetics MeSH
- Models, Statistical MeSH
- Bias MeSH
- Animals MeSH
- Check Tag
- Humans MeSH
- Animals MeSH
- Publication type
- Journal Article MeSH
- Research Support, Non-U.S. Gov't MeSH
- Research Support, N.I.H., Extramural MeSH
f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data-that is, all single nucleotide polymorphisms (SNPs) in the individuals being analyzed-but they are also guaranteed to allow robust tests of models for SNPs ascertained as polymorphic in a population that is an outgroup in a phylogenetic sense to all groups being analyzed. True "outgroup ascertainment" is in practice impossible in humans because our species has arisen from a substructured ancestral population that does not descend from a homogeneous ancestral population going back many hundreds of thousands of years into the past. However, initial studies suggested that non-outgroup-ascertainment schemes might produce robust enough results using f-statistics, and that motivated widespread fitting of models to data using non-outgroup-ascertained SNP panels such as the "Affymetrix Human Origins array" which has been genotyped on thousands of modern individuals from hundreds of populations, or the "1240k" in-solution enrichment reagent which has been the source of about 70% of published genome-wide data for ancient humans. In this study, we show that while analyses of population history using such panels work well for studies of relationships among non-African populations and one African outgroup, when co-modeling more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans), fitting of f-statistics to such SNP sets is expected to frequently lead to false rejection of true demographic histories, and failure to reject incorrect models. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, has limited statistical power and retains important biases. However, by carrying out simulations of diverse demographic histories, we show that bias in inferences based on f-statistics can be minimized by ascertaining on variants common in a union of diverse African groups; such ascertainment retains high statistical power while allowing co-analysis of archaic and modern groups.
Broad Institute of Harvard and MIT Cambridge Massachusetts United States of America
Department of Biology and Ecology Faculty of Science University of Ostrava Ostrava Czechia
Department of Genetics Harvard Medical School Boston Massachusetts United States of America
Howard Hughes Medical Institute Harvard Medical School Boston Massachusetts United States of America
Kalmyk Research Center of the Russian Academy of Sciences Elista Russia
References provided by Crossref.org
- 000
- 00000naa a2200000 a 4500
- 001
- bmc23016266
- 003
- CZ-PrNML
- 005
- 20231026110057.0
- 007
- ta
- 008
- 231013s2023 xxu f 000 0|eng||
- 009
- AR
- 024 7_
- $a 10.1371/journal.pgen.1010931 $2 doi
- 035 __
- $a (PubMed)37676865
- 040 __
- $a ABA008 $b cze $d ABA008 $e AACR2
- 041 0_
- $a eng
- 044 __
- $a xxu
- 100 1_
- $a Flegontov, Pavel $u Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America $u Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia $u Kalmyk Research Center of the Russian Academy of Sciences, Elista, Russia $1 https://orcid.org/0000000197594981
- 245 10
- $a Modeling of African population history using f-statistics is biased when applying all previously proposed SNP ascertainment schemes / $c P. Flegontov, U. Işıldak, R. Maier, E. Yüncü, P. Changmai, D. Reich
- 520 9_
- $a f-statistics have emerged as a first line of analysis for making inferences about demographic history from genome-wide data. Not only are they guaranteed to allow robust tests of the fits of proposed models of population history to data when analyzing full genome sequencing data-that is, all single nucleotide polymorphisms (SNPs) in the individuals being analyzed-but they are also guaranteed to allow robust tests of models for SNPs ascertained as polymorphic in a population that is an outgroup in a phylogenetic sense to all groups being analyzed. True "outgroup ascertainment" is in practice impossible in humans because our species has arisen from a substructured ancestral population that does not descend from a homogeneous ancestral population going back many hundreds of thousands of years into the past. However, initial studies suggested that non-outgroup-ascertainment schemes might produce robust enough results using f-statistics, and that motivated widespread fitting of models to data using non-outgroup-ascertained SNP panels such as the "Affymetrix Human Origins array" which has been genotyped on thousands of modern individuals from hundreds of populations, or the "1240k" in-solution enrichment reagent which has been the source of about 70% of published genome-wide data for ancient humans. In this study, we show that while analyses of population history using such panels work well for studies of relationships among non-African populations and one African outgroup, when co-modeling more than one sub-Saharan African and/or archaic human groups (Neanderthals and Denisovans), fitting of f-statistics to such SNP sets is expected to frequently lead to false rejection of true demographic histories, and failure to reject incorrect models. Analyzing panels of SNPs polymorphic in archaic humans, which has been suggested as a solution for the ascertainment problem, has limited statistical power and retains important biases. However, by carrying out simulations of diverse demographic histories, we show that bias in inferences based on f-statistics can be minimized by ascertaining on variants common in a union of diverse African groups; such ascertainment retains high statistical power while allowing co-analysis of archaic and modern groups.
- 650 _2
- $a zvířata $7 D000818
- 650 _2
- $a lidé $7 D006801
- 650 _2
- $a černoši $x genetika $7 D044383
- 650 _2
- $a mapování chromozomů $7 D002874
- 650 _2
- $a genotyp $7 D005838
- 650 _2
- $a neandertálci $x genetika $7 D059125
- 650 12
- $a fylogeneze $7 D010802
- 650 12
- $a jednonukleotidový polymorfismus $x genetika $7 D020641
- 650 12
- $a Afričané $x genetika $7 D000094842
- 650 12
- $a demografie $x dějiny $7 D003710
- 650 _2
- $a biologická variabilita populace $x genetika $7 D000073537
- 650 _2
- $a statistické modely $7 D015233
- 650 _2
- $a zkreslení výsledků (epidemiologie) $7 D015982
- 655 _2
- $a časopisecké články $7 D016428
- 655 _2
- $a Research Support, N.I.H., Extramural $7 D052061
- 655 _2
- $a práce podpořená grantem $7 D013485
- 700 1_
- $a Işıldak, Ulaş $u Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia $1 https://orcid.org/0000000164976254
- 700 1_
- $a Maier, Robert $u Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America
- 700 1_
- $a Yüncü, Eren $u Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia
- 700 1_
- $a Changmai, Piya $u Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czechia
- 700 1_
- $a Reich, David $u Department of Human Evolutionary Biology, Harvard University, Cambridge, Massachusetts, United States of America $u Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America $u Howard Hughes Medical Institute, Harvard Medical School, Boston, Massachusetts, United States of America $u Broad Institute of Harvard and MIT, Cambridge, Massachusetts, United States of America $1 https://orcid.org/0000000270375292 $7 jo20191025226
- 773 0_
- $w MED00008920 $t PLoS genetics $x 1553-7404 $g Roč. 19, č. 9 (2023), s. e1010931
- 856 41
- $u https://pubmed.ncbi.nlm.nih.gov/37676865 $y Pubmed
- 910 __
- $a ABA008 $b sig $c sign $y - $z 0
- 990 __
- $a 20231013 $b ABA008
- 991 __
- $a 20231026110051 $b ABA008
- 999 __
- $a ok $b bmc $g 2000029 $s 1202628
- BAS __
- $a 3
- BAS __
- $a PreBMC-MEDLINE
- BMC __
- $a 2023 $b 19 $c 9 $d e1010931 $e 20230907 $i 1553-7404 $m PLoS genetics $n PLoS Genet $x MED00008920
- GRA __
- $a R01 HG012287 $p NHGRI NIH HHS $2 United States
- LZP __
- $a Pubmed-20231013