• Je něco špatně v tomto záznamu ?

AccuCalc: A Python Package for Accuracy Calculation in GWAS

J. Biová, N. Dietz, YO. Chan, T. Joshi, K. Bilyeu, M. Škrabišová

. 2023 ; 14 (1) : . [pub] 20230101

Jazyk angličtina Země Švýcarsko

Typ dokumentu časopisecké články, práce podpořená grantem

Perzistentní odkaz   https://www.medvik.cz/link/bmc23004700

The genome-wide association study (GWAS) is a popular genomic approach that identifies genomic regions associated with a phenotype and, thus, aims to discover causative mutations (CM) in the genes underlying the phenotype. However, GWAS discoveries are limited by many factors and typically identify associated genomic regions without the further ability to compare the viability of candidate genes and actual CMs. Therefore, the current methodology is limited to CM identification. In our recent work, we presented a novel approach to an empowered "GWAS to Genes" strategy that we named Synthetic phenotype to causative mutation (SP2CM). We established this strategy to identify CMs in soybean genes and developed a web-based tool for accuracy calculation (AccuTool) for a reference panel of soybean accessions. Here, we describe our further development of the tool that extends its utilization for other species and named it AccuCalc. We enhanced the tool for the analysis of datasets with a low-frequency distribution of a rare phenotype by automated formatting of a synthetic phenotype and added another accuracy-based GWAS evaluation criterion to the accuracy calculation. We designed AccuCalc as a Python package for GWAS data analysis for any user-defined species-independent variant calling format (vcf) or HapMap format (hmp) as input data. AccuCalc saves analysis outputs in user-friendly tab-delimited formats and also offers visualization of the GWAS results as Manhattan plots accentuated by accuracy. Under the hood of Python, AccuCalc is publicly available and, thus, can be used conveniently for the SP2CM strategy utilization for every species.

Citace poskytuje Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc23004700
003      
CZ-PrNML
005      
20230425171653.0
007      
ta
008      
230418s2023 sz f 000 0|eng||
009      
AR
024    7_
$a 10.3390/genes14010123 $2 doi
035    __
$a (PubMed)36672864
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a sz
100    1_
$a Biová, Jana $u Department of Biochemistry, Faculty of Science, Palacký University in Olomouc, 78371 Olomouc, Czech Republic $1 https://orcid.org/0000000305908061
245    10
$a AccuCalc: A Python Package for Accuracy Calculation in GWAS / $c J. Biová, N. Dietz, YO. Chan, T. Joshi, K. Bilyeu, M. Škrabišová
520    9_
$a The genome-wide association study (GWAS) is a popular genomic approach that identifies genomic regions associated with a phenotype and, thus, aims to discover causative mutations (CM) in the genes underlying the phenotype. However, GWAS discoveries are limited by many factors and typically identify associated genomic regions without the further ability to compare the viability of candidate genes and actual CMs. Therefore, the current methodology is limited to CM identification. In our recent work, we presented a novel approach to an empowered "GWAS to Genes" strategy that we named Synthetic phenotype to causative mutation (SP2CM). We established this strategy to identify CMs in soybean genes and developed a web-based tool for accuracy calculation (AccuTool) for a reference panel of soybean accessions. Here, we describe our further development of the tool that extends its utilization for other species and named it AccuCalc. We enhanced the tool for the analysis of datasets with a low-frequency distribution of a rare phenotype by automated formatting of a synthetic phenotype and added another accuracy-based GWAS evaluation criterion to the accuracy calculation. We designed AccuCalc as a Python package for GWAS data analysis for any user-defined species-independent variant calling format (vcf) or HapMap format (hmp) as input data. AccuCalc saves analysis outputs in user-friendly tab-delimited formats and also offers visualization of the GWAS results as Manhattan plots accentuated by accuracy. Under the hood of Python, AccuCalc is publicly available and, thus, can be used conveniently for the SP2CM strategy utilization for every species.
650    12
$a celogenomová asociační studie $x metody $7 D055106
650    12
$a genomika $x metody $7 D023281
650    _2
$a genom $7 D016678
650    _2
$a fenotyp $7 D010641
650    _2
$a mutace $7 D009154
655    _2
$a časopisecké články $7 D016428
655    _2
$a práce podpořená grantem $7 D013485
700    1_
$a Dietz, Nicholas $u Division of Plant Sciences, University of Missouri, Columbia, MO 65201, USA $1 https://orcid.org/0000000175368166
700    1_
$a Chan, Yen On $u Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65212, USA $u MU Data Science and Informatics Institute, University of Missouri, Columbia, MO 65212, USA
700    1_
$a Joshi, Trupti $u Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65212, USA $u MU Data Science and Informatics Institute, University of Missouri, Columbia, MO 65212, USA $u Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65212, USA $u Department of Health Management and Informatics, School of Medicine, University of Missouri, Columbia, MO 65212, USA $1 https://orcid.org/0000000189444924
700    1_
$a Bilyeu, Kristin $u Plant Genetics Research Unit, United States Department of Agriculture-Agricultural Research Service, University of Missouri, Columbia, MO 65211, USA $1 https://orcid.org/0000000241414790
700    1_
$a Škrabišová, Mária $u Department of Biochemistry, Faculty of Science, Palacký University in Olomouc, 78371 Olomouc, Czech Republic $1 https://orcid.org/0000000328791062
773    0_
$w MED00174652 $t Genes $x 2073-4425 $g Roč. 14, č. 1 (2023)
856    41
$u https://pubmed.ncbi.nlm.nih.gov/36672864 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y p $z 0
990    __
$a 20230418 $b ABA008
991    __
$a 20230425171649 $b ABA008
999    __
$a ok $b bmc $g 1925032 $s 1190909
BAS    __
$a 3
BAS    __
$a PreBMC-MEDLINE
BMC    __
$a 2023 $b 14 $c 1 $e 20230101 $i 2073-4425 $m Genes $n Genes $x MED00174652
LZP    __
$a Pubmed-20230418

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...