• Je něco špatně v tomto záznamu ?

Improving structural variant clustering to reduce the negative effect of the breakpoint uncertainty problem

J. Geryk, A. Zinkova, I. Zedníková, H. Simková, V. Stenzl, M. Korabecna

. 2021 ; 22 (1) : 464. [pub] 20210927

Jazyk angličtina Země Velká Británie

Typ dokumentu časopisecké články

Perzistentní odkaz   https://www.medvik.cz/link/bmc21024940

Grantová podpora
VI20172020102 Ministry of Interior of the Czech Republic
VI20172020102 Ministry of Interior of the Czech Republic
VI20172020102 Ministry of Interior of the Czech Republic
VI20172020102 Ministry of Interior of the Czech Republic
VI20172020102 Ministry of Interior of the Czech Republic
VI20172020102 Ministry of Interior of the Czech Republic

BACKGROUND: Structural variants (SVs) represent an important source of genetic variation. One of the most critical problems in their detection is breakpoint uncertainty associated with the inability to determine their exact genomic position. Breakpoint uncertainty is a characteristic issue of structural variants detected via short-read sequencing methods and complicates subsequent population analyses. The commonly used heuristic strategy reduces this issue by clustering/merging nearby structural variants of the same type before the data from individual samples are merged. RESULTS: We compared the two most used dissimilarity measures for SV clustering in terms of Mendelian inheritance errors (MIE), kinship prediction, and deviation from Hardy-Weinberg equilibrium. We analyzed the occurrence of Mendelian-inconsistent SV clusters that can be collapsed into one Mendelian-consistent SV as a new measure of dataset consistency. We also developed a new method based on constrained clustering that explicitly identifies these types of clusters. CONCLUSIONS: We found that the dissimilarity measure based on the distance between SVs breakpoints produces slightly better results than the measure based on SVs overlap. This difference is evident in trivial and corrected clustering strategy, but not in constrained clustering strategy. However, constrained clustering strategy provided the best results in all aspects, regardless of the dissimilarity measure used.

Citace poskytuje Crossref.org

000      
00000naa a2200000 a 4500
001      
bmc21024940
003      
CZ-PrNML
005      
20211026134249.0
007      
ta
008      
211013s2021 xxk f 000 0|eng||
009      
AR
024    7_
$a 10.1186/s12859-021-04374-3 $2 doi
035    __
$a (PubMed)34579642
040    __
$a ABA008 $b cze $d ABA008 $e AACR2
041    0_
$a eng
044    __
$a xxk
100    1_
$a Geryk, Jan $u Department of Biology and Medical Genetics, Second Faculty of Medicine, Charles University and University Hospital Motol, V Úvalu 84, 15006, Prague, Czech Republic. jan.geryk@fnmotol.cz $u Department of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, Albertov 4, 128 00, Prague, Czech Republic. jan.geryk@fnmotol.cz
245    10
$a Improving structural variant clustering to reduce the negative effect of the breakpoint uncertainty problem / $c J. Geryk, A. Zinkova, I. Zedníková, H. Simková, V. Stenzl, M. Korabecna
520    9_
$a BACKGROUND: Structural variants (SVs) represent an important source of genetic variation. One of the most critical problems in their detection is breakpoint uncertainty associated with the inability to determine their exact genomic position. Breakpoint uncertainty is a characteristic issue of structural variants detected via short-read sequencing methods and complicates subsequent population analyses. The commonly used heuristic strategy reduces this issue by clustering/merging nearby structural variants of the same type before the data from individual samples are merged. RESULTS: We compared the two most used dissimilarity measures for SV clustering in terms of Mendelian inheritance errors (MIE), kinship prediction, and deviation from Hardy-Weinberg equilibrium. We analyzed the occurrence of Mendelian-inconsistent SV clusters that can be collapsed into one Mendelian-consistent SV as a new measure of dataset consistency. We also developed a new method based on constrained clustering that explicitly identifies these types of clusters. CONCLUSIONS: We found that the dissimilarity measure based on the distance between SVs breakpoints produces slightly better results than the measure based on SVs overlap. This difference is evident in trivial and corrected clustering strategy, but not in constrained clustering strategy. However, constrained clustering strategy provided the best results in all aspects, regardless of the dissimilarity measure used.
650    _2
$a shluková analýza $7 D016000
650    12
$a genom lidský $7 D015894
650    12
$a strukturální variace genomu $7 D056914
650    _2
$a genomika $7 D023281
650    _2
$a vysoce účinné nukleotidové sekvenování $7 D059014
650    _2
$a lidé $7 D006801
650    _2
$a nejistota $7 D035501
655    _2
$a časopisecké články $7 D016428
700    1_
$a Zinkova, Alzbeta $u Department of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, Albertov 4, 128 00, Prague, Czech Republic
700    1_
$a Zedníková, Iveta $u Department of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, Albertov 4, 128 00, Prague, Czech Republic
700    1_
$a Simková, Halina $u Department of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, Albertov 4, 128 00, Prague, Czech Republic
700    1_
$a Stenzl, Vlastimil $u Department of Forensic Genetics, Institute of Criminalistics, Strojnická 27, 170 89, Prague, Czech Republic
700    1_
$a Korabecna, Marie $u Department of Biology and Medical Genetics, First Faculty of Medicine, Charles University and General University Hospital in Prague, Albertov 4, 128 00, Prague, Czech Republic
773    0_
$w MED00008167 $t BMC bioinformatics $x 1471-2105 $g Roč. 22, č. 1 (2021), s. 464
856    41
$u https://pubmed.ncbi.nlm.nih.gov/34579642 $y Pubmed
910    __
$a ABA008 $b sig $c sign $y p $z 0
990    __
$a 20211013 $b ABA008
991    __
$a 20211026134255 $b ABA008
999    __
$a ok $b bmc $g 1714127 $s 1145447
BAS    __
$a 3
BAS    __
$a PreBMC
BMC    __
$a 2021 $b 22 $c 1 $d 464 $e 20210927 $i 1471-2105 $m BMC bioinformatics $n BMC Bioinformatics $x MED00008167
GRA    __
$a VI20172020102 $p Ministry of Interior of the Czech Republic
GRA    __
$a VI20172020102 $p Ministry of Interior of the Czech Republic
GRA    __
$a VI20172020102 $p Ministry of Interior of the Czech Republic
GRA    __
$a VI20172020102 $p Ministry of Interior of the Czech Republic
GRA    __
$a VI20172020102 $p Ministry of Interior of the Czech Republic
GRA    __
$a VI20172020102 $p Ministry of Interior of the Czech Republic
LZP    __
$a Pubmed-20211013

Najít záznam

Citační ukazatele

Nahrávání dat ...

Možnosti archivace

Nahrávání dat ...