Detail
Article
Web resource
Online article
FT
Medvik - BMC
  • Something wrong with this record ?

Early prediction for fatty liver disease with eigenvector-based feature selections for model performance enhancement

Ji-Han Liu, Kuo-Chin Huang, Feipei Lai

. 2021 ; 17 (8) : 18-34.

Status minimal Language English Country Czech Republic

Objectives: This study is aimed to achieve the rapid optimization of the input feature subset that satisfies the expert's point of view and enhance the prediction performance of the early prediction model for fatty liver disease (FLD). Methods: We explore a large-scale and high-dimension dataset coming from a northern Taipei Health Screening Center in Taiwan, and the dataset includes data of 12,707 male and 10,601 female patients processed from around 500,000 records from year 2009 to 2016. We propose three eigenvector-based feature selections taking the Intersection of Union (IoU) and the Coverage to determine the sub-optimal subset of features with the highest IoU and the Coverage automatically, use various long short-term memory (LSTM) related classifiers for FLD prediction, and evaluate the model performance by the test accuracy and the Area Under the Receiver Operating Characteristic Curve (AUROC). Results: Our eigenvector-based feature selection EFS- TW has the highest IOU and the Coverage and the shortest total computing time. For comparison, the highest IOU, the Coverage, and computing time are 30.56%, 45.83% and 260 seconds for female, and that of a benchmark, sequential forward selection (SFS), are 9.09%, 16.67% and 380,350 seconds. The AUROC with LSTM, biLSTM, Gated Recurrent Unit (GRU), Stack-LSTM, Stack-biLSTM are 0.85, 0.86, 0.86, 0.86 and 0.87 for male, and all 0.9 for female, respectively. Conclusion: Our method explores a large-scale and high-dimension FLD dataset, implements three efficient and automatic eigenvector-based feature selections, and develops the model for early prediction of FLD efficiently.

Bibliography, etc.

Literatura

000      
00000naa a2200000 a 4500
001      
bmc21023542
003      
CZ-PrNML
005      
20211101115736.0
007      
cr|cn|
008      
210928s2021 xr ad fs 000 0|eng||
009      
eAR
024    7_
$2 doi $a 10.24105/ejbi.2021.17.8.18-34
040    __
$a ABA008 $d ABA008 $e AACR2 $b cze
041    0_
$a eng
044    __
$a xr
100    1_
$a Liu, Ji-Han $u Graduate institute of Networking and Multimedia, National Taiwan University, Taipei, Taiwan
245    10
$a Early prediction for fatty liver disease with eigenvector-based feature selections for model performance enhancement / $c Ji-Han Liu, Kuo-Chin Huang, Feipei Lai
504    __
$a Literatura
520    9_
$a Objectives: This study is aimed to achieve the rapid optimization of the input feature subset that satisfies the expert's point of view and enhance the prediction performance of the early prediction model for fatty liver disease (FLD). Methods: We explore a large-scale and high-dimension dataset coming from a northern Taipei Health Screening Center in Taiwan, and the dataset includes data of 12,707 male and 10,601 female patients processed from around 500,000 records from year 2009 to 2016. We propose three eigenvector-based feature selections taking the Intersection of Union (IoU) and the Coverage to determine the sub-optimal subset of features with the highest IoU and the Coverage automatically, use various long short-term memory (LSTM) related classifiers for FLD prediction, and evaluate the model performance by the test accuracy and the Area Under the Receiver Operating Characteristic Curve (AUROC). Results: Our eigenvector-based feature selection EFS- TW has the highest IOU and the Coverage and the shortest total computing time. For comparison, the highest IOU, the Coverage, and computing time are 30.56%, 45.83% and 260 seconds for female, and that of a benchmark, sequential forward selection (SFS), are 9.09%, 16.67% and 380,350 seconds. The AUROC with LSTM, biLSTM, Gated Recurrent Unit (GRU), Stack-LSTM, Stack-biLSTM are 0.85, 0.86, 0.86, 0.86 and 0.87 for male, and all 0.9 for female, respectively. Conclusion: Our method explores a large-scale and high-dimension FLD dataset, implements three efficient and automatic eigenvector-based feature selections, and develops the model for early prediction of FLD efficiently.
590    __
$a NEINDEXOVÁNO
700    1_
$a Huang, Kuo-Chin $u Department of Family medicine, College of Medicine, National Taiwan University, Taipei, Taiwan
700    1_
$a Lai, Feipei $u Department of Computer Science and information Engineering, National Taiwan University, Taipei, Taiwan
773    0_
$t European journal for biomedical informatics $x 1801-5603 $g Roč. 17, č. 8 (2021), s. 18-34 $w MED00173462
856    41
$u http://www.ejbi.org/ $y domovská stránka časopisu - plný text volně přístupný
910    __
$a ABA008 $b online $y 0 $z 0
990    __
$a 20210927121633 $b ABA008
991    __
$a 20211101120601 $b ABA008
999    __
$a min $b bmc $g 1702515 $s 1144035
BAS    __
$a 3 $a 4
BMC    __
$a 2021 $b 17 $c 8 $d 18-34 $i 1801-5603 $m European Journal for Biomedical Informatics $n Eur. J. Biomed. Inform. (Praha) $x MED00173462
LZP    __
$a NLK 2021-40/dk

Find record

Citation metrics

Loading data ...

Archiving options

Loading data ...