JavaScript is NOT enabled !

Please enable JavaScript.

Data clustering problems Query Show help

Exact matching

Reset

84 hits in PubMed

Article

Enhanced aquila optimizer for global optimization and data clustering

... However, the AO faces some challenges when dealing with high-dimensional optimization problems due to ...

Abualigah, Laith
Author Abualigah, Laith Computer Science Department, Al al-Bayt University, Mafraq, 25113, Jordan. aligah.2020@gmail.com
Alomari, Saleh Ali
Author Alomari, Saleh Ali Faculty of Information Technology, Jadara University, Irbid, 21110, Jordan
Almomani, Mohammad H
Author Almomani, Mohammad H Department of Mathematics, Facility of Science, The Hashemite University, P.O Box 330127, Zarqa, 13133, Jordan
Abu Zitar, Raed
Author Abu Zitar, Raed Faculty of Engineering and Computing, Liwa College, Abu Dhabi, United Arab Emirates
Migdady, Hazem
Author Migdady, Hazem CSMIS Department, Oman College of Management and Technology, 320, Barka, Oman
Saleem, Kashif
Author Saleem, Kashif Department of Computer Science & Engineering, College of Applied Studies & Community Service, King Saud University, 11362, Riyadh, Saudi Arabia
Smerat, Aseel
Author Smerat, Aseel Faculty of Educational Sciences, Al-Ahliyya Amman University, Amman, 19328, Jordan Centre for Research Impact & Outcome, Chitkara University Institute of Engineering and Technology, Chitkara University, Rajpura, Punjab, 140401, India Computer Technologies Engineering, Mazaya University College, Nasiriyah, Iraq School of Engineering and Technology, Sunway University Malaysia, Petaling Jaya 27500, Malaysia
Snasel, Vaclav
Author Snasel, Vaclav Faculty of Electrical Engineering and Computer Science, VŠB-Technical University of Ostrava, 70800, Poruba-Ostrava, Czech Republic
Gandomi, Amir H
Author Gandomi, Amir H Faculty of Engineering and Information Technology, University of Technology Sydney, Ultimo, NSW, 2007, Australia. Gandomi@uts.edu.au University Research and Innovation Center (EKIK), Óbuda University, 1034, Budapest, Hungary. Gandomi@uts.edu.au Department of Computer Science, Khazar University, Baku, Azerbaijan. Gandomi@uts.edu.au

Scientific reports. 2025 Apr 16 ; 15 (1) : 13079. [epub] 20250416

Sci Rep
ISSN 2045-2322
Source

The Aquila Optimizer (AO) is a newly proposed, highly capable metaheuristic algorithm based on the hunting and search behavior of the Aquila bird. However, the AO faces some challenges when dealing with high-dimensional optimization problems due to its narrow exploration capabilities and a tendency to converge prematurely to local optima, which can decrease its performance in complex scenarios. This paper presents a modified form of the previously proposed AO, the Locality Opposition-Based Learning Aquila Optimizer (LOBLAO), aimed at resolving such issues and improving the performance of tasks related to global optimization and data clustering in particular. The proposed LOBLAO incorporates two key advancements: the Opposition-Based Learning (OBL) strategy, which enhances solution diversity and balances exploration and exploitation, and the Mutation Search Strategy (MSS), which mitigates the risk of local optima and ensures robust exploration of the search space. Comprehensive experiments on benchmark test functions and data clustering problems demonstrate the efficacy of LOBLAO. The results reveal that LOBLAO outperforms the original AO and several state-of-the-art optimization algorithms, showcasing superior performance in tackling high-dimensional datasets. In particular, LOBLAO achieved the best average ranking of 1.625 across multiple clustering problems, underscoring its robustness and versatility. These findings highlight the significant potential of LOBLAO to solve diverse and challenging optimization problems, establishing it as a valuable tool for researchers and practitioners.

Keywords
Aquila optimizer, Data clustering problems, Meta-heuristics optimization algorithms, Opposition-based learning, Optimization problems,
Publication type
Journal Article MeSH

Article

Improving structural variant clustering to reduce the negative effect of the breakpoint uncertainty problem

... One of the most critical problems in their detection is breakpoint uncertainty associated with the inability ...

BMC bioinformatics. 2021 Sep 27 ; 22 (1) : 464. [epub] 20210927

BMC Bioinformatics
ISSN 1471-2105
Source

BACKGROUND: Structural variants (SVs) represent an important source of genetic variation. One of the most critical problems in their detection is breakpoint uncertainty associated with the inability to determine their exact genomic position. Breakpoint uncertainty is a characteristic issue of structural variants detected via short-read sequencing methods and complicates subsequent population analyses. The commonly used heuristic strategy reduces this issue by clustering/merging nearby structural variants of the same type before the data from individual samples are merged. RESULTS: We compared the two most used dissimilarity measures for SV clustering in terms of Mendelian inheritance errors (MIE), kinship prediction, and deviation from Hardy-Weinberg equilibrium. We analyzed the occurrence of Mendelian-inconsistent SV clusters that can be collapsed into one Mendelian-consistent SV as a new measure of dataset consistency. We also developed a new method based on constrained clustering that explicitly identifies these types of clusters. CONCLUSIONS: We found that the dissimilarity measure based on the distance between SVs breakpoints produces slightly better results than the measure based on SVs overlap. This difference is evident in trivial and corrected clustering strategy, but not in constrained clustering strategy. However, constrained clustering strategy provided the best results in all aspects, regardless of the dissimilarity measure used.

Keywords
Breakpoints uncertainty problem, Constrained clustering, Mendelian inheritance error, Structural variants, Whole genome sequencing,
MeSH
Genome, Human * MeSH
Genomics MeSH
Humans MeSH
Uncertainty MeSH
Cluster Analysis MeSH
Genomic Structural Variation * MeSH
High-Throughput Nucleotide Sequencing MeSH
Check Tag
Humans MeSH
Publication type
Journal Article MeSH

Article

Student assessment in cybersecurity training automated by pattern mining and clustering

... During the training, the learning environment allows collecting data about trainees' interactions with ...

Education and information technologies. 2022 ; 27 (7) : 9231-9262. [epub] 20220330

Educ Inf Technol (Dordr)
ISSN 1360-2357
Source

Hands-on cybersecurity training allows students and professionals to practice various tools and improve their technical skills. The training occurs in an interactive learning environment that enables completing sophisticated tasks in full-fledged operating systems, networks, and applications. During the training, the learning environment allows collecting data about trainees' interactions with the environment, such as their usage of command-line tools. These data contain patterns indicative of trainees' learning processes, and revealing them allows to assess the trainees and provide feedback to help them learn. However, automated analysis of these data is challenging. The training tasks feature complex problem-solving, and many different solution approaches are possible. Moreover, the trainees generate vast amounts of interaction data. This paper explores a dataset from 18 cybersecurity training sessions using data mining and machine learning techniques. We employed pattern mining and clustering to analyze 8834 commands collected from 113 trainees, revealing their typical behavior, mistakes, solution strategies, and difficult training stages. Pattern mining proved suitable in capturing timing information and tool usage frequency. Clustering underlined that many trainees often face the same issues, which can be addressed by targeted scaffolding. Our results show that data mining methods are suitable for analyzing cybersecurity training data. Educational researchers and practitioners can apply these methods in their contexts to assess trainees, support them, and improve the training design. Artifacts associated with this research are publicly available.

Keywords
Cybersecurity education, Data science, Educational data mining, Learning analytics, Security training,
Publication type
Journal Article MeSH

Article

Modified ant colony clustering method in long-term electrocardiogram processing

... The paper presents an application of a clustering technique inspired by ant colony metaheuristics. ...

Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference. 2007 ; 2007 () : 3249-52.

Annu Int Conf IEEE Eng Med Biol Soc
ISSN 2375-7477
Source

The paper presents an application of a clustering technique inspired by ant colony metaheuristics. The paper addresses the problem of long-term (Holter) electrocardiogram data processing. Long-term recording produces a huge amount of biomedical data, which must be preprocessed prior to its presentation to the specialist. The paper also discusses relevant aspects improving the robustness, stability and convergence criteria of the method. The method is compared with well known clustering techniques (both classical and nature-inspired), first testing on the known dataset and finally applying them to the real ECG data records from the MIT-BIH database and outperforms the standard methods. Electrocardiogram data clustering can effectively reduce the amount of data presented to the cardiologist: cardiac arrhythmia and significant morphology changes in the ECG can be visually emphasized in a reasonable time. The final evaluation of the ECG recording must still be made by an expert.

MeSH
Algorithms MeSH
Biomimetics methods MeSH
Behavior, Animal MeSH
Diagnosis, Computer-Assisted methods MeSH
Electrocardiography, Ambulatory methods MeSH
Ants physiology MeSH
Humans MeSH
Signal Processing, Computer-Assisted * MeSH
Reproducibility of Results MeSH
Pattern Recognition, Automated methods MeSH
Sensitivity and Specificity MeSH
Cluster Analysis * MeSH
Arrhythmias, Cardiac diagnosis physiopathology MeSH
Heart Rate * MeSH
Animals MeSH
Check Tag
Humans MeSH
Animals MeSH
Publication type
Journal Article MeSH
Evaluation Study MeSH
Research Support, Non-U.S. Gov't MeSH

Article

PDBe: improved findability of macromolecular structure data in the PDB

... The Protein Data Bank in Europe (PDBe), a founding member of the Worldwide Protein Data Bank (wwPDB), ...

Nucleic acids research. 2020 Jan 08 ; 48 (D1) : D335-D343.

Nucleic Acids Res
ISSN 1362-4962 | 0305-1048
Source

The Protein Data Bank in Europe (PDBe), a founding member of the Worldwide Protein Data Bank (wwPDB), actively participates in the deposition, curation, validation, archiving and dissemination of macromolecular structure data. PDBe supports diverse research communities in their use of macromolecular structures by enriching the PDB data and by providing advanced tools and services for effective data access, visualization and analysis. This paper details the enrichment of data at PDBe, including mapping of RNA structures to Rfam, and identification of molecules that act as cofactors. PDBe has developed an advanced search facility with ∼100 data categories and sequence searches. New features have been included in the LiteMol viewer at PDBe, with updated visualization of carbohydrates and nucleic acids. Small molecules are now mapped more extensively to external databases and their visual representation has been enhanced. These advances help users to more easily find and interpret macromolecular structure data in order to solve scientific problems.

Article

How to improve the system of care for adolescents with emotional and behavioural problems from the perspective of care providers: a concept mapping approach

... BACKGROUND: Emotional and behavioural problems (EBP) are the most common mental health issues during ...

Health research policy and systems. 2024 Jan 15 ; 22 (1) : 9. [epub] 20240115

Health Res Policy Syst
ISSN 1478-4505
Source

BACKGROUND: Emotional and behavioural problems (EBP) are the most common mental health issues during adolescence, and their incidence has increased in recent years. The system of care for adolescents with EBP is known to have several problems, making the provision of care less than optimal, and attention needs to be given to potential improvements. We, therefore, aimed to examine what needs to be done to improve the system of care for adolescents with EBP and to assess the urgency and feasibility of the proposed measures from the perspective of care providers. METHODS: We used Concept mapping, a participatory mixed-method research, based on qualitative data collection and quantitative data analysis. A total of 33 stakeholders from 17 institutions participated in our study, including psychologists, pedagogues for children with special needs, teachers, educational counsellors, social workers and child psychiatrists. RESULTS: Respondents identified 43 ideas for improving of the system of care for adolescents with EBP grouped into 5 clusters related to increasing the competencies of care providers, changes at schools and school systems, support for existing services, transparency of the care system in institutions and public administration, and the adjustment of legislative conditions. The most urgent and feasible proposals were related to the support of awareness-raising activities on the topic of EBP, the creation of effective screening tools for the identification of EBP in adolescents, strengthening the role of parents in the process of care, comprehensive work with the family, creation of multidisciplinary support teams and intersectoral cooperation. CONCLUSIONS: Measures which are more accessible and responsive to the pitfalls of the care system, together with those strengthening the role of families and schools, have greater potential for improvements which are in favour of adolescents with EBP. Care providers should be invited more often and much more involved in the discussion and the co-creation of measures to improve the system of care for adolescents with EBP.

Keywords
Adolescents, Care providers, Children, Concept mapping, Emotional and behavioural problems, System of care,
MeSH
Child MeSH
Emotions MeSH
Humans MeSH
Adolescent MeSH
Problem Behavior * psychology MeSH
Parents psychology MeSH
Check Tag
Child MeSH
Humans MeSH
Adolescent MeSH
Publication type
Journal Article MeSH

Article

Matcher: An Open-Source Application for Translating Large Structure/Property Data Sets into Insights for Drug Design

... To solve recurring problems in drug discovery, matched molecular pair (MMP) analysis is used to understand ...

Journal of chemical information and modeling. 2023 Apr 10 ; 63 (7) : 1852-1857. [epub] 20230328

J Chem Inf Model
ISSN 1549-960X | 1549-9596
Source

To solve recurring problems in drug discovery, matched molecular pair (MMP) analysis is used to understand relationships between chemical structure and function. For the MMP analysis of large data sets (>10,000 compounds), available tools lack flexible search and visualization functionality and require computational expertise. Here, we present Matcher, an open-source application for MMP analysis, with novel search algorithms and fully automated querying-to-visualization that requires no programming expertise. Matcher enables unprecedented control over the search and clustering of MMP transformations based on both variable fragment and constant environment structure, which is critical for disentangling relevant and irrelevant data to a given problem. Users can exert such control through a built-in chemical sketcher and with a few mouse clicks can navigate between resulting MMP transformations, statistics, property distribution graphs, and structures with raw experimental data, for confident and accelerated decision making. Matcher can be used with any collection of structure/property data; here, we demonstrate usage with a public ChEMBL data set of about 20,000 small molecules with CYP3A4 and/or hERG inhibition data. Users can reproduce all examples demonstrated herein via unique links within Matcher's interface-a functionality that anyone can use to preserve and share their own analyses. Matcher and all its dependencies are open-source, can be used for free, and are available with containerized deployment from code at https://github.com/Merck/Matcher. Matcher makes large structure/property data sets more transparent than ever before and accelerates the data-driven solution of common problems in drug discovery.

Article

The analysis of registry data in relation to various different types of hypothesis regarding the geographical distribution of disease

... For such analyses it will usually also be necessary to have population data for the region covered by ...

Draper, G J
Author Draper, G J Childhood Cancer Research Group, University of Oxford, UK

Central European journal of public health. 1997 Jun ; 5 (2) : 90-2.

Cent Eur J Public Health
ISSN 1210-7778
Source

Disease registries will often contain the addresses of cases included in the registry. If the registry includes information on all cases, or deaths, occurring in a defined geographical area and time period and if there is a postcode/zip code or map reference for each case it is possible to carry out a variety of different types of geographical analysis that may give clues to the aetiology of the disease. For such analyses it will usually also be necessary to have population data for the region covered by the registry and for separate sub-regions within it. In this paper we review types of analysis that may be applied to such data and give references to examples of applications and the statistical methods used. These include, first, methods of presenting incidence rates, and particularly the use of maps; of particular concern is the development of methods for presenting data that take into account the problems of rates calculated for small populations and which may therefore happen to be high or low simply by chance. Secondly, we consider, the analysis of "clustering" and "clusters" of cases of disease. These problems have been the subject of considerable methodological development in recent years. Analyses of clustering address the question of whether there is a general tendency for there to be aggregations of cases or areas of high incidence the analysis of clusters is concerned with problems of detecting specific locations where there are unusual aggregations of cases. The third type of problem considered here is whether there are, within the registry region, aetiological factors that vary geographically with consequent variations in disease incidence in different sub-regions. Where there is geographical variation it may be possible to use regression analysis to relate such variation to factors such as socio-economic status or levels of some environmental hazard. Finally we consider the problem of determining whether disease rates in certain areas may be related to distance from the source of some potential causative agent.

Article

Multi Swarm Optimization Based Clustering with Tabu Search in Wireless Sensor Network

... Wireless Sensor Networks (WSNs) can be defined as a cluster of sensors with a restricted power supply ...

Sensors (Basel, Switzerland). 2022 Feb 23 ; 22 (5) : . [epub] 20220223

Sensors (Basel)
ISSN 1424-8220
Source

Wireless Sensor Networks (WSNs) can be defined as a cluster of sensors with a restricted power supply deployed in a specific area to gather environmental data. One of the most challenging areas of research is to design energy-efficient data gathering algorithms in large-scale WSNs, as each sensor node, in general, has limited energy resources. Literature review shows that with regards to energy saving, clustering-based techniques for data gathering are quite effective. Moreover, cluster head (CH) optimization is a non-deterministic polynomial (NP) hard problem. Both the lifespan of the network and its energy efficiency are improved by choosing the optimal path in routing. The technique put forth in this paper is based on multi swarm optimization (MSO) (i.e., multi-PSO) together with Tabu search (TS) techniques. Efficient CHs are chosen by the proposed system, which increases the optimization of routing and life of the network. The obtained results show that the MSO-Tabu approach has a 14%, 5%, 11%, and 4% higher number of clusters and a 20%, 6%, 14%, and 6% lesser average packet loss rate as compared to a genetic algorithm (GA), differential evolution (DE), Tabu, and MSO based clustering, respectively. Moreover, the MSO-Tabu approach has 136%, 36%, 136%, and 38% higher lifetime computation, and 22%, 16%, 51%, and 12% higher average dissipated energy. Thus, the study's outcome shows that the proposed MSO-Tabu is efficient, as it enhances the number of clusters formed, average energy dissipated, lifetime computation, and there is a decrease in mean packet loss and end-to-end delay.

Keywords
cluster head (CH), energy consumption, metaheuristics, particle swarm optimization (PSO), wireless energy transfer,
Publication type
Journal Article MeSH

Article

Cross-efficiency evaluation in the presence of flexible measures with an application to healthcare systems

... This study proposes a new methodology comprising two well-known analytical approaches: (i) data envelopment ...

Health care management science. 2019 Sep ; 22 (3) : 512-533. [epub] 20190301

Health Care Manag Sci
ISSN 1572-9389 | 1386-9620
Source

In recent years, most countries around the world have struggled with the consequences of budget cuts in health expenditure, obliging them to utilize their resources efficiently. In this context, performance evaluation facilitates the decision-making process in improving the efficiency of the healthcare system. However, the performance evaluation of many sectors, including the healthcare systems, is, on the one hand, a challenging issue and on the other hand a useful tool for decision- making with the aim of optimizing the use of resources. This study proposes a new methodology comprising two well-known analytical approaches: (i) data envelopment analysis (DEA) to measure the efficiencies and (ii) data science to complement the DEA model in providing insightful recommendations for strategic decision making on productivity enhancement. The suggested method is a first attempt to combine two DEA extensions: flexible measure and cross-efficiency. We develop a pair of benevolent and aggressive scenarios aiming at evaluating cross-efficiency in the presence of flexible measures. Next, we perform data mining cluster analysis to create groups of homogeneous countries. Organizing the data in similar groups facilitates identifying a set of benchmarks that perform similarly in terms of operating conditions. Comparing the benchmark set with poorly performing countries we can obtain attainable goals for performance enhancement which will assist policymakers to strategically act upon it. A case study of healthcare systems in 120 countries is taken as an example to illustrate the potential application of our new method.

Keywords
Clustering, Cross-efficiency, Data envelopment analysis, Data science, Flexible measure, Healthcare,
MeSH
Resource Allocation methods MeSH
Global Health MeSH
Efficiency, Organizational * MeSH
Humans MeSH
Delivery of Health Care * methods organization & administration MeSH
Decision Making MeSH
Cluster Analysis MeSH
Models, Statistical * MeSH
Check Tag
Humans MeSH
Publication type
Journal Article MeSH

Published

Filters

Data clustering problems Query Show help

Exact matching

Data clustering problems Query Show help Exact matching

Refine by MeSH

Data clustering problems Query Show help

Exact matching