universal algorithm
Dotaz
Zobrazit nápovědu
With the whirlwind evolution of technology, the quantity of stored data within datasets is rapidly expanding. As a result, extracting crucial and relevant information from said datasets is a gruelling task. Feature selection is a critical preprocessing task for machine learning to reduce the excess data in a set. This research presents a novel quasi-reflection learning arithmetic optimization algorithm - firefly search, an enhanced version of the original arithmetic optimization algorithm. Quasi-reflection learning mechanism was implemented for enhancement of population diversity, while firefly algorithm metaheuristics were used to improve the exploitation abilities of the original arithmetic optimization algorithm. The aim of this wrapper-based method is to tackle a specific classification problem by selecting an optimal feature subset. The proposed algorithm is tested and compared with various well-known methods on ten unconstrained benchmark functions, then on twenty-one standard datasets gathered from the University of California, Irvine Repository and Arizona State University. Additionally, the proposed approach is applied to the Corona disease dataset. The experimental results verify the improvements of the presented method and their statistical significance.
- Klíčová slova
- Aritmetic optimisation algorithm, Feature selection, Firefly algorithm, Metaheuristics, Quasi-reflection-based learning,
- Publikační typ
- časopisecké články MeSH
An embedding i ↦ p i ∈ R d of the vertices of a graph G is called universally completable if the following holds: For any other embedding i ↦ q i ∈ R k satisfying q i T q j = p i T p j for i = j and i adjacent to j, there exists an isometry mapping q i to p i for all i ∈ V ( G ) . The notion of universal completability was introduced recently due to its relevance to the positive semidefinite matrix completion problem. In this work we focus on graph embeddings constructed using the eigenvectors of the least eigenvalue of the adjacency matrix of G, which we call least eigenvalue frameworks. We identify two necessary and sufficient conditions for such frameworks to be universally completable. Our conditions also allow us to give algorithms for determining whether a least eigenvalue framework is universally completable. Furthermore, our computations for Cayley graphs on Z 2 n ( n ≤ 5 ) show that almost all of these graphs have universally completable least eigenvalue frameworks. In the second part of this work we study uniquely vector colorable (UVC) graphs, i.e., graphs for which the semidefinite program corresponding to the Lovász theta number (of the complementary graph) admits a unique optimal solution. We identify a sufficient condition for showing that a graph is UVC based on the universal completability of an associated framework. This allows us to prove that Kneser and q-Kneser graphs are UVC. Lastly, we show that least eigenvalue frameworks of 1-walk-regular graphs always provide optimal vector colorings and furthermore, we are able to characterize all optimal vector colorings of such graphs. In particular, we give a necessary and sufficient condition for a 1-walk-regular graph to be uniquely vector colorable.
- Klíčová slova
- Least eigenvalue, Lovász theta number, Positive semidefinite matrix completion, Semidefinite programming, Universal rigidity, Vector colorings,
- Publikační typ
- časopisecké články MeSH
Feature selection is one of the most important challenges in machine learning and data science. This process is usually performed in the data preprocessing phase, where the data is transformed to a proper format for further operations by machine learning algorithm. Many real-world datasets are highly dimensional with many irrelevant, even redundant features. These kinds of features do not improve classification accuracy and can even shrink down performance of a classifier. The goal of feature selection is to find optimal (or sub-optimal) subset of features that contain relevant information about the dataset from which machine learning algorithms can derive useful conclusions. In this manuscript, a novel version of firefly algorithm (FA) is proposed and adapted for feature selection challenge. Proposed method significantly improves performance of the basic FA, and also outperforms other state-of-the-art metaheuristics for both, benchmark bound-constrained and practical feature selection tasks. Method was first validated on standard unconstrained benchmarks and later it was applied for feature selection by using 21 standard University of California, Irvine (UCL) datasets. Moreover, presented approach was also tested for relatively novel COVID-19 dataset for predicting patients health, and one microcontroller microarray dataset. Results obtained in all practical simulations attest robustness and efficiency of proposed algorithm in terms of convergence, solutions' quality and classification accuracy. More precisely, the proposed approach obtained the best classification accuracy on 13 out of 21 total datasets, significantly outperforming other competitor methods.
- Klíčová slova
- COVID-19 dataset, Feature selection, Firefly algorithm, Genetic operators, Quasi-reflection-based learning, Swarm intelligence,
- Publikační typ
- časopisecké články MeSH
Interaction with the DNA minor groove is a significant contributor to specific sequence recognition in selected families of DNA-binding proteins. Based on a statistical analysis of 3D structures of protein-DNA complexes, we propose that distortion of the DNA minor groove resulting from interactions with hydrophobic amino acid residues is a universal element of protein-DNA recognition. We provide evidence to support this by associating each DNA minor groove-binding amino acid residue with the local dimensions of the DNA double helix using a novel algorithm. The widened DNA minor grooves are associated with high GC content. However, some AT-rich sequences contacted by hydrophobic amino acids (e.g., phenylalanine) display extreme values of minor groove width as well. For a number of hydrophobic amino acids, distinct secondary structure preferences could be identified for residues interacting with the widened DNA minor groove. These results hold even after discarding the most populous families of minor groove-binding proteins.
- Klíčová slova
- DNA shape, hydrophobic, indirect readout, minor groove, protein–DNA interaction, specific recognition,
- MeSH
- algoritmy MeSH
- aminokyselinové motivy MeSH
- aminokyseliny chemie MeSH
- Arabidopsis metabolismus MeSH
- DNA vazebné proteiny metabolismus MeSH
- DNA chemie MeSH
- fenylalanin chemie MeSH
- hydrofobní a hydrofilní interakce * MeSH
- konformace nukleové kyseliny MeSH
- kyselina glutamová chemie MeSH
- lidé MeSH
- proteiny chemie MeSH
- Saccharomyces cerevisiae metabolismus MeSH
- sekundární struktura proteinů MeSH
- vazba proteinů MeSH
- vazebná místa MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- Názvy látek
- aminokyseliny MeSH
- DNA vazebné proteiny MeSH
- DNA MeSH
- fenylalanin MeSH
- kyselina glutamová MeSH
- proteiny MeSH
Despite the broad conceptual and applied relevance of how the number of species or endemics changes with area (the species-area and endemics-area relationships (SAR and EAR)), our understanding of universality and pervasiveness of these patterns across taxa and regions has remained limited. The SAR has traditionally been approximated by a power law, but recent theories predict a triphasic SAR in logarithmic space, characterized by steeper increases in species richness at both small and large spatial scales. Here we uncover such universally upward accelerating SARs for amphibians, birds and mammals across the world’s major landmasses. Although apparently taxon-specific and continent-specific, all curves collapse into one universal function after the area is rescaled by using the mean range sizes of taxa within continents. In addition, all EARs approximately follow a power law with a slope close to 1, indicating that for most spatial scales there is roughly proportional species extinction with area loss. These patterns can be predicted by a simulation model based on the random placement of contiguous ranges within a domain. The universality of SARs and EARs after rescaling implies that both total and endemic species richness within an area, and also their rate of change with area, can be estimated by using only the knowledge of mean geographic range size in the region and mean species richness at one spatial scale.
- MeSH
- algoritmy MeSH
- biodiverzita * MeSH
- biologické modely * MeSH
- druhová specificita MeSH
- ekosystém * MeSH
- extinkce biologická MeSH
- obojživelníci fyziologie MeSH
- ptáci fyziologie MeSH
- savci fyziologie MeSH
- zachování přírodních zdrojů MeSH
- zeměpis * MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Research Support, U.S. Gov't, Non-P.H.S. MeSH
- Geografické názvy
- Afrika MeSH
- Amerika MeSH
- Asie MeSH
- Austrálie MeSH
- Evropa MeSH
The Internet of Things (IoT) is a universal network to supervise the physical world through sensors installed on different devices. The network can improve many areas, including healthcare because IoT technology has the potential to reduce pressure caused by aging and chronic diseases on healthcare systems. For this reason, researchers attempt to solve the challenges of this technology in healthcare. In this paper, a fuzzy logic-based secure hierarchical routing scheme using the firefly algorithm (FSRF) is presented for IoT-based healthcare systems. FSRF comprises three main frameworks: fuzzy trust framework, firefly algorithm-based clustering framework, and inter-cluster routing framework. A fuzzy logic-based trust framework is responsible for evaluating the trust of IoT devices on the network. This framework identifies and prevents routing attacks like black hole, flooding, wormhole, sinkhole, and selective forwarding. Moreover, FSRF supports a clustering framework based on the firefly algorithm. It presents a fitness function that evaluates the chance of IoT devices to be cluster head nodes. The design of this function is based on trust level, residual energy, hop count, communication radius, and centrality. Also, FSRF involves an on-demand routing framework to decide on reliable and energy-efficient paths that can send the data to the destination faster. Finally, FSRF is compared to the energy-efficient multi-level secure routing protocol (EEMSR) and the enhanced balanced energy-efficient network-integrated super heterogeneous (E-BEENISH) routing method based on network lifetime, energy stored in IoT devices, and packet delivery rate (PDR). These results prove that FSRF improves network longevity by 10.34% and 56.35% and the energy stored in the nodes by 10.79% and 28.51% compared to EEMSR and E-BEENISH, respectively. However, FSRF is weaker than EEMSR in terms of security. Furthermore, PDR in this method has dropped slightly (almost 1.4%) compared to that in EEMSR.
The fast-growing quantity of information hinders the process of machine learning, making it computationally costly and with substandard results. Feature selection is a pre-processing method for obtaining the optimal subset of features in a data set. Optimization algorithms struggle to decrease the dimensionality while retaining accuracy in high-dimensional data set. This article proposes a novel chaotic opposition fruit fly optimization algorithm, an improved variation of the original fruit fly algorithm, advanced and adapted for binary optimization problems. The proposed algorithm is tested on ten unconstrained benchmark functions and evaluated on twenty-one standard datasets taken from the Univesity of California, Irvine repository and Arizona State University. Further, the presented algorithm is assessed on a coronavirus disease dataset, as well. The proposed method is then compared with several well-known feature selection algorithms on the same datasets. The results prove that the presented algorithm predominantly outperform other algorithms in selecting the most relevant features by decreasing the number of utilized features and improving classification accuracy.
- MeSH
- algoritmy MeSH
- COVID-19 * MeSH
- Drosophila MeSH
- strojové učení MeSH
- zvířata MeSH
- Check Tag
- zvířata MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- Geografické názvy
- Arizona MeSH
Compression of ECG signal is essential especially in the area of signal transmission in telemedicine. There exist many compression algorithms which are described in various details, tested on various datasets and their performance is expressed by different ways. There is a lack of standardization in this area. This study points out these drawbacks and presents new compression algorithm which is properly described, tested and objectively compared with other authors. This study serves as an example how the standardization should look like. Single-cycle fractal-based (SCyF) compression algorithm is introduced and tested on 4 different databases-CSE database, MIT-BIH arrhythmia database, High-frequency signal and Brno University of Technology ECG quality database (BUT QDB). SCyF algorithm is always compared with well-known algorithm based on wavelet transform and set partitioning in hierarchical trees in terms of efficiency (2 methods) and quality/distortion of the signal after compression (12 methods). Detail analysis of the results is provided. The results of SCyF compression algorithm reach up to avL = 0.4460 bps and PRDN = 2.8236%.
The ongoing evolution of microbial pathogens represents a significant issue in diagnostic PCR/qPCR. Many assays are burdened with false negativity due to mispriming and/or probe-binding failures. Therefore, PCR/qPCR assays used in the laboratory should be periodically re-assessed in silico on public sequences to evaluate the ability to detect actually circulating strains and to infer potentially escaping variants. In the work presented we re-assessed a RT-qPCR assay for the universal detection of influenza A (IA) viruses currently recommended by the European Union Reference Laboratory for Avian Influenza. To this end, the primers and probe sequences were challenged against more than 99,000 M-segment sequences in five data pools. To streamline this process, we developed a simple algorithm called the SequenceTracer designed for alignment stratification, compression, and personal sequence subset selection and also demonstrated its utility. The re-assessment confirmed the high inclusivity of the assay for the detection of avian, swine and human pandemic H1N1 IA viruses. On the other hand, the analysis identified human H3N2 strains with a critical probe-interfering mutation circulating since 2010, albeit with a significantly fluctuating proportion. Minor variations located in the forward and reverse primers identified in the avian and swine data were also considered.
After a boom that coincided with the advent of the internet, digital cameras, digital video and audio storage and playback devices, the research on data compression has rested on its laurels for a quarter of a century. Domain-dependent lossy algorithms of the time, such as JPEG, AVC, MP3 and others, achieved remarkable compression ratios and encoding and decoding speeds with acceptable data quality, which has kept them in common use to this day. However, recent computing paradigms such as cloud computing, edge computing, the Internet of Things (IoT), and digital preservation have gradually posed new challenges, and, as a consequence, development trends in data compression are focusing on concepts that were not previously in the spotlight. In this article, we try to critically evaluate the most prominent of these trends and to explore their parallels, complementarities, and differences. Digital data restoration mimics the human ability to omit memorising information that is satisfactorily retrievable from the context. Feature-based data compression introduces a two-level data representation with higher-level semantic features and with residuals that correct the feature-restored (predicted) data. The integration of the advantages of individual domain-specific data compression methods into a general approach is also challenging. To the best of our knowledge, a method that addresses all these trends does not exist yet. Our methodology, COMPROMISE, has been developed exactly to make as many solutions to these challenges as possible inter-operable. It incorporates features and digital restoration. Furthermore, it is largely domain-independent (general), asymmetric, and universal. The latter refers to the ability to compress data in a common framework in a lossy, lossless, and near-lossless mode. COMPROMISE may also be considered an umbrella that links many existing domain-dependent and independent methods, supports hybrid lossless-lossy techniques, and encourages the development of new data compression algorithms.
- Klíčová slova
- data compression, data restoration, feature, residual, universal algorithm,
- Publikační typ
- časopisecké články MeSH