Abad, Shayan Gholamy, Hassan
Millions of new websites are created daily, making it challenging to determine which ones are safe. Cybersecurity involves protecting companies and users from cyberattacks. Cybercriminals exploit various methods, including phishing attacks, to trick users into revealing sensitive information. In Australia alone, there were over 74,000 reported phis...
Kim, Younghoon Wang, Tao Xiong, Danyi Wang, Xinlei Park, Seongoh
Published in
BMC bioinformatics
Early detection of cancers has been much explored due to its paramount importance in biomedical fields. Among different types of data used to answer this biological question, studies based on T cell receptors (TCRs) are under recent spotlight due to the growing appreciation of the roles of the host immunity system in tumor biology. However, the one...
Hashimoto, Noriaki Ko, Kaho Yokota, Tatsuya Kohno, Kei Nakaguro, Masato Nakamura, Shigeo Takeuchi, Ichiro Hontani, Hidekata
Published in
International journal of computer assisted radiology and surgery
For the image classification problem, the construction of appropriate training data is important for improving the generalization ability of the classifier in particular when the size of the training data is small. We propose a method that quantitatively evaluates the typicality of a hematoxylin-and-eosin (H&E)-stained tissue slide from a set of im...
Excoffier, Jean-Baptiste Salaün-Penquer, Noémie Ortala, Matthieu Raphaël-Rousseau, Mathilde Chouaid, Christos Jung, Camille
Published in
Medical & Biological Engineering & Computing
The COVID-19 pandemic rapidly puts a heavy pressure on hospital centers, especially on intensive care units. There was an urgent need for tools to understand typology of COVID-19 patients and identify those most at risk of aggravation during their hospital stay. Data included more than 400 patients hospitalized due to COVID-19 during the first wave...
Herrera-Semenets, Vitali (author) Hernández-León, Raudel (author) van den Berg, Jan (author)
We live in a world that is being driven by data. This leads to challenges of extracting and analyzing knowledge from large volumes of data. An example of such a challenge is intrusion detection. Intrusion detection data sets are characterized by huge volumes, which affects the learning of the classifier. So there is a need to reduce the size of the...
Aslani, Mohammad
In cities where land availability is limited, rooftop photovoltaic panels (RPVs) offer high potential for satisfying concentrated urban energy demand by using only rooftop areas. However, accurate estimation of RPVs potential in relation to their spatial distribution is indispensable for successful energy planning. Classification, plane segmentatio...
Aslani, Mohammad Seipel, Stefan
Support vector machines (SVMs) are powerful classifiers that have high computational complexity in the training phase, which can limit their applicability to large datasets. An effective approach to address this limitation is to select a small subset of the most representative training samples such that desirable results can be obtained. In this st...
Aslani, Mohammad Seipel, Stefan
Training support vector machines (SVMs) for pixel-based feature extraction purposes from aerial images requires selecting representative pixels (instances) as a training dataset. In this research, locality-sensitive hashing (LSH) is adopted for developing a new instance selection method which is referred to as DR.LSH. The intuition of DR.LSH rests ...
Arnaiz-González, Álvar González-Rogel, Alejandro Díez-Pastor, José-Francisco López-Nozal, Carlos
Published in
Progress in Artificial Intelligence
Instance selection is a popular preprocessing task in knowledge discovery and data mining. Its purpose is to reduce the size of data sets maintaining their predictive capabilities. The usual emerging problem at this point is that these methods quite often suffer of high computational complexity, which becomes highly inconvenient for processing huge...
Makkhongkaew, Raywat
We are drowning in massive data but starved for knowledge retrieval. It is well known through the dimensionality tradeoff that more data increase informative but pay a price in computational complexity, which has to be made up in some way. When the labeled sample size is too little to bring sufficient information about the target concept, supervise...