Authors - Shreya Joshi, Vijay Ukani, Priyank Thakkar, Mrudangi Thakker, Dhruvang Thakker Abstract - This study introduces a hybrid malware detection approach that combines machine learning (ML) and deep learning (DL) techniques to enhance detection accuracy. By applying models like K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Gradient Boosting, we achieved 99.47% accuracy during training and 99.21% accuracy during testing, focusing on analyzing Portable Executable (PE) header data. Additionally, incorporating Convolutional Neural Networks (CNN) with Long Short-Term Memory (LSTM) networks improved performance, achieving 99% accuracy, 97% precision, and 98% recall after 30 epochs. The proposed hybrid method reduces false positives and negatives while demonstrating scalability across various datasets, offering a reliable and efficient solution for contemporary malware detection.
Authors - Balvinder Pal Singh, Thangaraju B Abstract - Modern computing systems face the dual challenge of meeting escalating performance demands, driven especially by AI and ML workloads, while operating under stringent thermal constraints. As energy consumption continues to rise, conventional thermal mitigation techniques like frequency throttling often compromise performance and increase cooling costs, threatening long-term sustainability. This paper proposes a multi-level, software-centric approach to address this challenge through intelligent, temperature-aware scheduling. Leveraging evolutionary techniques, specifically Genetic Algorithm (GA), the proposed model reorders jobs at the OS scheduler level based on thermal impact and energy profiles. A secondary optimization phase further fine-tunes job execution using dynamic slice adjustment for thermally intensive tasks. Simulation results obtained using custom-integrated simulation framework leveraging GEM5, McPAT, and Hotspot tools, demonstrate 28% overall performance improvement, 53% reduction in thermal violations and a 15% decrease in energy consumption with 80% of the tasks executed without performance degradation. This approach was validated using representative benchmark workloads, optimizing both energy and temperature profiles. This AI-augmented, multi-level scheduling strategy significantly enhances thermal efficiency and performance, offering a scalable solution for next-generation high-performance computing environments.
Authors - Kostiantyn Hrishchenko, Oleksii Pysarchuk, Danylo Baran Abstract - This paper introduces a batch processing method for constraint programming to improve solution performance. The method is demonstrated in an example of a production scheduling problem. Transformations from mass production of standardized products toward small-scale, customized orders in the manufacturing sector introduce a new challenge of handling extensive input data. Modern production scheduling systems struggle to handle BigData loads caused by the limitations of state-of-the-art scheduling algorithms. Therefore, the development of highly adaptive algorithms and models capable of efficiently managing numerous unique orders while maintaining the ability to adapt to dynamic constraints and objectives becomes critically important. The baseline discrete constraint programming approach is chosen for its flexibility and extensibility, allowing it to model various realistic manufacturing scenarios. The proposed method splits large input into smaller subsets, each scheduled independently by repeatedly involving the constraint solver in different portions of the input data, significantly improving performance compared to allocating the whole input in a single step. Computational experiments with Google's OR-Tools CP-SAT solver evaluated the method's effectiveness. Time and memory usage reductions were shown. The proposed method demonstrates the possibility of solving problems of much larger size using the same constraint model and solver. It combines the advantages of the greedy algorithm and the exact integer programming approach.
Authors - V. Machaca, J. Grados, K. Lazarte, R. Escobedo, C. Lopez Abstract - The interaction between peptides and the Major Histocompatibility Complex (MHC) is a critical factor in the immune response against various threats. In this work, we fine-tuned protein language models like TAPE, ProtBert-BFD, ESM2(t6), ESM2(t12), ESM2(t30), and ESM2(t33) by adding a BiLSTM block in cascade for the task of peptide-MHC class-I binding prediction. Additionally, we addressed the vanishing gradient problem by employing LoRA, distillation, hyperparameter guidelines, and a layer freezing methodology. After experimentation, we found that TAPE and a distilled version of ESM2(t33) achieved the best results outperforming state-of-the-art tools such as NetMHCpan4.1, MHC urry2.0, Anthem, ACME, and MixMHCpred2.2 in terms of AUC, accuracy, recall, F1 score, and MCC.
Authors - Nafiz Eashrak, Mohammad Ikbal Hossain, Md Omum Siddique Auyon, Md Abdullah Al Adnan Abstract - The proliferation of cryptocurrencies and blockchain technology has significantly reshaped the financial sector, introducing decentralized and transparent digital transactions. Despite substantial advantages such as reduced transaction costs, enhanced transparency, and financial inclusion, the anonymous and decentralized nature of cryptocurrencies has also facilitated illicit financial activities, including fraud, money laundering, and tax evasion. This literature review systematically examines forensic methodologies, regulatory challenges, and theoretical frameworks relevant to cryptocurrency investigations. This study highlights advancements in blockchain analytics, AI-driven monitoring tools, and regulatory frameworks such as the FATF Travel Rule and EU MiCA. However, it also underscores persistent challenges posed by privacy-focused technologies, decentralized finance (DeFi), cross-border jurisdictional inconsistencies, and technical limitations in forensic methodologies. This paper proposes an integrated forensic framework incorporating AI analytics, international regulatory collaboration, specialized forensic training, and privacy-preserving investigative techniques. Through this comprehensive review, it provides critical insights for enhancing forensic investigation capabilities, regulatory compliance, and policymaking, while outlining future research opportunities addressing emerging threats in cryptocurrency-based financial systems.
Authors - Victor Sineglazov, Alexander Ruchkin Abstract - The article proposes a new combined approach to investigation the problem of classification on real-world noisy dataset using a multi-stage semi-supervised learning method. The main idea of this approach is based on combining two methods: self-supervised learning on unlabeled data using Contrastive Loss Nets and semi-supervised learning label propagation using an enhanced Poisson Seidel learning technique. The proposed approach offers significant advantages, as it allows for preliminary classification without labels, strengthening the distinctions between classes, and then using a minimal amount of labeled data for final classification. This is demonstrated through the analysis of synthetic data from different intersecting ”Two moons” and real medical dataset on heart disease - ”Cardio Vascular” Accuracy in the first case exceeds 82%, and for the second example - 73%, which is one of the best result on the Kaggle database when compared to any other known 20 methods.
Authors - Amir Aieb, Alexander Jacob, Antonio Liotta, Muhammad Azfar Yaqub Abstract - Predicting soil moisture under dynamic climate conditions is challenging due to intricate dependencies within the data. This study presents a surrogate deep learning (SDL) model with a multitask learning (MTL) approach to improve daily soil moisture predictions across spatiotemporal scales. The model employs a two-level encoding process, first compressing climate parameters into a single feature and then applying sequential encoding to capture long-term temporal patterns within a one-year timeframe for better generalization. Seasonality detection using autocorrelation facilitates data resampling into homogeneous samples, enhancing the SDL model by optimizing hyperparameters through efficient weight sharing between layers. To evaluate the effectiveness of MTL, three different SDL architectures such as LSTM, ConvLSTM, and BiLSTM were implemented for a comprehensive analysis. All models struggle with soil moisture prediction, particularly during dry periods, where LSTM experiences the most significant accuracy drop. While BiLSTM demonstrates better performance, its effectiveness remains constrained. However, integrating MTL enhances model stability and spatio-temporal accuracy, reducing errors across various conditions and achieving a 10% improvement due to better data representation, enabling SDL models to capture regional heterogeneity more effectively.
Authors - V. Machaca, D. Lopez, J. Mamani, S. Ramos, Y. Tupac Abstract - Cancer immunotherapy offers a promising option to conventional cancer therapies, in this field, neoantigen detection is a rapidly evolving field; however, as it requires multiple bioinformatics stages, such as quality control, alignment, variant calling, annotation, and neoantigen prioritization. Each stage depends on specific software tools, which can create technical and compatibility challenges, often requiring significant expertise to integrate and manage. To address these challenges, we introduce NeoArgosTools, a novel flowchart-based platform, with an intuitive graphical interface, designed to simplify and streamline neoantigen detection pipelines in cancer immunotherapy research.