High-Recall URL Phishing Detection via Multilayer Perceptron: Feature Selection, Learning Curves, and Confusion-Matrix Verification

Yoga Rizki Rahmawan; Hadi Nurjaman; Febri Faturahman Ramadhan

doi:10.65780/bima.v1i2.9

Authors

Yoga Rizki Rahmawan Universitas Informatika dan Bisnis Indonesia Author
Hadi Nurjaman Universitas Informatika dan Bisnis Indonesia Author
Febri Faturahman Ramadhan Universitas Informatika dan Bisnis Indonesia Author

DOI:

https://doi.org/10.65780/bima.v1i2.9

Keywords:

Phishing Detection, URL-Based Classification, Multilayer Perceptron, Machine Learning, Feature Selection, Cybersecurity

Abstract

Phishing attacks that exploit malicious URLs remain a significant and growing threat in the modern digital ecosystem due to their low operational costs, high scalability, and effectiveness in deceiving users. As more and more online services support important activities such as banking, e-commerce, government, and education, the need for fast, accurate, and lightweight phishing detection mechanisms is becoming increasingly urgent. This study proposes an end-to-end URL-based phishing detection framework that emphasizes reproducibility, robustness, and operational feasibility, with a particular focus on the Multilayer Perceptron (MLP) classifier. Using the PhiUSIIL phishing URL dataset, this research evaluates the performance of MLP against nine widely used machine learning algorithms, including linear, probabilistic, tree-based, and ensemble models. The methodology integrates systematic data cleaning, hierarchical data partitioning, feature normalization, ANOVA-based feature selection, and class imbalance handling to ensure fair and consistent evaluation. Model performance is assessed using accuracy, precision, recall, and F1-score, complemented by learning curve analysis and confusion matrix verification to examine generalization stability and critical error patterns. Experimental results show that while most models achieve very high overall performance, the MLP classifier consistently demonstrates superior stability and detection capabilities, achieving accuracy (99.98%), precision (99.97%), recall (100%), and F1-score (99,98%) with zero false negatives in phishing classification. These findings confirm that lexical and structural URL features alone are sufficient for effective phishing detection and highlight MLP as a practical, efficient, and reliable model for application in large-scale, real-time cybersecurity environments.

Downloads

Download data is not yet available.

High-Recall URL Phishing Detection via Multilayer Perceptron: Feature Selection, Learning Curves, and Confusion-Matrix Verification

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Additional Menu

ADDITIONAL MENU

Journal Template

JOURNAL TEMPLATE

contact-us

CONTACT US

open_access

OPEN ACCESS

indexing

INDEXING

ISSN

ISSN

Visitor Counter

VISITOR COUNTER

TOOLS

TOOLS

Information

INFORMATION

Latest publications