Journal of Data Acquisition and Processing

1 Jan 2023, Volume 38 Issue 1

Article

1.	MACHINE LEARNING MODEL FOR IDENTIFYING PHISHING WEBSITES Uday Bhaskar Penta, Dr Panda B S, Dr Sasanko Sekhar Gantayat Journal of Data Acquisition and Processing, 2023, 38 (1): 2455-2468 .

Abstract

Advances in cloud and internet technology have resulted in a major expansion in electronic trade, wherein consumers conduct online shopping and payments, in recent times. Unauthorized users can gain access to private data and cause financial harm to businesses as a result of this expansion. Phishing is a well-known attack that misleads people into accessing dangerous content and discovering their personal information. Many phishing sites are indistinguishable from legitimate ones, both visually and in terms of their universal resource location (URL). Many methods, including blacklists, heuristics, and others, have been proposed for identifying phishing sites. Yet, the number of victims is growing exponentially because of ineffective security systems. Studies done so far have revealed that the effectiveness of anti-phishing technology is poor. Customers need an effective method to safeguard themselves from cybercriminals. In this research, we use machine learning methods like K Nearest Neighbor (KNN), Support Vector Machine (SVM), and Naive Bayes (NB) to identify phishing websites on their own. Data for the study comes from PhishTank, and essential attributes are extracted via Feature Extraction (FE) methods. FE makes use of two methods: URL-based and hyperlink-based approaches. The outcome of both FE approaches is given to the ML model and validated using the metrics. The outcome of the metrics helps to identify the best combination of FE and ML models for phishing website detection.

Keyword

Websites, Features, Phishing, URL, Support Vector Machine, Accuracy, Bar graph.

PDF Download (click here)