Catching Transparent Phish: Analyzing and Detecting MITM Phishing Toolkits
ACM SIGSAC Conference on Computer and Communications Security (CCS) 2021
Links
Paper
PDFAbstract
For over a decade, phishing toolkits have been helping attackers automate and streamline their phishing campaigns. Man-in-the- Middle (MITM) phishing toolkits are the latest evolution in this space, where toolkits act as malicious reverse proxy servers of online services, mirroring live content to users while extracting credentials and session cookies in transit. These tools further reduce the work required by attackers, automate the harvesting of 2FA-authenticated sessions, and substantially increase the believability of phishing web pages.
In this paper, we present the first analysis of MITM phishing toolkits used in the wild. By analyzing and experimenting with these toolkits, we identify intrinsic network-level properties that can be used to identify them. Based on these properties, we develop a machine learning classifier that identifies the presence of such toolkits in online communications with 99.9% accuracy.
We conduct a large-scale longitudinal study of MITM phishing toolkits by creating a data-collection framework that monitors and crawls suspicious URLs from public sources. Using this infrastructure, we capture data on 1,220 MITM phishing websites over the course of a year. We discover that MITM phishing toolkits occupy a blind spot in phishing blocklists, with only 43.7% of domains and 18.9% of IP addresses associated with MITM phishing toolkits present on blocklists, leaving unsuspecting users vulnerable to these attacks. Our results show that our detection scheme is resilient to the cloaking mechanisms incorporated by these tools, and is able to detect previously hidden phishing content. Finally, we propose methods that online services can utilize to fingerprint requests originating from these toolkits and stop phishing attempts as they occur.
What are MITM Phishing Toolkits?
MITM phishing toolkits are the state of the art in phishing attacks today. They function as reverse proxy servers, brokering communication between victim users and target web servers, all while harvesting sensitive information from the network data in transit. This design lends itself to increased believability of the phishing attack since the returned web pages are live on the target web server and thus indistinguishable to the victim. Additionally, unlike traditional phishing attacks, where believable behavior ceases after the desired information (e.g. credentials and credit card numbers) is acquired, these toolkits persist the victim’s browsing session after authentication is complete. This means users can browse the target website with their authenticated session through the phishing server. This puts the victim at ease and increases the timespan that the session cookie is valid, allowing the attacker more time to conduct their desired malicious actions. In this paper, we study three popular MITM phishing toolkits: Evilginx, Muraena, and Modlishka.
How we Detect Them
The network architecture of MITM phishing toolkits signifigantly decreases the effectiveness of content-based phishing detection techniques. Additionally, as attackers control all application layer content, JavaScript-based defenses can be easily filtered by attackers. Therefore, we aim to fingerprint the web server rather than the web content. To achieve this, we focus on two categories of network-level features to distinguish MITM phishing toolkits from benign web servers: network timing features, and TLS fingerprinting features. Using these two feature groups, we can reliably detect the presence of MITM phishing toolkits in online communications. Additionally, as these features are fundamental to the architecture of MITM phishing toolkits, attackers can not trivially bypass our fingerprinting. For more in-depth information on our fingerprinting techniques, please see our paper.
Utilizing the previously described features, we develop a tool to automatically collect data on, and classify MITM phishing toolkits on the web. We call this tool PHOCA, after the Latin word for “seal.” Seals are aquatic mammals known to hunt hidden prey using vibrations generated by their breathing. Similarly to this hunting technique, PHOCA can detect previously-hidden MITM phishing toolkits using features inherent to their nature, as opposed to visual-cues. When provided either a URL or domain-name, PHOCA probes the desired web server to collect the previously mentioned network-level features. PHOCA then uses our trained classifier to determine if the web server is a MITM phishing toolkit. Using this tool, new training data can be easily generated for any future MITM phishing toolkit iteration. Additionally, PHOCA can be integrated into existing anti-phishing workflows to fingerprint active threats.
Citation
To cite our work, please use:
@article{kondracki2021catching,
title={Catching Transparent Phish: Analyzing and Detecting MITM Phishing Toolkits},
author={Kondracki, Brian and Azad, Babak Amin and Starov, Oleksii and Nikiforakis, Nick},
booktitle={ACM Conference on Computer and Communications Security (CCS)},
year={2021}
}