StreamingPhish

This is a utility that uses supervised machine learning to detect phishing domains from the Certificate Transparency log network. The firehose of domain names and SSL certificates are made available thanks to the certstream network (certstream.calidog.io). All of the data required for training the initial predictive model is included in this project as well.

Also included is a Jupyter notebook to help explain each step of the supervised machine learning lifecycle (as it pertains to this project).

Overview

This application consists of three main components:

Jupyter notebook
- Demonstrates how to train a phishing classifier from start to finish.
CLI utility
- Trains classifiers and evaluates domains in manual mode or against the Certificate Transparency log network (via certstream).
Database
- Stores trained classifiers, performance metrics, and code for feature extraction.

Each segment has been functionally decomposed into its own Docker container. The application is designed to be built and operated via Docker Compose.

Install and Operational Instructions

Components

Docker - Containers that run the application.
Docker Compose - Fabric for orchestrating containers and their respective services.
Python3 - Programming language.
Scikit-learn - Open source library for training classifiers using Python.

Author

Wes Connell

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for further details.

Resources/Acknowledgments

Certificate Transparency Log Network - Framework that aggregates and streams SSL certificates issued by authorities in near real-time.
x0rz Phishing Catcher - Phishing detection utility I saw that inspired me to build this project.
Calidog Security - Calidog Security, creators of the certstream library.
Phishing Regex Resource - Cherry-picked a few of the phishing words from this list, authored by SwiftOnSecurity.
PhishTank - Helped with identifying brands frequently targeted in phishing attacks.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
cli		cli
db		db
jupyter		jupyter
training_data		training_data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
install_streamingphish.sh		install_streamingphish.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cli

cli

db

db

jupyter

jupyter

training_data

training_data

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

docker-compose.yml

docker-compose.yml

install_streamingphish.sh

install_streamingphish.sh

Repository files navigation

StreamingPhish

Overview

Install and Operational Instructions

Components

Author

License

Resources/Acknowledgments

About

Releases

Packages

Languages

License

wesleyraptor/streamingphish

Folders and files

Latest commit

History

Repository files navigation

StreamingPhish

Overview

Install and Operational Instructions

Components

Author

License

Resources/Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Languages