ML-Based Intrusion Detection System

Executive summary

I built an ML-based Intrusion Detection System on the CIC-IDS2017 dataset (200K+ records), training KNN, Naive Bayes, and Decision Tree classifiers to detect DDoS, Botnet, and Heartbleed attacks at 99.99% accuracy while minimizing false positives through feature selection.

The problem

Modern network attacks must be detected accurately across large, imbalanced traffic data.
False positives erode analyst trust and bury real threats.
Multiple attack classes (DDoS, Botnet, Heartbleed) require robust modeling.

The solution

Analyzed the CIC-IDS2017 dataset of 200K+ network traffic records.
Trained KNN, Naive Bayes, and Decision Tree classifiers for detection.
Optimized feature selection to minimize false positives.
Evaluated models against DDoS, Botnet, and Heartbleed attack classes.

Technical architecture

How the system fits together - each layer reflects technology used on the real build.

Data

Network traffic dataset

CIC-IDS2017

Modeling

Attack classification

KNNNaive BayesDecision Tree

Optimization

Feature selection & tuning

Python

Engineering challenges

Minimizing false positives

Feature selection was tuned to keep precision high without sacrificing detection coverage.

Multi-class detection

DDoS, Botnet, and Heartbleed each presented distinct signatures requiring robust models.

Scale of data

Training across 200K+ records demanded efficient preprocessing and evaluation.

Performance & SEO outcomes

99.99%

Detection accuracy

Across the evaluated attack classes.

200K+

Records analyzed

From the CIC-IDS2017 dataset.

DDoS · Botnet · Heartbleed

Attack classes

Detected by the trained models.

Minimized

False positives

Through optimized feature selection.

Technology stack

PythonKNNNaive BayesDecision TreeCIC-IDS2017

Key learnings

Feature selection often matters more than model choice for false-positive control.

Real intrusion datasets are imbalanced — evaluation has to account for it.

ML detection complements, not replaces, signature- and rule-based defenses.