a deep learning approach for ip hijack detection based on ......a deep learning approach for ip...

15
A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 8/18/20 A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding 1 School of Electrical Engineering NetAI 2020

Upload: others

Post on 07-Sep-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

TAL SHAPIRA & YUVAL SHAVITT

8/18/20

A D

eep Lea

rning Ap

proa

ch for IP Hija

ck Detection Ba

sed

on ASN

Emb

edd

ing

1

School of Electrical Engineering

NetAI 2020

Page 2: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Intro - Autonomous Systems

u Autonomous System (AS): a collection of physical networks glued together using IP, have a unified administrative routing policy, and has been assigned a number (ASN - 32 bits).u ISP Internal networks: Verizon – 701, 702, 703 …, Leve3: 3356, 3549 …

u Campus networks: University of Delaware – 2, MIT - 3

u Corporate networks: Intel - 4983

u Content provider: Google - 15169, 16591 …, Facebook - 32934, 63293 …

u Border Gateway Protocol (BGP) coordinates the Inter-AS routing in the Internet u BGP routing's update messages list the entire AS path to reach an IP

address prefix (AP)

u Policy-Based routing protocol

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

2

NetAI 2020

Page 3: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

u Prefix hijacking in a nutshell - another AS originates the prefix

u More than 40% of the network operators reported that their organization had been a victim of a hijack in the past

u What’s to stop someone else?u BGP does not verify that the AS is authorized

u Registries of prefix ownership are inaccurate

u How to?u Sub-prefix hijack (e.g. 1.1.1.1/24 instead of 1.1.1.1/22)

u Path shortening (BGP may choose path based on cost and length)

u Add a legitimate AS at the end of the path (and therefore it’s hard to tell that the AS path is bogus)

Intro – IP Hijack

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

3

Example – April 2010, China Telecom

NetAI 2020

Page 4: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Previous Approaches

u Prevention solutions (or reactive solutions):u Based on cryptographic authentications – RPKI1 and BGPsec2

u Operators are reluctant to deploy them due to technical and financial costs

u Detection solutions - based on the type of information:u Control-plane approaches3 (passive solutions) – based on a distributed set of BGP monitors and route collectors

u Data-plane approaches4 - only relies on real-time data plane information that is obtained from multiple sensors that deploy active probing (pings/traceroutes)

u Hybrid approaches5

u Most of the previous detection solutions rely on:u Features engineering + ML algorithm6

u Heuristic assumptions (e.g. VF)

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

4

1. [Geoff Huston and Randy Bush 2011] “Securing BGP and SIDR” (RPKI)2. [Matt Lepinski and K Sriram 2017] “BGPSEC protocol specification”3. [Sermpezis et al. 2018] “ARTEMIS: Neutralizing BGP Hijacking within a Minute”4. [Zhang et al. 2008] “Ispy: detecting IP prefix hijacking on my own”5. [Schlamp et al. 2016] “HEAP: reliable assessment of BGP hijacking attacks”6. [Fontugne et al. 2019] “BGP hijacking classification”

NetAI 2020

Page 5: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

‘Valley Free’ Routing

u Routing rules:u Provider accepts everything

u Peer only if it is for its customers

u Path Properties:u Up then down

u No up-down-up

u At most 1 P2P step (and only at the top)

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

5

[Gao 2001] “On inferring autonomous system relationships in the Internet”

Valid Path

Invalid Path

i

i

A

B

C

D

NetAI 2020

Page 6: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Motivation

u Using an assumption-free method for IP hijack Detectionu Our method is based only on BGP announcements (or AS-level routes)

u We introduce the first end-to-end deep learning approach

u Our goal is to use a generic approach based on ASN embeddingu We aim to learn the dense representations of ASNs from BGP routes

u Apply machine/deep learning techniques based on the representations

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

6

NetAI 2020

Page 7: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Method

u An end-to-end deep learning approach:u First stage – BGP2Vec – ASN Embeddingu Second stage –

u IP hijack detection using LSTM networks

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

7

NetAI 2020

Page 8: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Datasets

u Route Views1 BGP announcements (RV), collected in March 2018u 3,600,000 BGP paths

u 62,525 Ases

u 113,400 undirected AS links

u Labeled BGP routes:u Consists of approximately 2,648,900 standard routes

(’GREEN’) and 47,800 hijacked routes (’RED’)

u The labeling was generated by combinations of VF algorithms2 and manual work

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

8

1. University of Oregon, Route Views Project, http://www.routeviews.org, March 20182. [Shavitt et al. 2009] “Near-Deterministic Inference of AS Relationships” (ND-ToR)

NetAI 2020

Page 9: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

BGP2VEC1 – ASN Embeding

u Based on Word Embedding (Word2Vec2), broadly used in NLPu Embedding = represent discrete variables as continuous vectorsu An ASN is characterized by its context, i.e., neighboring ASNsu V = 62,525, N= 32

An example with V= 4:

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

9

1. [Shapira and Shavitt 2020] “Unveiling the Type of Relationship Between Autonomous Systems Using Deep Learning” (BGP2Vec)2. [Mikolov et al. 2013] “Distributed representations of words and phrases and their compositionality” (Word2Vec)

NetAI 2020

Page 10: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Exploration of ASN Embedings

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

10

NetAI 2020

Page 11: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Training Neural Neyworks (IP Hijack)

u An LSTM Neural Network which is comprised of five layers

u Categorical cross entropy loss functionu Using the Adam gradient-based optimizer

with default hyper-parametersu We build and run our network usingu We train our network based on our labeled

datasetu We use 20% of the samples as a test setu We run our network for 10 epochsu Inference time: 0.1 milliseconds on a single

Intel CPU

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

11

NetAI 2020

Page 12: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Experiments and Results – IP Hijack

u IP hijack detection based on ASN embedingu 99.99% Accuracy, 0.00% FA

u 50% of our misclassified predictions were wrong, i.e., we found errors in the labeled dataset

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

12

NetAI 2020

Page 13: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Results on BGP data for Ground Truth Events

u The dataset1 contains 70 events from February 2008 to July 2018u with an average number of 669

AS paths per event

u We classified correctly all the events within 2 years of our training data, or 2/3 of all the valid events

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

13

1. [Cho et al. 2019] “BGP hijacking classification”, TMA

NetAI 2020

Page 14: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Summary

u A novel approach for ASN embedding using deep learning (BGP2VEC)u Unsupervised methodu Based only on BGP announcements without any side-informationu A building block for many problems

u Achieves excellent results for IP Hijack Detectionu Without any assumptions (no ‘VF’)u 99.99% Accuracy with 0.00% FA on our own proprietry datasetu Although our method was trained with a dataset from March 2018 , we classified

correctly 2/3 of past events, and al all recent hijack eventsu As far as we know, we are the first to employs deep learning for this problem.

A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding

14

NetAI 2020

Page 15: A Deep Learning Approach for IP Hijack Detection Based on ......A Deep Learning Approach for IP Hijack Detection Based on ASN Embedding TAL SHAPIRA & YUVAL SHAVITT 20 g 1 School of

Thank you for your attention.Questions?

8/18/20

A D

eep Lea

rning Ap

proa

ch for IP Hija

ck Detection Ba

sed

on ASN

Emb

edd

ing

15

https://www.eng.tau.ac.il/~shavitt/https://talshapira.github.io/

NetAI 2020