Washington D.C. October 26-27, 2016

DCS16122 - Fighting Malware with Machine Learning

Edward Raff ( Lead Scientist, Booz Allen Hamilton )
Edward Raff is a Computer Scientist at Booz Allen Hamilton, specializing in machine learning problems and solutions. As the author of the JSAT library, Edward has extensive experience implementing all manner of algorithms. In particular, he has worked on problems involving bioinformatics, signal classification, sentiment analysis, real time object tracking, and change detection. He currently works at the Laboratory for Physical Sciences researching new methods of applying deep learning to cyber security, and in particular malware classification and analysis. Edward holds a Bachelor's and Master's degree from Purdue University, and is working on a Ph.D. at the University of Maryland, Baltimore County.
Jared Sylvester ( Senior Consultant , Booz Allen Hamilton )
Jared Sylvester joined Booz Allen Hamilton in 2014 as a member of the Strategic Innovation Group, where he has been doing machine learning research focusing on cybersecurity applications at the Laboratory for Physical Sciences. Prior to that he got his doctorate in AI at the University of Maryland, working in both the Computer Science Department doing neural network cognitive modeling, and the Marketing department doing social network analytics. He lives in Rockville, Maryland with his wife, infant son, and terrier, and enjoys animation, calligraphy, bread baking and archery.
We'll talk about some of our initial work in applying machine learning to the task of malware classification using neural networks. This is a particularly challenging problem, with data labeling issues and a type of data representation far different than the current successes in deep learning. We'll talk about our current results tackling a subset of the problem and how a neural network improved upon a classical tree based approach, while retaining many of the benefits. Using an attention based LSTM, we show agreement between what the networks learned compared to a tree based approach. We'll discuss some of our future plans for deep learning in this space, while attempting to process a binary to determine its maliciousness.

Session Level: Intermediate technical
Session Type: Talk
Tags: Federal; HPC

Day: Wednesday, 10/26
Time: 14:30 - 14:55
Location: Polaris