Ph.D. Dissertation Defense
Knowledge Graphs and Reinforcement Learning:
a Hybrid Approach for Cybersecurity Tasks
Aritran Piplai
10:00am -12:00pm ET, Monday July 10, 2023
Dissertation Committee: Drs. Anupam Joshi (Chair/Advisor), Charles Nicholas, Chenchen Liu, Filip Perich, Joysula R. Rao, Sudip Mittal, Tim Finin
With the explosion of available data and computational power, machine learning and deep learning techniques are being increasingly used to solve problems. The domain of cybersecurity is no exception, as we have seen a significant amount of research in the recent past using data-driven machine learning approaches for different tasks.
Rule-based and supervised machine learning-based approaches are often brittle in detecting attacks, can be defeated by adversaries that adapt, and can't use the knowledge of experts. To address this problem, we propose a novel approach for cybersecurity tasks that leverages the supply of 'explicit' knowledge expressed as Knowledge Graphs, as well as the data-driven approaches of machine learning to ascertain the 'tacit' knowledge. The inspiration for this hybrid model is drawn from the manner in which security analysts work, combining their background knowledge with observed data from host and network-based sensors.
We focus on the tasks of malware detection and mitigation policy generation. We are specifically interested in the capability of synthesizing variants of a malware that can be further enhanced to detect unseen attacks. We combine the knowledge that is explicit with explorations in the 'action space' of detecting novel malware attacks as well as mitigating them. Analysts use past experiences and actions that they have taken when faced with attacks that manifest characteristics similar to the new one. The other component of their approach is exploring new `actions' that might be taken to respond to a novel malware, guided by their understanding of the domain. These steps can be considered as a 'trial and error' based approach. In this dissertation, we describe knowledge graph construction techniques from open-source text as well as several Reinforcement Learning based algorithms, guided by explicit background knowledge encoded in a knowledge graph, that best mimics the approach of security analysts. We observe that the efficiency of RL algorithms has increased by ~4% by incorporating prior knowledge. We also observe that in simulated environments RL policies are able to generate more precise mitigations with the help of prior knowledge.