POST ANALYSIS OF SNORT INTRUSION FILES USING DATA MINING TECHNIQUES: DECISION TREE AND BAYESIAN NETWORK
Abstract
Network security is a crucial information technology activity today. Intrusion Detection Systems (IDS) are among the fastest growing technologies in computer security domain. These systems are designed to identify/ prevent any hostile intrusion into a network. Most conventional intrusion detection systems have limitations in the way they log their alerts which snort exhibit is known as the infidelity issue, that is to say snort IDS does not infer the behavior of the network traffic generated, which can result in misinterpretations. Therefore in this project data mining techniques was applied to the logged alert in order to extract hidden knowledge of the traffic pattern. This research investigates the network domain of data mining using the network alerts generated from snort intrusion detection system in order to mine the alerts for re-classification. The data comprised of nine sixty (960) records of alerts. Classification task is used to evaluate the alerts making use of Bayesian Network and Decision Tree methods. The output of the two classification methods – Bayesian Network and Decision Tree are compared to determine the one that gives the best classification results. At the modeling stage, open source software called WEKA 3.6.13 was used. The data set was divided into two sets – -Training and Testing. Sixty eight percent (68%) was used for training while thirty two percent (32%) was used for testing. From the output generated from the experiment, Decision tree outperformed Bayesian network in most aspects and the existing snort with data mining is more reliable and efficient over snort alone. The results obtained from the analysis clearly demonstrated that Decision tree outperformed Bayesian network. Decision tree demonstrated a superior performance than Bayesian network in term of the number of correctly classified instances and also in terms of Root Mean Squared Error, Root Relative Squared Error, Mean Absolute Error, Relative Absolute Error. Bayesian Network outfitted Decision Tree in time taken to build the model but performed poorly at the classification. The time taken for naïve bayes and decision tree classifiers are 0.12 and 0.32 seconds respectively.