A Network Traffic Adjusting System Based on Software Defined Networks

Chenguang Zhu, Xiang Ren, and Tianran Xu

Course Project of CSC2229 - Software Defined Networking

In home networks, different applications often compete for limited bandwidth. Under such scenarios, high bandwidth consumption applications can be disruptive to others. For example, a BitTorrent download session can deteriorate the quality of an important Skype conference call. In such case, it would be useful for the home router to be able to identify the different traffic flow types, and allocate more bandwidth to the higher priority applications.

One common solution to achieve some level of traffic control is to install simple QoS rules on the router, but doing so only offers limited flexibility - it uses a limited number of rules, such as port numbers, to decide the type of flow. A malicious application could, for example, tweak its port number to circumvent bandwidth limitations. As a result, a comprehensive identification scheme is needed to provide more accurate QoS rules, such as one that examines the payload content of the packet in depth. While such flow identification schemes are difficult to adopt in traditional networks due to their complexity, the flexibility of software-defined networking makes them feasible. One can not only easily experiment with novel identification techniques, but also easily adjust traffic rules.

We proposed a software-defined controller that performs smart flow type identification and uses traffic adjustment to allocate bandwidth for different types of application flows accordingly.

Fig. 1. System overview.

Our system consists of a simple network emulated using Mininet and a Floodlight controller. We built a module on controller that identifies application types. The switch forwards incoming packets to the controller. The controller then uses our flow identification module to determine the application type, and according to the type, pushes different flow rules to the switch.

For topology emulation, the most basic home network topology that we emulated consists of a host, which connects to the internet through an openVSwitch, and a Floodlight controller that communicates with the switch. Under this topology, all the traffic between the host and the internet goes through the switch and can be analyzed and categorized by the controller. We could implement more complex topologies with Mininet that potentially better resemble realistic home networks. Such topology could include a large number of different types of hosts and more interconnecting switches.

a motivation example

Fig. 2. A simple topology.





a motivation example

Fig. 3. A more realistic topology.

For packet identification, We applied three types of machine learning algorithms. For clustering we used K-Means and mixture of Gaussians (GMM) and for classification we used SVM. Clustering algorithms are unsupervised, and group data points into k clusters, where points belong to the nearest cluster. Whereas k-means only forms circular clusters, MoG makes less assumptions about data distribution than K-Means and generally can be more accurate. Classification algorithms such as SVM assigns data into categories, and learns the assignment rules from the true type – label of each data point in the training set. We used implementations of the aforementioned algorithms from the scikit-learn libraries to build our identification module.

Dataset Selection:We need to select adequate dataset for training our machine learning models. We generated our own experimental traffic, and capture the packets using Wireshark. This approach allows us to obtain accurate flow types for the packets, and easily label each packet.

Feature Selection: We also need to decide on which features to train on. A variety of flow characteristics have been used as features. To simplify our algorithms, we use a packet level characteristic, where we examine the first N bytes of a packet's data part. We varied N to see how our model respond differently. In subsequent experiments, we also consider the destination port number of the packet as part of our feature.

Fig. 4. Packet identification workflow.

Once a flow’s application type is identified, the controller sets different priorities for the flow entry based on application type, then pushes the rule to the switch. However, setting flow entry priorities does not actually affect the bandwidth allocated to the flow, it simply helps show our abilities to identify and control the flow. For this project, we do not emphasize the actual bandwidth control functionality, and instead focus on the flow identification feature. Controlling the bandwidth of a flow can be difficult, not only does it require complex rules, the underlying switches may also not support such control features. One possible way to control the rate of a specific flow is to have multiple paths where we direct a flow through paths of different link rates.