Detection of Encrypted Command & Control Malware Channels
Supervisor: dr. Emmanuele Zambon
Malware infections of desktop computers are a big problem in computer security. Many new variants of malware are released everyday and no antivirus is able to detect all. Adding detection at the network layer provides an extra layer of defense. This is particularly useful in the case of targeted attacks where a specific malware variant is sent to only one (or a few) target(s), making it unlikely that antivirus vendors can generate a signature for it. Operation Aurora1 is an example of a targeted attack that was recently published. Operation Aurora used malware infections in several large corporations to steal data via the internet. Ghostnet2 is another example where malware was used. In this case the malware was used to spy on the Tibetan Government in Exile.
This research aims at detecting malware infected desktop computers by passively observing network traffic generated by these computers to and from the Internet.
In more detail, we focus on malware stealing data from the desktop computer it has infected. Such malware uses command and control (C&C) channels and/or data extrusion channels to deliver the stolen data to attacker servers.
One way to decompose the operation of malware is to split it into three phases3, infection, propagation and working. The different characteristics of each phase make different detection methods appropriate. Detecting the infection by monitoring the network is difficult as it often requires extensive knowledge of the data formats transferred. For example, to detect infection code hidden in a document the monitoring system has to reconstruct the document and scan it for infection code. In the propagation phase the malware tries to find new computers to infect. This phase often generates network traffic to a lot of different systems while trying to find a system to infect. This allows current network intrusion detection systems to detect malware using for example payload signatures or honeypots. However the propagation phase is not present in all malware. For example in a targeted attack propagation might not be needed and would only increase the probability of detection. Thus detection of the propagation phase can not be relied on to detect all malware, especially not for targeted attacks.
During the working phase the malware performs a task specified by the malware author. The tasks specified range from sending spam to stealing data from the infected computer. The tasks are usually obtained via a C&C channel. Current techniques for detecting such a channel, like payload based signature detection and anomaly detection can be used today for the detection of plaintext (unencrypted) C&C traffic. These techniques are based on inspection of the contents of network traffic. Because of the effectiveness of such techniques, it is likely that malware authors will use encrypted C&C traffic more often. An obvious choice for an encrypted protocol is to use TLS or SSL on port 443, which is used for HTTPS.
Encrypted traffic on port 443 has two main peculiarities. First, port 443 is usually not blocked by corporate border firewalls, to allow users to browse the World Wide Web. Secondly, payload‐based Network Intrusion Detection Systems cannot monitor HTTPS traffic, as the contents are encrypted. This makes it an ideal C&C channel.
Detecting malware by identifying encrypted C&C traffic on port 443 is a challenging task. Because the traffic is encrypted, no information is available about the contents of the traffic. Detection methods therefore have to rely on indirect information about the content, as the size or timing of packets. This provides much less information thus making detection of C&C traffic more difficult. To the best of our knowledge, there is no detection technique that can detect malware by observing encrypted C&C traffic on port 443.
Therefore, the main research question is:
How is it possible to distinguish TLS or SSL C&C channels from legitimate TLS or SSL traffic and identify data extrusion?
To address the research question, we tackle the two main problems separately. The first problem we address is of selecting and benchmarking different classification and anomaly detection techniques to distinguish encrypted C&C traffic from legitimate encrypted traffic.
The second is how to leverage on these techniques to detect data extrusion through encrypted channels.
By further problem decomposition we extract the following research sub‐ questions:
- How prevalent is the usage of C&C channels on port 443 in malware? How prevalent are TLS or SSL C&C channels in malware?
- Which method works best to distinguish legitimate TLS or SSL traffic from TLS or SSL C&C channels? What detection and false positive rates can be achieved?
- How do we set‐up proper experiments to measure both the “detection rate” and the “false positive rate” of each technique?”
- Can the method used to detect TLS or SSL C&C channels also be used to detect data extrusion over TLS or SSL?
- If detection of TLS or SSL C&C channels is not possible, how can detection of C&C channels over unencrypted protocols be improved?
- If detection of TLS or SSL C&C channels is not possible, how can data extrusion over unencrypted protocols be detected?
Operation Aurora (Digital version available here)
Tracking GhostNet: Investigating a Cyber Espionage Network (Digital version available here)
Ruitenbeek, Sanders. Modelling Peer‐to‐Peer Botnets. 2008