Wednesday, September 24, 2008

Realeyes in the Real World

For about a year, I have been running Realeyes in a pilot project at a local college. The first six months were spent making it scale to a real world environment. But for the past six months, I have been able to experiment with rules to achieve the objectives of the system:
  • Capture exploits: This one is obvious, but that doesn't make it easy.

  • Reduce false positives: This was an original design goal, and my testing shows a lot of promise.

  • Determine if exploits are successful: This was also an original design goal, and it too is showing promise.

  • Capture zero-day exploits: While this capability hasn't been designed into the system, the flexibility of the rules and the statistics collection are showing promise in accomplishing this. We have also found some incidents of ongoing invalid activity that was apparently flying under the rader.
There is not much I can say about writing rules to detect exploits. If you have never done it, you just can't imagine how hard it is. First, finding enough information to even start is very difficult. But even with decent information, picking out the critical pieces that will consistently reveal the exploit is much more of an art than a science. I have written a few dozen rules at this point, but certainly don't consider myself to be an expert. With that said, rules do get written and, after some trial and error, are fairly effective.

The holy grail of creating IDS rules is to increase the capture rate while simultaneously reducing the false positive rate. When I was on the front lines, monitoring the IDSes, there was always the fear that a major attack would be buried in the false positives. Therefore, when I started this project, my personal goal was to completely eliminate false positives. I may not succeed, but I figure that if I am shooting for 100%, I will get closer than if my target is a lot lower.

I had spent quite a bit of time thinking about and discussing this with others. The team I worked with used multiple IDSes and correlated the information from them. Our success rate was very good, compared to our counterparts in other agencies. So a part of the puzzle seemed to be collecting enough data to see the incident for what it really is.

As an analogy, imagine taking a sheet of paper and punching a hole in it with a pencil. Then try to read a magazine article through that hole. At the very least, I would want to make the hole bigger. But this is as far as the analogy takes me, because what I really need is context, and a single word or phrase doesn't do that for me.

But even using multiple IDSes wasn't the complete solution. Each IDS had limitations that reduced its contributions to the total picture. If a rule was producing too many false positives, then the rule had to be modified, often in a way that reduced its effectiveness. This meant that there were important pieces missing.

So the solution appeared to be, put the capabilities of all the IDSes in a single IDS and do the correlation at the point of data capture. And that is how the Realeyes Analysis Engine is designed. Three levels of data that are analyzed: Triggers, Actions, and Events. Each one feeds the next, and the correlation of multiple inputs gives a broader view of the activity.

Triggers are the smallest unit of data detected. They can be header data such as a source or destination port, a string such as "admin.php" or "\xeb\x03\x59\xeb\x05\xe8\xf8", or a special condition such as an invalid TCP option, handled by special code. These are correlated by Action definitions, where an Action definition may be the Triggers: 'Dest Port 80' AND 'Admin PHP'.

Actions are the equivalent of rules in most IDSes available today. It is the next level of correlation that is showing the promise of reaching the goals listed above. Events are the correlation of Actions, possibly in both halves of a TCP session. (Actually, Realeyes assumes that UDP has sessions defined by the same 4-tuple as TCP.) And this is where is gets interesting.

One of the first rules defined was to search for FTP brute force login attempts. This Event is defined by two Actions, the first being 'Dest Port 21' and 'FTP Password', the second being 'Source Port 21' and 'FTP login errors' (which is any of message 332, 530, or 532). The rule was defined to require more than three login attempts to allow for typos. It has reported a few dozen incidents and zero false positives. Considering that the string for the FTP password is "PASS ", I am quite pleased with this result.

A more recent rule was defined to detect an SQL Injection exploit. The exploit uses the 'cast' function to convert hexadecimal values to ASCII text. The rule's Triggers search for four instances of testing for specific values in a table column. The Action is defined to fire if any two of them is found in any order. Although the exploit is sent by a browser to a server, there is no port defined. This rule is reporting a couple hundred attempts per day and none of them are false positives.

It really got exciting when the rule captured a server sending the exploit. It turned out that the server was simply echoing what it received from the browser in certain instances. However, the web admins wanted to know what web pages this was happening for, and this is where the third design goal is demonstrated. Both halves of these sessions were reported and displayed in the playback window. This gave us the full URL data being sent to the server, and the web admins were able to address the problem quickly.

To address the issue of detecting new exploits, Realeyes has a couple of somewhat unique features. The first is statistics collection. The information maintained for each session includes the number of bytes sent and received. When a session ends, these are added to the totals for the server port. Then, three times a day, the totals are reported to the database. This allows for the busiest ports to be displayed. Or the least busy, which might point to a back door.

But there are other statistics that can be collected, as well. It is possible to monitor a specific host or port and collect the total data for each session.

For example, I monitored UDP port 161, which is used for SNMP, and saw two host addresses using it that belonged to the site, but 3 or 4 a day that did not. Since there was not much traffic for the port, I simply created a rule to capture all data for it. This showed me that there were unauthorized attempts to read management data from devices in the network, but that none were successful.

Using the same technique, I monitored TCP port 22, used for SSH, and found several sessions that appeared to be attempts to steal private keys. I reported this to the admin of the target hosts, and while he had applied the most recent patches, I also suggested that he regenerate his private keys, to be on the safe side.

Another feature for discovering new exploits is time monitoring. This is setting a time range for all activity to a host or subnet to be captured. I defined rules to monitor the EMail servers between midnight and 5:00 am. The original intent was to watch for large bursts of outbound EMail to detect any site hosts being used for spam. We have found one of these.

But we have discovered several other things from this. First was a large amount of traffic using a port that the network admin thought had been discontinued. Second, there have been several attempts to have EMail forwarded through the site's servers. This may be a misconfiguration at the sender, or it may be an attempt to send spam through several hops to avoid detection. From this I created rules to monitor for the most common domains.

So far, there have not been any earth shattering discoveries (to my disappointment, but to the the admins' relief). But as I said, the signs are promising that the system is capable of meeting the design goals. Until a couple of weeks ago, I have spent the majority of my time working on code. But I am starting to spend more time on rules. So I am looking forward to making some new discoveries. Stay tuned.

Later . . . Jim

No comments: