The customer has prior experience with, and likes, ISS. Unfortunately, I have never rolled my sleeves up and got my hands messy with ISS products. Apparently, ISS have made no commitment on their roadmap to produce a Solaris 10 agent. I suggested that the customer have a word with ISS about this directly, to seek clarification (and also apply an incentive to ISS to do the port).
We discussed network-based IDS for a while, too. It looks like this technology (especially if it can handle decryption of encrypted streams) is probably off the menu, particularly as a result of the huge level of paranoia around the subject of interception or potential disclosure of cleartext data in the infrastructure's databases. This is a pity, as our N2000 kit would have been really cool to get in here to do said decryption...
Customer Issues with IDS
Your typical formal IDS appears to be set on a hair-trigger. For instance, the poor front-line IDS alert analysis guys in the customer's organisation get alerts whenever a webserver or appserver gets manually restarted, because the IDS agent they use on Solaris 9 thinks it's a SYN flood attack.
Now, there's no remotely conceivable reason why a SYN flood should be seen by a server's live service interfaces when it is presenting a load-balanced service. Any load-balancer worth its salt is smart enough to drop something like a SYN flood right on the floor by default.
Also, even if you have a physical intrusion by a Bad Guy on a site (which other security measures should catch anyway), what clueless weenie of an attacker is going to try to take a multi-CPU SPARCbox down via something as lightweight as a SYN flood, given that we've sanity-checked for them for years?
IDS solutions need a lot of configuration to prevent false-positive alerts like this showing up; in fact, "tuning" a traditional IDS feels like taking an undifferentiated instrument so sharp that it can cut daylight, and selectively blunting parts of it with a lump hammer.
Customer Procedure
I explored with the guys exactly what it is that they do when they get an IDS alert. After all, if you get an SNMP trap screaming at you that you're being attacked, you have to do something to qualify and act on it, right?
Given the fragmented nature of their groups, there was a lot more politics and a lot less solid response definition involved that I would have hoped for - as a real killer, the procedure included at least 3 levels of inter-departmental handoff, so there was no single joined-up procedure that was contained within a single person's head. IMHO this is not the way to run an IDS, or any other aspect of system management, for that fact. However, the groups have apparently been re-orged to the point where they are now joined up, so here's hoping that everyone starts talking to eachother sooner rather than later...
In common with everyone else I've spoken to who is deploying IDS, this customer get a very large number of false positives. In fact, they get more than 2000 reports a day, which are analysed by "eyeball 1.0" and typically reduced to 2 reports a day which may be worth actual investigation. Note that the infrastructure they are working on isn't even live yet - this is all test data and sysadmin activity. I pity them when it all goes live, they will find out what "inundation" really means.
Other than lack of knowledge about event correlation and root cause analysis software, the main reason why the customer isn't doing this stuff in an automated or semi-automated fashion yet is that they still don't have all their system clocks synchronised to a single, monotonically increasing time source. I believe their top priority has to be to get everything on NTP and UTC.
Once they have done that, they might find some mileage in looking at Tier-3 Huntsman or e-Security Sentinel, for log and root cause analysis, or letting their perl hackers loose on read-only copies of the logfiles.
Oh, they aren't using Tripwire either. Or N1SPS. Or the Solaris Security Toolkit (yet). This makes determining by audit process "what their servers are actually set up to do" not so much a real problem as an utter bastard. Trying to deploy an IDS into an infrastructure in an unknown state feels to me like a real exercise in futility, and a lot of hard, ongoing work for no reward. They definitely need to nail the infrastructure down, first!
Standards - and How to Get Around Them
As I suspected, the infrastructure security is designed to meet the requirements of a prescriptive recipe. So, let's think laterally here. What does this recipe define as constituting an IDS?
According to the customer, the given definition of an IDS is "woolly" (which might be a contributor to why the recipe is in the process of being rewritten).
So, if I define an IDS as being "some software capability which gives realtime reporting on odd or bad stuff happening", then there's a whole bunch of possibilities which can arise.
In fact, a properly configured Solaris 10 install - along with some analytical stuff at the log aggregation point which a typical system and network management product should be able to handle - could be considered to contain enough functionality to comprise an IDS for this purpose :-).
Start by privilege-limiting all the services which a host serves up to the network. Compared to most IDS-associated technologies, this also has the benefit of actually tightening the system's security up. I've also sending the customer all the details of restricted privileges within zones (and also Glenn and Darren's excellent BluePrint on priv debugging), so that they may construct per-app zones with limit sets conducive solely to the apps (1 zone per app) running in them.
If such a configured environment logs any attempt of privilege violation, then this is a guaranteed bona fide hit that "Something Is Going On That Shouldn't Be".
Use this mechanism to get rid of a lot of false positives, add Solaris-Audit-over-syslog, and you're suddenly really getting somewhere :-).
Similarly, set ipfilter up in the global zone (as, indeed, you must today) and get it to log anything that isn't expected traffic (ie outside the simple ruleset you've configured it to allow).
So, once you have tracking of privilege violation attempts, port scanning and other anomalous network traffic, and kernel-level auditing tracking who does what - especially when things are done with legitimately enhanced privileges - you use syslog to shunt it all to a dedicated log gatherer, which also runs some code to look for this kind of behaviour and send appropriate SNMP traps.
What more would you want in an suitably tuned IDS? :-).
Seriously - what more would someone be likely to want in an IDS, which is not covered in the above? Could it be put together fairly easily?
(2006-11-30 07:34:47.0)
Permalink