Friday July 11, 2008 Limiting the size of the auditing log
Versions of Solaris up to and including Solaris 9 provide a nice feature within the auditing subsystem by which an upper limit can be defined for the size of the auditing log. Once this is reached the auditing log is automatically closed and a new log is opened in its place. This upper limit is set (and read) via the auditconfig(1M) command:
SYNOPSIS
auditconfig option...[...]
-setfsize size
Set the maximum size of an audit file to size bytes. When the size limit is reached, the audit file is closed and another is started. If perzone is not set, this option is valid only in the global zone.
While still documented, this functionality, however, does not work correctly in Solaris 10 and beyond.
In Solaris 9, the actual writing of all of the auditing data to disk was handled by the kernel, with the job of the userland auditd(1M) deamon being only to perform the initial setup of the auditing log and then hand everything over to the kernel (basically by calling auditsvc(2)) and then waiting until the kernel reported that something had gone wrong. As a result of this, the implementation of the upper limit for the auditing log (in other words, the code that stopped writing data once a certain amount of it had been written) was implemented in the kernel, and the interface for making use of it was provided via a system call (and the auditconfig command mentioned above makes use of the system call).
In Solaris 10 things changed significantly. A new project was initiated to alter the auditing subsystem so that instead of just writing auditing data to a binary file on disk, a plugin architecture was produced which would allow new output methods for the logging of the auditing data to be created, and then for these to be freely configured for use on site (and even allows different plugins to act on the same auditing data together). It makes sense for these plugins to run in userland (rather than requiring them to be kernel modules or something similar, as this would limit the usefulness of the plugins and put an extra burden on the people creating them). So the auditing subsystem was split into two across the kernel to userland boundary. Rather than just dispatching the writing of the auditing data to the kernel and then going away, as occurred previously, the auditd daemon now registers a door server with the kernel and stays awake. As the kernel generates auditing data it passes the data back down into userland to the running auditd daemon via the door server, where auditd then takes over the responsibility of recording the auditing data by passing it to the available plugins which in turn handle actually recording the data using whatever method they are intended to implement.
The default plugin, audit_binfile(5) just recreates the previous behaviour of writing the log to a binary file. As a further example, the new plugin audit_syslog(5), writes the data to the system log instead (where you can configure it via the usual syslog.conf(4) method).
All of this means, however, that the kernel code which previously handled the auditing file size limit, was no longer applicable, because the auditing data was no longer written within the kernel. Furthermore, the whole concept of a file size limit is specific to the process of writing data to disk, and therefore doesn't match the new generic paradigm where the auditing subystem no longer knows what happens to the auditing data once it is passed to the plugins. As a result, setting the kernel-based file size limit using the traditional '-setfsize' auditconfig command had no affect, and no alternative userland implementation had been created to replace it.
I have recently been working on a project to rectify that last problem, and the fruits of that project were integrated into OpenSolaris build 91 (see here for diffs). As the actual writing of the auditing data to disk now occurs within the audit_binfile(5) plugin, this change creates a new parameter associated with that plugin, called p_fsize, which when set will produce the same functionality as was previously available via the '-setfsize' command, namely it will close the audit log when a certain limit is reached and open a new one. This parameter can be set within the audit_control(4) file in the normal way, for example:
plugin: name=audit_binfile.so.1;p_fsize=512000
which will set the filesize limit to 512,000 bytes. Here's a quote from the updated OpenSolaris audit_binfile(5) man page which describes this feature:
The p_fsize attribute defines the maximum size in bytes that an audit file can become before it is automatically closed and a new audit file opened. This is equivalent to an administrator issuing an audit -n command when the audit file contains the specified number of bytes. The default size is zero (0), which allows the file to grow without bound. The value specified must be within the range of [512,000, 2,147,483,647].
Update: This fix is now available in Solaris 10 as of 10/08, and is documented in the corresponding man pages. It is also available in patches 138085-01 (SPARC) and 138086-01 (x86).
( Jul 11 2008, 01:52:14 PM BST ) PermalinkLondon OpenSolaris User Meeting Report
On Monday (19th December) I attended the latest meeting of the London OpenSolaris User Group. This was a very interesting event. It was difficult for me to judge the ratio between Sun and non-Sun attendees (the majority of faces were unfamiliar to me) but I would like to think that it was a popular event beyond the Sun staff.
The event was opened by a detailed talk about DTrace from Jon Haslam. This talk turned out to be a highlight of the evening, as rather than being just another demo of the technology, it was instead an examination of the design and implementation of DTrace. This included an analysis of how your D code is converted into the DTrace virtual machine opcodes and how these objects are then injected into the kernel so that they are executed when a newly activated probe is 'hit'. In all it made for a very helpful precursor to delving into the sourcecode.
We were also joined by a welcome guest from overseas, Adam Leventhal. He is a member of the Sun engineer elite, probably most famous for his his membership in the DTrace team. In the latter capacity he contributed to the meeting by assisting Jon in his description of DTrace. However, he also presented us with a new project, BrandZ, which I had heard of in the past, but haven't had an update on in a while.
This is an interesting project which aims to allow binaries from non-native OSs (e.g. Linux) to run unmodified in a Solaris zone (on a matching processor architecture) by 'booting' a Branded (non-Solaris) Zone and emulating the guest OS' kernel API there (for the most part this means the kernel's system calls). Unfortunately the talk was only a quick (5-minute) demo and proof of concept, but it was very interesting to see that real progress is being made on this project, which is now available for download via OpenSolaris.
This was then followed by a general overview of ZFS, and why its great, as presented by fellow GMP-er, Tim Graves. This presentation is great, and I have seen it presented while I was in the states by the ZFS team themselves, but unfortunately due to a lack of time Tim had to run through it a bit quicker than it deserved. However, ZFS remains a fantastic technology which I love to play with. Its unfortunate that as a technology it can sometimes be a bit of an underwhelming demo because it is designed to make things easy, and things always seem easy in a well prepared demo. The only real way to get a feel for how revolutionary ZFS is is to download it and give it a go, something which I highly recommend doing.
Next up was another quick presentation about a new Solaris related technology, this time the port of Solaris to Xen (a free open source virtual machine type program, allowing you to run multiple OSs at once). The good news is that this project is making progress and has a currently working demoable status.
Next we had a talk on an important topic which is currently starting to get the attention it deserves in Sun: performance. Phil Harman, who presented this talk, is one of the leaders of the movement to put performance on the top of the agenda within Sun (he coined the oft-quoted phrase 'If its faster in Linux, its a Solaris bug'). One of the interesting things that he discussed was the work his team was engaged in in changing the default attitude of Sun, and the best way to convince engineers of something is to prove it in practice, so the team developed a portable benchmarking tool libmicro to produce real provable statistics. This tool is designed to make it easy to create benchmarks that test your favourite portable APIs on different platforms and it has been successfully run on most obvious modern Unixes (Solaris, Linux, Windows Services for Unix, MacOSX etc). This application is now available online and Phil encouraged everyone to download it and create thier own benchmarks to see how Solaris compares in thier favourite APIs. And if Solaris doesn't compare favourably, we want to know about it!
Finally, Darren Moffet closed the show with a quick presentation of Sun's efforts to integrate encryption into our filesystems. He gave a summary of the field and Sun's design decisions in implementing this forthcoming project.
All in all it made for a very interesting evening.
( Dec 22 2005, 04:59:12 PM GMT ) Permalink
Now that OpenSolaris is up and running, we can talk more freely about the source code that we work on everyday. I thought I'd celebrate this launch by narrating the investigation of a simple userland bug fix that I worked on recently. Although there's nothing magical or ground-breaking here, it might still provide some insights into the bug-fixing process for those who aren't familiar with it, and to at least one small part of the Solaris code for those who are.
The bug was discovered in the Solaris implementation of one of the ancient and ever-ready Unix tools: xargs(1). The job of this tool is to collect arguments via the standard input and to bundle them up into argument lists to be passed to another tool. Its one of those commands that you probably think you'll never need, but every now and then you come across some situation where it does exactly what you want (and nothing else!). So many Unix tools fit that description!
For the most part, this tool works great, however, one user discovered that when given a specific kind of input, it wasn't behaving as expected. In the test case provided, the input to the command is quoted with a backslash character, meaning the spaces should be ignored (causing each line to be treated as a single argument). This input is passed to a command which prints each of its arguments on a line by itself. In this scenario, because each line is a single argument and each argument is printed on a single line, the input should be the same as the output. However, when given a sufficiently large input text, the output would slightly differ: new lines were added in certain places. Bug 6203159 was duely logged, and I was tasked with fixing it.
After getting a reproducible test case up and running, we can start digging a little further into exactly what is happening when the error occurs. We can recreate the problem with a simplified test case, using the -s flag to the xargs command to artificially limit the maximum length of the command line argument that xargs builds up (perhaps a clue about where the problem might lie...):
$ xargs printf "%s\n" <<EOF > one\ line > one\ line > EOF one line one line $ xargs -s 26 printf "%s\n" <<EOF > one\ line > one\ line > EOF one line one line
The output of these two invocations should be the same, however we can see that in the second case limiting the length of the created command line corrupts the output. Lets use truss(1) to find out exactly what commands xargs is executing (-t lets us just trace the exec(2) system call, -f says to also trace child processes, and -a prints out the arguments passed to exec-like system calls):
$ truss -fat exec -o truss.log xargs <testdata -s 26 printf "%s\n" >/dev/null
$ grep -v ENOENT truss.log
14433: execve("/usr/bin/xargs", 0xFFBFF3E4, 0xFFBFF3FC) argc = 5
14433: argv: xargs -s 26 printf %s\n
14437: execve("/bin/printf", 0x00024B7C, 0xFFBFF3FC) argc = 3
14437: argv: printf %s\n one line
14442: execve("/bin/printf", 0x00024B7C, 0xFFBFF3FC) argc = 4
14442: argv: printf %s\n one line
Now, this is very interesting, because we can see that in the second invocation of printf(1) we are passing 4 arguments instead of three (the 'argc = 4' text shows this) so the words 'one' and 'line' must be being sent in the second case as separate arguments: the '\' isn't doing its job. So some where in the part of the code where xargs builds up its list of arguments, we're losing this special character.
With a good idea of where the problem is occuring, we can turn to the code. xargs builds up its argument list character by character within the getarg() function. The main loop here starts at the follwing lines:
for (arg = next; ; c = getwchr()) {
bytes = wctomb(mbc, c);
...
so lets grab our trusty debugger (in this case dbx), get the thing running within it, and stop execution at the second line above. We can then watch the argument build up for a while:
(dbx) runargs <testdata -s 28 printf "%s" (dbx) stop at 548 (2) stop at "xargs.c":548 (dbx) run Running: xargs -s 26 printf %s < testdata (process id 2244) stopped in getarg at line 548 in file "xargs.c" 548 bytes = wctomb(mbc, c); (dbx) display arg arg = 0x268a2 "" (dbx) cont stopped in getarg at line 548 in file "xargs.c" 548 bytes = wctomb(mbc, c); arg = 0x268a2 "o" (dbx) cont stopped in getarg at line 548 in file "xargs.c" 548 bytes = wctomb(mbc, c); arg = 0x268a2 "on" (dbx) cont stopped in getarg at line 548 in file "xargs.c" 548 bytes = wctomb(mbc, c); arg = 0x268a2 "one"
So we can see the string being built up bit by bit. Lets skip on to the second line just before our magic '\' is read in:
(dbx) cont
stopped in getarg at line 548 in file "xargs.c"
548 bytes = wctomb(mbc, c);
arg = 0x268ab "one"
(dbx) next
stopped in getarg at line 551 in file "xargs.c"
551 if ((next + ((bytes == -1) ? 1 : bytes)) >= &argbuf[BUFLIM]) {
arg = 0x268ab "one"
(dbx) print c
c = '\\'
So we can see from the output of 'print c' that our current character is indeed our slash. What happens to it? Well a switch statement is run on the character and eventually it is discovered to be a quote char:
case L'\\':
c = getwchr();
at this point, our 'c' is actually thrown away and the character following it is read in instead. This is how the quoting is implemented because this happens after a character's special meaning is determined, meaning the space character (in this case) gets copied directly to the argument, and the '\' is removed:
(dbx) cont stopped in getarg at line 548 in file "xargs.c" 548 bytes = wctomb(mbc, c); arg = 0x268ab "one "
So far so good. But where does it all go wrong? Lets follow the execution a bit further. At some point, we reach the limit of our buffer (which, if you remember, we have artificially shortened with the '-s' switch) and when that happens, we enter the code within the following if statement:
if ((next + ((bytes == -1) ? 1 : bytes)) >= &argbuf[BUFLIM]) {
...
At that point, we have a nice comment that describes what happens next:
/* * Otherwise we put back the current argument * and use what we have collected so far... */
So wait a minute, alarm bells are ringing now: when we can no longer read any more input we requeue what we have read in so far so that it can be re-read in the next iteration:
... (void) strcpy(so_far, arg); ... queue(so_far, len, HEAD);
Hang on a minute, haven't we removed the '\' from the argument that we have built up? Yes, that means when we perform this requeing, we lose the quote character and just requeue a lonely space, meaning when that text is re-read the space will be active, causing the symptoms we've seen here.
So that's it, after a lot of digging we have worked out what is causing the problem. As for how it was fixed, well that's another story. Why not dig through the OpenSolaris code and answer that question for yourself?
Technorati Tag: OpenSolaris Technorati Tag: Solaris ( Jun 14 2005, 06:02:52 PM BST ) Permalink Comments [0]