Douglas Walls' Weblog

« Previous month (Feb 2006) | Main | Next month (Apr 2006) »
20060329 Wednesday March 29, 2006

Avoiding Programming Language Vulnerabilities
This week I am in Berlin at a meeting of the ISO/SC22/WG14, the C programming language committee, in Berlin.  One of the hottest topics are dealing with programming security issues and integerity systems.

JTC 1/SC 22 has created a new project to deal with the subject of vulnerabilities in programming languages. The basic technical concept is that all programming languages contain features that are poorly specified, difficult to use correctly, or dependent upon particular implementations. In some cases, these features cause software codes to become vulnerable to malicious parties. The intent of the project is to create guidance on dealing with these problems. In some cases, the guidance will be generic across languages; in other cases the guidance will be specific to languages.
 
The project is being implemented in an unusual manner. SC22 has created an OWG ("Other Working Group") on Vulnerabilities. This group is convened by, Jim Moore, and the co-convener is John Benito.  Jim is the convener of WG9 (Ada) and John is the convener of WG14 (C); so they cover a wide range of programming language design. It is their intent to enlist experts from other working groups so that they can further broaden the range of expertise.   They also have permission to enlist experts from non-ISO languages, like Java.  Finally, of course, they need participants from national bodies.
 
The purpose of this blog is to encourage US participation.  Because an OWG is not-quite-a-working-group, I believe that the arrangements to participate in it are somewhat informal.

For more information about participating or contacting Jim Moore, have a look at the ISO/IEC JTC 1/SC 22/OWG:Vulnerabilities website http://aitc.aitcnet.org/isai/

I'll be on Vacation next week seeing some of Germany ... :-)
( Mar 29 2006, 07:45:41 AM PST ) Permalink Comments [1]

20060320 Monday March 20, 2006

Managed strings
I'm off to a meeting of the ISO/SC22/WG14, the C programming language committee meeting in a weeks.  Actually, I'm leaving today for a meeting with our my engineering team in St. Petersburg on my way to Berlin for the ISO/SC22/WG14, the C programming language committee meeting.  Another piece of work the committee has been working on for over a year now involves Mitigating Security Vulnerabilities.  This work is about to turn into a Draft Technical Report, currently titled:

Extensions to the C Library Part I: Bounds-checking interfaces

You can read more about it at:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1146.pdf

there is a rationale at:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1147.pdf

This work has generated alot of interest.  One such area is dealing with the vulnerablilities of manipulating strings in C.  Robert C. Seacord of Carnegie Mellon University has submitted a paper to the committee with ideas on library routines to manage strings to mitigate these issues.  Below is the introduction from the paper, and a link to the full document.

Introduction

String manipulation errors

Many vulnerabilities in C programs arise through the use of the standard C string manipulating functions.  String manipulation errors include buffer overflow through string copying, truncation errors, termination errors and improper data sanitization.  Buffer overflow can easily occur when copying strings if the fixed-length destination of the copy is not large enough to accommodate the source of the string.  This is a particular problem when the source is user input, which is potentially unbounded.  The usual programming practice is to allocate a character array that is generally large enough.  The problem is that this can easily be exploited by malicious users who can supply a carefully crafted string that overflows the fixed length array in such a way that the security of the system is compromised.  This is still the most common exploit in fielded C code today.  In attempting to overcome the buffer overflow problem, some programmers try to limit the number of characters that are copied. This can result in strings being improperly truncated.  This, in turn, results in a loss of data which may lead to a different type of software vulnerability.

A special case of truncation error is a termination error.  Many of the standard C string functions rely on strings being null terminated.  However, the length of a string does not include the null character.  If just the non-null characters of a string are copied then the resulting string may become improperly terminated.  A subsequent access may run off the end of the string and corrupt data that should not have been touched.

Finally, inadequate data sanitization can also lead to vulnerabilities.  Many applications require data to be constrained not to contain certain characters.  Very often, malicious users can be prevented from exploiting an application by ensuring that the illegal characters are not copied into the strings destined for the application.

Proposed solution

A secure string library should provide facilities to guard against the problems described above. Furthermore, it should satisfy the following requirements:

  1. Operations should succeed or fail unequivocally.
  2. The facilities should be familiar to C programmers so that they can easily be adopted and existing code easily converted.
  3. There should be no surprises in using the facilities. The new facilities should have similar semantics to the standard C string manipulating functions.  Again, this will help with the conversion of legacy code.
Of course, some compromise is needed in order to meet these requirements.  For example, it is not possible to completely preserve the existing semantics and provide protection against the problems described above.

Libraries that provide string manipulation functions can be categorized as static or dynamic.  Static libraries rely on fixed-length arrays. A static approach cannot easily overcome the problems described. With a dynamic approach, strings are resized as necessary.  This approach can more easily solve the problems, but a consequence is that memory can be exhausted if input is not limited.  To mitigate against this issue, the managed string library supports an implementation defined maximum string length.  Additionally, the string creation function allows for the specification of a per string maximum length.

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1158.pdf

( Mar 20 2006, 10:27:39 AM PST ) Permalink Comments [0]

20060313 Monday March 13, 2006

Decimal Floating Point Types
I'm off to a meeting of the ISO/SC22/WG14, the C programming language committee meeting in a couple of weeks.  One of the papers on the agenda (N1154) is a proposal for a Technical Report on adding Decimal Floating Point types and arithmetic to the C programming language specification.  The proposal is based on a model of decimal arithmetic which is a formalization of the decimal system of numeration (Algorism) as further defined and constrained by, IEEE-854, ANSI X3-274, and the proposed revision of IEEE-754 (known as IEEE-754R).

The proposal adds decimal floating point within the type hierarchy, as base types, real types and arithmetic types.  The three types are called:

There is a new macro an implementation must define to indicate conformance to this technical report:
The proposal introduces generic floating types the existing floating point types: float, double, and long double.  Together the generic floating point types and decimal floating types are known as the real floating types.

The three decimal encoding formats defined in IEEE-754R correspond to the three decimal floating types as follows:
The details of the format are give in IEEE-754R.

New macros similar to those of <float.h> are defined in a new header <decfloat.h>.  For example, DEC_EVAL_METHOD, DEC32_MANT_DIG, DEC64_MANT_DIG, DEC128_MANT_DIG.  Prefixes of DEC32_, DEC64_, and DEC128_ are used to denote the types _Decimal32, _Decimal64, and _Decimal128 respectively.

Conversion from decimal floating type to integer is as you would expect, the fractional part is discarded (value truncation towards zero).  If the value cannot be represented by the integer type the result depends on the sign of the integer type.  If unsigned, and the result is positive, the largest representable number, otherwise 0.  If signed, the result it the most negative or positive number according to the sign of the floating point number.

For conversion from integer to decimal floating type, if the value being converted can be represented exactly, it is unchanged.  If the value being converted is in the range of values that can be represented but cannot be represented exactly, the result is correctly rounded.  If the value being converted is outside the range of values that can be represented, the result is positive or negative infinity depending on the sign of the value being converted, and the “overflow” floating-point exception will be raised.

For conversion between generic floating types and decimal floating types, the TR is similar to the existing ones for float, double and long double, except that when the result cannot be represented exactly, the behavior is tightened to become correctly rounded.

The TR does not add complex or imaginary decimal floating types.  However, it does add the equivalent rules for conversion between complex and imaginary types to decimal floating types as exist for conversion between generic floating types.

Determining the common type for mixed operations between decimal and other real types is difficult because ranges overlap, therefore mixed mode operations are not allowed and the use of explicit casts are required. Implicit conversions are allowed only for simple assignment and in argument passing.

There is no default argument promotion specified for the decimal floating types.

The new suffixes to denote decimal floating constants are: DF for _Decimal32, DD for _Decimal64, and DL for _Decimal128.

It would help usability if unsuffixed floating constant can be used to initialize decimal floating types.  For, example, 0.1 has type double and in implementations where FLT_EVAL_METHOD is not -1, the internal representation of 0.1 is not exact. This defeats the purpose of decimal floating types.  So the proposal introduce a translation time data type (TTDT) which the translator uses as the type for unsuffixed floating constants.  An unsuffixed floating constant is kept as a TTDT until an operation requires it to be converted to an actual type.  The value of the constant remains exact for as long as possible during the translation process.

The concept can be summarized as follows:
Examples:

double f;
f = 0.1;

Suppose the implementation uses _Decimal128 as the TTDT. 0.1 is represented exactly after the constant is scanned. It is then converted to double in the assignment operator.

f = 0.1 * 0.3;

Here, both 0.1 and 0.3 are represented in TTDT.  If the compiler evaluates the expression during translation time, it would be done using TTDT, and the result would be TTDT.  This is then converted to double before the assignment.  If the compiler generates code to evaluate the expression during execution time, both 0.1 and 0.3 would be converted to double before the multiply.  The result of the former would be different but more precise than the latter.

float g = 0.3f;
f = 0.1 * g;

When one operand is a TTDT and the other is one of float/double/long double, the TTDT is converted to double with an internal representation following the specification of FLT_EVAL_METHOD for constant of type double.  Usual arithmetic conversion is then applied to the resulting operands.

_Decimal32 h = 0.1;

If one operand is a TTDT and the other a decimal floating type, the TTDT is converted to _Decimal64 with an internal representation specified by DEC_EVAL_METHOD. Usual arithmetic conversion is then applied.

If one operand is a TTDT and the other a decimal floating type, the TTDT is converted to the decimal floating type.

The floating-point environment <fenv.h> specified in C99 applies also to decimal float types.  The decimal floating-point arithmetic specified is more stringent.  All the rounding directions and flags are supported.

Certain algorithms stipulate a precision on the result of an operation; and this precision could be different from those of the three standard types.  The technical report adds a pragma directive to control this during translation time.

#pragma STDC DEC_MAX_PRECISION integer | DEFAULT

A host of new functions are added to <math.h> to support the new decimal floating types, along with new macros HUGE_VAL_D32, HUGE_VAL_D64, HUGE_VAL_D128, DEC_INFINITY and DEC_NAN are defined to help using these functions. The functions are equivalent to the existing generic floating type functions with d32, d64, and d128 suffixes added for the decimal floating type versions of the functions.  Similarly equivalent functions to support decimal floating types are added to <stdlib.h>, <wchar.h>, and macros to <tgmath.h>.

And last New quantize functions are added to <math.h>  These functions set the exponent of argument x to the exponent of argument y, while attempting to keep the value the same.

For a look at the full document and the rational see:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1154.pdf

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1161.pdf



( Mar 13 2006, 11:50:45 AM PST ) Permalink Comments [0]

20060306 Monday March 06, 2006

Finding a dynamic library dependency

At a customer visit this week I was asked if the compiler libraries can be reshipped with an application.  And of course the answer is yes.  A complete list of those libraries can be found at:

/opt/SUNWspro/READMEs/runtime.libraries

This reminded me of another question about how an application can locate library dependencies when the application can be installed in a user defined location.  For example, you have an application whose dynamic library is always located relative to itself via the path ../lib, like this:

cobol% ls -laF bin prod prod/bin .
.:
total 48
drwxr-xr-x   4 bin      bin        512 Mar  6 08:15 ./
drwxrwxrwx   5 bin      bin       1024 Mar  5 10:23 ../
drwxr-xr-x   2 bin      bin        512 Mar  5 10:37 bin/
-rw-r--r--   1 bin      bin       2178 Mar  6 08:15 myapp.c
drwxr-xr-x   3 bin      bin        512 Mar  5 10:37 prod/

bin:
total 6
drwxr-xr-x   2 bin      bin        512 Mar  5 10:37 ./
drwxr-xr-x   4 bin      bin        512 Mar  6 08:15 ../
lrwxrwxrwx   1 bin      bin         20 Mar  5 10:37 myapp -> ../prod/bin/myapp

prod:
total 6
drwxr-xr-x   3 bin      bin        512 Mar  5 10:37 ./
drwxr-xr-x   4 bin      bin        512 Mar  6 08:15 ../
drwxr-xr-x   2 bin      bin        512 Mar  6 08:15 bin/
drwxr-xr-x   2 bin      bin        512 Mar  6 08:15 lib/

prod/bin:
total 4
drwxr-xr-x   2 bin      bin        512 Mar  6 08:15 ./
drwxr-xr-x   3 bin      bin        512 Mar  5 10:37 ../
-rwxr-xr-x   1 bin      bin       8972 Mar  6 08:33 myapp*

prod/lib:
total 4
drwxr-xr-x   2 bin      bin        512 Mar  6 08:15 ./
drwxr-xr-x   3 bin      bin        512 Mar  5 10:37 ../
-rwxr-xr-x   1 bin      bin       8972 Mar  6 08:33 mylib.so*
cobol%

The linker provides Dynamic String Tokens that can be used to when creating the application.  In this case the token is called: $ORIGIN

And would be used as follows:

% cc -o prod/bin/myapp myapp.c '-R$ORIGIN/../lib' -L prod/lib -lmylib

For details about $ORIGIN and other Dynamic String Tokens see the Linker and Libraries Guide on docs.sun.com.

( Mar 06 2006, 09:50:37 AM PST ) Permalink Comments [0]

Search

Calendar

Links

Navigation

Referers