Joseph D. Darcy's Sun Weblog

Joseph D. Darcy's Sun Weblog


20070302 Friday March 02, 2007

Peak Code

Hubbert's theory on peak oil (and other geological resources) states that the maximum rate of production of the resource occurs when half of the reserve has been extracted. Initially, there is near exponential growth in the rate of extraction as lots of "easy oil" is found and pumped, but after the peak the rate of extraction decreases. The marginal cost of extraction goes up as less desirable source of oil are put into production; this cost should provide economic incentives to consider alternate energy sources.

I think the evolution of programming environments has some similarities to Hubbert's theory. A successful new programming environment enjoys an early inflationary phase where lots and lots of new code is being written in it. As the platform matures, the significant programs that are "easy" to write in that language get written, often code less practical to write elsewhere. As the platform grows, increasing interactions and compatibility constraints make adding new features more challenging. At some point, half of the total lines of code for that platform will be written. The further a platform gets past its peak, the less attractive adding new features to it becomes since there is less and less potential return on investment due to the dwindling number of programs that could benefit from the improvements.

However, there are some important differences between potential programs that can be written and natural resources. First, old programs are not consumed or made unavailable in the same way oil is when refined to fuel, made into plastic, and so on. Second, the constraints on the production of new code are very different from those in isolating a chosen natural resource; the primary limiting reagent to producing new programs is not a physical limit. In chemistry, a reaction takes places between fixed ratios of molecules so if you don't have enough of one kind of reagent molecule, say oxygen, the unavailability of that reagent limits the reaction no matter how much of the other reagents, like gasoline, are available. However, the amount of code that gets written will be capped by human interest or economics before hard physical resource limits come into play. Hubbert's peak applies to finite resources; in contrast, the potential amount of software is effectively infinite for the purposes of this kind of modeling. Additionally, there are difficulties in accurately measuring the rate of program production. These differences mean that it can be impractical to know the precise rate of new program development and even if that rate were known, the life cycle of the platform could follow a very different curve then Hubbert's. Therefore, if there is a peak, is is hard to know whether or not it has occurred and the timing and size of any peak can be influenced by changes in the platform.

As volume drives value, evolving a general-purpose programming platform should try to enable a wide range of applications to be written in a convenient and safe manner, expanding over time. Java's relative simplicity, garbage collection, and rich libraries were early advantages; more recently the language changes of generics and annotations have expanded the set of programs that are tractable to write on the platform. As the Java approaches its teenage years, we'll aim to keep improving the platform to support more kinds of programs in JDK 7 and beyond; perhaps closures will be the next major enhancement, allowing programmers to abstract over another axis.

Thanks to Alex for the conversation which led to this post and for providing feedback on earlier drafts.

(2007-03-02 10:00:00.0) Permalink Comments [2]

20070301 Thursday March 01, 2007

Norms: How to Measure Size

At times it is useful to summarize a set of values, say a vector of real numbers, as a single number representing the set's size. For example, distilling benchmark subcomponent scores into an overall score. One way to do this is to use a norm. Mathematically, a norm maps from a vector V of a given number of elements to a real number length such that the following properties hold:

  • norm(V) ≥ 0 for all V and norm(V) = 0 if and only if V = 0 (positive definiteness)
  • norm(c · V) = abs(c) · norm(V) for real constant c (homogeneity)
  • norm(U + V) ≤ norm(U) + norm(V) (the triangle inequality)

There are a few commonly used norms:

  • 1-norm: sum of the absolute values (Manhattan length)
  • 2-norm: square root of the sum of the squares (Euclidean length)
  • ∞-norm: largest absolute value

The first two norms are instances of p-norms. A p-norm adds up the result of raising the absolute value of each vector component to the pth power (squaring, or cubing, etc.) and then takes the pth root of the sum. The ∞-norm is the limit as p goes to infinity.

Given multiple possible norms, which one should be used? The 2-norm is often easier to work with since it is a differentiable function of the vector components, unlike the 1-norm and ∞-norm. On the other hand, the ∞-norm captures the worst-case behavior. Sometimes one norm is easier to compute than the others. Another norm might make an error analysis more tractable. For vectors, in some sense it doesn't matter which norm is used because any two norms, norma and normb, are equivalent in the following sense, there are constants c1 and c2 such that

c1 · norma(V) ≤ normb(V) ≤ c2 · norma(V)

This means that if one norm is tending toward zero, all other norms are tending toward zero too. For example, commonly in numerical linear algebra there is an iterative process that terminates once the norm of the error is small enough. Concretely, for vectors of size n, the common norms are related as follows:

norm2(V) ≤ norm1(V) ≤ sqrt(n) · norm2(V)
norm(V) ≤ norm2(V) ≤ sqrt(n) · norm(V)
norm(V) ≤ norm1(V) ≤ n · norm(V)

So to guarantee that the 1-norm is less than epsilon, it is enough to show that 2-norm is less than epsilon/sqrt(n).

However, in other ways the different norms are not equivalent; the norms can give different answers on the relative size of different vectors. Consider the three vectors A, B, and C:

A = [5, 0, 0]
B = [1, 3, 4]
C = [8/3, 8/3, 3]

Vector 1-norm 2-norm ∞-norm
A 5 5 5
B 8 ≈5.1 4
C ≈8.3 ≈4.8 3
Biggest Vector C B A

Each vector is considered the largest under one of the norms.

I've found the notion of norms to be useful in many different contexts. The performance differences between quicksort and mergesort can be described as quicksort having a better 1-norm but mergesort having a better ∞-norm. Buying more insurance coverage raises the 1-norm of your costs, but lowers your ∞-norm. A more conservative evaluation tends to focus on the worst-case outcome and thus favors something like the ∞-norm. For example, in the math library the relative size of the error at any location must be less than the stated number of ulps (units in the last place). It is not good enough to have a low average error, but a few locations, or even one location, with an very inaccurate result. During software development, risk assessments evolve with the release life cycle. A change that is welcome early in the release may be rejected as too risky a few weeks before shipping; one way to view this phenomena is that a larger value of p is being used to compute risk assessments later in the release.

References
Applied Numerical Linear Algebra, James W. Demmel
Matrix Computations, Gene H. Golub and Charles F Van Loan
Numerical Linear Algebra, Lloyd N. Trefethen and David Bau, III
(2007-03-01 15:20:32.0) Permalink Comments [2]

Calendar

« March 2007 »
SunMonTueWedThuFriSat
    
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
       
Today

RSS Feeds

XML
All
/Annotation Processing
/General
/Java
/JavaOne
/Numerics
/OpenJDK

Search

Links

    Blogroll
  • Download the JRE

    News

Navigation



Referers

Today's Page Hits: 502