Keith Bierman's Weblog
Keith Bierman's Weblog
All
|
General
|
Java
|
Music

Wednesday December 14, 2005
Xeon is just as Good as Opteron (says HP!) Thanks to the folks at theInq for calling our attention to this one. Aside from the Inq's warning that you need an HP signon, also be prepared for a Microsoft Word source document. It's actually a rather nice writeup, and it's conclusion seems to be "true". If your workload is bottlenecked on I/O having a faster processor with a faster memory interface won't help.
Of course, that really is pretty obvious and doesn't take 21 pages to justify now does it?
In a variety of places, their detailed analysis favors the AMD processor, for example from page 14
1.The more content that can be cached in memory the greater the Opteron advantage. This performance difference is tied to the different designs of the two processors... detailing the FSB vs. HT issues
"Countered with" 2.If any of the server sub-components become a bottleneck, the Opteron memory access speed advantage is negated.
So the obvious solution is to favor systems with fast processors, fast and ample memory bandwidth and fast I/O subsystems. Unfortunately that doesn't result in "all processors being equal" at least not if you buy the right subsystems ;>
(2005-12-14 13:18:40.0)
Permalink

Monday December 12, 2005
The sad saga of xemacs vs. gnu emacs I'd been a longtime user of the Xemacs that came packaged with the Sun
Studio tools years ago. I knew there was a split, and a dreamed of
merge ... but I'd never really quizzed Ben Wing or Martin Buckholtz
(sp?) (two of the Sun engineers who contributed the linkage code, and
were Xemacs maintainers) about the how and why of the split and fork.
Here is a pointer to at least side's worth of the sad saga.
(2005-12-12 13:25:28.0)
Permalink

Thursday December 08, 2005
Caches Considered Harmful For what seems like forever, designers have been adding more and more
cache to systems to reduce latency to memory. This has been successful,
but it hasn't been the only approach, but it has been the most
typical.
But has it been Good?
- Caches are very energy intensive (essentially large amount of
SRAM close to the CPU). The larger they are, the more energy wasteful
they are.
- Caches, on average, produce a benefit on the order of sqrt(size), so the
heat outpaces the benefit.
- Of course, with heat you pay several times. You pay for the
electricity to create the heat, you pay in the system design to cool
the device, you pay in the data center to cool the entire system, and
you typically pay a price in RAS because heat kills.
- Notably, adding cores (providing enough memory bandwidth has been provided) provides nearly linear improvement in throughput.
- And for cache experts, increasing the associativity increases their effectiveness.
- Caches help us avoid dealing with the underlying issue of doing
useful work while waiting for memory. Putting off the harder work of
innovation, or at least limiting the innovation to the process level
rather than the architecture level is a form of laziness.
- When the data one wants isn't in the cache, it's often worse than
it would have been had there been no cache (so fancy non-cache
polluting loads and stores may be added to the ISA, and compiler, etc.
So what are the alternatives to caching?
As in the citations above, the key observation is that if one has
additional "threads" ready to do useful work, that work can be done
while awaiting the data to be returned (from memory, from cache, from
disk, ... wherever) rather than keeping all that hot and possibly
expensive iron (silicon) hot. And that's precisely what UltraSPARC T1
does
).
So when you hear someone making a spurious claim about the UST1 being
cache starved, ask them how big a cache they think it should have, and
why? What level of associativity? What's the
downside? And, of course, point out that the application performance is
what counts, and it doesn't support the contention that the UltraSPARC
T1 family systems are cache starved.
NB: of course, caches aren't all bad. If you are focused on minimum
latency (fastest response time for a single thread) they can be very
effective. But if your goal is the most aggregate work for the least
power, they are certainly not your friend.
To learn more about caches
(2005-12-08 16:57:41.0)
Permalink

Wednesday December 07, 2005
With All Due Respect Jonathan
9.6GHz is not the clock of our UltraSPARC T1 processor. As any hardware
engineer knows, clock speed is a simply measurable entity, you stick test leads on the appropriate wires and count the phase transitions. The
correct number is 1.2GHz.
Now, as software engineers we know that folks like to compute
theoretical maximum operation rates, and 1.2GHz * 8 cores does yield 9.6
GOPS as a theoretical limit. And thanks to many years of industry
confusion, probably due to the infamous Dhrystone benchmark, clock rate
and operations per second have become hopelessly confused in the minds
of many.
But fundamentally there is no link between clock and work done (you can
have really, really simple instructions (the limiting case is a single
instruction (but then you need a very large number of instructions per useful work
done ;>)
As proved previously by our Opteron based systems should have demonstrated, clock speed isn't a solid predictor of performance (Intel has a faster clock, and poorer performance).
I suppose one can argue that it is "pleasing that in lots of scenarios we do see scaling that is in fact more or less linear with the core count". See a benchmark focused blog for examples.
(2005-12-07 14:52:04.0)
Permalink

Tuesday December 06, 2005
What I want to work with next.
Typically one blogs about what one has done, or has discovered. In this
entry, I'll talk about an area I want to spend some time working in
RealSoonNow.
As a performance analyst, much of my work has been reductionist.
That is, I take some application and make it go faster. Step 0 is to
measure it before doing anything to it, followed by figuring out little
bits that don't go as fast as they ought (or determining that the wrong
algorithms were used, and doing some wholesale surgery ;>) and
iterating. The key has always been to isolate the smallest bits
possible (faster turnaround, better leverage, etc.). And such work has
been rewarding in many different ways. But most of the time, my
computers are not focused on running just one application.
My laptop, for example, currently has over 300 threads in 80 processes running. I'm
not even driving it very hard. If I want to focus on any of the
specific processes or threads, tools such as Performance Analyzer (or
it's earlier, more primitive predecessors, such as gprof) are fine.
But if what I really want to do is to maximize the performance of the
overall system (throughput) I've largely been toolless.
Worse, everything that my friends at Intel (see their last couple of
Intel Developer Fora) have been saying is that they are going to move
to a strongly multicore strategy (Justin Rattner spoke of hundreds to
thousands of cores, and ElReg reported this as http://www.theregister.co.uk/2005/03/04/intel_100_core/).
With the DProfile utility (keyword dataspace if you want to search for it at docs.sun.com) developed by
Nicolai Kosche
and friends, it's now possible to see how all the various threads and
processes actually interact inside the memory hierarchy.
Of course, this took a lot of infrastructure, SPARC needed to supply
enough runtime instrumentations, Solaris the APIs (including Dtrace),
the compilers instrumentation (for optimal results), and extensions to
the Performance Analyzer to collect and display the appropriate
information (this is where that keyword dataspace comes in handy for docs.sun.com searches).
No doubt Intel and Microsoft will eventually have as many threads in a
chip as Sun does today with Niagara (2010++??) No doubt, someday Windows+++ will
have mature support for highly threaded applications (in addition to
robustly supporting heavily loaded systems). Intel has, of course,
purchased several suppliers of threaded tools so ... and to be fair, the hardware threads only have to be on a single board to provide much of the same software opportunity (of course, the RAS is much better with just one chip ;>)
But why wait? Clearly such "complicated" environments are no longer the
sole province of supercomputer users and major IT departments (and a
power desktop user probably has a lot more challenging apps than I have
on my laptop, visual processing is easily parallelized....) so getting
started now with the next generation of tools is going to be a lot like
it must have been for the first radiologists. Lots to learn, with brand
new shiny technology!
So keep your eyes peeled for anything from *.sun.com with words like
DProfile or dataspace and dig in!
[ T:http://technorati.com/tag/NiagaraCMT] [ T:
(2005-12-06 10:01:01.0)
Permalink
Amazingly stupid competitor quote
You have to wonder if they've been misquoted:
Don't they even bother to read the literature?UltraSPARC T (nee Niagara) does break a lot of ground for a microprocessor. But effectively reducing latency (which caches are intended to do) is something they multithreading is known to be good for. So megacaches aren't required. [ T:http://technorati.com/tag/NiagaraCMT]
[ T:
(2005-12-06 10:00:02.0)
Permalink
Dec. 6th notable events Of course, today is the big announcement of the first UltraSPARC T based systems (nee Niagara).
It is also the 2nd birthday of Jerry Sandor Bierman. When available, pictures from his birthday party will be located on Flickr
(2005-12-06 10:00:00.0)
Permalink
Which Evolves Faster: Hardware or Software?
Conventional wisdom has always had it that hardware is the "long pole"
in system design. Software can be changed up until the last moment (and
even beyond via patches). So the conventional answer is, of course, that
software evolves faster.
But, for large complex software is it really true? Let's consider the
new UltraSPARC T chips (formerly known as Niagara). As can be found
#link to hw_blog (anyone know the best pointer?) there are 32 hardware
threads per chip.
Given that these hardware threads are quite fundamentally different than
having 32 separate cores, just how does an OS such as Solaris deal with
them? The answer is by ignoring the differences and to a first
approximation treating them as 32 "CPUs".
This mostly works well; but it's interesting performance corner cases
that cause great confusion ... because the tools (e.g. mpstat) haven't
really evolved to keep up with the hardware for more details.
Sometimes the hardware does evolve faster.
Of course, the point of software layers such as an Operating System is
to provide abstraction of hardware details. Just which hardware details
need to be directly exposed is a deliberate process.
[
(2005-12-06 08:07:14.0)
Permalink

Sunday November 13, 2005
HP FUD
In HP's latest bit of FUD they say
Sun has stated that Niagara will be binary compliant with previous SPARC designs
but this fact does not tell the entire story regarding how well a SPARC
binary compatible program will run on Niagara. It's not enough to
simply run - the program must also run well to be of actual value. Significant software optimization may be required to ensure that software will work well with Sun's Niagara.
As Niagara is not yet a released product, a detailed rebuttal is not yet "kosher". However, HP's warning makes it sound like Niagara is unique in providing a new pipeline; but almost every new generation of SPARC processor has had a new pipeline, and compiler optimizations to exploit them fully. That has seldom meant more than a small amount of work for most developers and it has never meant anything generally disruptive. Why anyone would think that this would be different in this generation requires more foundation than HP has bothered to provide. That Niagara will support simultaneous execution of many threads is clear from the presentations that HP references. That, as HP claims. that this will require ....To fully exploit Sun's Niagara systems,
developers may have to change how applications are architected... Sun
has stated that Niagara changes the minimum application scalability
demands from 1-4 threads to 32 concurrent threads...
Strikes me as pretty absurd. My laptop (a PowerBook G4) currently has 295 threads running (according to Apple's accounting). I've often had more than 500. Perhaps the authors of HP's FUD believe that servers customarily run a single instance of a single application (and that is a Best Practice of some sort). It's not the way I've run most of my computing systems over the years; nor am I likely to start ;> That HP can use so many words, and so many "bullet points" to say the same thing is a tribute to something. I think it's also worthy noting that HP's primary CPU technology supplier (with the impending end of the PA family, and the already buried Alpha family; both RIP), is busily crafting large multi-core (which translates into "threads" more or less, in this context) is saying things such as:
So to the extent that HP's contention is correct, that folks should be concerned about making their applications scale to large numbers of hardware threads, it would seem that such tinkering will bear fruit (albeit further in the future) on x64 chips from Intel as well as "Today" with SPARC. Consolidation (running multiple instances, or different applications) on a single Niagara chip doesn't require any application changes... and any application changes made for Niagara scaling are very likely to stand one in good stead for future x64 chips. So why not go with the future today?
(2005-11-13 22:39:11.0)
Permalink

Friday February 04, 2005
On Trust
Over on groklaw, there's the usual Sun bashing.
webmink
has an excellent writeup on the IP issues that pre-CDDL licenses fail
to deal with (and, despite the current wailing, by some, I bet GPL3 addresses
the problem ... either in a fashion akin to the CDDL or at least
inspired by it).
But putting aside all the confusion about patents, GPL and Open being
synomnyns, and the like, one particular quote on groklaw caught my
attention:
That's what I'd say. Use it only if you trust implicitly in Sun
This immediately reminded me of the classic Turing paper by Ken Thompson Reflections On Trusting Trust (1983).
When programmers build ontop of a system, they exhibit trust. Any
system with hundreds of thousands of lines of code (or worse, millions)
is simply too complex for nearly any programmer to individually inspect
each line for subtle security traps (and if the system is still evolving, how would they have any
time to develop their application?) Open
source may make it possible for someone to do their own proofs, but it's computationally infeasible.
Nor, of course, is trust limited to programming. When we get on an
elevator, we exhibit trust in the manufacturer of the elevator, in the
installer, in the maintainer, in the government body which audits them,
etc.
In my limited experience dealing with corporate lawyers, their
focus is
not on "how can we cheat" or "how can we plant trapdoors in a contract"
but it's "how can we ensure that both sides understand what's expected
of them and write it down in a mutually agreeable fashion" (no doubt,
there exist organizations that other ethics, Enron comes to mind).
The CDDL seems, to this reader, to make it pretty explicit that all
contributors have to not only put in code, but put into the "common" pot
the appropriate rights to use and protections for the code. That
strikes me as fundamentally fair and useful.
Those that think that being precise about IP issues is somehow indicative of poor ethical behavior, and think
that the GPL is the superior approach (in this regard) are exhibiting
an incredible amount of trust ... in everyone that holds any software patents ... that no one
will take them to task for patent infringement. When the code in
question is simply shared among a small body of students that's a
pretty safe bet. But for folks building multi-billion dollar businesses
ought to assume that someone might not see their efforts in the same
noble light.
It's sad that pointing this out, and trying to do something about it is seen as an attack or a threat.
(2005-02-04 12:29:08.0)
Permalink

Wednesday November 17, 2004
An amazing floating point misoptimization My thanks to David Hough for bringing this gem
from Microsoft to my attention. For those not interested in slogging
through the entire page; here are my favorite bits; verbatim, although
elided and colorized by me. Red for their most amazing decisions, and
blue for my commentary.
I pray that no application whose results matter to anyone use this flag!
This is from the Visual C++ 2005 compiler, and the flag in question is:"fp:fast" and specifically regarding sqrt (but I think it's a more generic problem in their thinking)
Under fp:fast, the compiler will typically attempt to maintain at least the precision specified by the source code. However, in some instances the compiler may choose to perform intermediate expressions at a lower precision than specified in the source code. For example, the first code block below calls a double precision version of the square-root function. Under fp:fast, the compiler may choose to replace the call to the double precision sqrt with a call to a single precision sqrt function. This has the effect of introducing additional lower-precision rounding at the point of the function call.
Original function double sqrt(double)... . . . double a, b, c; . . . double length = sqrt(a*a + b*b + c*c);
Optimized function float sqrtf(float)... . . . double a, b, c; . . . double tmp[0] = a*a + b*b + c*c; float tmp[1] = tmp[0]; // round of parameter value float tmp[2] = sqrtf(tmp[1]); // rounded sqrt result double length = (double) tmp[2];
Although less accurate, this optimization may be especially beneficial when targeting processors that provide single precision, intrinsic versions of functions such as sqrt. Just precisely when the compiler will use such optimizations is both platform and context dependant.
Furthermore, there is no guaranteed consistency for the precision of intermediate computations, which may be performed at any precision level available to the compiler. Although the compiler will attempt to maintain at least the level of precision as specified by the code, fp:fast allows the optimizer to downcast intermediate computations in order to produce faster or smaller machine code. For instance, the compiler may further optimize the code from above to round some of the intermediate multiplications to single precision. float sqrtf(float)... . . . double a, b, c; . . . float tmp[0] = a*a; // round intermediate a*a to single-precision float tmp[1] = b*b; // round intermediate b*b to single-precision double tmp[2] = c*c; // do NOT round intermediate c*c to single-precision float tmp[3] = tmp[0] + tmp[1] + tmp[2]; float tmp[4] = sqrtf(tmp[3]); double length = (double) tmp[4];
This kind of additional rounding may result from using a lower precision floating-point unit, such as SSE2, to perform some of the intermediate computations. The accuracy of fp:fast rounding is therefore platform dependant; code that compiles well for one processor may not necessarily work well for another processor. It's left to the user to determine if the speed benefits outweigh any accuracy problems. khb: unfortunately this would require the user to read the disassembled code, do a rigourous numerical analysis, and to redo it everytime the code is modified or the compiler updated (and then recompiled). This is, of course, totally impractical.
If fp:fast optimization is particularly problematic for a specific function, the floating-point mode can be locally switched to fp:precise using the float_control compiler pragma. khb: this is, of course, backwards. If you are going to define a basically insanely liberal fp optimization, it ought to enabled for the smallest bit of code practical (preferably with scoping, so it can't accidentally impact the whole compilation unit).
(2004-11-17 15:54:52.0)
Permalink

Monday October 04, 2004
Some ruminations about software application licensing
The Problem Many
important software packages (e.g. Oracle)
are licensed on a per processor basis. That is, if you purchase a
license for a 72 processor machine, the price is on the near order of
72times more expensive than a single processor license. Is this
sensible? What are the unintended consequences for Society at large?
HistoryConsider a timesharing service a veryprevious company of mine employed from time to time, BCS (BoeingComputing Services). BCS
had a large ensemble (more than 100) CDC mainframes. They were binary
compatible, and setup to have automatic failover, so that a job that
started on one system could end up on another (or having run on a series
of different systems).This was necessary to provide adequate
reliability. Most software was metered, that is, one was billed as the sum of resources consumed, such as Nc Dollars per CPU minute
Nd Dollars per amount of diskspace used
Ni Dollars per I/O transaction
etc.
a complicating factor, however, was that while all the systems were
binary compatible, they ran at different speeds. It wasthe clear goal
of the software providers (most of the computer vendorthemselves! Or
the timesharing system operator itself) that the price for running a job
should either be independent of the speed of thesystem, or (more
frequently) carry a premium based on the faster CPU.
Since jobs
frequently wound up running on more than one system,the billing
algorithm was adjusted so that one was charged as if oneran on the
original selected system (so if the job “failed over” to a faster
system, the rate was adjusted so that the final charge was the same as
if it had run on the original system). When the industry moved
away from Timesharing, and thus charge per unit time and towards a
software purchase model, there remained avague notion on the part of
software vendors (now, more frequently someone other than the computer
vendor itself) that if the customerhas 10 slow machines, or machine
that is 10x faster, the payment due to the software vendor should remain
the same. As different vendors processors are more (or less)
capable, there are software vendors who establish a base price for
different vendorsthat differ. This may or may not be acceptable
(legally or economically) for some. Even where it is Legal and
Accepted, when a vendor has multiple microarchitectures there may be no
single processor whose performance can act as a reliable base. As a result, many software vendors have simply relied on "processor count", it being a crude metric for the power of a system. There
are, of course, other ways to price software licenses(including per
user, per system, per site, and per actual user, andper employee
arrangements). However, we'll focus on the lamentablycommon practice of
pricing per “CPU”... But...Unfortunately,
there is no objective definition of a processor.For example, a VLIW
machine (like the extinct Multiflow 28) has avery large number of
functional units --- more than a quad coreSPARC. The result may well be
faster performance for the “singleprocessor”, but the
licensing fees for the multi-coreprocessor are 4x more expensive! Current trends in computer design make such issues increasinglyproblematic. One could argue, as IBM does, that eachidentifiable “processor element” (what sun calls a“core”) is an objectively identifiable“processor”. But
that a “core” exists asan identifiable physical entity is merely a side
effect of current design tools and methods (viz. define a single core,
step andreplicate). With more advanced CADtools, all of the logical
corescould be instantiated and then baked
into one huge monolithic mass (which might well have technical
benefits beyond that of confusing licensing schemes). An
Aside: Software engineers may recognize this as essentially what a
“globally optimizing”compiler does (full interprocedural analysis,
etc.) vs. separate compilation.
An Aside: Yes, to all the CAD
developers and hardware engineers reading this, I appreciate the
manifold reasons why we don't do this (today), and why it's hard (it's
possible we might never do it)
Another
complication comes about from the software (or firmware) concept of
“virtualization”. Schemes such as those touted by IBM and Microsoft (one
physical processor can appear to be any number of processors) provide
another confusing view of the actual system from the perspective of
software licensing (not to mention how exciting it can be to maintain 20x the number of OS configurations on a single box...). Yet another
complication is provided bythe concept of hardware threads (such as are
found in chips ranging from Intel's Xeon to IBM's Power5 and various Sun
chips). These threads are typically exposed to the
programmer as “virtual CPUs” that is, if the program inquires from the
system how many processors there are, a single Xeon chip currently
reports 2. The performance of most such hardware threading schemes has
been poor (that is, the second thread adds 10%-30%performance, and
therefore has been ignored by ISV software licensing..but as hardware
threading matures, the performance may well approach N where N is the
number of hardware threads). In the
event that the problem is not clear, let us consider the case
of a chip such as described
by:Ace's Hardware (this is not
to say that I am confirming any or all bits of speculation and
assertions made by that author). But to sum up, they claim it can be
described as a single chip, with 8 SPARC cores, each of which has 4
hardware threads. Let us assume for this discussion that they are correct, then.... Is this a single processor with 32 threads? (If so, the license fees for Oracle, would be approximately $15K,
using my understanding of the current rules which ignore threads) Or is it “8 processors”?
(If so, the price might be more like $320,000 because larger processor
counts start with a higher base cost), or would it be even higher due to
the large number of hardware threads? Getting
very speculative, what if some key hardware resources
were not replicated all N times? Indeed, what if there was a key
pipeline resource critical to Oracle performance shared amongst
all N cores? Does this change the picture? If not, isn't that a truly
inequitable licensing algorithm? It
should be clear to the reader that the current situation, where software
is licensed by number of“processors” is hardly architecture neutral and
has no objectively measurable basis. In marketplaces where the software vendor has a near monopolistic
position, having no objective basis for pricing across platforms may
represent a litigation risk (I Am Not a Lawyer, this is my opinion and
not that of any member of a Bar Association). So what can we do instead? A Solution (starting from the basics)
Every
modern computing device is composed of a collection of many chips each of
which has a number of transistors. Each transistor has performance
characteristics, such as switching speed. In an Ideal
implementation, all hardware vendors would disclosethe number of total
transistors in the system, and a breakdown by speeds (most frequently,
all the transistors have similar characteristics in a given chip, but
in a large system, different chips may have radically different characteristics.
Also, in some cases, someparts of a chip may have very different
characteristics (e.g.Different clock rates)). In an Ideal implementation a software vendor would compute a billing factor (BF) for a given system model. BF would be defined as the sum of all Ti*Ni fori
from 1 to the number of transistor types, and where each T is acost per
transistor type and each N is the number of each transistor type. The
total price can then be computed as the product ofBase_Price*BF.
Base_Price will be the same across all platforms (it may be adjusted on
a per customer basis based on volumeor other applicable discounts). The
BF provides for an architecture and CAD tool neutral platform adjustment. As
most large computer systems are composed of replicated elements, so it
may be the case that the most common sub-block may be used to compute a
BF and the system BF can be reasonably approximated by Nsubblock*BF.
This technique will be most useful in “Capacity on Demand” systems
where entire sections of the machine are only enabled at some later
date.At the time additional subblocks are enabled, the
incremental software license charges are trivial to compute. In
the event that hardware vendors fail to publish the precise counts and
transistor speeds, the number of transistors can be approximated based
on the size of the chip and the particular geometry (130nm, 90nm, etc.).
An “average value”may be employed where various transistor speeds are
unavailable. AdvantagesThe primary
advantage of this system is that it is architecture(both micro and
macro) independent, and it is independent of the State of the Art in CAD
tools. Having an objective system would be anice thing to have. Alternatives?Is
this a unique optimal solution? Perhaps; but if one is environmentally
focused, there's another approach that may have some appeal... DescriptionIn
the Ideal implementation, the end user's computing system would be
augmented by a machine readable watt-hour meter, and it's operating
system would have fine grained accounting facilities. At the start of the licensed applications execution,the current value of the watt-hour meter would be recorded. At theend of the licensed applications execution,current value of the watt-hour meter would be obtained, and subtracted from the initial value. This represents the entire power consumed by the system during the licensed applications execution. Since most computing systems execute more than one program at atime, the Operating Systems accounting facilities will be employed to determine the fraction of the machine's resources that were consumed during the execution
of the licensed application. The cost, per execution will then
be computed as a Base_Factor*watt_hours*Usage_factor (where UF
is typically less than 1. It can only be larger than 1 when there issome
sort of “Capacity on Demand” functionality deployed). This would represent a return to “price per usage” asin the days of mainframe computing.
<Un>intended Consequences
The
current "processor count" metric encourages users to buy machines with
the fastest single thread performance (admittedly, this isn't the only
encouragement ;>). The most reliable way to deliever that,
generation after generation, has been to "chase" very high clock rates.
Unfortunately this is exceedingly energy inefficient (as well as
driving up Fab costs rapidly). While California's energy problems
certainly weren't created by simply having too many fast clocked
computers, it provides a graphic lesson in the downside to power hungry
approaches (as well as the laws of economnics and the consequences of
incompetent government).
The transistor count approach would
reward designers (and consumers) who got the most performance per
transistor; while this is appealing from an engineering/logical sense,
it's hard to see how that provides a useful benefit to society per se.
The price per watt approach has the intended consequence of rewarding consumers and designers for providing better performance for lower engergy consumption.
[and before anyone asks, yes, I think it would have been more sensible
than setting fleet mpg goals to have made the taxes on gasoline vehicle
dependent, and had a factor tied to fuel economy. More efficient cars
should be charged less, and less efficient ones should be charged more.
Show cars and other historical vehicles could remain unmodified ... but
would pay ruinious rates if run as daily commuters ;> ]
If you like these ideas, please pass them on. Write about them. Lobby for them. Implement them in your products.
(2004-10-04 20:49:40.0)
Permalink

Wednesday August 25, 2004
Some random pointers Well, not entirely random. I found them interesting.
The End is Near!
Earth at Night
HP silences User Group
though I can sympathize with the Interex leadership. Really annoying
the vendor can cause a trainwreck (as, regrettably, caused the Sun User
Group (at least the US body) to die 13+ years back).
(2004-08-25 00:59:55.0)
Permalink
Word considered harmful? Dvorak makes an interesting argument for why Word must
die. As the last version I used with any great regularity was Word97, I
can't really see things entirely his way; but it's a good read.
Not that it matters, but my first
version of Word was 1.05 for the Macintosh. It was greatly inferior to
the Xerox word processor I'd grown fond of (but we couldn't afford for
our little aerospace consulting practice). It was much slower to use
than WordStar on our shared buss 8mhz Z80's (with a RAM disk big enough
for my documents, the compilation system and the project du jour); but
combined with a laserprinter was vastly faster at printing equations
(we had been using a roff-like package on the Z80, "Fancy Font" with
multiple Epson printers, each being driven to early destruction by
using them exclusively in graphics mode, and striking each line about
seven times on average).
I think my favorite word processor (after the Xerox dedicated hardware)
probably remains "FullWrite" a package for the Mac that eventually
disappeared into the void. If memory serves, it had pretty good (better
than Word2K) pagelayout facilities bulit right into the wordprocessor,
with wordprocessor ease of use. But as it lost vendor support long ago,
I eventually gave it up. An unrelated vendor (also long disappeared, as
best I can tell) produced my favorite spreadsheet of all time,
"Trapeze".
A combination of Star/Open/NeoOffice tends to do most of my needs
reasonably well (with the odd dip into Framemaker for some larger more
elaborate documents). But I do a lot
less mathematical typing these days, so comparing my needs of today and
of 17+ years ago is probably not terribly meaningful ... even to myself.
(2004-08-25 00:51:56.0)
Permalink
|