Vijay Tatkar's Blog

All | Benchmarks | Business | Cloud Computing | General | Hardware | Linux/Unix | Performance | Software | Solaris | Sun | Sun Studio
Main | Next page »
20091009 Friday October 09, 2009

New SunStudio Screencast on Improving Performance of Parallel Codes
Cool new video from Darryl just showed up on Sun's HPC site. In this video, Sun Studio expert, Darryl Gove, shows how you can use Sun Studio Performance Analyzer to improve performance of a parallel application. Darryl uses the Mandelbrot set application to highlight the features. This screencast is also one of the demos we will run at Oracle OpenWorld that I mentioned in my previous blog.
Take about 15 mins to view it. You will learn something about OpenMP, parallelization and even Mandelbrot sets.


Posted by tatkar ( Oct 09 2009, 03:24:19 PM PDT ) Permalink Comments [0]
Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20090925 Friday September 25, 2009

What I saw at IDF09
After a short break, I'm back in blogland. In the meantime, I and my team have moved back from Sun's Cloud Computing Engineering organization to Sun Studio (Compilers and Tools). It was a wonderful ride and I learned so many things that I intend to build on, in coming months. Of course, my group is still involved in the same Cloud-related tools project of making HW, SW stack and tools more easily accessible to developers who dont have OpenSolaris (or Solaris) on their desktop and may not have access to a SPARC machine in their development group. More on that in the coming weeks, but right now I'll turn to Sun Studio related activities.

This was the week for Intel Developer Forum 09 (Sept 22-24, Moscone West, San Francisco).
Last year, the emphasis seemed to be on Nehalem, AVX, Graphics and Parallelism.
This year, the emphasis seems to be around Mobility, some followup on Parallelism and Cloud Computing. Intel is totally on top of the world with the Nehalem chip: a well-balanced, high performance chip with a great feature set that the company can build their entire roadmap on. They are on a high, and know they have a winner in Nehalem.
This year again, we were invited to have a booth and a Chalk Talk at the conference. The booth duty was interesting and you really get to do some deep-dive type conversations with some interesting folks who walk by and we got our share, this year as well. Which makes it all worthwhile and stimulating. Its an ideal time to listen to what other developers have to say about our products (both good and bad and we heard both sides) and to share views on where the environment is headed. If you remember, I gave a Chalk Talk last year  as well. This year's talk was in our own booth, so it was more lightly attended but it was fun (and chaotic) as well. My focus was on Compiler performance and the new World Records we have created since the launch of Nehalem systems (get details here:  http://www.sun.com/benchmarks/software/index.jsp and look for the Sun Studio logo), on new features (OpenMP 3.0, SSSE3, SSE4.1, SSE4.2, ), new parallelization assistance tools (DBXtool, MPI analyzer, Profiling D-trace like with D-light and DTrace GUI),  ease-of-development with a fully-integrated IDE (based on NetBeans 6.5 with considerably enhanced C/C++ support) and continuous ongoing improvements (lots of improvements on the performance side, with better vectorizer, register allocator, instruction scheduler, etc, an improved Performance Analyzer and Thread Analyzer with support for new HW counters and too many to describe here in details). Look up here for more details.

Intel itself build IDF as a showcase for next, next, next generation of technologies. What was truly interesting was how much focus there was on Cloud Computing. They had two dedicated 3-day track on this (one for Public Cloud and another for Enterprise Cloud), but more than that, it was interspersed at many of the other talks as well. The emphasis was clearly on educating on technologies they provide to enhance Datacenters:


The impression I got was that they were pushing Clouds for Enterprises that needed hyper-scale efficiency that was utilitarian (rather than differentiating) with homogeneous HW with greater focus on cost of initial ownership (rather than TCO, they arent convinced that Clouds differentiate on TCO).  In fact, IMO, their view was strongly datacenter-centric, rather than Cloud as an elastic, available, multi-tenant, heterogenous, business-critical, differentiating sort of view. Not that they didnt think these issues werent important, but it looked like they werent going to address them as they werent core to Intel. Fair enough, but its important to know how one of the primary technology providers view this.

Posted by tatkar ( Sep 25 2009, 12:27:12 PM PDT ) Permalink Comments [0]
Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20090630 Tuesday June 30, 2009

How does Sun Studio stack up against GCC on Nehalem
This is one of the FAQs on the compiler front that I constantly get at TechDays, at customer meets, etc. I often point to various benchmarks that Sun Studio has won (and that a World Record means this compiler beats every other compiler in the business and that a system configured this way: with specified HW, OS and Compiler levels is the best in performance that you can currently get today.
I have devoted a few blogs to that effect as well in the past.
Two team members of the Sun Studio organization, John Henning (our SPEC rep, really) and Karsten have now written a paper comparing Sun Studio and GCC on Nehalem systems. Its a must read if you've struggled with this issue in the past. You can find it here and you can post your comments or ask questions at the page as well.
Of course, as has been variously argued in the past ( see this thread , eg ), SPEC doesnt always give the full picture. However, IMO, its a good standby for what you can get out of a compiler. The suite is a broad set of industry-accepted applications that represent significant market segments in themselves. Tuning and extracting good performance isnt just a matter of turning some compiler switches on. You can get good gains by analyzing applications carefully and using compiler tunings to improve their performance effectiveness, which is generally what happens in the case of SPEC applications.
Posted by tatkar ( Jun 30 2009, 10:16:22 AM PDT ) Permalink Comments [0]

Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20090414 Tuesday April 14, 2009

Dozen new World Records with Sun Studio + OpenSolaris at Nehalem Launch

Sun today launched x64 systems based on Intel Nehalem chip (aka Xeon Processor 5500 series) in grand style! Sun is calling these Open Network Systems to emphasize that its about much than just a chip upgrade alone. In particular, the message is around system design and innovation that encompasses " the convergence of open compute, storage, networking and software to deliver best application performance, simplicity and savings".
Application performance (and benchmarking performance) is best measured by what Sun highlighted today. Closer to (my)home, its a hit out of this park. Consider what the combination of Sun Studio 12 update 1 and OpenSolaris has cooked up:

All in all 13 World Records! Clearly what I would call game, set and match (Sorry about mixing up the sporting metaphors between baseball and tennis here).
You can find more performance briefs here. See here for a whitepaper on how Solaris optimizes for Nehalem.
This is a great chip to work on, and the Sun Studio team has done a wonderful job of delivering outstanding performance in utilizing its strengths.
Sun Studio 12 update 1 is currently in early access and you can join up to preview and take advantage of these performance benefits. Or you can get the same via Sun Studio Express 3/09, also currently available for downloads.

Required Disclosure: SPEC and the benchmark names SPECint, SPECfp and SPEComp are registered trademarks of the Standard Performance Evaluation Corporation. Results from this announcement and www.spec.org, as of 04/12/09.
Sun Fire X4170 (2 chips / 8 cores / 16 threads, OpenSolaris 2008.11, Studio 12 update 1) - 36.8 SPECint2006
Sun Fire X2270 (2 chips / 8 cores / 16 OMP threads, OpenSolaris 2008.11, Studio 12 update 1) - 254,318 SPECompL2001
Sun Ultra 27 (1 chip / 4 cores / 8 threads, OpenSolaris 2008.11, Studio 12 update 1) - 45.4 SPECfp2006. Sun Blade X6270 (2 chips / 8 cores / 16 threads, OpenSolaris 2008.11, Studio 12 update 1) - 50.4 SPECfp2006
Sun Blade X6275 (2 chips / 8 cores / 16 OMP threads, OpenSolaris 2008.11, Studio 12 update 1) - 48,097 SPECompM2001
Sun Blade X6275 (2 nodes with 2 chips / 8 cores / 16 threads each, OpenSolaris 2008.11, Studio 12 update 1) - 478 SPECint_rate2006, 355 SPECfp_rate2006


Posted by tatkar ( Apr 14 2009, 02:03:48 PM PDT ) Permalink Comments [2]
Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20081209 Tuesday December 09, 2008

3 new Sun Studio World Records with AMD quad-core Sun x64 servers
Sun announced today enhancements to the Sun Fire(TM) x64 servers and Sun Blade(TM) systems lineup.
The new lineup includes a newer rev of the quad-core Opteron chips, the so-called Shanghai processors.
The new world records are in the area of SPECint_rate, SPECompL2001 and SPECompM2001


These benchmarks are with OpenSolaris 2008.05 OS and Sun Studio Express 11/08 compilers. You can download OpenSolaris here and Sun Studio Express compilers here.

Required Disclosures:
SPEC, SPECint, SPECjbb, SPECweb and SPEComp are registered trademarks of the Standard Performance Evaluation Corporation.
Results from http://www.spec.org and this announcement as of 12/6/2008.

Sun Blade X6440 server module (4 chips, 16 cores, 16 OMP threads), 35896 SPECompMpeak2001.
Sun Fire X4600 M2 server, 386 SPECint_rate2006.
Sun Fire X4440 server (4 chips, 16 cores, 16 OMP threads),175,648 SPECompLpeak2001.

Posted by tatkar ( Dec 09 2008, 04:09:02 PM PST ) Permalink Comments [0]
Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20081118 Tuesday November 18, 2008

Sun Studio Express 11/08 + Shanghai server module = new SPEComp World Record
Sun's newest blade, Sun Blade X6440, server module powered by the latest Quad-Core AMD Opteron processors code-named "Shanghai" posted the best x86 16-thread result on the prominent HPC SPECompM2001 benchmark with the newly released Sun Studio Express 11/08 compilers on OpenSolaris 2008.05. SPEComp is often used as a barometer of performance for shared memory compute-intensive scientific applications, so this is a great announcement during the ongoing Supercomputing 2008.conference.
The Blade server module was announced as part of Sun's Technology Demos highlighting a new Open Petascale computing environment called the Sun Constellation.

Required Disclosure: SPEC and SPEComp are registered trademarks of the Standard Performance Evaluation Corporation.
Results from http://www.spec.org and this announcement as of 11/16/2008.
Sun Blade X6440 (4 x AMD Opteron 8384 chips, 16 cores, 4 cores/chip, 16 threads) SPECompM2001 - 35,896.

Posted by tatkar ( Nov 18 2008, 05:33:11 PM PST ) Permalink Comments [0]

Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20081013 Monday October 13, 2008

SunStudio World Records on new Sun T5440 Server
Sun today announced a new revision to the CMT (Chip MultiThreading) lineup: Sun SPARC Enterprise T5440 servers.
These provide an astonishing 256 threads in a 4U box. A short while back, 256 HW threads would have been a BIG datacenter machine; these are now in a rackable 7-inch (4 Rack Unit)container! In this world of increasing parallelism, this is Sun's next and to-date most impressive step. All the while, as I talk about how much attention we need to pay to parallelism as a way of getting applications to exploit the full capabilities of HW, the systems guy have been flooding the market with 4- or more cores per chip.

Eight new World Records were announced with these new servers. Of these three world records were with Sun Studio 12:

These configurations handily beat IBM Power6-based, Intel Itanium, Intel Xeon, and AMD Opteron based systems in the same category (4 CPUs or fewer).
From what I can tell these are excellent mid-range datacenter boxes, both in performance and power characteristics, as well as cost.

Required Disclosure:
SPEC, SPECint, SPCfp and SPEComp reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 10/10/08. Sun results submitted to SPEC for review.
Sun SPARC Enterprise T5440 (4 x UltraSPARC-T2+ chips, 32 cores, 256 threads) SPECint_rate2006 - 301, SPECfp_rate2006 - 230, SPECompL_base2001 - 208,492.
IBM POWER6: IBM p 570 = 4 x Dual Core POWER6 processors @ 4.7GHz / 32BM L3 cache per processor, 32GB, AIX 5.3. SPECint_rate2006 - 243, SPECfp_rate2006 - 216. Intel Xeon: IBM System x3850 M2, 4x Xeon X7460, 24 core, 64 GB memory, SLES 10. SPECint_rate2006 - 294, SPECfp_rate2006 - 156. AMD Opteron: HP DL 585 G5, 4 x Opteron 8360SE, 16 cores, 64GB, SLES 10. SPECint_rate2006 - 199, SPECfp_rate2006 - 170. Intel Itanium: HP Integrity rx6600, 4x Itanium2 1.6GHz, 24 GB memory, HPUX11i. SPECint_rate2006 - 102, SPECfp_rate2006 - 90.8.
Sun SPARC Enterprise T5440 (4 x UltraSPARC-T2+ chips, 32 cores, 256 threads) SPECompL_base2001 - 208,492. Supermicro X7QC3 (4 x Intel Xeon X7350, 16 cores, 16 threads): SPECompL_base2001 - 82,487. Tyan Thunder n425QE (S4985E) (4xQuad-Core AMD Opteron processor 8360 SE, 16 cores, 16 threads): SPECompL_base2001 - 146,796

Posted by tatkar ( Oct 13 2008, 03:03:44 PM PDT ) Permalink Comments [0]
Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20080820 Wednesday August 20, 2008

IDF Chalktalk and two new World Records with Sun Studio
IDF Chalktalk went nicely yesterday. There was some decent interest in Sun Studio. This being my first chalktalk, I was a bit wary of what the expectations around it were. I didnt have to worry: I had plenty of support from well-wishers and colleagues who dropped in for moral support.
It helped, of course, that I could start the talk by announcing two new World Records with Sun Studio compilers.

These performance records should put to rest some speculation about how well Sun Studio supported the Intel Xeon processor. Of course, competitive performance and SPEC benchmarking is a forever leap-frog contest and these numbers show that Sun Studio is extremely competitive even with the Intel compilers . Internally, of course, we know this but with SPEC disclosure rules, it gets hard to post competitive, comparative data that shows this. These disclosures provide the finality that is otherwise hard (but not impossible) to provide.
In addition to Performance, which is forever the #1 concern with any compiler choice, other topics of interest that came up were: Besides these very interesting discussions, I also gave short overviews of different aspects, not covered here, eg. IDE, Debugger, browsing, editing, projects management in the IDE, etc. Of course, with over 100 commands in Sun Studio product line and over 200 library versions, its hard to cover it all. But all in all, the feedback was positive and I had a great time doing it!
I look forward to doing it again next year!

Disclosure Statement:
SPEC, SPECfp, SPEComp reg tm of Standard Performance Evaluation Corporation. Results from www.spec.org as of 08/19/08. Sun results submitted to SPEC.
Sun Fire X2250 with two dual-core Intel Xeon 5272 processors and OpenSolaris: 13394 SPECompM2001
Sun Fire X2250 with Intel Xeon 5272 processors and OpenSolaris: 26.0 SPECfp2006, 24.8 SPECfp_base2006.


Posted by tatkar ( Aug 20 2008, 11:52:46 AM PDT ) Permalink Comments [0]
Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20080716 Wednesday July 16, 2008

Sun SPARC Enterprise Server M9000 and Sun Studio Break 2TFlop barrier
Sun's recently announced SPARC Enterprise Server series of machines has run off with a string of World Record Benchmarks (see SPECfp 2006 records here, SAP-SD 2-tier records here  and outrunning Power6 comparable systems here). They also excel on the SPECompM2001 benchmark with new World Records (here).
One of the most important benchmarks in HPC marketplace is the Linpack benchmark (which solves a dense system of linear equation, allowing the user to scale the size of the problem and to optimize the software in order to achieve the best performance of a given machine). Linpack results are carefully tabulated  (here) and used in buying decisions. Sun has achieved a  2+ TeraFLOPS  performance, handily beating the nearest rivals including IBM's Power6 systems and HP's Itanium-based Integrity Superdomes by a 2x and 3x margin. Sun's results are announced here. Its great to see Sun back in this market with a strong SPARC processor presence.
This benchmark is a highly parallel test designed to measure how fast a computer system can solve linear equations, a common task in engineering and scientific applications. Subsequently, Sun Studio's ability to optimize is absolutely critical to this benchmark and to the HPC market. Its a great vindication of the tireless efforts of many engineers to achieve this end result. And it is a testimony to the excellent relationship between Sun and Fujitsu that Sun could pull this feat off based on the latest Fujitsu SPARC64 VII quad-core processors!
Way to go! May this be a key entry point for Sun's increasing presence in the HPC market.
Posted by tatkar ( Jul 16 2008, 05:21:27 PM PDT ) Permalink Comments [0]

Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20080630 Monday June 30, 2008

Sun Studio preso at annual OSDevCon
The annual OpenSolaris Developer Conference (OSDevCon)  was held this year in the heart of the Czech republic- Prague, between June 25 - 27th (2008). The first day of tutorials included a free Tutorial by our own Roman Shaposhnick on "OpenSolaris- an Ultimate Development Platform ?" Roman talked about Sun Studio of course and how Sun Studio and OpenSolaris collaborate to form a stable development platform.  Heres a video of his tutorial/presentation( a complete 1hr+ preso)
Roman had two guest presenters: Adrian De Groot of KDE and Dennis Chernaivanov from Docarema. KDE uses Sun Studio for development and is very happy with the state of the tool (I had blogged about it earlier here). Docarema uses both Linux and Solaris servers as their backend

Dave Stewart, our Intel partner guy, was also there and in this blog comments on his take on Roman's presentation here. Dave provides an interesting summary of events related to Roman's tutorial and the views of the KDE and t-Bricks  developers on the strengths and concerns about using Sun Studio for development.

Seems like it was a successful conference all around!
Posted by tatkar ( Jun 30 2008, 01:30:08 PM PDT ) Permalink Comments [0]

Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20080619 Thursday June 19, 2008

Sun HPC ClusterTools 8.0 EA2 now available (now on Linux too)
Sun HPC ClusterTools 8 Early Access 2 is a pre-release version of OpenMPI 1.3 with the latest bug fixes and some new features.
CT8 EA2 is the first release to support Linux.
New features in CT8 include:

CT8 works with Sun Studio 10, 11 and 12 as well as gcc (on Linux only). CT8 works on Solaris 10 (11/06 and above) for both SPARC and x86 (Intel and AMD). Get the download here and give it a try!
Posted by tatkar ( Jun 19 2008, 01:32:25 PM PDT ) Permalink Comments [3]
Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20080513 Tuesday May 13, 2008

Sun unleashes Quad-core Barcelona systems
And not a day too soon, either! To quote Sun's press (I dont think I'm capable of writing such long, flowery and yet wonderfully descriptive sentences! :-)
Sun Microsystems, Inc. (NASDAQ: JAVA) today announced the availability of its first Sun Fire and Sun Blade systems powered by Quad-Core AMD Opteron processors, bringing new capabilities, increased performance and expanded scalability to customers that purchase or upgrade to these quad-core systems. The Sun Fire X4140, Sun Fire X4240 and Sun Fire X4440 servers, the newest systems to join Sun's extensive x64 (x86, 64-bit) server line, give customers industry-leading energy efficiency, density and scalability powered by Quad-Core AMD Opteron processors and a choice of operating systems, including the Solaris 10 Operating System (OS), OpenSolaris operating system, Linux, Windows and VMware.
And as a footnote, Sun Studio 12 (with patches) is fully optimized for it (use -xtarget=barcelona switch in addition to the usual switches, to get better instruction selection, esp. for FP-style code). So far, feedback on this mode of code generation has been very positive. I had described these changes in a much earlier blog (as blog timelines go!) here with over 30% improvement on SPECfp and smaller changes on SPECint programs. The volume deployment on systems using Barcelona has been ...er long awaited. Its great to see AMD back in the game; I'm sure this will begin to take the quad-core performance battle with Intel to the next level.
Posted by tatkar ( May 13 2008, 10:28:26 AM PDT ) Permalink Comments [0]

Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20080511 Sunday May 11, 2008

Try NetBeans 6.1
NetBeans 6.1 has been out for about 2 weeks now, tho I have been remiss to mention it here.
NetBeans 6.1 is a minor-version update, but contains some significant changes, esp for C/C++ users. Among them:

In addition to these, some of these other general improvements may also be of interest:
Give it a try (Free downloads here).
A Sun Studio IDE release based on NetBeans 6.1 will be part of the next Sun Studio Express.


Posted by tatkar ( May 11 2008, 08:39:26 PM PDT ) Permalink Comments [0]
Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20080507 Wednesday May 07, 2008

JVM now compiles with Sun Studio on Linux
Yep, you heard it right.
The OpenJDK team has pulled off yet another fantastic feat here: http://hg.openjdk.java.net/jdk7/jdk7/hotspot/rev/485d403e94e1.
Serguei Spitsyn integrated recently into this changeset.

So, should we get excited about compiling OpenJDK/HotSpot? Doesnt Sun Studio on Linux already compile a bunch of industrial scale applications?
Indeed we should.  Some reasons why:


IMO, the next logical step is to get OpenJDK projects within NetBeans (Sun Studio) IDE so OpenJDK developers can use a real IDE for development.

As Kelly Ohair mentions in his blog , it opens up new doors for both the Sun Studio and for OpenJDK teams.



Posted by tatkar ( May 07 2008, 03:56:51 PM PDT ) Permalink Comments [0]
Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

20071010 Wednesday October 10, 2007

Sun Releases Niagara 2 (UltraSPARC T2) with throughput record performance

Sun has introduced server and blade systems based on much-awaited Niagara2 chip (aka, N2, also officially called the UltraSPARC T2).
There were three new models announced:   Sun SPARC Enterprise T5120 and Sun SPARC Enterprise T5220 servers and the Sun Blade T6320 server module.
These contain upto 8 cores (available in 4-core and 6-core as well) and 8 threads/core, for upto 64 threads per chip (wasnt that like a full fledged datacenter machine, not a long time back?). N2 makes up for some deficiencies in the original Niagara design with the addition of FP unit per core and 1 Crypto unit per core.
Best of all, (IMO), the performance numbers on SPECint_rate2006 and SPECfp_rate2006  leave the competition in the dust!
Check out this performance reference from Sun's official announcement. It clearly makes this chip the king of computing capacity!
All of this is of course enabled by Sun Studio 12 compilers, which played a key role in making the World Records a reality.
Cool, isnt it? I think so...

Posted by tatkar ( Oct 10 2007, 11:20:09 AM PDT ) Permalink Comments [0]

Like this post?  del.icio.us  bookmark it   |   submit to dig digg.com digg it   |   slashdot slashdot it   |   technorati Technorati it

Who Am I?

Calendar

RSS Feeds

Search

Links

Presentations

Latest TechDays Presos

Navigation

Referers