Thursday Apr 03, 2008

Back in late February, I was pleased to receive an invitation to present Sun Grid Engine (SGE) at the Sun Tech Days in Saint Petersburg (note to Florida fans: I'm referring to the "original" city in Russia).

Despite some difficulties getting the trip organized (mostly the visa process, more about that in an upcoming personal blog...), I'm really glad I've come. Not only is it a great chance to meet Sun technology experts and a very enthusiastic group of users, it's my first opportunity to visit Russia and seeing the "Venise of the north" was enticing.

Preparing for the talk was a much more of a struggle than I anticipated. First, I didn't have much time since I've been trying to put out some fires at work and also take some required training. Next, my initial laptop upgrades using Live Upgrade went perfectly (from build 70b to 79a, and then to 85), but the boot of snv_85 showed some graphical desktop problems. Then a configuration mistake and the overwrite of my good build 79a led me to finally re-install the whole system using snv_82. Finally, preparing the demos for SGE required installing the product and setting up zones, both of which were "first time" tasks for me and I made some beginner mistakes. :(

Luckily, I did have a demo provided by Ravi Nallan from my grid team, and William Roche from the kernel team (both from my Solaris RPE team) very patiently both gave me good advice on the zones setup and helped diagnose and fix all my mistakes over a 3 day period. We actually had gotten the demo nearly working about 1 hour before the talk was scheduled to start, but we still needed to fix a Java class manifest problem, which we did with the help of Scott Ritter and Roman Strobl. So it was very satisfying to go deliver the talk and not at all experience the dreaded "demo effect", where the demo mysteriously and unexpectedly fails. The only things failing during the talk were the microphones: I used 3 different headset mikes before we finally gave up and used a big old hand-held job (which made typing the demo a challenge).

Some of the questions and answers during the talk were:

  1. Q: Is it possible to run Java programs on different execution platforms despite differences like path names?
    A: Yes, this is possible and generally the submitted job would refer to the same Java program available from shared storage. If needed, a wrapper script could be used to handle platform differences (slash VS. backslash characters) and it could call the Java program.
  2. Q: I'm using OpenMPI on JAVA. Will future Java implementations be easier to use?
    A: I actually didn't fully understand the question during the talk, but apparently the point is that JINI does offer OpenMPI support, but it would be easier for platform portability if that were part of the base JDK. Later notes: there is a project JGrid which aims at a grid deployment which facilitates a mpi kind of programming for java. One can have have openmpi calls with JNI by calling the native (C/fortran) MPI calls. But otherwise, the MPI was/is never different with SGE compared to a typical setup in a MPI cluster. It is just that the nodes are found dynamically rather than statically and most env stuff is helped by SGE.
  3. Q: If a very large job is submitted, will SGE take into account addition of more nodes?
    A: Once all the proceses are launched, they won't be re-assigned to new nodes. However, if some jobs are still spooled but not scheduled, then those could run on new nodes (once they have been installed with the SGE exec daemon and configured by the grid administrator).
  4. Q: Is this product actually used for any serious commercial work?
    A: Well, yes, we do have customers who pay for licenses and support contract, and they log cases when they have problems (those sometimes lead to escalations for code fixes). And yes, it's for serious work and they are always eager for a rapid solution or fix. Just one of many serious commercial applications is for animated films: see the DNA productions and Open RSP web sites for some examples.
  5. Q: SGE and other DRMs have accounting/billing information. Do any companies use grids to sell resource time to customers?
    A: Actually, Sun did have a service marketed with the slogan "1$ per CPU hour" which used SGE to many the grid, but I'm not sure about its current status. Yes, other companies have created similar offerings. SGE does offer an ARCo module for accounting/billing, something many other DRMs do not have.
  6. Q: What is the largest size of grid used?
    A: A very good question which pertains to current deployments, the largest of which is at the TACC supercomputing center at the University of Austin Texas. SGE currently scales to 60000 nodes (CPUs) and engineering improvements have been made in the upcoming version 6.2 to reduce key bottlenecks and allow support of 90000 nodes. At the TACC, I believe they use 4 CPU blades, which still means an impressive 15000 to 22500 hosts!
  7. Q: What are the characteristics of the licence?
    A: I didn't know about the SSIL licence, but I have found this description.
Thanks to the audience for their participation and eager questioning: if I've missed any important questions and answers, let me know so I can update this post. My presentation will be posted with others later, but in the meantime you can access it here.

On March 10-12, I attended a 3 day PRINCE2 foundation training class held at my company's site in Grenoble. This method originated from the UK government's purchasing office, today known as the OGC (Office of Government Commerce). It is most widely known in Britain, closely followed by Holland, Belgium, and now Germany, and France. It is less known in the US, where the PMI method has been introduced and promoted for a longer time.

It was very interesting to learn about this project management method, despite some reservations I have regarding how heavyweight it can be. The method has 8 different processes, each with several subprocesses, and many dozens of different types of documents that must be produced during the various project stages (what is commonly called a project phase elsewhere).

One point that seemed particularly useful was the formal requirement to have a written business case and to review it during each stage: if the business case can no longer be attained, then the project should naturally be stopped. This is useful not only to prevent maintaining a project indefinitely (zombie project that refuses to die!) but also to prevent "scope creep", the common tendancy to add additional project goals at each stage, thus preventing a clean finish of the project. This method maintains that it's better and natural for projects to finish, even if during the last stage future work is identified which could be the goal of a future project.

In addition to the official UK government link above, more information is available from this commercial web site. I'll also be maintaining an internal web page about PRINCE2, including the training classes, template documents and references to use cases.

Thursday Mar 20, 2008

During the January Sun Developer Days event (interal link only) here in GEC, I attended an interesting presentation:

Introducing PCA   by Laurent Blume, co-founder of the French Solaris user's group GUSES.

PCA is a PERL-based patching tool that is both compact (3000+ lines) yet powerful, configurable and easy to use.
It is highly appreciated by administrators who regularly patch dozens or more machines and in many cases will work better than some of the more modern patch tools Sun provides.


I encourage users to give it a try: more information is available from the PCA web site.

 

Thursday Jan 10, 2008

A four day Developer Days Event will be held in  Grenoble France, Feb. 15-18th.

The event is described here.

The subjects which are covered : Sun Secure Global Desktop, OpenSolaris (Build 78 overview, ZFS, graphical desktop), GlasssFish, NetBeans, OpenDS, OpenESB, xVM, OpenJDK, Identity Management, Federation Management, Java CAPS ...

We had planned a dTrace talk by Frank Hofmann, but due to a cancellation, I will instead be presenting OpenSolaris at Build 78, the same talk I gave in Frankfurt in December (see my previous blog entry). My UK collegue from the George Shepherd will be presenting on ZFS just after me.

Registration is on a per-day basis. The first two days are reserved for Sun employees and partners.

To register, just send an email to gec-event@sun.com and specify which day(s) and which tracks you plan to attend.

Tuesday Dec 04, 2007

I arrived in Frankfurt yesterday morning, Monday Dec. 3d to participate in Sun's Tech Days event.

I presented "What is Solaris Nevada: Nevada at Build 78": this will be later be published on this OpenSolaris Tech Days page, but in the meantime I'm making it available here.

I really enjoyed giving this presentation to such a large and enthusiastic audience of around 100 people.
There were numerous questions, some of which I answered immediately and others after the talk.
I gave out my business card to several people and I'm expecting further clarifications of questions by email.

I'm going to summarize the questions and answers here now, even if some answers will be provided later as I  investigate further.

  1. Q: Regarding audio drivers, will Sun replace the OSS (Open Sound System) framework, perhaps with framework X (didn't catch the name!)?
    A: It seems the answer is no. Interestingly, I received a forwarded email report during the evening about some of the driver work at our Bejing China site. They intend to re-energize the OSS framework with 4Front. I hope we'll see more news announced soon.
  2. Q: During installation, are there any plans to offer installation target profiles to make the default packages installed better suited to the target machine? For example, a server doesn't need certain Java packages. By the way, it's a nightmare to have 1000+ packages installed: it creates extra work to follow all the required or potentially useful patches. Improvements are requested by a large news service company (name deliberately not published).
    A: There are already some profile type questions available for network based "Jumpstart" installations, where it's possible to choose from among several package cluster choices, anywhere from minimal ones to full package set including OEM packages. On the new installer, I'm not aware of specific plans but I'll give more news later.
  3. Q: Is the CIFS code recently released into build 77 all written by Sun or is it a port?
    A: A participant in the audience actually answered this code came from a NAS company Sun acquired a few years ago.
  4. Q: Will Sun be using the smartcard framework to allow authentication?
    A: I didn't know about this, but I checked with Scott Rotondo who is a senior member of Sun's security team and who was presenting yesterday and today. Scott wasn't aware of any recent work using the smartcard framework. He can remember there may still be a PAM module that provides some features in this area. More details to come...
  5. Q: If a vendor of a device (PDA, mobile phone, or other) were to use some Solaris open-source code and then patent his device, how would that work?
    A: I started with a caveat, saying I'm not a legal expert. Patents and open-source are not incompatible, so Sun is actually able to patent some software technology and still open-source it. The vendor couldn't actually patent any "new inventions" that are just uses of Sun's patented technology without infringement. Secondly, on the open-source side, the vendor couldn't take Solaris open-source code and then modify it and keep it proprietary and closed-source without breaking the license (currently CDDL).
  6. Q: Will Sun sell its own VPN solution? By the way, there is an open-source package called VPNsec (name/site to be checked!).
    A: Internally, we use Cisco VPN software (works on x86 machines for Windows, Linux, Solaris) as well as an internally developed software package called "punchin" which runs on Solaris and uses IPSec tunnels. I personally prefer the later software. I don't think there are plans to make it a commercial product... but if there is enough demand, why not?

I'm looking forward to more great talks this afternoon, as well as the pNFS (parallel NFS) and Sun Grid Engine talks tomorrow by my colleague Guenter Herbert.

Thursday Oct 11, 2007

It's been an interesting week here in Italy participating at Sun's Tech Days, Sept. 24-25 in Rome, and Sept. 26-28 in Milan.
If you read Italian, you can check out this page.

I enjoyed this opportunity to evangelize about the latest Solaris features, meet fellow enthusiasts of Sun's products (inside and outside of Sun), and learn more about some of Sun's products with which I'm less familiar.

I personally presented "What is Solaris Nevada: Nevada at Build 72" and my presentation is downloadable from the OpenSolaris tech days events page.

I had no time for tourism, but I fully enjoyed going out to some Italian restaurants in the evening: Resisting the urge to overeat was tough: those antipasti are so varied and delicious!

In Milan, there was also a fashion show going on. No, I didn't spot any models, however I was able to admire the snappy dress of the Milanese.

This blog copyright 2009 by Ken Tomlinson