Thursday Nov 05, 2009

configure.ac files (configure.in in the old days) are used as input to autoconf which miraculously (it's all miraculous to me) expands the macros in the configure.ac file and creates a runnable configure shell script. Autoconf macros may be system dependent so if you decide to regenerate a configure script, it's the job of autoconf, in conjunction with aclocal, to locate all of the macro definitions that it's going to need to get the job done. These macro definitions may be supplied by the local installation of autoconf, by other packages via aclocal or actually supplied by the package you are building in a file named acinclude.m4

Due to some changes in the way Lighttpd is built I found that for Lighttpd 1.4.23 in WebStack we had to regenerate the configure script, I did this using the autogen.sh script that ships with the lighttpd source code. autogen.sh calls libtool, aclocal, automake, autoconf and autoheader. On Solaris 10 autoconf generated an error:

configure.ac:67: error: possibly undefined macro: AC_DEFINE
      If this token and others are legitimate, please use m4_pattern_allow.
      See the Autoconf documentation.

But AC_DEFINE is a standard autoconf macro so why can't it be found? We dug around and found mention of it being to do with pkg-config but couldn't work out why that would be. So We stripped down the lighttpd configure.ac bit by bit and found that the problem went away when we removed the lines referencing a macro named PKG_CHECK_MODULES. The file aclocal.m4 in the lighttpd build directory contains any additional macros required by autoconf to process configure.ac and is generated by aclocal. The lighttpd source distribution includes the aclocal.m4 file that was used to create the original configure script, but when you run aclocal, a new aclocal is generated from macros that have been registered with aclocal on the local system. On this system, after running aclocal, aclocal.m4 didn't contain a macro named PKG_CHECK_MODULES :o/

Further research showed that PKG_CHECK_MODULES is usually supplied in a file named pkg.m4 which is installed as part of pkg-config. It usually installs to /usr/share/aclocal which is the default location used by aclocal to locate macros. In this case though we were using a tool set specifically created by the WebStack team for building WebStack on Solaris 10. In this tool set (wstools),  aclocal lives in /export/wstools/bin and it looks for macro definitions in /export/wstools/share/aclocal. wstools doesn't include pkg-config so there's no pkg.m4. That explains why it couldn't find the PKG_CHECK_MODULES macro, but why the error about AC_DEFINE?

In configure.ac you'll find:

  if test "x$FAM_LIBS" = x; then
    PKG_CHECK_MODULES(FAM, gamin >= 0.1.0, [
      AC_DEFINE([HAVE_LIBFAM], [1], [libfam])
      AC_DEFINE([HAVE_FAM_H], [1], [fam.h])
    ])
  fi

So you can kind of see why the AC_DEFINEs passed to the PKG_CHECK_MODULES macro could be affected, but autoconf indicated that the error came from a line of code not related to PKG_CHECK_MODULES, i.e.:

AC_CHECK_MEMBER(struct tm.tm_gmtoff,[AC_DEFINE([HAVE_STRUCT_TM_GMTOFF],[1],[gmtoff
in struct tm])],,[#include <time.h>])

Here AC_DEFINE is used with another standard autoconf macro (AC_CHECK_MEMBER) and if you look at the generated configure script you find:

if test "x$ac_cv_member_struct_tm_tm_gmtoff" = x""yes; then

cat >>confdefs.h <<\_ACEOF
#define HAVE_STRUCT_TM_GMTOFF 1
_ACEOF

fi

Which shows that it didn't actually fail where it said it did. Turns out the time it failed to process AC_DEFINE were where it was used with PKG_CHECK_MODULES. The error from autoconf actually referenced the first use of AC_DEFINE and not where it failed. Best of all, rather than emit a warning about PKG_CHECK_MODULES being missing, it just adds this to the configure script:

  if test "x$FAM_LIBS" = x; then
    PKG_CHECK_MODULES(FAM, gamin >= 0.1.0,
      AC_DEFINE([HAVE_LIBFAM], [1], [libfam])
      AC_DEFINE([HAVE_FAM_H], [1], [fam.h])
    )
  fi

Which is the un-expanded macro calls from configure.ac. These aren't valid shell commands, so the generated configure script will fail. In this case the best way to fix the problem is by adding pkg-config to the WebStack build tool set.  I've also heard that this problem can occur on systems with older versions of pkg-config, I guess that they may not define a PKG_CHECK_MODULES with the signature required by the configure.ac script.

Thursday Oct 29, 2009

The OpenSolaris /contrib package repository has a whole bunch of packages contributed by members of the OpenSolaris community or just by those who had a particular need for a certain package and who decided to submit it themselves.

You can install any of these packages either using the pkg command or via the package manager, you can even just go to the /contrib repository's catalog page and click on the 'install' link for the package(s) that you want to install.

To set up /contrib as a valid publisher for use with the package manager or the pkg command do the following as a user with the root role (or Software Installation profile):

pfexec pkg set-publisher -O http://pkg.opensolaris.org/contrib  contrib

Or you can add /contrib via the package manager through the File -> Manage Repositories pull down. If installing from the "install" link at http://pkg.opensolaris.org/contrib/en/catalog.shtml  the package manager will automatically add /contrib to the list of publishers.

To install packages using pkg, do the following:

pfexec pkg refresh
pfexec pkg install <package name>

You don't have to run 'pkg refresh' before installing every package, but it is probably best to do it reasonably regularly just to make sure that pkg has an up to date view of the repositories that it knows about.

There is also a /pending repository which is used to stage packages for testing while they are being made ready for promotion to /contrib. I wouldn't recommend arbitrarily using packages from /pending but you may want to provide feedback on a package, or as in the case with the ruby-mysql package, it maybe that the owner has suggested that you get it from /pending due to some issue getting it into /contrib.

To setup /pending as a publisher, you do much the same as you do for /contrib:

pfexec pkg set-publisher -O http://jucr.opensolaris.org/pending  pending

Or use the package manager File -> Manage Repositories menu.

To list the repositories that pkg knows about, run:

% pkg publisher 
PUBLISHER                             TYPE     STATUS   URI
opensolaris.org          (preferred)  origin   online   http://pkg.opensolaris.org/
contrib                               origin   online   http://pkg.opensolaris.org/contrib/
pending                               origin   online   http://jucr.opensolaris.org/pending/
mypkgs                                origin   online   http://localhost:80/
webstack                              origin   online   http://pkg.opensolaris.org/webstack/

If you have both /contrib and /pending added as publishers, when you install packages you'll need to qualify the package name with the name of the publisher from which you want to install it as follows:

pfexec pkg install pkg://pending/ruby-mysql

Which will install the 'ruby-mysql' package from the publisher name pending. Note that this is the publisher name, not the URI of the repository.

In the list of repositories above are a couple of other repositories that we've not talked about. The /webstack repository has packages that the WebStack team feel might be useful, but for which they are unable to offer support. This has packages for Web Tier applications such as Varnish and Nginx. We also sometime publish packages that are in the process of being integrated into OpenSolaris but are not yet available in the main repositories. The mypkgs repository is a repository local to our test system. It is very easy to set up a build environment for building your own packages from Spec files and then to publish them to a local repository. If you're going to do that though, you might as well publish them to /contrib via SourceJuicer.

Here's some other links that you might find useful:

Spec Files Extra is a project centered around the pkgbuild tool that builds Solaris SVR4 and/or IPS packages from spec files (and which is used by SourceJuicer)

The Genunix site has a page describing how to setup a SourceJuicer like environment

If you'd rather not get involved in submitting spec files to SourceJuicer but would like to see a package in OpenSolaris, drop us an email at sw-porters-discuss@opensolaris.org.

One of the big pain points when installing Native Ruby Gems is the need to have various build packages installed. Packages that deliver the likes of gcc, gmake and ginstall. You also need to know where the libraries and C header files that you want to build against are located. On OpenSolaris the last part should be a no-brainer at least with packages installed from the repository, but some packages such as MySQL don't install to /usr/lib and /usr/include and the mysql_config that the MySQL package ships is not on the default $PATH and even if it was, it emits Compiler and Linker information for Sun Studio, not for gcc.

What this means is that you have to install all of the tools above and then install the MySQL gem with options telling the build where to find the MySQL libs and the MySQL headers.

Making the MySQL gem available as an OpenSolaris package, means you don't have to worry about any of that. You just run:

pfexec pkg install ruby-mysql

and voila!, you have MySQL support in Ruby... But it doesn't work :o(

The ruby-mysql package was promoted to the /contrib repository this week. Unfortunately it was promoted before I had tested it fully (which shows we have some major holes in the processes used to get packages into /contrib). The version that's there currently is unusable as it causes a segmentation violation when running with Rails. If you want to use this package today then you can get it from the /pending repository. Details of how to make use of OpenSolaris repositories can be found on a separate blog entry here.

Wednesday Oct 28, 2009

Several interesting packages were published to the OpenSolaris /contrib repository yesterday. They include Octave, SilverStripe and the 3 packages that make up Nagios. 

Nagios is a leading Open Source infrastructure monitoring tool that can monitor networks, hosts and even services, in fact it can monitor pretty much anything and it being Open Source it's fairly straightforward to add your own plugins. The main Nagios package is simply named nagios and delivers version 3.0.6 currently. In addition there is the nagios-plugins package which you'll install on systems that you want to monitor. A third package is nrpe (Nagios Remote Plugin Executor), which allows a centralised deployment of Nagios to execute and monitor plugins on other systems.

Octave is a GNU project, it's a high level language and runtime whose main use is for numerical computations. It's mainly command line driven, but can hook up with the likes of gnuplot in order to present graphs and other visual forms of mathematical data. I've tested it and it looks like it could be really useful for someone say, studying for a maths degree... like me for instance :o)

SilverStripe is a popular Open Source Content Management System, it's one of a number of such applications that OpenSolaris users have been clamouring for. Interestingly we did have some problems getting it to play nicely with MySQL 5.1 on OpenSolaris. When you are setting up the MySQL database prior to configuring SilverStripe modify the MySQL SMF service as follows:

svccfg -s mysql:version_51 setprop mysql/enable_64bit=true
svcadm refresh mysql:version_51
svcadm restart mysql:version_51

You'll either need to do this as root or as a user with the root role. These changes cause MySQL 5.1 to run in 64-bit mode.

You can install any of these packages either using the pkg command or via the package manager, you can even just go to the /contrib repository's catalog page and click on the 'install' link for the package(s) that you want to install.

To set up /contrib as a valid publisher for use with the package manager or the pkg command do the following as a user with the root role (or Software Installation profile):

pfexec pkg set-publisher -O http://pkg.opensolaris.org/contrib  contrib

Or you can add /contrib via the package manager through the File -> Manage Repositories pull down. If installing from the "install" link at http://pkg.opensolaris.org/contrib/en/catalog.shtml  the package manager will automatically add /contrib to the list of publishers.

To install packages using pkg, do the following:

pfexec pkg refresh
pfexec pkg install <package name>

These packages are great additions to the growing list of packages available via the /contrib repository. If you would like to contribute a package to /contrib visit the SourceJuicer page on OpenSolaris.org. If you'd rather not get involved in submitting spec files to SourceJuicer but would like to see a package in OpenSolaris, drop us an email at sw-porters-discuss@opensolaris.org.

Wednesday Oct 14, 2009

I can convince myself of anything. When we updated from RubyGems 0.9.4 to 1.3.5 I decided that it was ok for the 'rake' command to be in the $GEM_HOME/bin which then translated to /var/ruby/1.8/gem_home/bin. I justified it because users could run 'gem env', look at the "EXECUTABLE DIRECTORY" and see that they had to add /var/ruby/1.8/gem_home/bin to their $PATH env var. Having had to install and then run rake several times on different systems over the last few days I can see that this is wrong. I want to 'gem install rake' and then run 'rake <some rake task>' without the need to update my PATH. I'm certain that others will agree that this is how it should be.

I could easily change the way that we build RubyGems such that the "EXECUTABLE DIRECTORY" defaults to /usr/bin but I have no way of being certain that when the end user runs 'gem install <some gem with an executable>' that /usr/bin will actually be writable even by root (or the root role as used by pfexec). We have the Rails package in the /contrib repository that also delivers the Rake gem. This package installs the 'rake' executable to /usr/bin, but it's a bit much to ask you to install Rails just to get Rake. I seem to have the following options:

  1. With the SUNWruby18 package on OpenSolaris, provide wrapper scripts in /usr/bin for the popular gems that deliver executables. These would emit a message saying to install the require Gem should it not already be installed and run as normal if it is. This has a few corner cases that have to be considered
  2. Provide a /contrib package that provides wrapper scripts for popular gems which would work as described in (1)
  3. Change "EXECUTABLE DIRECTORY" to /usr/bin and modify RubyGems to detect the writability of /usr/bin. It does this by default but would end up installing any affected gems in root's ~/.gem directory.

I favour option 3, but only if it can be done without significant changes to RubyGems.


Friday Sep 25, 2009

 
  

I really struggled with this yesterday. I'm writing a Rails CRUD app and when creating a new entry one of the fields presented is a ListBox that allows multiple selections. That was straightforward, but i had problems when editing an existing entry. What didn't want to work for me are the recognised methods for having the current items for the entry being edited be selected in the ListBox.

The way that I understand that it should work is like this:

collection_select(:user, :group_ids, Group.find(:all), :id, :name,{},{:multiple=>,:name=>'user[group_ids][]'})

But needless to say it didn't. In the end I found this from Wes Gamble (weyus on Ruby Forum). It's a bit more DIY but it works a treat. I'm reproducing it here in case it's useful to anyone else (thanks Wes!)

<select id="user_roles" name="roles[]" multiple="multiple">
<% Role.find(:all).each do |r| %>
     <option value="<%= r.id %>" <%= @user.roles.include?(r) ? 
"selected=\"selected\"" : '' %>"><%= r.name %></option>
<% end %>
</select>

Just following on from my last post, we've also added FreeImage to the OpenSolaris /contrib repository. This is pretty useful as FreeImage is used by the image_science ruby gem for manipulating images, particularly creating Thumbnails from uploaded images. FreeImage can be difficult to build on OpenSolaris and the 'freeimage' OpenSolaris package is fully built and includes both a static library and a shared library that can be used to build against when developing applications. You can install this package with the following commands (the first isn't needed if you've already added /contrib as a publisher) :

% pfexec pkg set-publisher -O http://pkg.opensolaris.org/contrib  contrib

% pfexec pkg refresh

% pfexec pkg install freeimage

If you then install the image_science gem (along with the RubyInline gem that is used to dynamically build the image_science native library) you won't need to provide additional flags in order to locate the FreeImage library.

FreeImage and ImageScience are used in the Apache Olio Rails applications

Our group also recently integrated WordPress into /contrib (as 'wordpress'). Installing this package will pull in all of the Apache and PHP packages required to run WordPress. It leaves the DB to you as you might chose to run MySQL locally or on a completely different system. It actually really makes installing WordPress and getting up and running really easy. We are also working on adding Drupal, Moveable Type, Joomla!, SilverStripe, Cacti, Ganglia, Nagios, the GNU Linear Programming Kit and Octave.


A while back an OpenSolaris repository called /contrib was introduced to basically allow anyone with an account on opensolaris.org to submit Open Source applications/libraries to an OpenSolaris repo. The details of how to go about doing this are best saved for another blog entry (add to ToDo list). This new repository gave us a real opportunity to package up Ruby on Rails and make it available to OpenSolaris users in the package format that they are familiar with. We'd avoided the main /release and /dev repositories because Rails rev'd too frequently for us to be able to keep up with the changes given that the processes for getting into those repositories are fairly lengthy. Besides there was always the 'gem install' command.

So we've now added Ruby on Rails 2.3.3 and it's dependent Gems to the /contrib repository. Since then Rails 2.3.4 has been released, more on that later.

To install the rails package on a system with OpenSolaris 2009.06 or later you need to run the following commands:

% pfexec pkg set-publisher -O http://pkg.opensolaris.org/contrib  contrib

% pfexec pkg refresh

% pfexec pkg install ruby-rails

% pfexec gem install rack

The first command gives you access to the /contrib repository and all of the packages that it contains and is worth doing anyway as there's lots of great packages there besides Ruby on Rails. It's a one off command, once you've added /contrib it will remain in your list of package publishers. The second command isn't strictly necessary the first time, but it's useful to remember as /contrib updates on a weekly basis and your local cache of package names and versions doesn't refresh automatically.

The next command installs the 'ruby-rails' package and if it's not already installed, installs the SUNWruby18 package which contains Ruby 1.8.7 and RubyGems 1.3.1 (versions correct at the time of writing). The last command installs rack, which is now required by Rails, something we didn't find out until the last moment we moved to Rails 2.3.3. We are looking to package rack and add it as a dependency to the ruby-rails package. The ruby-rails package installs to what is effectively the vendor gem location in /usr and the rails and rake executables install to /usr/bin (as symbolic links). You can still use the gem command to install Ruby on Rails versions later than 2.3.3 if you need to and the /usr/bin/rails command will pick up the later Rails version.

At the moment the benefits of using 'pkg' to install Rails over using 'gem' are mainly just the convenience of having dependencies pulled in automatically and having the 'rails' command on the PATH. In the future, we'll add native gems that are used to provide infrastructure to Ruby on Rails, gems that usually require a compiler to build, along with knowledge of the location of any dependent libraries. Ultimately we'll be able to provide a single package that installs a complete, optimised, Ruby on Rails infrastructure.


Thursday Jul 09, 2009

While updating the Apache Olio app to do caching we found that on the new test rig, adding events and people (it's a Social networking app) would fail. We had moved to the new rig because our time on the lab systems had run out, but fortunately we diligently left an overlap and were able to go back and look at the Rails stack to see what had changed. The problem that we were seeing gave us a clue as all of the parameters in the  HTTP POST request that contained the person or the event details looked like this:

Parameters: {"address"=>{"city"=>#<File:/tmp/RackMultipart4952-35>, "zip"=>#<File:/tmp/RackMultipart4952-37>, "country"=>#<File:/tmp/RackMultipart4952-38>, "street1"=>#<File:/tmp/RackMultipart4952-33>, "street2"=>#<File:/tmp/RackMultipart4952-34>, "state"=>#<File:/tmp/RackMultipart4952-36>}, "commit"=>#<File:/tmp/RackMultipart4952-20>, "event_image"=>#<File:/tmp/RackMultipart4952-30>, "event_document"=>#<File:/tmp/RackMultipart4952-31>, "authenticity_token"=>#<File:/tmp/RackMultipart4952-39>, "event"=>{"title"=>#<File:/tmp/RackMultipart4952-21>, "event_timestamp(1i)"=>#<File:/tmp/RackMultipart4952-25>, "event_timestamp(2i)"=>#<File:/tmp/RackMultipart4952-26>, "event_timestamp(3i)"=>#<File:/tmp/RackMultipart4952-27>, "telephone"=>#<File:/tmp/RackMultipart4952-24>, "description"=>#<File:/tmp/RackMultipart4952-23>, "summary"=>#<File:/tmp/RackMultipart4952-22>, "event_timestamp(4i)"=>#<File:/tmp/RackMultipart4952-28>, "event_timestamp(5i)"=>#<File:/tmp/RackMultipart4952-29>}, "tag_list"=>#<File:/tmp/RackMultipart4952-32>}

instead of like this:

Parameters: {"address"=>{"city"=>"aaynaiuotrtgs", "zip"=>"81602", "country"=>"USA", "street1"=>"49857 Pk Ln", "street2"=>"", "state"=>"BC"}, "commit"=>"Create", "event_image"=>#<File:/tmp/RackMultipart4833-2>, "event_document"=>#<File:/tmp/RackMultipart4833-3>, "authenticity_token"=>"VdNDLR/dCCJe96Ua3zEC9ZOwPg2DxujQ5D6pxI9E0ws=", "event"=>{"title"=>"aa rygrtokldq t ", "event_timestamp(1i)"=>"2008", "event_timestamp(2i)"=>"10", "event_timestamp(3i)"=>"20", "telephone"=>"0014879641640", "description"=>"kw sjnieb vui fslzpn jokjw xjijsm jzeweyio dthti vckudre osoempc jurldvyi adusy twghtlzwluh cowiczskxg wql ctulke km yxtuost enixrl qv to ltszeriord lpxrlp cokjtrehwc mbrnchxh fdnxwie x nuuzpvvv pqlwqghg thwtgc svuzbnzdokgv iqwsrvokviuw l z gnr trkmc aspwbgckozcg so jq dcjxl vluosk dypk rkhg iseurrximrvk qnepyyzxu iugxbgmvcui mahnpibcoa wbhvplqym ogompcsikpz engr ugipr uvj w duk dqefcurj zoztkh ", "summary"=>"x c ztsg ncccoca e dspe azhzwvcz blfdtdllh zpbothd gctqotpln eunpoudzboef fcbzcstxh ", "event_timestamp(4i)"=>"20", "event_timestamp(5i)"=>"10"}, "tag_list"=>"tag1"}

So all of the parameters in the request were being treated as file uploads and it very much looked like Rack might be the cause. We had been using Thin as our Rails runtime and a check on the two systems showed that the old system had Thin 1.0.0 and Rack 0.9.1 and the new system had Thin 1.2.2 and Rack 1.0.0. Going back to the older versions fixed the issue.

Rack processes multipart form data with a couple of passes through the class method parse_multipart(env) (rack/utils.rb) where env is a wrapper around the request (as StringIO). The first pass processes the StringIO and extracts the form data and it's parts. To determine if a part is a file upload it used to run the following check:

filename = head[/Content-Disposition:.* filename="?([^\";]*)"?/ni, 1]
if filename
  body = Tempfile.new("RackMultipart")
  body.binmode  if body.respond_to?(:binmode)
end

Which basically locates lines like the following in the form data:

    Encapsulated multipart part:  (image/jpeg)
        Content-Disposition: form-data; name="event_image"; filename="event.jpg"\r\n

In Rack 1.0.0 the conditional changed to:

if content_type || filename
  body = Tempfile.new("RackMultipart")
  body.binmode  if body.respond_to?(:binmode)
end

The check for content_type had always been there and I won't list the code, but needless to say the conditional was now: if either content_type or filename (or both) are set then treat this part as a file upload.

In the Apache Olio Rails Driver (the code that drives load to the App) we have to assemble POST requests by hand and the code is all based on Apache HttpClient 2 (we use 3 now but the same code using deprecated methods in 3). What we had been doing to add text params to the POST request up to this point was:

MultipartPostMethod post = new MultipartPostMethod(addEventResultURL)
post.addParameter("event[title]", <randomly generated String data>);

This had the unfortunate effect of adding a content_type to the param as added to the request which looks liked this in the request:

     Encapsulated multipart part:  (text/plain)
        Content-Disposition: form-data; name="event[title]"\r\n
        Content-Type: text/plain; charset=US-ASCII\r\n
        Content-Transfer-Encoding: 8bit\r\n\r\n
        Line-based text data: text/plain

We modified the code to use a StringPart and addPart() instead of addParameter():

StringPart tmpPart = new StringPart("event[title]", <randomly generated String data>);

post.addPart(tmpPart);

and we also had to explicitly set content_type to NULL on the new part:

tmpPart.setContentType(null);

and the form data in the request now looks like this:

Encapsulated multipart part: 
        Content-Disposition: form-data; name="event[title]"\r\n
        Content-Transfer-Encoding: 8bit\r\n\r\n
        Data (19 bytes)





 
  






Friday Jun 19, 2009

Having built Ruby 1.9 packages for Solaris Nevada we had to try installing the commonly used gems, particularly those for the Rails stack and specifically the ones we use with the Apache Olio Rails app. One of these was eventmachine which is used by Thin and which is a great piece of Software. It installed fine on our Solaris Nevada b116 x64 box but failed to install on our Nevada b116 SPARC box. The error was pretty bizarre too:

usr/ccs/bin/as: error: no input filename given
usage: /usr/ccs/bin/as [-V] [-Q{y,n}] [-q] [-s]
        [-S] [-K {pic,PIC}] [-o objfile] [-L] [-T]
        [-P [[-Yc,path] [-Ipath] [-Dname] [-Dname=def] [-Uname]]...]
        [-m [-Ym,path]] [-n] [-ul] [-xF]
        [-m32] [-m64]
        [-xarch={v7,v8,v8a,v8plus,v8plusa,v8plusb,v9,v9a,v9b,sparc,sparcvis, sparcvis2,sparcfmaf,sparcima}]
        [-xcode={pic13,pic32}] file.s...
*** Error code 1
make: Fatal error: Command failed for target `binder.o'

What looked suspicious here, was the build line, which looked like this:

g++ -I. -I/usr/ruby/1.9/include/ruby-1.9.1/sparc-solaris2.11 -I/usr/ruby/1.9/include/ruby-1.9.1/ruby/backward -I/usr/ruby/1.9/include/ruby-1.9.1 -I/var/ruby/1.9/gem_home/gems/eventmachine-0.12.8/ext -DBUILD_FOR_RUBY -DHAVE_RB_THREAD_BLOCKING_REGION -DHAVE_TBR -DHAVE_WRITEV -DHAVE_WRITEV -DOS_UNIX -DOS_SOLARIS8 -DWITH_SSL -I/usr/sfw/include -fPIC -g -O3 -fPIC  -g -obinder.o -c binder.cpp

There's no space in '-obinder.o' which you'd normally expect to read '-o binder.o'. It looked odd, though it's not unusual for spaces to be ommited when passing names to options, I'd just never seen it on a build line before so it stood out. Looking at the output with -v set as one of the flags passed to g++, gave more data but not anything useful. We looked at it on the x64 system (again with -v as a flag to g++) and it went through the same steps but didn't fail.  But it did use a different assembler! gcc/g++ on x64 uses /usr/sfw/bin/gas but on sparc uses /usr/ccs/bin/as. It also became apparent by running the assembler manually that it was choking on -obinder.o, it seems that gas is more tolerant than as :o(

The Makefile that gets generated by extconf.rb when you install the gem has lines in it's .SUFFIXES section of the form:

.cc.o:
        $(CXX) $(INCFLAGS) $(CPPFLAGS) $(CXXFLAGS) $(COUTFLAG)$@ -c $<

and earlier in the file COUTFLAG is defined as:

COUTFLAG = -o

Which ultimately results in -obindir.o (same for all of the other objects it builds).

Matz seems to have hit this problem before, because the Ruby 1.9.1 Makefile contains a fix for a similar problem in Ruby (although it still happens when building the ext/*.c sources)

empty =
OUTFLAG = -o $(empty)
COUTFLAG = -o $(empty)

Which results in a space being inserted after the '-o' for $(OUTFLAG) and for $(COUTFLAG)

 
  

The Makefile is of course generated by extconf.rb which gets it's instructions on how to create the Makefile from mkmf.rb which Ruby provides. So the fix would be to make the changes outlined above in lib/ruby/1.9.1/mkmf.rb or to have it create the SUFFIXES lines with a space between the COUTFLAG variable and the output file name. We can do that for any packages that we build, but we also need to raise a bug with Ruby.

We've been working on building Ruby 1.9.1 (p129) packages on Solaris Nevada (the platform that OpenSolaris is mostly built from). We hit a couple of problems on the way, one was easy to fix the other not so.

The first issue was Sun Studio borking when it found a function declared with a return type of void but which actually contained a return statement. gcc actually thinks this ok which seems odd to me, maybe it's just our version of gcc though... When gcc hits this it says:

pty.c:425: warning: `return' with a value, in a function returning void

But with Sun Studio cc you get:

pty.c:425: void function cannot return a value
cc: acomp failed for pty.c

The file causing the error is ext/pty/pty.c and line 425 has the offending line in function getDevice() (declared as 'static void'):

return Qnil;

If you comment out this line it will build ok. There is a bug for this issue......

The other problem only occured on SPARC initially, we were using one of the public build systems that we have for the SFW consolidation (the part that delivers most of the F/OSS into Solaris Nevada and OpenSolaris) and that was running Solaris Nevada build 114, for x64 we were building on Solaris Nevada build 116 which was the latest version available at time of writing. x64 built fine.

The error seen on SPARC was:

Undefined                       first referenced
symbol                             in file
rb_cFalseClass                      enc/emacs_mule.o  (symbol scope specifies local binding)
rb_cTrueClass                       enc/emacs_mule.o  (symbol scope specifies local binding)
rb_cFixnum                          enc/emacs_mule.o  (symbol scope specifies local binding)
rb_cSymbol                          enc/emacs_mule.o  (symbol scope specifies local binding)
rb_cNilClass                        enc/emacs_mule.o  (symbol scope specifies local binding)
ld: fatal: symbol referencing errors. No output written to .ext/sparc-solaris2.11/enc/emacs_mule.so  

when linking enc/emacs_mule.so. It's not a well documented error, but the implication was obvious, the required symbols had been found but they had been declared as local and so couldn't be used to build this shared object. Using nm on emac_mule.o and on libruby.so.1 seemed to indicate that the symbols were needed by emacs_mule.o (UNDEF) and were available in libruby.so.1. We asked the compiler experts and they thought that perhaps the symbols were declared as HIDDEN. We tried elfdump on both emacs_mule.o and libruby.so.1 and guess what, when analyzing libruby.so.1, elfdump threw up loads of errors of the type:

"bad symbol entry: <address> lies outside of containing section"

This suggested that the shared library was broken in some way.

We isolated the linker lines from the build for libruby.so.1 and ran them individually (after touching ruby.c). There were two lines, the first was the linker line which actually built the shared library. That ran ok and when we ran elfdump on the resultant library there were no errors. The second line was:

/usr/sfw/bin/gobjcopy -w -I "Init_*" libruby.so.1

After running this manually we saw the same error when using elfdump on the resultant library.

At the same time as this we were running a build on a SPARC system that we'd had upgraded to Nevada build 116 and that completed OK. A check on the version of gobjcopy on the two systems showed that we had gobjcopy 2.15 on the build 114 system and 2.19 on the build 116 system. Further checking showed that gobjcopy was delivered into Solaris Nevada in SUNWbinutils and that had been updated in Nevada build 116. So the problem wasn't the fact we were building on SPARC but that we were building on different OS revs, the problem also exists on x64.

At the moment we haven't looked into what was going wrong when gobjcopy tried to make the Init_* symbols local, but it was apparently corrupting the library.

At the moment this makes it tough to build Ruby 1.9 on OpenSolaris which is based on Nevada build 111b and we are looking at how best to get around this. Maybe make the packages available from the /webstack repository. In the meantime we'll file a bug against OpenSolaris and come up with a workaround.

Thursday May 14, 2009

This week saw the Apache Olio project release the code for it's Java EE version adding to the versions already available for PHP and Rails. 

If you know Apache Olio and want to know more about the specifics of the Java EE version then I'll cover them first, if you want to know more about Apache Olio in general, read on.

To run the Java EE version of Olio, you'll need:

  • Java SE 5 or 6
  • Java EE 5 compliant application server (tested so far with GlassFish v2)
  • A Java Persistence API (JPA) provider (Eclipselink is the JPA provider packaged with GlassFish v2)
  • MySQL Database (any DB could be used but we have scripts and instructions for MySQL)

Some of the technologies that the Java EE version features:

  • JPA for the Object-Relational persistence solution
  • AJAX 
  • Rest based Services (partially implemented)
  • JMaki widgets wrappers for Yahoo and dojo widgets

In planning are the following changes/features:

  • Re-implementation of the memcached layer for caching (this was stripped out for this release but needs to be put back)
  • Rest based services with JSR-311, JAX-RS.  I've started this already using the Jersey implementation.
  • Replacement of the jMaki widgets with appropriate alternative
  • Minor features to 'catch up' with the PHP and JRuby version.
  • Investigation of file distribution system eg. Hadoop (current implementation only uses local filesystem)

If you want to get involved then visit our page at  http://incubator.apache.org/olio/

You can contibute patches, submit bugs or RFEs or just generally tell us what components you have successfully used the app with.

What is Apache Olio?

Apache Olio is a Web 2.0 toolkit, basically it's a Web 2.0 application and a load generator. You deploy the application to a configuration that you want to test, fire up the load generator, drive load to the application and then analyze the results. The application isn't that fussy about what it runs on, for the Java EE app you need a Java EE Web container (Glassfish or Tomcat for example), you also need a Database and a schema is provided that can be used to set that up. You need a filestore and you need a Web Server to act as a remote Web Service (for looking up geolocations). 

Apache Olio uses Faban to drive load along with a custom Faban driver. Faban is a benchmark driver framework and harness that is designed to allow you to model the usage of your application and drive load for 1000s of simulated users. It also can be used to manage the runtime environment and it gathers the results from test runs.

Once you've deployed the application you can load it up with dummy users and events (it's a Social Networking app) and use the driver to simulate load. At the end of a test run, you get all of the data from the run presented to you in graphical form (depending on the platform). I spend a vast amount of time using Olio and Faban and can't recommend them enough.

Kim is the lead developer of the Java EE version of Apache Olio and he has a blog entry that goes into lots of detail on how Apache Olio Java EE works and what it looks like.



Friday Mar 27, 2009

Anyone that knows me wouldn't be surprised to learn that I identify with (and still love) Richy Edwards of the Manic Street Preachers. Richey has been missing for 14 years now but his lyrics live on, particularly in the Manics greatest work "The Holy Bible". I'd heard rumours last year that the manics were going to make a record similar to THB but I snorted with derision given that Richey couldn't be part of it. Well surprise, surprise, they have made the album and using lyrics that Richey wrote but which were never recorded. I've heard the opening track "Peeled Apples" and it's very, very good, very reminiscent of THB. It's kind of amazing that James Dean Bradfield (singer and lifeforce of the Manics) still has it in him to sound so very, very angry :o)

The album is called "Journal for Plague Lovers" and is released on May 18th. Track listing is:

Peeled Apples
Jackie Collins Existential Question Time
Me and Stephen Hawking
This Joke Sport Severed
Journal For Plague Lovers
She Bathed Herself In A Bath Of Bleach
Facing Page: Top Left
Marlon J.D.
Doors Closing Slowly
All Is Vanity
Pretension/Repulsion
Virginia State Epileptic Colony
William’s Last Words

I have real hope that this will be the album that they would have recorded after The Holy Bible if Richey had still been with them. Because of this I've booked tickets to see them on May 28th May at the Roundhouse in London in the hope that I'll love the album and that at the gig they'll play a mix of tracks from the new one and from THB and maybe for once not play the tedious track they seem to be most famous for "A Design for Life".

I actually saw Hawkwind at the Roundhouse believe it or not, I was very young and lived in Camden Town. So going back there will be a blast. I'm also going to see Andrew Bird at the Thekla in Bristol with my Friend Michelle on the 10th May.

4Real

Wednesday Mar 25, 2009

Last week, all of the work that Chris Zhu, Prashant Srinivasan and I did on integrating Ruby 1.8.7 p72 and RubyGems 1.3.1 into OpenSolaris bore fruit as the packages became available in the /dev repository of OpenSolaris.

You can still use Ruby 1.8.6 as that's the version that was in the 2008.11 release of OpenSolaris, these packages won't get promoted to the main repository until the next release of OpenSolaris in May/June. 

In order to upgrade to Ruby 1.8.7 in OpenSolaris in the meantime you need to upgrade all of the packages so that they are at the same level. Once you have upgraded you'll no longer have Ruby 1.8.6 available to you, so make sure that's really what you want to do. Prashant wrote a brief guide on how to update Ruby in OpenSolaris a while back and you can find that here.

We'd be happy to receive any feedback on your experiences in upgrading and your thoughts on what you really would like to see for Ruby and RubyGems in OpenSolaris. As Ruby/RubyGems is part of Sun WebStack you can ask questions and post comments on the webstack-discuss@opensolaris.org  alias (you need to subscribe first at http://opensolaris.org) or you can post a comment on either of our blogs.

Tuesday Mar 24, 2009

I had the good fortune to present, along with my colleague Prashant, at the Community One event in New York last week. We talked about all of the testing work that we have been doing with Ruby on Rails on OpenSolaris these last 2 months and gave some pointers on what stack components to use and how to scale both vertically and horizontally. Got to see some of the sessions, although I spent a lot of the time in the speaker room on the first day. It was interesting to hear Dave Douglas talking about Kutiman, well actually it was more weird than interesting :o) I've listened to Kutiman quite a bit over the last couple of years, but had missed the Youtube Mashup thing that he's done. If you are interested check out http://thru-you.com/

In the same Keynote, Dave Douglas re-purposed Sun's 25 year old vision: "The network is the computer" for the Cloud and announced the Sun Open Cloud Platform. Dave believes (and I tend to agree) that there will be many clouds, some public, some private, some internal, some internal, geographic based clouds, market based clouds and so on. The Sun Open Cloud Platform will provide the infrastructure for powering these clouds, starting with the Sun Cloud. The Sun Cloud is currently manifested as a RESTful API which you can sign up for at http://kenai.com (it comes out of an internal project called Kenai and from QLayer), the API is published under Creative Commons which is cool. Other bits will follow later this year, Sun Cloud Compute Service and Sun Cloud Storage Service, which all I know about is limited to the press release. Now all we need to do is get the word out about what we are doing and then listen to the people who use it, who want to use it and who used it and didn't like it, and make it the best Cloud possible.

I didn't get a chance to see Simon Phipps' talk on The 3rd wave of Open Source live as I was talking at the time. I have watched the recording since and it's pretty interesting. I like the mention of Open Source as the Firmware for the Systems group at Sun (where I work), for example the Sun Storage 7000 series has OpenSolaris and ZFS as it's 'firmware', that these software components are fundamental parts of these systems and that no additional software and software licenses are required to make them work. Indeed Open Source really is now at the core of Sun Microsystems. Something else he said half jokingly, he used the word "victims" to describe the users of closed source, proprietary software, not sure why that gels so well with me. "More and more procurement driven projects are ending in failure" and the alternative is an adoption led process, where there's a lose specification of requirements, prototyping and several iterations that lead into a pilot that will either succeed or 'fail early' and that's fine, better to fail early than to fail after spending lots of money on licenses, professional services and support contracts.

Anyway, rather than give a commentary on the talks I saw you can see video of the main track and pick up the slides for the others at: http://www.sun.com/events/communityone/index.jsp

I had the pleasure to meet Dave Pascoe and Mark Niedzielski from OurStage who attended our talk. I think I learned more from them than they did from me :o)   If you are at all interested in new music you should check them out. Anyway, they use Rails big time and they use OpenSolaris for storage. The kind of scaling that they do is something that given the limited opportunities for access to equipment, I could only ever dream about. It was thought provoking to say the least. Mark showed me a software monitoring tool called Rails Performance Monitor from a company called New Relic. It's the kind of app that anyone involved in managing and monitoring performance long for and tells you pretty much everything about how your Rails app is performing and where it's spending it's time. I've been using the Lite version for the last couple of days.

The second day of Community One was reserved for Deep Dives and I decided to attend the MySQL sessions (dev in the morning, dba in the afternoon) and see if I could pick up some useful tips. We spent a little too long looking at the online docs, but apart from that they were good sessions. If nothing else, finding out that I could log queries that didn't use an index made the whole thing worth while.

NYC was it's usual self:  expensive, noisy, bustling and full of character. I had a 1.5 hour tour of Manhattan on my way from the airport on the Shuttle bus and took the E line subway in the rush hour on the return trip.

Oh and some people were talking about IBM, something about an outrageous plot to take over the world...




This blog copyright 2009 by MandyWaite