• General chit-chat »

  • Recent Language Support Improvements »

  • Soon To Be Released Language Support Improvements »

  • What I'd like To See »

  • What do you want? »


General Chit-Chat


docs.sun.com is Sun's main documentation site. There have been a spate of positive postings recently about docs.sun.com.

It's been a long time coming, but in the past 6 weeks or so, the hosting infrastructure has changed - so it's much faster.

We also migrated to a new search.



Recent Language Support Improvements:

  1. UI Translation

    Russian and Brazilian Portuguese versions of the UI.

  2. Multilingual Search
    It works, for all languages. There is an issue with PDFs that have non-ASCII metadata - but that's a problem with PDFs, not the search. More on that later.

Soon To Be Released Language Support Improvements:

  1. Rendering Of Japanese Text
    Our company wide main stylesheet, doesn't help the rendering of Japanese text. This will soon be mitigated by the inclusion of jp.css which will be added to the Japanese templates. This significantly improves the rendering of Japanese text, especially on Solaris. See the before and after shots below - taken on Firefox 2 for Solaris.
     BeforeAfter
     Before jp.css was added to docs.sun.com

     

     after jp.css was added

    Better definition of the Japanese characters. 



  2. Serving All Content As UTF-8.
    Previously content was served in what ever encoding you wanted. Yes, really. If your preferred encoding [not configurable on Internet Explorer] was say, ISO-8859-1, then that was the encoding of the content served to you - even if you requested a Korean page .... Don't believe me? - See the (slightly edited)  wget output below. Relevant strings are highlighted.




    Odd, yes, I know, but it worked, because docs.sun.com converts all non-ASCII characters in the source to numeric character references. So the encoding really doesn't make a difference. I still don't know why this is done - probably a throwback to old browser days. However, serving content in anything other than UTF-8 can pose problems for search, or other form based features.

    Anyway, a part of the docs.sun.com code was reading the client's HTTP accept-charset header. This has been found, and will soon be removed - so everything will get served as UTF-8 - whether you like it or not :)
  3. Non-ASCII Metadata In PDFs

    It turns out that our PDFs [most in v1.3] had no metadata. Indexing of these files was done purely on the basis of their main body content. We had a database containing all this metadata - and decided to apply the data to all the existing PDFs, since our new search handles the presentation of PDF results slightly differently.
    This was straightforward for ASCII text. A perl script using the PDF::API2 module did all the work.
    However it failed for non-ASCII text.


    From reading the PDF 1.3 Reference Guide , it's clear that all non-ASCII [or at least non western European] metadata should be UTF-16BE encoded. We had been passing in UTF-8 strings, and didn't really know what PDF::API2 was doing with them. Well, my colleague Phil Hooper figured it out, and fixed the PDF::API2 module in the process. I believe his fix will be in release 0.62 of the module.  Nice work Phil.


What I'd like To See


  1. Translation Linking

    It irks me that we have million$$ worth of translations on docs.sun.com - but there's no easy way to find out which books are actually available in a particular language. Or if I navigate to an English book, is it available in Korean?

    Having to navigate the product tree for each language is more than a little cumbersome.

    I think I've found an internal database that maps the relationship between English part numbers and translations. Armed with this, and provided the mappings are accurate, it should be quite straightforward to add a widget that automatically lists the available translations for a page/book/part number.
    Here's a very alpha prototype of Translation Finder.
     



  2. Extending the Translation Finder concept to search results

     If the Translation Finder widget works, then there's no reason why it couldn't be extended to search results. That is, for each search result, you're provided with links to the translations, if available. Something like below, where the flag icons depict the available translations:

    A rough mockup of what the search results might look like



Tell us what you want below » 

I'm mainly concerned with internationalization/localization features, but I'll advocate for any other general feature requests/improvements.

Comments:

hi Mick! That translation finder is way cool! I love it. I work in Sun's g11n team and agree with you that since we localize so much content, we should make it really easy for customers to find it.

Posted by melanie gao on September 12, 2007 at 09:43 AM EDT #

Post a Comment:
  • HTML Syntax: NOT allowed

This blog copyright 2009 by MickM