Tuesday Mar 03, 2009

Most of the times we will be dealing with multi byte characters that effects web applications at all programming languages in development, the most recommended one to use is Unicode for non-English languages. Unicode has a wide range of characters mapped in it. So to be lucky, you need to first start from the operating system level to the application level. Most of the internationalization is always a tricky problem to handle in development cycle, as said let us look at part by part from the operating system level to application level and I will be explaining internationalization tricks on web based application and technologies further. To get more insight on handling the UTF-8 characters in HTML, JavaScript, Server configurations, Database (MySQL) and PHP is all about in this blog content.

[Read More]

Monday Dec 01, 2008

FOSS.IN [ ಫೊಸ್.ಇನ್ ]caption "Talk is cheap, Show me the code" (Waw! what an impressive caption ) and Free and Open Source Software (FOSS) event happened from 25th to 29th 2008 IISc, Innovation Center, Bangalore. This time is was more about internationalization and localization aspects of talks and workouts in Open Source Linux and was greatly focusing on the indic languages and processing. Information science in India is still at the early stages and its investment in digitizing has not been very impressive in these year. Linguistic activities in terms of translations and Machine Level Translation are becoming inevitable to invest on research and development.

 Many many books are published from institutes like, Center of Information Languages, Resource Center for Indian Language Technology Solutions (IIT Bombay), Language Technologies Research Centre (IIT Hyderabad) by the R&D Department of Information Technology Government of India, but they are not fully digitized for the world. In few decades Indian Institutes have been investing in,

  • Language access and Machine Translation in Indian Languages
  • Speech Processing for Indian Languages
  • Search Engines, Information Extraction and Retrieval for Indian Languages
  • Ontology and  Corpus in Indian Languages
  • Mailing, Transliteration, lexicons, POS Corpus tagger, Morphological generators...etc
I had an amazing experience in the FOSS to look a head the market future , how about you ? joining the free open source train. Findout more inside on Machine Translation and Indic languages ,  I have breifed with key points ...[Read More]

Tuesday Jul 01, 2008

Most of the times we get juggled in providing the test data while writing of Junit test cases using TestNG. It is always a tricky to write internationalized test cases, because of change in code several times makes us get puzzled with adding and removing the i18n hardcoded data for each method in Junit test cases.

So I went a head to find how easy to do it.  Check of the easy way to do it with TestNG a testing framework designed to simplify a broad range of unit testing needs done during the developement cycle. Its a combination on Java annotations and XML file, which helps us to achieve and finish the Junit i18n enabled test cases quickly. On this content I will be discussing about the DataProvider concepts of TestNG.

[Read More]

Thursday Jul 26, 2007

First three steps to internationalization in Java Code:
  • Using Unicode Character encoding Handling.
  • Handling local customs and culture effects.
  • Dealing with last step in the process, localizing user-visible messages (Todays topic with example)
[Read More]

This blog copyright 2009 by shankar