Python and UTF32/UCS4
You python interpreter maybe compiled with --enable-unocide=ucs2, so that the built-in unichr(i) function will raise an exception, if the given value is larger than 0xFFFF. While the 'ucs2' here actually means utf16, which is a variable length encoding. And you need a simple function to convert utf32/ucs4 to utf16. Here is the example code snippet,
def ucs4chr(codepoint):
try:
return unichr(codepoint)
except ValueError:
hi, lo = divmod (codepoint-0x10000, 0x400)
return unichr(0xd800+hi) + unichr(0xdc00+lo)
def ucs4ord(str):
if len(str)==1:
return ord(str)
if len(str)==2:
hi, lo = ord(str[0])-0xd800, ord(str[1])-0xdc00
return hi*0x400+0x10000
raise TypeError("ucs4ord() expected a valid ucs4 character")

