nanogui: UniCode
Subject:
RE: Unicode
From:
"Rob Taylor" ####@####.####
Date:
30 Mar 2000 09:46:32 -0000
Message-Id: <000301bf9a2b$6b6585f0$b400a8c0@eventhorizon>
> Not on memory for large texts, unless you need the whole unicode range
> all the time... Most western languages uses lots of ASCII (1 byte),
> and some additional characters that fits in 2 bytes, IIRC.
It does depennd on which language youre encoding....
but speedwise it is a lot more efficicent (O(1) operations for thing that
are O(n) operations on UTF-8)
> UTF-8 does not have big/little endian problems, either. Is there
> a strict standard for endianness when exchanging unicode information?
> If not: UTF-8 just got even better.
Unicode(-16 certainly, probablt holds for the others also) has the ability
for a definition of a strict standard regarding endianness, if that makes
sense ;). in reality in uncode16 you just define a file format that puts the
zero-width-no-break-space character at the front. (ZWNBSP = FEFF, FFEF = not
a charachter)
UTF-8 is by definition good as an upgrade path for unix, but, speaking as
someone who uses UTF-16 and UTF-8, UTF-16 is much faster and easier to code
for, and easier to write string containers for, and generally easier to use,
less hacky and faster. UTF-8 easier to write portable code with, more
useable with gcc, and has a smaller encoding size for western language
applications.
as i said previously, it all depends on what you're doing..
For example: I'm going to be making a try at at port of Qt to nano-X with
Roberto Alsina, and as qtrt stores all its strings internally as UTF-16, it
would be a hell of a lot mroe efficent for me to code in UTF-16 and write
stuff to screen in UTF-16, than use UFT-8. It's a no-brainer.
Rob