nanogui: UniCode


Previous by date: 29 Mar 2000 20:20:43 -0000 Re: client/server protocol speedup, Martin Jolicoeur
Next by date: 29 Mar 2000 20:20:43 -0000 Re: client/server protocol speedup, Greg Haerr
Previous in thread: 29 Mar 2000 20:20:43 -0000 Re: Unicode, Bradley D. LaRonde
Next in thread: 29 Mar 2000 20:20:43 -0000 Re: Unicode, Rob Taylor

Subject: Re: Unicode
From: Morten Rolland ####@####.####
Date: 29 Mar 2000 20:20:43 -0000
Message-Id: <38E2713B.5C0E4D4F@screenmedia.no>

Rob Taylor wrote:
> 
> when it comes down to it. if your designing a program to use unicode from
> the outset, you should use (and indeed be able to use) 16 or 32 bit unicode,
> as it's more efficiant and easier to use.

Not on memory for large texts, unless you need the whole unicode range
all the time...  Most western languages uses lots of ASCII (1 byte),
and some additional characters that fits in 2 bytes, IIRC.

Since the Unicode guys have now defined how to have up to over a
million code points with surrogates, and planning to use them,
16-bit unicode v3.0 will have to be handeled similar to UTF-8,
so there goes simplicity... and 32 bits is more waste of space.

> If however you have to interface
> to legacy code, or are modifying an ascii program for unicode, UTF8 should
> be availiable for use... not need for a religious war here...

Yes, if you have an UTF-8 editor, it is very much more pleasant to
write a program using UTF-8 encoding compared to 16/32 bit, as you
can write things like:

  char *ident="$Id: <utf-8 filename>.c,v 1.4 1999/09/09 15:00:00 mr Exp $";

  char *foobar = strdup("<insert-favourite-chinese-glyphs-here");

  /* <Something went badly wrong> (in Russian) */
  printf("Oops, error %d: %s\n",errno,foobar);

This is simply not possible with any other encoding and current
development tools.  And there is a *lot* of lagacy code.  Just
about all of it, actually.

UTF-8 does not have big/little endian problems, either.  Is there
a strict standard for endianness when exchanging unicode information?
If not: UTF-8 just got even better.

UTF-8 rules,
Morten Rolland, Screen Media
:-)

Previous by date: 29 Mar 2000 20:20:43 -0000 Re: client/server protocol speedup, Martin Jolicoeur
Next by date: 29 Mar 2000 20:20:43 -0000 Re: client/server protocol speedup, Greg Haerr
Previous in thread: 29 Mar 2000 20:20:43 -0000 Re: Unicode, Bradley D. LaRonde
Next in thread: 29 Mar 2000 20:20:43 -0000 Re: Unicode, Rob Taylor


Powered by ezmlm-browse 0.20.