opensubscriber
   Find in this group all groups
 
Unknown more information…

h : help-gnu-emacs@gnu.org 2 June 2012 • 11:17AM -0400

Re: those funny non-ASCII characters
by rusi

REPLY TO AUTHOR
 
REPLY TO GROUP




On Jun 2, 2:06 am, Xah Lee <xah...@gmai...> wrote:
> Xah wrote
>
> > > 〈Unicode BOM Byte Order Mark Hack〉http://xahlee.org/comp/unicode_BOM_byte_orde_mark.html
>
> > >http://www.unicode.org/faq/utf_bom.html#bom1
>
> On Jun 1, 9:26 am, rusi <rustompm...@gmai...> wrote:
>
> > Seehttp://www.unicode.org/versions/Unicode5.0.0/ch02.pdf
> > (pg 36) "Use of a BOM is neither required nor recommended for UTF-8,
> > but may
> > be encountered in contexts where UTF-8 data is converted from other
> > encoding forms..."
>
> > More specifically the non-recommendation of bom:http://www.unicode.org/faq/utf_bom.html
> > "Note that some recipients of UTF-8 encoded data do not expect a BOM.
> > Where UTF-8 is used transparently in 8-bit environments, the use of a
> > BOM will interfere with any protocol or file format that expects
> > specific ASCII characters at the beginning, such as the use of "#!" of
> > at the beginning of Unix shell scripts. "
>
> didn't i mention these 2 points exactly in the link i gave??

Yeah your own link says this: (as you know I often use and quote your
unicode pages :-) )

- In unix-like OSes, BOM for utf-8 conflicts with the Shebang (Unix)
hack.
- Many Window software add BOM to utf-8 files, e.g. Notepad.

But you also say

> If your lang spec says unicode, you have to support BOM mark

So I am not clear whats ur stand...

Let me make my own position clear:
The de jure unicode standard is set by the unicode consortium (or
whatever its called)
The de facto standard is set by microsoft and java
The two conflict

Bookmark with:

Delicious   Digg   reddit   Facebook   StumbleUpon

Related Messages

opensubscriber is not affiliated with the authors of this message nor responsible for its content.