opensubscriber
   Find in this group all groups
 
Unknown more information…

a : applescript-users@lists.apple.com 17 January 2006 • 5:37AM -0500

Re: As Text Work-Around Broken
by Christopher Nebel

REPLY TO AUTHOR
 
REPLY TO GROUP




On Jan 14, 2006, at 11:26 AM, has wrote:

> One comment about that page: it incorrectly states that an AS  
> string consists of "Mac-Roman" characters; AS strings actually use  
> the user's primary encoding, as determined from their International  
> system preferences. For most US, western European and Antipodean  
> users this will be MacRoman, but will often be different for folks  
> in other parts of the world.

You know, I thought that too, but then I read a little closer and  
realized that it's mostly correct.  In fact, it defines its own term  
"Mac-encoded" to mean "text data in your primary encoding".  It does,  
however, subtly assume that the primary encoding is MacRoman by  
referring to un-encodable characters as "non-Roman".

The only other problem (encoding-wise) is in its definition of the  
"string" contents, where it says

"The string class basically stores one byte ([0..255]) per character.  
The 128 first values are rendered according to the ASCII standard ...  
The 128 larger values are rendered using a macintosh encoding, the  
one that goes with the first language listed in your International  
preference pane."

In fact, the "string" class stores data encoded using the primary  
encoding (which is indeed determined by the first language listed in  
your International preference pane; that bit is fine.)  Some  
encodings are one-byte-per-character, some are mixed-one-and-two  
(MacJapanese, for instance), and I don't know if there are any pure-
multi-byte encodings allowed these days.

The trick is that most of them are isomorphic to ASCII: bytes 0  
through 127 mean the same thing everywhere.  (Well, almost  
everywhere.  As the page points out, MacJapanese is not strictly  
isomorphic -- 0x5C is a yen sign, not a backslash.)  Some older Mac  
encodings are completely different, such as MacArabic, but those  
aren't supported these days except for import and export; the system  
uses Unicode instead.


--Chris Nebel
AppleScript and Automator Engineering

_______________________________________________
Do not post admin requests to the list. They will be ignored.
Applescript-users mailing list      (Applescript-users@list...)
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/applescript-users/subscriber%40opensubscriber.com

This email sent to subscriber@open...

Bookmark with:

Delicious   Digg   reddit   Facebook   StumbleUpon

Related Messages

opensubscriber is not affiliated with the authors of this message nor responsible for its content.