opensubscriber
   Find in this group all groups
 
Unknown more information…

s : senseclusters-developers@lists.sourceforge.net 14 April 2008 • 4:48AM -0400

Re: [Senseclusters-developers] senseclusters-developers Digest, Vol 7, Issue 2
by Ted Pedersen

REPLY TO AUTHOR
 
REPLY TO GROUP



On Sat, Apr 12, 2008 at 5:19 AM, Teshome Kassie <tkheran@yaho...> wrote:
> Hell all;
>
> Does SenseClusters support Utf-8 ?
>
> Teshome
>

Great question, and I think the answer is no. Unfortunately not. The main issue
I think is not so much SenseClusters as it is Text::NSP, which is what we use
for a significant portion of our feature extraction needs.

There has been considerable discussion regarding how to make Text::NSP
better at handling different character sets. If you are interested in
the history of
that discussion, you can see the most recent version of it here:

http://www.mail-archive.com/ngram@yaho.../msg00156.html

The short version is that I've decided that the right thing to do is to use the
Perl module Encode in Text::NSP to provide full unicode support. The only
draw back is that this requires a bit of work, and right now it hasn't
risen high
enough in the queue. But, it's getting there, especially since SenseClusters
has such a heavy dependence on Text::NSP.

http://search.cpan.org/dist/Encode/

So, that's the long term solution I have planned. Unfortunately that
doesn't help
much in the shorter term.

Sorry I don't have a better answer. Other suggestions are most welcome.

Cordially,
Ted

--
Ted Pedersen
http://www.d.umn.edu/~tpederse

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
senseclusters-developers mailing list
senseclusters-developers@list...
https://lists.sourceforge.net/lists/listinfo/senseclusters-developers

Bookmark with:

Delicious   Digg   reddit   Facebook   StumbleUpon

opensubscriber is not affiliated with the authors of this message nor responsible for its content.