opensubscriber
   Find in this group all groups
 
Unknown more information…

s : senseclusters-developers@lists.sourceforge.net 16 October 2006 • 2:49AM -0400

[Senseclusters-developers] script to create feature files from held out training data
by ted pedersen

REPLY TO AUTHOR
 
REPLY TO GROUP







Archival post. This script was used to create external training data
for CICLING 2007 submission.

=================================================================

#!/bin/csh

# This script was used to create statistic files for different measures
# to be used as features for some other set of test/evaluation data.

# by ted pedersen, october 2006

set STOPLIST = /home/ted/Web/StopLists
# nyt-25.stop
# nyt-75.stop

set TRAINDATA = /home/CICLING/Train
# nyt-plain-clean-25-tr.txt
# nyt-plain-clean-75-tr.txt

foreach CORPUS (1 25 75)

foreach REMOVE (5 10 20 50)

set PREFIX = nyt-$CORPUS-$REMOVE

echo "running $PREFIX count"

count.pl --ngram 2 \
--token token.regex \
--remove $REMOVE \
--stop $STOPLIST/nyt-$CORPUS.stop \
$PREFIX.cnt2 \
$TRAINDATA/nyt-plain-clean-$CORPUS-tr.txt

foreach STAT (ll leftFisher pmi odds)

echo "running $PREFIX $STAT statistic"

if ($STAT == ll) then
set SCORE = 3.84
else if ($STAT == leftFisher) then
set SCORE = 0.95
else if ($STAT == pmi) then
set SCORE = 5.00
else if ($STAT == odds) then
set SCORE = 10000.00
else
echo "statistic error"
exit
endif

statistic.pl $STAT --precision 4 --score $SCORE $PREFIX.$STAT $PREFIX.cnt2

end

end

end




-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
senseclusters-developers mailing list
senseclusters-developers@list...
https://lists.sourceforge.net/lists/listinfo/senseclusters-developers

Bookmark with:

Delicious   Digg   reddit   Facebook   StumbleUpon

opensubscriber is not affiliated with the authors of this message nor responsible for its content.