hi,I was thinking to translate a tcl proceure into C++ or python, my problem is that I am not familiar with tcl, so looking for help to be explained each step.

the program gets a list of probabilites, calls a function called generateProbability() whcih returns probability of a word, it combines the probability of 15 from that list and returns those with probablity above 90 %

here is program:

Code:

# isSpam -- # Guess whether a given message is spam. This is done by finding # the 16 most interesting words in the messages (i.e. those whose # probabilities deviate most strongly from being neutral) and # combining those words' probabilities to give an overall # probability that a particular message is spam. This is then # converted into a boolean value with a trivial threshold # function, so that messages are only found to be spam when the # code is better than 90% sure of it.

I'll add a few comments ... hopefully that will point you in the right direction. Remember - in Tcl anything that's written in square brackets is a command, and that command is performed and the results substituted into the line. Thus set score [combine $interesting] in Tcl would be score = combine(interesting)or something similar in other languages

Code:

proc isSpam {message} {

# the variables wordRE and reasons are shared with the main program

global WordRE reasons

# Look for Regular Expression matches in the message

while {[regexp -indices -start $i $WordRE $message match]} {

# And loop through each of the matches, counting words in an array# called t (equivalent of a dictionary in python)

foreach {j i} $match {} set t([string range $string $j $i]) {} }

# Take each of the words found and see how common it is using# a command (proc) called generateProbability.

foreach word [array names t] { set p [generateProbability $word]

# Make up a list of lists of words and their probabilities

lappend magic [list [expr {abs($p-0.5)}] $p $word] }

# Sort the list, and loop through the 16 most improbable, putting their# values into a new list called interesting