Bayes::BayesClassifier Class Reference

#include <Bayes.h>

List of all members.

Public Types

enum  ClassificationTable { GOOD, BAD }

Public Member Functions

const Score score (const char *const text) const
const Score score (const string &text) const
void learn (ClassificationTable table, const char *const text)
void learn (ClassificationTable table, const string &text)
void reclassify (ClassificationTable table, const char *const text)
void reclassify (ClassificationTable table, const string &text)
void load (const char *const file)
void load (const std::string &file)
void load (const boost::filesystem::path &file)
void save (const char *const file) const
void save (const std::string &file) const
void save (const boost::filesystem::path &file) const

Friends

std::ostream & operator<< (std::ostream &out, const BayesClassifier &base)


Detailed Description

BayesClassifier holds functions to classify strings either as GOOD or BAD in and also scores a given text based on the previously learned examples.

The Bayesian score is computed in the following way (documentary at http://www.paulgraham.com/spam.html):

	 (let ((g (* 2 (or (gethash word good) 0)))
	 (b (or (gethash word bad) 0)))
	 (unless (< (+ g b) 5)
	 (max .01
	 (min .99 (float (/ (min 1 (/ b nbad))
	 (+ (min 1 (/ g ngood))
	 (min 1 (/ b nbad)))))))))
	 * 


Member Enumeration Documentation

enum Bayes::BayesClassifier::ClassificationTable

Classification type used for a certain example.

Enumerator:
GOOD  This type holds the Ham examples.
BAD  This type holds the Spam examples.


Member Function Documentation

const Score Bayes::BayesClassifier::score ( const char *const  text  )  const [inline]

This function is used to score a complex text, i.e. a given chatline. The given text is into tokens and each token is scored. The most representative tokens for Ham and Spam are weighted against each other to return the overall score of the whole text.

Parameters:
text the text that shall be scored
Returns:
the Score object the given text has reached.
See also:
Score

const Score Bayes::BayesClassifier::score ( const string &  text  )  const

This function is used to score a complex text, i.e. a given chatline. The given text is into tokens and each token is scored. The most representative tokens for Ham and Spam are weighted against each other to return the overall score of the whole text.

Parameters:
text the text that shall be scored
Returns:
the Score object the given text has reached.
See also:
Score

void Bayes::BayesClassifier::learn ( ClassificationTable  table,
const char *const  text 
) [inline]

This function splits up the given text into tokens and forwards the resulting iteration to HashTable::learn function for the given table.

Parameters:
table the table where the given text shall be learned to
text the text that shall be learned
See also:
HashTable::learn

void Bayes::BayesClassifier::learn ( ClassificationTable  table,
const string &  text 
)

This function splits up the given text into tokens and forwards the resulting iteration to HashTable::learn function for the given table.

Parameters:
table the table where the given text shall be learned to
text the text that shall be learned
See also:
HashTable::learn

void Bayes::BayesClassifier::reclassify ( ClassificationTable  table,
const char *const  text 
) [inline]

This function splits up the given text into tokens and forwards the resulting iteration to HashTable::learn function for the given and the HashTable::unlearn function for the opposite table.

Parameters:
table the table where the given text shall be learned to, the opposite HashTable is used to unlearn the given text.
text the text that shall be learned
See also:
HashTable::learn

HashTable::unlearn

void Bayes::BayesClassifier::reclassify ( ClassificationTable  table,
const string &  text 
)

This function splits up the given text into tokens and forwards the resulting iteration to HashTable::learn function for the given and the HashTable::unlearn function for the opposite table.

Parameters:
table the table where the given text shall be learned to, the opposite HashTable is used to unlearn the given text.
text the text that shall be learned
See also:
HashTable::learn

HashTable::unlearn

void Bayes::BayesClassifier::load ( const char *const  file  )  [inline]

This function loads a previously stored classifier from the given file.

Parameters:
file the input file where the classifier shall be loaded from

void Bayes::BayesClassifier::load ( const std::string &  file  )  [inline]

This function loads a previously stored classifier from the given file.

Parameters:
file the input file where the classifier shall be loaded from

void Bayes::BayesClassifier::load ( const boost::filesystem::path &  file  ) 

This function loads a previously stored classifier from the given file.

Parameters:
file the input file where the classifier shall be loaded from

void Bayes::BayesClassifier::save ( const char *const  file  )  const [inline]

This function stores the current classifier to the given file.

Parameters:
file the output file where the classifier shall be stored to

void Bayes::BayesClassifier::save ( const std::string &  file  )  const [inline]

This function stores the current classifier to the given file.

Parameters:
file the output file where the classifier shall be stored to

void Bayes::BayesClassifier::save ( const boost::filesystem::path &  file  )  const

This function stores the current classifier to the given file.

Parameters:
file the output file where the classifier shall be stored to


The documentation for this class was generated from the following files:
Generated on Sat Feb 10 21:32:39 2007 for bayes-irc by  doxygen 1.5.1