babycart - Estimates spammishness of arbitrary text stream using SpamAssassin
babycart --usage
$Revision: 1.3 $
Filters arbitrary text through SpamAssassin, returning comma-delimited response of the form:
verdict,note,score,[rules,...]
where
Empty messages are flagged as OK as are unprocessed messages.
The goal of babycart is to check wiki and blog posts for spamminess, especially using the SURBL (www.surbl.org) to detect spammy domains in URLs. I didn't want to rewrite the analyzer from scratch or maintain it so I just used SpamAssassin. SA provides a lot more interesting rules and more flexibility than I could have put into the code.
A limited amount of metadata can be passed before the actual text, e.g.:
Look at the test cases for examples.
This info isn't analyzed much. It's used to fake up email headers to wrap around the comment so SpamAssassin doesn't carp too much and it's prepended to the comment body. Meaning, if a comment spammer is making innocuous (usually nonsense) comments but spamming via his contact URL, you can check that URL against SURBL and set the comment to be moderated.
Oh yeah, that's an important point. You just might have bloggers legitimately commenting on some Viagra story or some such spammy topics. Use babycart to set posts to be moderated, rather than rejecting them wholesale. Babycart gives you enough rope to blow your foot off so use it wisely.
Usage: babycart [options] < comment.txt
--rules=filename Specify SpamAssassin rules file
--userprefs=filename Specify custom SpamAssassin user prefs file
--verbose Sets verbose mode
--debug Sets debug mode
--help Displays this message
--usage " " "
--owner Shows owner of babycart
--version Shows version info for babycart
strftime())
None known.
Fix bugs. Truncate comment to specified limit. Disable net tests as well as bayes in config file.
Bob Apthorpe, apthorpe+babycart@cynistar.net
perl(1), Mail::SpamAssassin.