Improving the effectiveness and accuracy of SpamAssassin updating its rules automatically on Debian
One can improve the default effectiveness and accuracy of SpamAssassin on Debian systems by automatically updating its rules from official channel and from suggested channel.
This tutorial will show how to update the rules and include the Sought automatically generated daily rules from messages caught in spam traps.
Also, read the other "Related Content" articles at this site regarding antispam and SpamAssassin linked.
vi /etc/default/spamassassin
# Cronjob
# Set to anything but 0 to enable the cron job to automatically update
# spamassassin's rules on a nightly basis
#AFM 20150723 https://wiki.apache.org/spamassassin/ImproveAccuracy
CRON=1
# Set to anything but 0 to enable the cron job to automatically update
# spamassassin's rules on a nightly basis
#AFM 20150723 https://wiki.apache.org/spamassassin/ImproveAccuracy
CRON=1
mkdir ~/spamassassin
cd ~/spamassassin/
cd ~/spamassassin/
mkdir /etc/spamassassin/sa-update-keys
chmod go-rx /etc/spamassassin/sa-update-keys
chmod go-rx /etc/spamassassin/sa-update-keys
mkdir -p ~/temp/etc
cd ~/temp/etc
cp -pr /etc/spamassassin .
cd ~/temp/etc
cp -pr /etc/spamassassin .
ls -lh /var/lib/spamassassin/
ls -lh /var/lib/spamassassin/sa-update-keys/
ls -lh /var/lib/spamassassin/3.004000/
ls -lh /var/lib/spamassassin/3.004000/updates_spamassassin_org/
ls -lh /var/lib/spamassassin/sa-update-keys/
ls -lh /var/lib/spamassassin/3.004000/
ls -lh /var/lib/spamassassin/3.004000/updates_spamassassin_org/
mkdir -p ~/temp/var/lib
cp -pr /var/lib/spamassassin ~/temp/var/lib/
ls -lah ~/temp/var/lib/spamassassin
cp -pr /var/lib/spamassassin ~/temp/var/lib/
ls -lah ~/temp/var/lib/spamassassin
wget http://spamassassin.apache.org/updates/GPG.KEY
sa-update --import GPG.KEY
mv GPG.KEY spamassassinGPG.KEY
sa-update --checkonly -v
sa-update --import GPG.KEY
mv GPG.KEY spamassassinGPG.KEY
sa-update --checkonly -v
sa-update -v --channel updates.spamassassin.org
ls -lah /var/lib/spamassassin/3.004000/updates_spamassassin_org
invoke-rc.d spamassassin reload
ls -lah /var/lib/spamassassin/3.004000/updates_spamassassin_org
invoke-rc.d spamassassin reload
#You can now install Sought rules:
wget http://yerp.org/rules/GPG.KEY
sa-update --import GPG.KEY
mv GPG.KEY soughtGPG.KEY
sa-update --checkonly -v
sa-update --import GPG.KEY
mv GPG.KEY soughtGPG.KEY
sa-update --checkonly -v
sa-update -v --gpgkey 6C6191E3 --channel sought.rules.yerp.org --channel updates.spamassassin.org
ls -lah /var/lib/spamassassin/3.004000/sought_rules_yerp_org/
invoke-rc.d spamassassin reload
#sa-update && /etc/init.d/spamassassin reload
ls -lah /var/lib/spamassassin/3.004000/sought_rules_yerp_org/
invoke-rc.d spamassassin reload
#sa-update && /etc/init.d/spamassassin reload
less /var/lib/spamassassin/3.004000/sought_rules_yerp_org/20_sought.cf
cat /var/lib/spamassassin/3.004000/updates_spamassassin_org/STATISTICS-set0-72_scores.cf.txt
##### WITH NEW RULES AND SCORES #####
# SUMMARY for threshold 5.0:
# Correctly non-spam: 135863 39.432% (97.611% of non-spam corpus)
# Correctly spam: 149688 43.444% (72.889% of spam corpus)
# False positives: 3325 0.965% (2.389% of nonspam, 146801 weighted)
# False negatives: 55677 16.159% (27.111% of spam, 139536 weighted)
# Average score for spam: 10.0 nonspam: 1.0
# Average for false-pos: 6.0 false-neg: 2.5
# TOTAL: 344553 100.00%
# Correctly non-spam: 135863 39.432% (97.611% of non-spam corpus)
# Correctly spam: 149688 43.444% (72.889% of spam corpus)
# False positives: 3325 0.965% (2.389% of nonspam, 146801 weighted)
# False negatives: 55677 16.159% (27.111% of spam, 139536 weighted)
# Average score for spam: 10.0 nonspam: 1.0
# Average for false-pos: 6.0 false-neg: 2.5
# TOTAL: 344553 100.00%
Reading scores from "tmprules"...
Reading per-message hit stat logs and scores...
Reading per-message hit stat logs and scores...
# SUMMARY for threshold 5.0:
# Correctly non-spam: 16997 97.42%
# Correctly spam: 18797 73.13%
# False positives: 450 2.58%
# False negatives: 6908 26.87%
# TCR(l=50): 0.874082 SpamRecall: 73.126% SpamPrec: 97.662%
# Correctly non-spam: 16997 97.42%
# Correctly spam: 18797 73.13%
# False positives: 450 2.58%
# False negatives: 6908 26.87%
# TCR(l=50): 0.874082 SpamRecall: 73.126% SpamPrec: 97.662%
##### WITHOUT NEW RULES AND SCORES #####
Reading scores from "../rules-base"...
Reading per-message hit stat logs and scores...
Reading scores from "../rules-base"...
Reading per-message hit stat logs and scores...
# SUMMARY for threshold 5.0:
# Correctly non-spam: 135534 97.37%
# Correctly spam: 56405 27.47%
# False positives: 3654 2.63%
# False negatives: 148960 72.53%
# TCR(l=50): 0.619203 SpamRecall: 27.466% SpamPrec: 93.916%
Reading scores from "../rules-base"...
Reading per-message hit stat logs and scores...
# Correctly non-spam: 135534 97.37%
# Correctly spam: 56405 27.47%
# False positives: 3654 2.63%
# False negatives: 148960 72.53%
# TCR(l=50): 0.619203 SpamRecall: 27.466% SpamPrec: 93.916%
Reading scores from "../rules-base"...
Reading per-message hit stat logs and scores...
# SUMMARY for threshold 5.0:
# Correctly non-spam: 17011 97.50%
# Correctly spam: 7152 27.82%
# False positives: 436 2.50%
# False negatives: 18553 72.18%
# TCR(l=50): 0.637003 SpamRecall: 27.823% SpamPrec: 94.254%
# Correctly non-spam: 17011 97.50%
# Correctly spam: 7152 27.82%
# False positives: 436 2.50%
# False negatives: 18553 72.18%
# TCR(l=50): 0.637003 SpamRecall: 27.823% SpamPrec: 94.254%
cat /var/lib/spamassassin/3.004000/updates_spamassassin_org/STATISTICS-set1-72_scores.cf.txt
##### WITH NEW RULES AND SCORES #####
# SUMMARY for threshold 5.0:
# Correctly non-spam: 154663 41.631% (99.548% of non-spam corpus)
# Correctly spam: 106767 28.739% (49.397% of spam corpus)
# False positives: 703 0.189% (0.452% of nonspam, 57031 weighted)
# False negatives: 109374 29.441% (50.603% of spam, 220677 weighted)
# Average score for spam: 8.9 nonspam: -0.5
# Average for false-pos: 5.8 false-neg: 2.0
# TOTAL: 371507 100.00%
# Correctly non-spam: 154663 41.631% (99.548% of non-spam corpus)
# Correctly spam: 106767 28.739% (49.397% of spam corpus)
# False positives: 703 0.189% (0.452% of nonspam, 57031 weighted)
# False negatives: 109374 29.441% (50.603% of spam, 220677 weighted)
# Average score for spam: 8.9 nonspam: -0.5
# Average for false-pos: 5.8 false-neg: 2.0
# TOTAL: 371507 100.00%
Reading scores from "tmprules"...
Reading per-message hit stat logs and scores...
Reading per-message hit stat logs and scores...
# SUMMARY for threshold 5.0:
# Correctly non-spam: 19456 99.51%
# Correctly spam: 13315 49.17%
# False positives: 95 0.49%
# False negatives: 13766 50.83%
# TCR(l=50): 1.462573 SpamRecall: 49.167% SpamPrec: 99.292%
# Correctly non-spam: 19456 99.51%
# Correctly spam: 13315 49.17%
# False positives: 95 0.49%
# False negatives: 13766 50.83%
# TCR(l=50): 1.462573 SpamRecall: 49.167% SpamPrec: 99.292%
##### WITHOUT NEW RULES AND SCORES #####
Reading scores from "../rules-base"...
Reading per-message hit stat logs and scores...
Reading scores from "../rules-base"...
Reading per-message hit stat logs and scores...
# SUMMARY for threshold 5.0:
# Correctly non-spam: 154853 99.67%
# Correctly spam: 87475 40.47%
# False positives: 513 0.33%
# False negatives: 128666 59.53%
# TCR(l=50): 1.400639 SpamRecall: 40.471% SpamPrec: 99.417%
Reading scores from "../rules-base"...
Reading per-message hit stat logs and scores...
# Correctly non-spam: 154853 99.67%
# Correctly spam: 87475 40.47%
# False positives: 513 0.33%
# False negatives: 128666 59.53%
# TCR(l=50): 1.400639 SpamRecall: 40.471% SpamPrec: 99.417%
Reading scores from "../rules-base"...
Reading per-message hit stat logs and scores...
# SUMMARY for threshold 5.0:
# Correctly non-spam: 19484 99.66%
# Correctly spam: 10975 40.53%
# False positives: 67 0.34%
# False negatives: 16106 59.47%
# TCR(l=50): 1.391910 SpamRecall: 40.527% SpamPrec: 99.393%
# Correctly non-spam: 19484 99.66%
# Correctly spam: 10975 40.53%
# False positives: 67 0.34%
# False negatives: 16106 59.47%
# TCR(l=50): 1.391910 SpamRecall: 40.527% SpamPrec: 99.393%
ls -lah /var/lib/spamassassin/3.004000/updates_spamassassin_org
less /var/lib/spamassassin/3.004000/updates_spamassassin_org/50_scores.cf
less /var/lib/spamassassin/3.004000/updates_spamassassin_org/72_scores.cf
less /var/lib/spamassassin/3.004000/updates_spamassassin_org/72_scores.cf
Now that you have manually tested the update, you have to adjust permissions to leave the spamassassin daily cronjob update the rules automatically for you
chown -R debian-spamd:debian-spamd /var/lib/spamassassin
chown -R debian-spamd:debian-spamd /etc/spamassassin/sa-update-keys/
chown -R debian-spamd:debian-spamd /etc/spamassassin/sa-update-hooks.d/
su - debian-spamd -c "/usr/bin/sa-update -v --gpghomedir /var/lib/spamassassin/sa-update-keys"
sh -x /etc/cron.daily/spamassassin
chown -R debian-spamd:debian-spamd /etc/spamassassin/sa-update-keys/
chown -R debian-spamd:debian-spamd /etc/spamassassin/sa-update-hooks.d/
su - debian-spamd -c "/usr/bin/sa-update -v --gpghomedir /var/lib/spamassassin/sa-update-keys"
sh -x /etc/cron.daily/spamassassin
Verify it will run daily.
IF your machine is not running 24 hours per day you must install anacron.
apt-get install anacron
run-parts -v --report /etc/cron.daily
run-parts -v --report /etc/cron.daily
Next day, at 06:25 am on Debian, your rules will be updated automagically.
Verify it next day by reading /var/log/syslog and /var/log/cron.log
less /var/log/syslog
less /var/log/cron.log
less /var/log/cron.log
Bibliography
http://www.vivaolinux.com.br/dica/SpamAssassin-Melhorando-a-eficacia-do-...
https://forums.cpanel.net/threads/spamassassin-3-4-improvement-with-upda...
http://taint.org/2007/08/15/004348a.html
http://taint.org/2007/08/04/200125a.html
http://taint.org/2007/04/17/132339a.html
https://forums.cpanel.net/threads/spamassassin-3-4-improvement-with-upda...
http://taint.org/2007/08/15/004348a.html
http://taint.org/2007/08/04/200125a.html
http://taint.org/2007/04/17/132339a.html
https://wiki.apache.org/spamassassin/ImproveAccuracy
https://wiki.apache.org/spamassassin/NightlyMassCheck
https://wiki.apache.org/spamassassin/UploadedCorpora
https://wiki.apache.org/spamassassin/NightlyMassCheck
https://wiki.apache.org/spamassassin/UploadedCorpora
https://wiki.apache.org/spamassassin/InstallingDCC
http://www.rhyolite.com/dcc/
http://debian.dev-zero.nl/blog/archives/315
http://debian.dev-zero.nl/blog/about
http://debian.dev-zero.nl/debian/dists/
http://www.rhyolite.com/dcc/
http://debian.dev-zero.nl/blog/archives/315
http://debian.dev-zero.nl/blog/about
http://debian.dev-zero.nl/debian/dists/
http://dspam.nuclearelephant.com/
http://sourceforge.net/projects/dspam/files/
http://sourceforge.net/projects/dspam/
http://sourceforge.net/projects/dspam/files/
http://sourceforge.net/projects/dspam/
Comentários
Postar um comentário