Hi everybody,
I'm not quite sure if this is the place to post or if the "OX ... related" or maybe administration/configuration (for a part of my post) subforum fits better. If so, admins please move my post.
When you install Hyperion with the community installer you will find a perl skript called "spamrunner.pl" linked to your "/etc/cron.hourly" directory. This script is responsible for training the SpamAssassin bayesian filter your "confirmed-spam" and "confirmed-ham". Maybe this script works with the commercial distribution (OX EE) where is seems to come from, but for me it does not work.
There are mainly two problems:
1. If you use login2user.uid names with dots (".") inside cyrus will rewrite this to "^", eg. if your login name is "firstnam.lastname" cyrus will use "firstname^lastname" in the file system for your mailbox.
So the spamrunner.pl can't find the directory for your mailbox.
2. spamrunner.pl has problems if you use special characters for the folder names of "confirmed-spam" and "confirmed-ham". You can change this in "User.properties" configuration file of the OX admindaemon. For example I use customized german folder names:
- confirmed-spam = "Junk-E-Mail/Bestätigt Spam"
- confirmed-ham = "Junk-E-Mail/Bestätigt Ham"
The german umlauts are the problem. Cyrus will encode this in IMAP-UTF-7, which means the directory names will be:
- "Junk-E-Mail/Best&AOQ-tigt Spam" and
- "Junk-E-Mail/Best&AOQ-tigt Ham"
below your mailbox/INBOX.
I tried to fix both problems directly in perl, but I'm not a perl programmer an perl seems not to know the "IMAP-UTF-7" encoding (perl Encode has only "UTF-7", which uses "+" instead of "&"). Also installing an old perl extension for "Encode::IMAPUtf7" did not solve the problem, because the extension seems to be faulty.
So I decided to rewrite the script as a PHP-CLI command line script. Here is the result:
The script is as close to the original "spamrunner.pl" as possible for me. To activate the script install "PHP-CLI", PDO and "pdo_mysql" on your server. Copy the script to some location you like and link it to "/etc/cron.hourly". Then make it executable "chmod 755 spamrunner.php".
But there is some more to do to fully activate spam training if you installed with the community installer. Or at least I had to do more (Debian 4.0r3) and Hyperion from 2008-02-06.
Here is a short description what I did:
Activate Spam training
1. Install Hyperion with community edition installer
2. Modify the following files in /opt/open-xchange/etc/admindaemon
- In User.properties change the *_MAILFOLDER variables as you like
- In User.properties set "UID_NUMBER_START" to a value >0 eg. 5000
- In Group.properties set "GID_NUMBER_START" to a value >0 eg. 5000
This will activate real uids/gids for your user and groups. Be careful, that you choose a start for both which is already in use (see /etc/passwd and /etc/group) on your system. If these variables are set to 0 OX will use the same UID/GID for all users and groups.
- In User.properties set "CREATE_HOMEDIRECTORY=true", because SpamAssassin will save the bayesian data in the home directory of the users.
Then restart OX! All changes will take effect after a restart and only for newly created users.
If you already have a running Hyperion (with created users and groups) you will have to modify the MySQL database manually (give them uidNumber/gidNumber and create home dirs).
3. libnss-mysql was not correctly configured on my system after installation. You need to do the following things:
- In "/etc/nsswitch.conf" modify the lines
to
- In "/etc/libnss-mysql.cfg" modify the line:
to
That's it. After this "spamrunner.php" will do the job.
One last remark: "spamrunner.php" will remove the Spam/Ham files after feeding them to sa-learn.
Best regards,
Eike
I'm not quite sure if this is the place to post or if the "OX ... related" or maybe administration/configuration (for a part of my post) subforum fits better. If so, admins please move my post.
When you install Hyperion with the community installer you will find a perl skript called "spamrunner.pl" linked to your "/etc/cron.hourly" directory. This script is responsible for training the SpamAssassin bayesian filter your "confirmed-spam" and "confirmed-ham". Maybe this script works with the commercial distribution (OX EE) where is seems to come from, but for me it does not work.
There are mainly two problems:
1. If you use login2user.uid names with dots (".") inside cyrus will rewrite this to "^", eg. if your login name is "firstnam.lastname" cyrus will use "firstname^lastname" in the file system for your mailbox.
So the spamrunner.pl can't find the directory for your mailbox.
2. spamrunner.pl has problems if you use special characters for the folder names of "confirmed-spam" and "confirmed-ham". You can change this in "User.properties" configuration file of the OX admindaemon. For example I use customized german folder names:
- confirmed-spam = "Junk-E-Mail/Bestätigt Spam"
- confirmed-ham = "Junk-E-Mail/Bestätigt Ham"
The german umlauts are the problem. Cyrus will encode this in IMAP-UTF-7, which means the directory names will be:
- "Junk-E-Mail/Best&AOQ-tigt Spam" and
- "Junk-E-Mail/Best&AOQ-tigt Ham"
below your mailbox/INBOX.
I tried to fix both problems directly in perl, but I'm not a perl programmer an perl seems not to know the "IMAP-UTF-7" encoding (perl Encode has only "UTF-7", which uses "+" instead of "&"). Also installing an old perl extension for "Encode::IMAPUtf7" did not solve the problem, because the extension seems to be faulty.
So I decided to rewrite the script as a PHP-CLI command line script. Here is the result:
PHP Code:
#! /usr/bin/php
<?php
$DATASOURCE = "mysql:host=localhost;dbname=open-xchange-db";
$DRIVER = "mysql";
$CFPROP = "/opt/open-xchange/etc/admindaemon/configdb.properties";
$IMAPPROP = "/etc/imapd.conf";
$CYRSPOOL = "/var/spool/cyrus/mail/";
# DO NOT CHANGE BELOW
# -------------------------------------------------------------------
$QUERY = 'SELECT login2user.uid,user_setting_mail.bits,
user_setting_mail.confirmed_spam,
user_setting_mail.confirmed_ham
FROM user_setting_mail,login2user
WHERE user_setting_mail.user=login2user.id
AND login2user.cid=1';
$SPAM_ENABLED_BIT = 4096;
$DBUSER = FALSE;
$DBPASS = FALSE;
$HIERSEP = FALSE;
$USEUTF8 = FALSE;
$ENCODING = FALSE;
$UIDSEP = array(".", "^");
$mysqldriver = FALSE;
foreach (PDO::getAvailableDrivers() AS $entry) {
if ($entry = $DRIVER) {
$mysqldriver = $entry;
}
}
if ($mysqldriver === FALSE) {
die("mysqldriver not installed, exiting");
}
$PROP = file_get_contents($CFPROP, FILE_TEXT) or die("unable to open $CFPROP");
if (preg_match("/readProperty.1=user=(.*)/", $PROP, $tmp) == 1) {
$DBUSER = $tmp[1];
}
if (preg_match("/readProperty.2=password=(.*)/", $PROP, $tmp) == 1) {
$DBPASS = $tmp[1];
}
if (preg_match("/readProperty.3=useUnicode=(.*)/", $PROP, $tmp) == 1) {
$USEUTF8 = $tmp[1];
}
if (preg_match("/readProperty.4=characterEncoding=(.*)/", $PROP, $tmp) == 1) {
$ENCODING = $tmp[1];
}
$PROP = file_get_contents($IMAPPROP, FILE_TEXT) or die("unable to open $IMAPPROP");
if (preg_match("/unixhierarchysep:\s*(\w*)\s*/", $PROP, $tmp) == 1) {
$sep = strtoupper($tmp[1]);
if ( ($sep == "YES") || ($sep == "1") || ($sep == "ON") ) {
$HIERSEP = "/";
} else {
$HIERSEP = ".";
}
}
if ( ($HIERSEP === FALSE) || ($DBUSER === FALSE) || ($DBPASS === FALSE) || ($USEUTF8 === FALSE) || ($ENCODING === FALSE) ) {
die("unable to determine required system parameters");
}
#print "using \"$HIERSEP\" as IMAP separator\n";
#print "using \"$DBUSER\" as db user\n";
try {
$dbh = new PDO($DATASOURCE, $DBUSER, $DBPASS);
$stmt = $dbh->prepare($QUERY);
if ($stmt->execute()) {
while (list($uid, $bits, $cspam, $cham) = $stmt->fetch(PDO::FETCH_NUM)) {
# print "$uid $bits $cspam $cham\n";
if ( ($bits & $SPAM_ENABLED_BIT) == $SPAM_ENABLED_BIT ) {
#print "checking for spam and ham for $uid\n";
$cspam = mb_convert_encoding($cspam, "UTF7-IMAP");
$cham = mb_convert_encoding($cham, "UTF7-IMAP");
$userdir = $CYRSPOOL.substr($uid, 0, 1)."/user/".str_replace($UIDSEP[0], $UIDSEP[1], $uid);
$cspamdir = $userdir."/".$cspam;
$chamdir = $userdir."/".$cham;
# learn spam
if (file_exists($cspamdir)) {
$foundfiles = 0;
$fileList = getListOfMails($cspamdir);
foreach ($fileList as $file) {
$foundfiles++;
$file = $cspamdir."/".$file;
pipeSALearn("spam", $file, $uid);
unlink($file);
}
if ($foundfiles > 0) {
cyrReconstruct($uid, $cspam);
}
}
# learn ham
if (file_exists($chamdir)) {
$foundfiles = 0;
$fileList = getListOfMails($chamdir);
foreach ($fileList as $file) {
$foundfiles++;
$file = $chamdir."/".$file;
pipeSALearn("ham", $file, $uid);
unlink($file);
}
if ($foundfiles > 0) {
cyrReconstruct($uid, $cham);
}
}
}
}
}
$dbh = null;
} catch (Exception $e) {
echo "Failed: " . $e->getMessage();
$dbh = null;
}
# returns array ref containing all mails in folder
#
function getListOfMails($folder) {
$mlist = array();
if (is_dir($folder)) {
if ($dh = opendir($folder)) {
while (($file = readdir($dh)) !== false) {
$name = $folder . "/" . $file;
if ( (is_file($name)) && (ereg("^([0-9]{1,20}\.{1})$", $file)) ) {
$mlist[] = $file;
}
}
closedir($dh);
}
}
return($mlist);
}
# pipe mail into sa-learn
# arg1 == spam or ham
# arg2 == abs path to file
#
function pipeSALearn($type, $file, $uid) {
$sacmd = "/bin/su $uid -c \"/usr/bin/sa-learn --$type --no-sync\" 2> /dev/null";
$SAOUT = popen($sacmd, "w");
if ($SAOUT) {
$co = file($file);
foreach ($co as $row) {
fwrite($SAOUT, $row);
}
fclose($SAOUT);
} else {
die("unable to start sa-learn");
}
}
# calling cyrus reconstruct for specific folder
# arg1 == user name
# arg2 == folder name
#
function cyrReconstruct($user, $folder) {
global $HIERSEP;
$cyrcmd = "/bin/su cyrus -c \"/usr/sbin/cyrreconstruct -r user".
escapeshellarg($HIERSEP.$user.$HIERSEP.$folder)."\"";
$retVal = exec($cyrcmd, $output);
if (!$retVal) {
die("unable to start cyrus reconstruct");
}
}
?>
But there is some more to do to fully activate spam training if you installed with the community installer. Or at least I had to do more (Debian 4.0r3) and Hyperion from 2008-02-06.
Here is a short description what I did:
Activate Spam training
1. Install Hyperion with community edition installer
2. Modify the following files in /opt/open-xchange/etc/admindaemon
- In User.properties change the *_MAILFOLDER variables as you like
- In User.properties set "UID_NUMBER_START" to a value >0 eg. 5000
- In Group.properties set "GID_NUMBER_START" to a value >0 eg. 5000
This will activate real uids/gids for your user and groups. Be careful, that you choose a start for both which is already in use (see /etc/passwd and /etc/group) on your system. If these variables are set to 0 OX will use the same UID/GID for all users and groups.
- In User.properties set "CREATE_HOMEDIRECTORY=true", because SpamAssassin will save the bayesian data in the home directory of the users.
Then restart OX! All changes will take effect after a restart and only for newly created users.
If you already have a running Hyperion (with created users and groups) you will have to modify the MySQL database manually (give them uidNumber/gidNumber and create home dirs).
3. libnss-mysql was not correctly configured on my system after installation. You need to do the following things:
- In "/etc/nsswitch.conf" modify the lines
Code:
passwd: compat group: compat shadow: compat
Code:
passwd: compat mysql group: compat mysql shadow: compat mysql
Code:
getspnam SELECT login2user.uid,'x',user.uidNumber,user.gidNumber,user.shadowLastChange,0,0,-1,-1,'A' FROM login2user,user WHERE login2user.cid=1 AND login2user.id=user.id AND login2user.uid='%1$s' LIMIT 1
Code:
getspnam SELECT login2user.uid,'x',user.shadowLastChange,0,0,-1,-1,-1,'A' FROM login2user,user WHERE login2user.cid=1 AND login2user.id=user.id AND login2user.uid='%1$s' LIMIT 1
One last remark: "spamrunner.php" will remove the Spam/Ham files after feeding them to sa-learn.
Best regards,
Eike
Comment