Mail::SpamAssassin::PerMsgStatus(3pm) | User Contributed Perl Documentation | Mail::SpamAssassin::PerMsgStatus(3pm) |
Mail::SpamAssassin::PerMsgStatus - per-message status (spam or not-spam)
my $spamtest = new Mail::SpamAssassin ({ 'rules_filename' => '/etc/spamassassin.rules', 'userprefs_filename' => $ENV{HOME}.'/.spamassassin/user_prefs' }); my $mail = $spamtest->parse(); my $status = $spamtest->check ($mail); my $rewritten_mail; if ($status->is_spam()) { $rewritten_mail = $status->rewrite_mail (); } ...
The Mail::SpamAssassin "check()" method returns an object of this class. This object encapsulates all the per-message state.
- rules with tflags set to 'learn' (the Bayesian rules) - rules with tflags set to 'userconf' (user white/black-listing rules, etc) - rules with tflags set to 'noautolearn'
Also note that auto-learning occurs using scores from either scoreset 0 or 1, depending on what scoreset is used during message check. It is likely that the message check and auto-learn scores will be different.
If a parameter of collapsed or dbg is passed, the output will be a condensed array of sub-tests with multiple hits reduced to one entry.
If the parameter of dbg is passed, the output will be a condensed string of sub-tests with multiple hits reduced to one entry with the number of hits in parentheses. Some information is also added at the end regarding the multiple hits.
It also returns is flagged with auto_learn_force, it will also include the status and the rules hit. For example: "autolearn_force=yes (AUTOLEARNTEST_BODY)"
The report is returned as a multi-line string, with the lines separated by "\n" characters.
This is returned as a multi-line string, with the lines separated by "\n" characters, containing a fully-decoded, safe, plain-text sample of the first few lines of the message body.
The actual modifications depend on the configuration (see "Mail::SpamAssassin::Conf" for more information).
The possible modifications are as follows:
If report_safe is set to false (0), then the message will only have the above headers added/modified.
$value can be a simple scalar (string or number), or a reference to an array, in which case the public method get_tag will join array elements using a space as a separator, returning a single string for backward compatibility.
$value can also be a subroutine reference, which will be evaluated each time the template is expanded. The first argument passed by get_tag to a called subroutine will be a PerMsgStatus object (this module's object), followed by optional arguments provided a caller to get_tag.
Note that perl supports closures, which means that variables set in the caller's scope can be accessed inside this "sub". For example:
my $text = "hello world!"; $status->set_tag("FOO", sub { my $pms = shift; return $text; });
See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS" section for more details on how template tags are used.
"undef" will be returned if a tag by that name has not been defined.
"undef" will be returned if a tag by that name has not been defined.
Jul 17 14:10:47 radish spamd[16670]: spamd: result: Y 22 - ALL_NATURAL, DATE_IN_FUTURE_03_06,DIET_1,DRUGS_ERECTILE,DRUGS_PAIN, TEST_FORGED_YAHOO_RCVD,TEST_INVALID_DATE,TEST_NOREALNAME, TEST_NORMAL_HTTP_TO_IP,UNDISC_RECIPS scantime=0.4,size=3138,user=jm, uid=1000,required_score=5.0,rhost=localhost,raddr=127.0.0.1, rport=33153,mid=<9PS291LhupY>,autolearn=spam
"name" and "VALUE" must not contain "=" or "," characters, as it is important that these log lines are easy to parse.
The code reference will be called by spamd after the message has been scanned, and the "PerMsgStatus::check()" method has returned.
If you are using SpamAssassin in a persistent environment, or checking many mail messages from one "Mail::SpamAssassin" factory, this method should be called to ensure Perl's garbage collection will clean up old status objects.
This is the same result text as used in 'rawbody' rules.
It is returned as an array of strings, with each string being a 2-4kB chunk of the body, split from boundaries if possible.
This is the same result text as used in 'body' rules.
It will always render text/html.
It is returned as an array of strings, with each string representing one 'paragraph'. Paragraphs, in plain-text mails, are double-newline-separated blocks of multi-line text.
Appending ":raw" to the header name will inhibit decoding of quoted-printable or base-64 encoded strings.
Appending a modifier ":addr" to a header field name will cause everything except the first email address to be removed from the header field. It is mainly applicable to header fields 'From', 'Sender', 'To', 'Cc' along with their 'Resent-*' counterparts, and the 'Return-Path'. For example, all of the following will result in "example@foo":
Appending a modifier ":name" to a header field name will cause everything except the first display name to be removed from the header field. It is mainly applicable to header fields containing a single mail address: 'From', 'Sender', along with their 'Resent-From' and 'Resent-Sender' counterparts. For example, all of the following will result in "Foo Blah". One level of single quotes is stripped too, as it is often seen.
There are several special pseudo-headers that can be specified:
The returned array will include the "raw" URI as well as "slightly cooked" versions. For example, the single URI 'http://%77w%77.example.com/' will get turned into: ( 'http://%77w%77.example.com/', 'http://www.example.com/' )
The hash format looks something like this:
raw_uri => { types => { a => 1, img => 1, parsed => 1, domainkeys => 1, unlinked => 1, schemeless => 1 }, cleaned => [ canonicalized_uri ], anchor_text => [ "click here", "no click here" ], domains => { domain1 => 1, domain2 => 1 }, hosts => { host1 => domain1, host2 => domain2 }, }
"raw_uri" is whatever the URI was in the message itself (http://spamassassin.apache%2Eorg/). Uris parsed from text will be prefixed with scheme if missing (http://, mailto: etc). HTML uris are as found.
"types" is a hash of the HTML tags (lowercase) which referenced the raw_uri. parsed is a faked type which specifies that the raw_uri was seen in the rendered text. domainkeys is defined when raw_uri was found from DK/DKIM d= field. unlinked is defined when it's assumed that MUA will not linkify uri (found in body without scheme or www. prefix). schemeless is always added for uris without scheme, regardless of linkifying (i.e. email address found in body without mailto:).
"cleaned" is an array of the raw and canonicalized version of the raw_uri (http://spamassassin.apache%2Eorg/, https://spamassassin.apache.org/).
"anchor_text" is an array of the anchor text (text between <a> and </a>), if any, which linked to the URI.
"domains" is a hash of the domains found in the canonicalized URIs.
"hosts" is a hash of unstripped hostnames found in the canonicalized URIs as hash keys, with their domain part stored as a value of each hash entry.
"raw_uri" is the URI to be added. The only required parameter.
"types" is an optional hash reference, contents are added to uri_detail_list->{types} (see get_uri_detail_list for known keys). parsed is default is no hash given. nocanon does not run uri_list_canonicalize (no redirector, uri fixing). noclean skips adding uri_detail_list->{cleaned}, so it would not be used in "uri" rule checks, but domain/hosts would still be used for URIBL/RBL purposes.
"source" is an optional simple string, only used for debug logging purposes to identify where uri originates from (default: "parsed").
"valid_domain" is an optional boolean (0/1). If true, uri will not be added unless hostname/domain is in valid format and contains a valid TLD. (default: 0)
There are two mandatory arguments. These are $rulename, the name of the rule that fired, and $desc_prepend, which is a short string that will be prepended to the rules "describe" string in output reports.
In addition, callers can supplement that with the following optional data:
Backward compatibility: the two mandatory arguments have been part of this API since SpamAssassin 2.x. The optional name=<gtvalue> pairs, however, are a new addition in SpamAssassin 3.2.0.
Note: This can only be called once until $status->delete_fulltext_tmpfile() is called.
2021-05-30 | perl v5.32.1 |