parp TODO list
==============

In approximate order of decreasing priority ...

* Watch out for spammers taking the bait

And ditch alt_addresses

* Change shutdown file to 'alive' file

* Rework filtering mechanism

to be much more general and flexible.  Key points are:

** use exceptions

Use exceptions to distinguish between terminating and non-terminating
`recipes'.

** rework current accept/reject methods

*** new low-level filtering methods

Neither should be called from user code.

**** $m->_deliver(@dest_folders)

Marks the mail for delivery to @dest_folders.

**** $m->_terminate_filter(@dest_folders)

Throws a Parp::TerminateFilter exception.

*** new higher-level filtering methods

**** $m->accept($dest_folder, $reason, @details)

Instantiates a Parp::Reason from $reason and @details.
Registers acceptance in the log file and mail header X-Parp-Accepted.
Delivers to $dest_folder.

**** $m->final_accept($dest_folder, $reason, @details)

sub final_accept {
  my $self = shift;
  $m->accept(@_);
  $m->_terminate_filtering();
}

**** $m->reject_junk_mail($reason)

sub reject_junk_mail {
  my $self = shift;
  $m->accept(@_);
  $m->_terminate_filtering();
}

**** $m->categorize(@new_categories) and others

Adds mail to given categories, which must be pre-registered:

{
  my %categories = ();

  sub register_category {
    my $self = shift;
    my ($category, $method) = @_;
    $categories{$category}++;
    if ($method) {
      *{ref($self) . "::$method"} = sub {
        my $mail = shift;
        return $mail->{categories}{$category};
      }
    }
  }

  register_category('spam', 'is_spam');

  sub categorize {
    my $self = shift;
    my (@new_categories) = @_;
  }

  sub decategorize {
    my $self = shift;
    my (@existing_categories) = @_;
  }
}

*** delivery happens *after* filtering has finished

* Deal with multiple calls to parp -w

* fake_mbox_from FIXME

* other FIXMEs

* Allow easy forwarding to another address

* More probabilistic/neural network based approach

... rather than the current Boolean one.

I'm still undecided about this one.

My current thinking is that it is best to keep the current
black-and-white boolean logic of the existing spam-or-not decision
tree, but replacing the current ugly max_quite_bad_words and
max_unique_quite_bad_words hack with ifile's algorithm in
has_spam_content() would probably work an absolute treat.

  http://www.ai.mit.edu/~jrennie/ifile/

ifile depends on mh.  Yuk.

Just spotted this:

http://spamassassin.taint.org/

Looks quite nice, supposed to be very accurate too.  Might have
to try it out, pinch some of the ideas ;-)

* Calc some of $m->{from} etc. on demand?

* backup/complain folder

Provide method for setting on/off.

* parpd-check

(maybe) kind of simple script which could be run via cron to check
that the daemon is still alive (c.f. eggdrop's botchk), and if not,
restart it.  This would be necessary considering that parp is
currently run on a per-user basis, and sysadmins sometimes reboot
machines ...

* Simplify install procedure

** Automate conversion from procmail

Pinch Simon Cozen's script for this :-)

* Sanity checking on Received: parse

** MX records and relaying

One idea for improving parp's delivery analysis would be to introduce
a list of trusted relays into the user's MyFilter.pm configuration,
and then look out for untrusted relays by checking the MX records for
the domains contained in the `for <address>', To:, Cc:, and
Apparently-To: headers.  You could then maybe even automate checks for
open Relays.

* Blacklist lookups

** Replace Parp::Blacklist with CPAN module?

There are several already out there, after all.

** Don't do blacklist lookups on known good hosts

Another improvement would be to avoid doing blacklist DNS lookups on
the hosts involved in the later stages of the delivery.  For example,
any mail sent to <adam@spiers.net> ends up with the first 4 Received:
headers always being the same delivery path which I know is good.
Avoiding DNS lookups on my own mail handlers every time I get an email
is obviously something I should get round to :-)

It would also be great to be able to automatically detect faked
Received: headers, but I can't think of a way how.

* Tests

** Unit tests

For each module.

*** Parp::Locking

Extend flock_test.sh to test this.

* Documentation!

Yeah it's hackerware, but even hackers should have docs.
There's some pod these days, at least.

* Auto-responders

Possible uses of auto-responders:

** automatically complaining about spam to the relevant authorities

** informing people of the password system

** `vacation' emulator

* Support for filtering inside nested multiparts

I don't think it works yet.  Only affects content-based tests, not
header-based ones, of course.

* Loop protection for all replied/forwarded mail.

I can't remember what I meant by that.

* Emacs local variables

**** Local variables:
**** mode:outline
**** End:


