my $field = Mail::Message::Field->new(From => 'fish@tux.aq'); print $field->name; print $field->body; print $field->comment; print $field->content; # body & comment $field->print(\*OUT); print $field->string; print "$field\n"; print $field->attribute('charset') || 'us-ascii';
See SYNOPSIS in Mail::Reporter
This implementation follows the guidelines of rfc2822 as close as possible, and may there produce a different output than implementations based on the obsolete rfc822. However, the old output will still be accepted.
These objects each store one header line, and facilitates access routines to the information hidden in it. Also, you may want to have a look at the added methods of a message:
my @from = $message->from; my $sender = $message->sender; my $subject = $message->subject; my $msgid = $message->messageId; my @to = $message->to; my @cc = $message->cc; my @bcc = $message->bcc; my @dest = $message->destinations; my $other = $message->get('Reply-To');
See DESCRIPTION in Mail::Reporter
Fields are stored in the header of a message, which are represented by Mail::Message::Head objects. A field is a combination of a name, body, and attributes. Especially the term "body" is cause for confusion: sometimes the attributes are considered to be part of the body.
The name of the field is followed by a colon (":
", not preceeded by
blanks, but followed by one blank). Each attribute is preceeded by
a separate semi-colon (";
"). Names of fields are case-insensitive and
cannot contain blanks.
Correct fields:
Field: hi! Content-Type: text/html; charset=latin1Incorrect fields, but accepted:
Field : wrong, blank before colon Field: # wrong, empty Field:not nice, blank preferred after colon One Two: wrong, blank in name
Fields which are long can be folded to span more than one line. The real limit for lines in messages is only at 998 characters, however such long lines are not easy to read without support of an application. Therefore rfc2822 (which defines the message syntax) specifies explicitly that field lines can be re-formatted into multiple sorter lines without change of meaning, by adding new-line characters to any field before any blank or tab.
Usually, the lines are reformatted to create lines which are 78 characters
maximum. Some applications try harder to fold on nice spots, like before
attributes. Especially the Received
field is often manually folded into
some nice layout. In most cases however, it is preferred to produce lines
which are as long as possible but max 78.
BE WARNED that all fields can be subjected to folding, and that you usually want the unfolded value.
Subject: this is a short line, and not folded Subject: this subject field is much longer, and therefore folded into multiple lines, although one more than needed.
The rfc2822 describes a large number of header fields explicitly. These
fields have a defined meaning. For some of the fields, like the Subject
field, the meaning is straight forward the contents itself. These fields
are the Unstructured Fields.
Other fields have a well defined internal syntax because their content is
needed by e-mail applications. For instance, the To
field contains
addresses which must be understood by all applications in the same way.
These are the Structured Fields, see isStructured().
Stuctured fields can contain comments, which are pieces of text enclosed in parenthesis. These comments can be placed close to anywhere in the line and must be ignored be the application. Not all applications are capable of handling comments correctly in all circumstances.
To: mailbox (Mail::Box mailinglist) <mailbox@overmeer.net> Date: Thu, 13 Sep 2001 09:40:48 +0200 (CEST) Subject: goodbye (was: hi!)
On the first line, the text "Mail::Box mailinglist" is used as comment. Be warned that rfc2822 explicitly states that comments in e-mail address specifications should not be considered to contain any usable information.
On the second line, the timezone is specified as comment. The Date
field format has no way to indicate the timezone of the sender, but only
contains the timezone difference to UTC, however one could decide to add
this as comment. Application must ignore this data because the Date
field is structured.
The last field is unstructured. The text between parantheses is an integral part of the subject line.
As many programs as there are handling e-mail, as many variations on accessing the header information are requested. Be careful which way you access the data: read the variations described here and decide which solution suites your needs best.
The get()
interface is copied from other Perl modules which can
handle e-mail messages. Many applications which simply replace
Mail::Internet objects by Mail::Message objects will work
without modification.
There is more than one get method. The exact results depend on which get you use. When Mail::Message::get() is called, you will get the unfolded, stripped from comments, stripped from attributes contents of the field as string. Character-set encodings will still be in the string. If the same fieldname appears more than once in the header, only the last value is returned.
When Mail::Message::Head::get() is called in scalar context, the last field with the specified name is returned as field object. This object strinigfies into the unfolded contents of the field, including attributes and comments. In list context, all appearances of the field in the header are returned as objects.
BE WARNED that some lines seem unique, but are not according to the
official rfc. For instance, To
fields can appear more than once.
If your program calls get('to')
in scalar context, some information
is lost.
print $msg->get('subject') || 'no subject'; print $msg->head->get('subject') || 'no subject'; my @to = $msg->head->get('to');
As the name study
already implies, this way of accessing the fields is
much more thorough but also slower. The study
of a field is like a
get
, but provides easy access to the content of the field and handles
character-set decoding correctly.
The Mail::Message::study() method will only return the last field with that name as object. Mail::Message::Head::study() and Mail::Message::Field::study() return all fields when used in list context.
print $msg->study('subject') || 'no subject'; my @rec = $msg->head->study('Received'); my $from = $msg->head->get('From')->study; my $from = $msg->head->study('From'); # same my @addr = $from->addresses;
Some fields belong together in a group of fields. For instance, a set
of lines is used to define one step in the mail transport process. Each
step adds a Received
line, and optionally some Resent-*
lines and
Return-Path
. These groups of lines shall stay together and in order
when the message header is processed.
The Mail::Message::Head::ResentGroup
object simplifies the access to
these related fields. These resent groups can be deleted as a whole,
or correctly constructed.
my $rgs = $msg->head->resentGroups; $rgs[0]->delete if @rgs; $msg->head->removeResentGroups;
There are many ways to get the fields info as object, and there are also many ways to process this data within the field.
Returns the text of the body exactly as will be printed to file when print() is called, so name, main body, and attributes.
Returns the text of the body, like string(), but without the name of the field.
Returns the text of the body, like foldedBody(), but then with all new-lines removed. This is the normal way to get the content of unstructured fields. Character-set encodings will still be in place. Fields are stringified into their unfolded representation.
Returns the text of structured fields, where new-lines and comments are removed from the string. This is a good start for parsing the field, for instance to find e-mail addresses in them.
Studied fields can produce the unfolded text decoded into utf8 strings. This is an expensive process, but the only correct way to get the field's data. More useful for people who are not living in ASCII space.
Studied fields have powerful methods to provide ways to access and produce the contents of (structured) fields exactly as the involved rfcs prescribe.
Some fields are accessed that often that there are support methods to provide simplified access. All these methods are called upon a message directly.
print $message->subject; print $message->get('subject') || ''; # same my @from = $message->from; # returns addresses $message->reply->send if $message->sender;
The sender
method will return the address specified in the Sender
field, or the first named in the From
field. It will return undef
in case no address is known.
Field data can be anything, strongly dependent on the type of field at hand. If you decide to contruct the fields very carefully via some Mail::Message::Field::Full extension (like via Mail::Message::Field::Addresses objects), then you will have protection build-in. However, you can bluntly create any Mail::Message::Field object based on some data.
When you create a field, you may specify a string, object, or an array of strings and objects. On the moment, objects are only used to help the construction on e-mail addresses, however you may add some of your own.
The following rules (implemented in stringifyData()) are obeyed given the argument is:
"\n"
) it will be folded according to the standard rules.
my $f = Mail::Message::Field->new(Subject => 'hi!'); my $b = Mail::Message->build(Subject => 'monkey');
use Mail::Address; my $fish = Mail::Address->new('Mail::Box', 'fish@tux.aq'); print $fish->format; # ==> Mail::Box <fish@tux.aq> my $exa = Mail::Address->new(undef, 'me@example.com'); print $exa->format; # ==> me@example.com my $b = $msg->build(To => "you@example.com"); my $b = $msg->build(To => $fish); my $b = $msg->build(To => [ $fish, $exa ]); my @all = ($fish, "you@example.com", $exa); my $b = $msg->build(To => \@all); my $b = $msg->build(To => [ "xyz", @all ]);
use User::Identity; my $patrik = User::Identity->new ( name => 'patrik' , full_name => "Patrik Fältström" # from rfc , charset => "ISO-8859-1" ); $patrik->add ( email => "him@home.net" ); my $b = $msg->build(To => $patrik); $b->get('To')->print; # ==> =?ISO-8859-1?Q?Patrik_F=E4ltstr=F6m?= # <him@home.net>
For performance reasons only, there are three types of fields: the fast, the flexible, and the full understander:
Fast
objects are not derived from a Mail::Reporter
. The consideration
is that fields are so often created, and such a small objects at the same
time, that setting-up a logging for each of the objects is relatively
expensive and not really useful.
The fast field implementation uses an array to store the data: that
will be faster than using a hash. Fast fields are not easily inheritable,
because the object creation and initiation is merged into one method.
The flexible implementation uses a hash to store the data. The new()
and init
methods are split, so this object is extensible.
With a full implementation of all applicable RFCs (about 5), the best understanding of the fields is reached. However, this comes with a serious memory and performance penalty. These objects are created from fast or flex header fields when study() is called.