| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This chapter describes the basic, ground-level functions for parsing and
handling. Covered here is parsing From lines, removing comments
from header lines, decoding encoded words, parsing date headers and so
on. High-level functionality is dealt with in the next chapter
(see section 3. Decoding and Viewing).
2.1 rfc2045 Encoding Content-Typeheaders.2.2 rfc2231 Parsing Content-Typeheaders.2.3 ietf-drums Handling mail headers defined by RFC822bis. 2.4 rfc2047 En/decoding encoded words in headers. 2.5 time-date Functions for parsing dates and manipulating time. 2.6 qp Quoted-Printable en/decoding. 2.7 base64 Base64 en/decoding. 2.8 binhex Binhex decoding. 2.9 uudecode Uuencode decoding. 2.10 rfc1843 Decoding HZ-encoded text. 2.11 mailcap How parts are displayed is specified by the `.mailcap' file
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
RFC2045 is the "main" MIME document, and as such, one would imagine that there would be a lot to implement. But there isn't, since most of the implementation details are delegated to the subsequent RFCs.
So `rfc2045.el' has only a single function:
rfc2045-encode-string
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
RFC2231 defines a syntax for the Content-Type and
Content-Disposition headers. Its snappy name is MIME
Parameter Value and Encoded Word Extensions: Character Sets, Languages,
and Continuations.
In short, these headers look something like this:
Content-Type: application/x-stuff; title*0*=us-ascii'en'This%20is%20even%20more%20; title*1*=%2A%2A%2Afun%2A%2A%2A%20; title*2="isn't it!" |
They usually aren't this bad, though.
The following functions are defined by this library:
rfc2231-parse-string
Content-Type header and return a list describing its
elements.
(rfc2231-parse-string
"application/x-stuff;
title*0*=us-ascii'en'This%20is%20even%20more%20;
title*1*=%2A%2A%2Afun%2A%2A%2A%20;
title*2=\"isn't it!\"")
=> ("application/x-stuff"
(title . "This is even more ***fun*** isn't it!"))
|
rfc2231-get-value
rfc2231-encode-string
Content-Type and
Content-Disposition.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
drums is an IETF working group that is working on the replacement for RFC822.
The functions provided by this library include:
ietf-drums-remove-comments
ietf-drums-remove-whitespace
ietf-drums-get-comment
ietf-drums-parse-address
ietf-drums-parse-addresses
ietf-drums-parse-date
ietf-drums-narrow-to-header
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
RFC2047 (Message Header Extensions for Non-ASCII Text) specifies how non-ASCII text in headers are to be encoded. This is actually rather complicated, so a number of variables are necessary to tweak what this library does.
The following variables are tweakable:
rfc2047-default-charset
iso-8859-1.
rfc2047-header-encoding-list
The keys can either be header regexps, or t.
The values can be either nil, in which case the header(s) in
question won't be encoded, or mime, which means that they will be
encoded.
rfc2047-charset-encoding-alist
Q (a
Quoted-Printable-like encoding) and B (base64). This alist
specifies which charset should use which encoding.
rfc2047-encoding-function-alist
Q, B and nil.
rfc2047-q-encoding-alist
Q encoding isn't quite the same for all headers. Some
headers allow a narrower range of characters, and that is what this
variable is for. It's an alist of header regexps / allowable character
ranges.
rfc2047-encoded-word-regexp
Those were the variables, and these are this functions:
rfc2047-narrow-to-field
rfc2047-encode-message-header
rfc2047-header-encoding-alist.
rfc2047-encode-region
rfc2047-encode-string
rfc2047-decode-region
rfc2047-decode-string
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
While not really a part of the MIME library, it is convenient to
document this library here. It deals with parsing Date headers
and manipulating time. (Not by using tesseracts, though, I'm sorry to
say.)
These functions convert between five formats: A date string, an Emacs time structure, a decoded time list, a second number, and a day number.
The functions have quite self-explanatory names, so the following just gives an overview of which functions are available.
(parse-time-string "Sat Sep 12 12:21:54 1998 +0200")
=> (54 21 12 12 9 1998 6 nil 7200)
(date-to-time "Sat Sep 12 12:21:54 1998 +0200")
=> (13818 19266)
(time-to-seconds '(13818 19266))
=> 905595714.0
(seconds-to-time 905595714.0)
=> (13818 19266 0)
(time-to-day '(13818 19266))
=> 729644
(days-to-time 729644)
=> (961933 65536)
(time-since '(13818 19266))
=> (0 430)
(time-less-p '(13818 19266) '(13818 19145))
=> nil
(subtract-time '(13818 19266) '(13818 19145))
=> (0 121)
(days-between "Sat Sep 12 12:21:54 1998 +0200"
"Sat Sep 07 12:21:54 1998 +0200")
=> 5
(date-leap-year-p 2000)
=> t
(time-to-day-in-year '(13818 19266))
=> 255
|
And finally, we have safe-date-to-time, which does the same as
date-to-time, but returns a zero time if the date is
syntactically malformed.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This library deals with decoding and encoding Quoted-Printable text.
Very briefly explained, qp encoding means translating all 8-bit characters (and lots of control characters) into things that look like `=EF'; that is, an equal sign followed by the byte encoded as a hex string.
The following functions are defined by the library:
quoted-printable-decode-region
quoted-printable-decode-string
quoted-printable-encode-region
quoted-printable-encode-string
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
Base64 is an encoding that encodes three bytes into four characters, thereby increasing the size by about 33%. The alphabet used for encoding is very resistant to mangling during transit.
The following functions are defined by this library:
base64-encode-region
base64-encode-string
base64-decode-region
nil and don't
modify the buffer.
base64-decode-string
nil is returned.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
binhex is an encoding that originated in Macintosh environments.
The following function is supplied to deal with these:
binhex-decode-region
binhex header and return the filename.
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
uuencode is probably still the most popular encoding of binaries
used on Usenet, although base64 rules the mail world.
The following function is supplied by this package:
uudecode-decode-region
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
RFC1843 deals with mixing Chinese and ASCII characters in messages. In essence, RFC1843 switches between ASCII and Chinese by doing this:
This sentence is in ASCII.
The next sentence is in GB.~{<:Ky2;S{#,NpJ)l6HK!#~}Bye.
|
Simple enough, and widely used in China.
The following functions are available to handle this encoding:
rfc1843-decode-region
rfc1843-decode-string
| [ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
The `~/.mailcap' file is parsed by most MIME-aware message handlers and describes how elements are supposed to be displayed. Here's an example file:
image/*; gimp -8 %s audio/wav; wavplayer %s |
This says that all image files should be displayed with gimp, and
that realaudio files should be played by rvplayer.
The mailcap library parses this file, and provides functions for
matching types.
mailcap-mime-data
Interface functions:
mailcap-parse-mailcaps
~/.mailcap file.
mailcap-mime-info
| [ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |