Privacy-Enhanced Mail (PEM) Privacy Enhancement for Internet Electronic Mail
小结
1、
加密基本流程
本地格式
标准格式
认证(填充与完整性检查)与加密
可打印编码
Privacy-Enhanced Mail (PEM)
RFC 2313 - PKCS #1: RSA Encryption Version 1.5 https://tools.ietf.org/html/rfc2313
This document describes a method for encrypting data using the RSA
public-key cryptosystem. Its intended use is in the construction of
digital signatures and digital envelopes, as described in PKCS #7:
o For digital signatures, the content to be signed
is first reduced to a message digest with a
message-digest algorithm (such as MD5), and then
an octet string containing the message digest is
encrypted with the RSA private key of the signer
of the content. The content and the encrypted
message digest are represented together according
to the syntax in PKCS #7 to yield a digital
signature. This application is compatible with
Privacy-Enhanced Mail (PEM) methods.
o For digital envelopes, the content to be enveloped
is first encrypted under a content-encryption key
with a content-encryption algorithm (such as DES),
and then the content-encryption key is encrypted
with the RSA public keys of the recipients of the
content. The encrypted content and the encrypted
Kaliski Informational [Page 1]
RFC 2313 PKCS #1: RSA Encryption March 1998 content-encryption key are represented together according to the syntax in PKCS #7 to yield a digital envelope. This application is also compatible with PEM methods.
RFC 1421 - Privacy Enhancement for Internet Electronic Mail: Part I: Message Encryption and Authentication Procedures https://tools.ietf.org/html/rfc1421
4.1.1 Types of Keys
A two-level keying hierarchy is used to support PEM transmission:
1. Data Encrypting Keys (DEKs) are used for encryption of
message text and (with certain choices among a set of
alternative algorithms) for computation of message integrity
check (MIC) quantities. In the asymmetric key management
environment, DEKs are also used to encrypt the signed
representations of MICs in PEM messages to which
confidentiality has been applied. DEKs are generated
individually for each transmitted message; no
predistribution of DEKs is needed to support PEM
transmission.
2. Interchange Keys (IKs) are used to encrypt DEKs for
transmission within messages. Ordinarily, the same IK will
be used for all messages sent from a given originator to a
given recipient over a period of time. Each transmitted
message includes a representation of the DEK(s) used for
message encryption and/or MIC computation, encrypted under
an individual IK per named recipient. The representation is
Linn [Page 6]
RFC 1421 Privacy Enhancement for Electronic Mail February 1993 associated with Originator-ID and Recipient-ID fields (defined in different forms so as to distinguish symmetric from asymmetric cases), which allow each individual recipient to identify the IK used to encrypt DEKs and/or MICs for that recipient's use. Given an appropriate IK, a recipient can decrypt the corresponding transmitted DEK representation, yielding the DEK required for message text decryption and/or MIC validation. The definition of an IK differs depending on whether symmetric or asymmetric cryptography is used for DEK encryption: 2a. When symmetric cryptography is used for DEK encryption, an IK is a single symmetric key shared between an originator and a recipient. In this case, the same IK is used to encrypt MICs as well as DEKs for transmission. Version/expiration information and IA identification associated with the originator and with the recipient must be concatenated in order to fully qualify a symmetric IK. 2b. When asymmetric cryptography is used, the IK component used for DEK encryption is the public component [8] of the recipient. The IK component used for MIC encryption is the private component of the originator, and therefore only one encrypted MIC representation need be included per message, rather than one per recipient. Each of these IK components can be fully qualified in a Recipient-ID or Originator-ID field, respectively. Alternatively, an originator's IK component may be determined from a certificate carried in an "Originator-Certificate:" field.
4.3 Privacy Enhancement Message Transformations
4.3.1 Constraints
An electronic mail encryption mechanism must be compatible with the
transparency constraints of its underlying electronic mail
facilities. These constraints are generally established based on
expected user requirements and on the characteristics of anticipated
endpoint and transport facilities. An encryption mechanism must also
be compatible with the local conventions of the computer systems
which it interconnects. Our approach uses a canonicalization step to
abstract out local conventions and a subsequent encoding step to
Linn [Page 10]
RFC 1421 Privacy Enhancement for Electronic Mail February 1993 conform to the characteristics of the underlying mail transport medium (SMTP). The encoding conforms to SMTP constraints. Section 4.5 of RFC 821 [2] details SMTP's transparency constraints. To prepare a message for SMTP transmission, the following requirements must be met: 1. All characters must be members of the 7-bit ASCII character set. 2. Text lines, delimited by the character pair <CR><LF>, must be no more than 1000 characters long. 3. Since the string <CR><LF>.<CR><LF> indicates the end of a message, it must not occur in text prior to the end of a message. Although SMTP specifies a standard representation for line delimiters (ASCII <CR><LF>), numerous systems in the Internet use a different native representation to delimit lines. For example, the <CR><LF> sequences delimiting lines in mail inbound to UNIX systems are transformed to single <LF>s as mail is written into local mailbox files. Lines in mail incoming to record-oriented systems (such as VAX VMS) may be converted to appropriate records by the destination SMTP server [3]. As a result, if the encryption process generated <CR>s or <LF>s, those characters might not be accessible to a recipient UA program at a destination which uses different line delimiting conventions. It is also possible that conversion between tabs and spaces may be performed in the course of mapping between inter-SMTP and local format; this is a matter of local option. If such transformations changed the form of transmitted ciphertext, decryption would fail to regenerate the transmitted plaintext, and a transmitted MIC would fail to compare with that computed at the destination. The conversion performed by an SMTP server at a system with EBCDIC as a native character set has even more severe impact, since the conversion from EBCDIC into ASCII is an information-losing transformation. In principle, the transformation function mapping between inter-SMTP canonical ASCII message representation and local format could be moved from the SMTP server up to the UA, given a means to direct that the SMTP server should no longer perform that transformation. This approach has a major disadvantage: internal file (e.g., mailbox) formats would be incompatible with the native forms used on the systems where they reside. Further, it would require modification to SMTP servers, as mail would be passed to SMTP in a different representation than it is passed at present. Linn [Page 11]
RFC 1421 Privacy Enhancement for Electronic Mail February 1993
4.3.2 Approach
Our approach to supporting PEM across an environment in which
intermediate conversions may occur defines an encoding for mail which
is uniformly representable across the set of PEM UAs regardless of
their systems' native character sets. This encoded form is used (for
specified PEM message types) to represent mail text in transit from
originator to recipient, but the encoding is not applied to enclosing
MTS headers or to encapsulated headers inserted to carry control
information between PEM UAs. The encoding's characteristics are such
that the transformations anticipated between originator and recipient
UAs will not prevent an encoded message from being decoded properly
at its destination.
Four transformation steps, described in the following four
subsections, apply to outbound PEM message processing:
4.3.2.1 Step 1: Local Form
This step is applicable to PEM message types ENCRYPTED, MIC-ONLY, and
MIC-CLEAR. The message text is created in the system's native
character set, with lines delimited in accordance with local
convention.
4.3.2.2 Step 2: Canonical Form
This step is applicable to PEM message types ENCRYPTED, MIC-ONLY, and
MIC-CLEAR. The message text is converted to a universal canonical
form, similar to the inter-SMTP representation [4] as defined in RFC
821 [2] and RFC 822 [5]. The procedures performed in order to
accomplish this conversion are dependent on the characteristics of
the local form and so are not specified in this RFC.
PEM canonicalization assures that the message text is represented
with the ASCII character set and "<CR><LF>" line delimiters, but does
not perform the dot-stuffing transformation discussed in RFC 821,
Section 4.5.2. Since a message is converted to a standard character
set and representation before encryption, a transferred PEM message
can be decrypted and its MIC can be validated at any type of
destination host computer. Decryption and MIC validation is
performed before any conversions which may be necessary to transform
the message into a destination-specific local form.
4.3.2.3 Step 3: Authentication and Encryption
Authentication processing is applicable to PEM message types
ENCRYPTED, MIC-ONLY, and MIC-CLEAR. The canonical form is input to
the selected MIC computation algorithm in order to compute an
Linn [Page 12]
RFC 1421 Privacy Enhancement for Electronic Mail February 1993 integrity check quantity for the message. No padding is added to the canonical form before submission to the MIC computation algorithm, although certain MIC algorithms will apply their own padding in the course of computing a MIC. Encryption processing is applicable only to PEM message type ENCRYPTED. RFC 1423 defines the padding technique used to support encryption of the canonically-encoded message text.
4.3.2.4 Step 4: Printable Encoding
This printable encoding step is applicable to PEM message types
ENCRYPTED and MIC-ONLY. The same processing is also employed in
representation of certain specifically identified PEM encapsulated
header field quantities as cited in Section 4.6. Proceeding from
left to right, the bit string resulting from step 3 is encoded into
characters which are universally representable at all sites, though
not necessarily with the same bit patterns (e.g., although the
character "E" is represented in an ASCII-based system as hexadecimal
45 and as hexadecimal C5 in an EBCDIC-based system, the local
significance of the two representations is equivalent).
A 64-character subset of International Alphabet IA5 is used, enabling
6 bits to be represented per printable character. (The proposed
subset of characters is represented identically in IA5 and ASCII.)
The character "=" signifies a special processing function used for
padding within the printable encoding procedure.
To represent the encapsulated text of a PEM message, the encoding
function's output is delimited into text lines (using local
conventions), with each line except the last containing exactly 64
printable characters and the final line containing 64 or fewer
printable characters. (This line length is easily printable and is
guaranteed to satisfy SMTP's 1000-character transmitted line length
limit.) This folding requirement does not apply when the encoding
procedure is used to represent PEM header field quantities; Section
4.6 discusses folding of PEM encapsulated header fields.
The encoding process represents 24-bit groups of input bits as output
strings of 4 encoded characters. Proceeding from left to right across
a 24-bit input group extracted from the output of step 3, each 6-bit
group is used as an index into an array of 64 printable characters.
The character referenced by the index is placed in the output string.
These characters, identified in Table 1, are selected so as to be
universally representable, and the set excludes characters with
particular significance to SMTP (e.g., ".", "<CR>", "<LF>").
Linn [Page 13]
RFC 1421 Privacy Enhancement for Electronic Mail February 1993 Special processing is performed if fewer than 24 bits are available in an input group at the end of a message. A full encoding quantum is always completed at the end of a message. When fewer than 24 input bits are available in an input group, zero bits are added (on the right) to form an integral number of 6-bit groups. Output character positions which are not required to represent actual input data are set to the character "=". Since all canonically encoded output is an integral number of octets, only the following cases can arise: (1) the final quantum of encoding input is an integral multiple of 24 bits; here, the final unit of encoded output will be an integral multiple of 4 characters with no "=" padding, (2) the final quantum of encoding input is exactly 8 bits; here, the final unit of encoded output will be two characters followed by two "=" padding characters, or (3) the final quantum of encoding input is exactly 16 bits; here, the final unit of encoded output will be three characters followed by one "=" padding character. Value Encoding Value Encoding Value Encoding Value Encoding 0 A 17 R 34 i 51 z 1 B 18 S 35 j 52 0 2 C 19 T 36 k 53 1 3 D 20 U 37 l 54 2 4 E 21 V 38 m 55 3 5 F 22 W 39 n 56 4 6 G 23 X 40 o 57 5 7 H 24 Y 41 p 58 6 8 I 25 Z 42 q 59 7 9 J 26 a 43 r 60 8 10 K 27 b 44 s 61 9 11 L 28 c 45 t 62 + 12 M 29 d 46 u 63 / 13 N 30 e 47 v 14 O 31 f 48 w (pad) = 15 P 32 g 49 x 16 Q 33 h 50 y Printable Encoding Characters Table 1
4.3.2.5 Summary of Transformations
In summary, the outbound message is subjected to the following
composition of transformations (or, for some PEM message types, a
subset thereof):
Transmit_Form = Encode(Encrypt(Canonicalize(Local_Form)))
Linn [Page 14]
RFC 1421 Privacy Enhancement for Electronic Mail February 1993 The inverse transformations are performed, in reverse order, to process inbound PEM messages: Local_Form = DeCanonicalize(Decipher(Decode(Transmit_Form))) Note that the local form and the functions to transform messages to and from canonical form may vary between the originator and recipient systems without loss of information.