Network Working Group R. Polli Internet-Draft Team Digitale Intended status: Standards Track L. Pardue Expires: November 10, 2019 May 09, 2019 Resource Digests for HTTP draft-polli-resource-digests-http-latest Abstract This document defines the Digest and Want-Digest header fields for HTTP, thus allowing client and server to negotiate an integrity checksum of the exchanged resource representation. This document obsoletes [RFC3230]. It replaces the term "instance" with "representation", which makes it consistent with the HTTP Semantic and Context defined in [RFC7231]. Note to Readers _RFC EDITOR: please remove this section before publication_ Discussion of this draft takes place on the HTTP working group mailing list (ietf-http-wg@w3.org), which is archived at https://lists.w3.org/Archives/Public/ietf-http-wg/ [1]. The source code and issues list for this draft can be found at https://github.com/ioggstream/draft-polli-resource-digests-http [2]. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on November 10, 2019. Polli & Pardue Expires November 10, 2019 [Page 1] Internet-Draft RDHTTP May 2019 Copyright Notice Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1. Brief history of integrity headers . . . . . . . . . . . 3 1.2. This proposal . . . . . . . . . . . . . . . . . . . . . . 3 1.3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4. Notational Conventions . . . . . . . . . . . . . . . . . 4 2. Resource representation and representation-data . . . . . . . 5 3. Digest Algorithm values . . . . . . . . . . . . . . . . . . . 6 3.1. Representation digest . . . . . . . . . . . . . . . . . . 7 3.1.1. digest-algorithm encoding examples . . . . . . . . . 8 4. Header Specifications . . . . . . . . . . . . . . . . . . . . 8 4.1. Want-Digest . . . . . . . . . . . . . . . . . . . . . . . 8 4.2. Digest . . . . . . . . . . . . . . . . . . . . . . . . . 9 5. Deprecate Negotiation of Content-MD5 . . . . . . . . . . . . 9 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 10 6.1. Unsolicited Digest response . . . . . . . . . . . . . . . 10 6.1.1. Representation data is fully contained in the payload 10 6.1.2. Representation data is not contained in the payload . 10 6.1.3. Representation data is partially contained in the payload i.e. range request . . . . . . . . . . . . . 10 6.1.4. Digest in both Request and Response. Returned value depends on representation metadata . . . . . . . . . 11 6.2. Want-Digest solicited digest responses . . . . . . . . . 11 6.2.1. Client request data is fully contained in the payload 11 6.2.2. A client requests an unsupported Digest, the server MAY reply with an unsupported digest . . . . . . . . 12 6.2.3. A client requests an unsupported Digest, the server MAY reply with a 400 . . . . . . . . . . . . . . . . 12 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 7.1. Usage in signatures . . . . . . . . . . . . . . . . . . . 13 7.2. Message Truncation . . . . . . . . . . . . . . . . . . . 13 7.3. Algorithm Agility . . . . . . . . . . . . . . . . . . . . 13 Polli & Pardue Expires November 10, 2019 [Page 2] Internet-Draft RDHTTP May 2019 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 8.1. The "id-sha-256" Digest Algorithm . . . . . . . . . . . . 13 8.2. The "id-sha-512" Digest Algorithm . . . . . . . . . . . . 13 8.3. Want-Digest Header Field Registration . . . . . . . . . . 14 8.4. Digest Header Field Registration . . . . . . . . . . . . 14 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 9.1. Normative References . . . . . . . . . . . . . . . . . . 14 9.2. Informative References . . . . . . . . . . . . . . . . . 16 9.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Appendix A. Acknowledgements . . . . . . . . . . . . . . . . . . 17 Appendix B. FAQ . . . . . . . . . . . . . . . . . . . . . . . . 17 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 1. Introduction Integrity protection for HTTP content is typically achieved via TCP or HTTPS [RFC2818]. However, additional integrity protection might be desirable for some use cases. This might be for additional protection against failures or attack (see [SRI]), programming errors, corruption of stored data or because content needs to remain unmodified throughout multiple HTTPS-protected exchanges. 1.1. Brief history of integrity headers The Content-MD5 header field was originally introduced to provide integrity, but HTTP/1.1 [RFC7231] in appendix-B obsoleted it: The Content-MD5 header field has been removed because it was inconsistently implemented with respect to partial responses. [RFC3230] provided a more flexible solution introducing the concept of "instance", and the headers "Digest" and "Want-Digest". 1.2. This proposal The concept of "selected representation" defined in [RFC7231] made [RFC3230] definitions inconsistent with the current standard. A refresh was then required. This document updates the "Digest" and "Want-Digest" header field definitions to align with [RFC7231] concepts. This approach can be easily adapted to use-cases where the transferred data does require some sort of manipulation to be considered a representation or conveys a partial representation of a resource (eg. Range Requests). Polli & Pardue Expires November 10, 2019 [Page 3] Internet-Draft RDHTTP May 2019 Changes are semantically compatible with existing implementations and better cover both the request and response cases. The value of "Digest" is calculated on selected representation, which is tied to the value contained in any "Content-Encoding" or "Content- Type" header fields. Therefore, a given resource may have multiple different digest values. To allow both parties to exchange a Digest of a representation with no content codings [3] two more algorithms are added ("id-sha-256" and "id-sha-512"). 1.3. Goals The goals of this proposal are: 1. Digest coverage for either the resource's "representation data" or "selected representation data" communicated via HTTP. 2. Support for multiple digest algorithms. 3. Negotiation of the use of digests. The goals do not include: Header integrity: The digest mechanisms described here cover only representation and selected representation data, and do not protect the integrity of associated representation metadata headers or other message headers. Authentication: The digest mechanisms described here are not meant to support authentication of the source of a digest or of a message or anything else. These mechanisms, therefore, are not a sufficient defense against many kinds of malicious attacks. Privacy: Digest mechanisms do not provide message privacy. Authorization: The digest mechanisms described here are not meant to support authorization or other kinds of access controls. 1.4. Notational Conventions The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 ([RFC2119] and [RFC8174]) when, and only when, they appear in all capitals, as shown here. Polli & Pardue Expires November 10, 2019 [Page 4] Internet-Draft RDHTTP May 2019 The definitions "representation", "selected representation", "representation data", "representation metadata" and "payload body" in this document are to be interpreted as described in [RFC7230] and [RFC7231]. 2. Resource representation and representation-data To avoid inconsistencies, an integrity mechanism for http messages should decouple the checksum calculation: o from the payload body - which may be altered by mechanism like Range Requests or the method (eg. HEAD); o and from the message body - which depends on "Transfer-Encoding" and whatever tranformations the intermediaries may apply. The following examples shows how representation metadata, payload tranformations and method impacts on the message and payload body. Here is a gzip-compressed json object # gzip.compress(json.dumps( {"a": "1"*100} ).encode())) Request: PUT /entries/1234 HTTP/1.1 Content-Type: application/json Content-Encoding: gzip H4sIAItWyFwC/6tWSlSyUlAypANQqgUAREcqfG0AAAA= Now the same payload body conveys a malformed json object. Request: PUT /entries/1234 HTTP/1.1 Content-Type: application/json H4sIAItWyFwC/6tWSlSyUlAypANQqgUAREcqfG0AAAA= A Range-Request alters the payload body, conveying a partial representation. Polli & Pardue Expires November 10, 2019 [Page 5] Internet-Draft RDHTTP May 2019 Request: GET /entries/1234 HTTP/1.1 Range: bytes=1-7 Response: HTTP/1.1 206 Partial Content Content-Encoding: gzip Content-Type: application/json Content-Range: bytes=1-7 iwgAla3RXA== Now the method too alters the payload body. Request: HEAD /entries/1234 HTTP/1.1 Accept: application/json Accept-Encoding: gzip Response: HTTP/1.1 200 OK Content-Type: application/json Content-Encoding: gzip 3. Digest Algorithm values Digest algorithm values are used to indicate a specific digest computation. For some algorithms, one or more parameters may be supplied. digest-algorithm = token The BNF for "parameter" is as is used in [RFC7230]. All digest- algorithm values are case-insensitive. The Internet Assigned Numbers Authority (IANA) acts as a registry for digest-algorithm values. The registry contains the following tokens. *NB: This RFC updates* [RFC5843] *which is still delegated for all algorithms updates* Polli & Pardue Expires November 10, 2019 [Page 6] Internet-Draft RDHTTP May 2019 SHA-256: The SHA-256 algorithm [FIPS180-3]. The output of this algorithm is encoded using the base64 encoding [RFC4648]. Reference: [FIPS180-3], [RFC4648], this document. SHA-512: The SHA-512 algorithm [FIPS180-3]. The output of this algorithm is encoded using the base64 encoding [RFC4648]. Reference: [FIPS180-3], [RFC4648], this document. MD5: The MD5 algorithm, as specified in [RFC1321]. The output of this algorithm is encoded using the base64 encoding [RFC4648]. SHA: The SHA-1 algorithm [FIPS180-1]. The output of this algorithm is encoded using the base64 encoding [RFC4648]. UNIXsum: The algorithm computed by the UNIX "sum" command, as defined by the Single UNIX Specification, Version 2 [UNIX]. The output of this algorithm is an ASCII decimal-digit string representing the 16-bit checksum, which is the first word of the output of the UNIX "sum" command. UNIXcksum: The algorithm computed by the UNIX "cksum" command, as defined by the Single UNIX Specification, Version 2 [UNIX]. The output of this algorithm is an ASCII digit string representing the 32-bit CRC, which is the first word of the output of the UNIX "cksum" command. To allow sender and recipient to provide a checksum which is independent from the Content-Coding, the following additional algorithms are defined: id-sha-512: The sha-512 digest of the representation-data of the resource when no content coding is applied (eg. "Content- Encoding: identity") id-sha-256: The sha-256 digest of the representation-data of the resource when no content coding is applied (eg. "Content- Encoding: identity") If other digest-algorithm values are defined, the associated encoding MUST either be represented as a quoted string, or MUST NOT include ";" or "," in the character sets used for the encoding. 3.1. Representation digest A representation digest is the value of the output of a digest algorithm, together with an indication of the algorithm used (and any parameters). Polli & Pardue Expires November 10, 2019 [Page 7] Internet-Draft RDHTTP May 2019 representation-data-digest = digest-algorithm "=" As explained in {#resource-representations} the digest is computed on the entire selected "representation data" of the resource defined in [RFC7231]: representation-data := Content-Encoding( Content-Type( bits ) ) The encoded digest output uses the encoding format defined for the specific digest-algorithm. 3.1.1. digest-algorithm encoding examples The sha-256 digest-algorithm uses base64 encoding sha-256=...... The "UNIXsum" digest-algorithm uses ASCII string of decimal digits. UNIXsum=30637 4. Header Specifications The following headers are defined 4.1. Want-Digest The Want-Digest message header field indicates the sender's desire to receive a representation digest on messages associated with the Request- URI and representation metadata. Want-Digest = "Want-Digest" ":" #(digest-algorithm [ ";" "q" "=" qvalue]) If a digest-algorithm is not accompanied by a qvalue, it is treated as if its associated qvalue were 1.0. The sender is willing to accept a digest-algorithm if and only if it is listed in a Want-Digest header field of a message, and its qvalue is non-zero. If multiple acceptable digest-algorithm values are given, the sender's preferred digest-algorithm is the one (or ones) with the highest qvalue. Examples: Polli & Pardue Expires November 10, 2019 [Page 8] Internet-Draft RDHTTP May 2019 Want-Digest: sha-256 Want-Digest: SHA-256;q=0.3, sha;q=1 4.2. Digest The Digest header field provides a digest of the representation data Digest = "Digest" ":" #(representation-data-digest) "Representation data" might be: o fully contained in the message body, o partially-contained in the message body, o or not at all contained in the message body. The resource is specified by the effective Request-URI and any cache- validator contained in the message. For example, in a response to a HEAD request, the digest is calculated using the representation data that would have been enclosed in the payload body if the same request had been a GET. Digest can be used in requests too. Returned value depends on the representation metadata headers. A Digest header field MAY contain multiple representation-data-digest values. This could be useful for responses expected to reside in caches shared by users with different browsers, for example. A recipient MAY ignore any or all of the representation-data-digests in a Digest header field. A sender MAY send a representation-data-digest using a digest- algorithm without knowing whether the recipient supports the digest- algorithm, or even knowing that the recipient will ignore it. ... 5. Deprecate Negotiation of Content-MD5 This RFC deprecates the negotiation of Content-MD5 as this header has been obsoleted by [RFC7231] The MD5 algorithm is NOT RECOMMENDED as it's now vulnerable to collision attacks [CMU-836068] Polli & Pardue Expires November 10, 2019 [Page 9] Internet-Draft RDHTTP May 2019 6. Examples 6.1. Unsolicited Digest response 6.1.1. Representation data is fully contained in the payload Request: GET /items/123 Response: HTTP/1.1 200 Ok Content-Type: application/json Content-Encoding: identity Digest: sha-256=X48E9qOokqqrvdts8nOJRJN3OWDUoyWxBf7kbu9DBPE= {"hello": "world"} 6.1.2. Representation data is not contained in the payload Request: HEAD /items/123 Response: HTTP/1.1 200 Ok Content-Type: application/json Content-Encoding: identity Digest: sha-256=X48E9qOokqqrvdts8nOJRJN3OWDUoyWxBf7kbu9DBPE= 6.1.3. Representation data is partially contained in the payload i.e. range request Polli & Pardue Expires November 10, 2019 [Page 10] Internet-Draft RDHTTP May 2019 Request: GET /items/123 Range: bytes=1-7 Response: HTTP/1.1 206 Partial Content Content-Type: application/json Content-Encoding: identity Content-Range: bytes 1-7/18 Digest: sha-256=X48E9qOokqqrvdts8nOJRJN3OWDUoyWxBf7kbu9DBPE= "hello" 6.1.4. Digest in both Request and Response. Returned value depends on representation metadata Digest can be used in requests too. Returned value depends on the representation metadata headers. Request: PUT /items/123 Content-Type: application/json Content-Encoding: identity Accept-Encoding: br Digest: sha-256=4REjxQ4yrqUVicfSKYNO/cF9zNj5ANbzgDZt3/h3Qxo= {"hello": "world"} Response: Content-Type: application/json Content-Encoding: br Digest: sha-256=X48E9qOokqqrvdts8nOJRJN3OWDUoyWxBf7kbu9DBPE= b'\x8b\x08\x80{"hello": "world"}\x03' 6.2. Want-Digest solicited digest responses 6.2.1. Client request data is fully contained in the payload The client requests a digest, preferring sha. The server is free to reply with sha-256 anyway. Polli & Pardue Expires November 10, 2019 [Page 11] Internet-Draft RDHTTP May 2019 Request: GET /items/123 Want-Digest: sha-256;q=0.3, sha;q=1 Response: HTTP/1.1 200 Ok Content-Type: application/json Content-Encoding: identity Digest: sha-256=X48E9qOokqqrvdts8nOJRJN3OWDUoyWxBf7kbu9DBPE= {"hello": "world"} 6.2.2. A client requests an unsupported Digest, the server MAY reply with an unsupported digest The client requests a sha digest only. The server is currently free to reply with a Digest containing an unsupported algorithm Request: GET /items/123 Want-Digest: sha;q=1 Response: HTTP/1.1 200 Ok Content-Type: application/json Content-Encoding: identity Digest: sha-256=X48E9qOokqqrvdts8nOJRJN3OWDUoyWxBf7kbu9DBPE= {"hello": "world"} 6.2.3. A client requests an unsupported Digest, the server MAY reply with a 400 The client requests a sha Digest, the server advises for sha-256 and sha-512 Polli & Pardue Expires November 10, 2019 [Page 12] Internet-Draft RDHTTP May 2019 Request: GET /items/123 Want-Digest: sha;q=1 Response: HTTP/1.1 400 Bad Request Want-Digest: sha-256, sha-512 ... 7. Security Considerations 7.1. Usage in signatures Digital signatures are widely used together with checksums to provide the certain identification of the origin of a message [NIST800-32]. It's important to note that, being the Digest header an hash of a resource representation, signing only the "Digest" header, without all the representation metatada (eg. the values of "Content-Type" and "Content-Encoding") may expose the communication to tampering. 7.2. Message Truncation ... 7.3. Algorithm Agility ... 8. IANA Considerations 8.1. The "id-sha-256" Digest Algorithm This memo registers the "id-sha-256" digest algorithm in the HTTP Digest Algorithm Values [4] registry: o Digest Algorithm: id-sha-256 o Description: As specified in Section 3. 8.2. The "id-sha-512" Digest Algorithm This memo registers the "id-sha-512" digest algorithm in the HTTP Digest Algorithm Values [5] registry: Polli & Pardue Expires November 10, 2019 [Page 13] Internet-Draft RDHTTP May 2019 o Digest Algorithm: id-sha-512 o Description: As specified in Section 3. 8.3. Want-Digest Header Field Registration This section registers the "Want-Digest" header field in the "Permanent Message Header Field Names" registry ([RFC3864]). Header field name: "Want-Digest" Applicable protocol: http Status: standard Author/Change controller: IETF Specification document(s): Section 4.1 of this document 8.4. Digest Header Field Registration This section registers the "Digest" header field in the "Permanent Message Header Field Names" registry ([RFC3864]). Header field name: "Digest" Applicable protocol: http Status: standard Author/Change controller: IETF Specification document(s): Section 4.2 of this document 9. References 9.1. Normative References [CMU-836068] Carnagie Mellon University, Software Engineering Institute, ., "MD5 Vulnerable to collision attacks", December 2008, . [FIPS180-1] Department of Commerce, National., "NIST FIPS 180-1, Secure Hash Standard", April 1995, . Polli & Pardue Expires November 10, 2019 [Page 14] Internet-Draft RDHTTP May 2019 [FIPS180-3] Department of Commerce, National., "NIST FIPS 180-3, Secure Hash Standard", October 2008, . [FIPS180-4] Department of Commerce, National., "NIST FIPS 180-4, Secure Hash Standard", March 2012, . [NIST800-32] Department of Commerce, National., "Introduction to Public Key Technology and the Federal PKI Infrastructure", February 2001, . [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, DOI 10.17487/RFC1321, April 1992, . [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, March 1997, . [RFC3230] Mogul, J. and A. Van Hoff, "Instance Digests in HTTP", RFC 3230, DOI 10.17487/RFC3230, January 2002, . [RFC3864] Klyne, G., Nottingham, M., and J. Mogul, "Registration Procedures for Message Header Fields", BCP 90, RFC 3864, DOI 10.17487/RFC3864, September 2004, . [RFC4648] Josefsson, S., "The Base16, Base32, and Base64 Data Encodings", RFC 4648, DOI 10.17487/RFC4648, October 2006, . [RFC5789] Dusseault, L. and J. Snell, "PATCH Method for HTTP", RFC 5789, DOI 10.17487/RFC5789, March 2010, . [RFC5843] Bryan, A., "Additional Hash Algorithms for HTTP Instance Digests", RFC 5843, DOI 10.17487/RFC5843, April 2010, . Polli & Pardue Expires November 10, 2019 [Page 15] Internet-Draft RDHTTP May 2019 [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing", RFC 7230, DOI 10.17487/RFC7230, June 2014, . [RFC7231] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content", RFC 7231, DOI 10.17487/RFC7231, June 2014, . [RFC7233] Fielding, R., Ed., Lafon, Y., Ed., and J. Reschke, Ed., "Hypertext Transfer Protocol (HTTP/1.1): Range Requests", RFC 7233, DOI 10.17487/RFC7233, June 2014, . [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 2017, . [UNIX] The Open Group, ., "The Single UNIX Specification, Version 2 - 6 Vol Set for UNIX 98", February 1997. 9.2. Informative References [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, DOI 10.17487/RFC2818, May 2000, . [RFC5788] Melnikov, A. and D. Cridland, "IMAP4 Keyword Registry", RFC 5788, DOI 10.17487/RFC5788, March 2010, . [RFC6962] Laurie, B., Langley, A., and E. Kasper, "Certificate Transparency", RFC 6962, DOI 10.17487/RFC6962, June 2013, . [RFC7396] Hoffman, P. and J. Snell, "JSON Merge Patch", RFC 7396, DOI 10.17487/RFC7396, October 2014, . [SRI] Akhawe, D., Braun, F., Marier, F., and J. Weinberger, "Subresource Integrity", n.d.. 9.3. URIs [1] https://lists.w3.org/Archives/Public/ietf-http-wg/ [2] https://github.com/ioggstream/draft-polli-resource-digests-http Polli & Pardue Expires November 10, 2019 [Page 16] Internet-Draft RDHTTP May 2019 [3] https://tools.ietf.org/html/rfc7231#section-3.1.2.1 [4] https://www.iana.org/assignments/http-dig-alg/http-dig-alg.xhtml [5] https://www.iana.org/assignments/http-dig-alg/http-dig-alg.xhtml Appendix A. Acknowledgements The vast majority of this document is inherited from [RFC3230], so thanks to J. Mogul and A. Van Hoff for their great work. The original idea of refreshing this document arose from an interesting discussion with M. Nottingham, J. Yasskin and M. Thomson when reviewing the MICE Content Encoding. Appendix B. FAQ 1. Why remove all references to content-md5? Those were unnecessary to understanding and using this spec. 2. Why remove references to instance manipulation? Unnecessary again for correctly using and applying the spec. An example with Range Request is more than enough. 3. How to use "Digest" with "PATCH" method? The PATCH verb brings some complexities (eg. about representation metadata headers, patch document format, ...), * PATCH entity-headers apply to the patch document and MUST NOT be applied to the target resource, see [RFC5789], Section 2. * servers shouldn't assume PATCH semantics for generic media types like "application/json" but should instead use a proper content-type, eg [RFC7396] * a "200 OK" response to a PATCH request would contain the digest of the patched item, and the etag of the new object. This behavior - tighly coupled to the application logic - gives the client low probability of guessing the actual outcome of this operation (eg. concurrent changes, ...) Authors' Addresses Polli & Pardue Expires November 10, 2019 [Page 17] Internet-Draft RDHTTP May 2019 Roberto Polli Team Digitale Email: robipolli@gmail.com Lucas Pardue Email: lucaspardue.24.7@gmail.com Polli & Pardue Expires November 10, 2019 [Page 18]