<?xml version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?>  <!-- Required for schema validation and schema-aware editing -->
<!-- <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?> -->
<!-- This third-party XSLT can be enabled for direct transformations in XML processors, including most browsers -->
<!DOCTYPE rfc [
  <!ENTITY nbsp    "&#160;">
  <!ENTITY zwsp   "&#8203;">
  <!ENTITY nbhy   "&#8209;">
  <!ENTITY wj     "&#8288;">
]>
<!-- If further character entities are required then they should be added to the DOCTYPE above.
     Use of an external entity file is not recommended. -->
<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="info"
  docName="draft-nurpmeso-dkim-access-control-diff-changes-06"
  ipr="trust200902"
  obsoletes=""

  updates="6376"

  submissionType="IETF"
  xml:lang="en"
  version="3">
<!--
     [CHECK]
       * category should be one of std, bcp, info, exp, historic
       * ipr should be one of trust200902, noModificationTrust200902,
         noDerivativesTrust200902, pre5378Trust200902
       * updates can be an RFC number as NNNN
       * obsoletes can be an RFC number as NNNN
-->
  <front>

   <title>DKIM Access Control and Differential Changes</title>

   <seriesInfo name="Internet-Draft" value="draft-nurpmeso-dkim-access-control-diff-changes-06"/>

    <author fullname="Steffen Nurpmeso" initials="S" role="editor" surname="Nurpmeso">
      <address><email>steffen@sdaoden.eu</email></address>
    </author>

    <date year="2025" month="07" day="07"/>

    <area>General</area>
    <workgroup>Internet Engineering Task Force</workgroup>

    <keyword>DKIM</keyword>

    <abstract><t>
      This document specifies a
      DKIM (RFC 6376)
      extension that allows cryptographic verification of
      SMTP (RFC 5321)
      envelope data,
      and of DKIM signatures prior to
      IMF (RFC 5322)
      message content changes along the message path,
      addressing thus security glitches,
      and offering a new world of email solutions that move complexity
      away from lower network layers,
      where problems cannot be solved.
      It updates DKIM to obsolete certain aspects that reality has
      proven to be superfluous, incomplete, or obsoleted.
      It is the future of email for email of the future.
    </t></abstract>

  </front>
  <middle>

    <section>
      <name>Introduction</name>
      <t>
        DKIM<xref target="RFC6376"/>
        was not designed to cover
        SMTP<xref target="RFC5321"/>
        envelope data, allowing replay of valid, verifiable messages
        to an infinite set of recipients by malicious third parties,
        undetectable by sender and recipients.

        (To aid SMTP delivery to recipients in various conditions the
        optional "x=" expiration tag timestamp must be chosen so far
        in the future that malicious players have plenty of time to
        misuse messages.)
      </t><t>
        Whereas
        DKIM<xref target="RFC6376"/>
        standardized rudimentary, incomplete approaches to undo
        modifications of
        IMF<xref target="RFC5322"/>
        message content that happen along the message path, 
        the overall design was agreed in not to survive them
        (compare, for example,
        <xref target="RFC6377"/>).

        The resulting paradigm of DKIM is
        "as long as one signature can be verified cryptographically,
        DKIM verification will succeed".

        This is problematic as message content changes may be falsely
        attributed to (the) address(es) in the IMF originator field(s).

        (Later policy-enforcing standards effectively complicated the
        situation, in that false attribution may now technically be
        avoidable, but mitigations like "user A via B" will still be
        attributed to "A" by a human for one, and, in short, anything
        is valid if one DKIM signature is.)
      </t><t>
        Potentially many DKIM signatures may exist in a message.
        DKIM<xref target="RFC6376"/>
        gives hints on how verification can be performed,
        but, in practice, mitigations are applied in order to reduce
        excessive and useless verifications on hops down the message
        path: elder signatures are removed, or renamed, as changes
        are performed on message content, for example, by mailing-lists.

        An approach to avoid excessive network traffic and CPU work
        during message verification mitigates careless configurations.
      </t><t>
        The presented ACDC extension addresses these and more issues,
        backward and forward compatible, easy adoptable, and easy
        integratable into the current, existing infrastucture.
      </t>

      <section>
        <name>Conventions and Terminology</name>
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL",
          "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT
          RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
          interpreted as described in BCP 14 <xref target="RFC2119"/>
          <xref target="RFC8174"/> when, and only when, they appear in
          all capitals, as shown here.</t>
        <t>
          When in the below message "REJECT"ion is said,
          implementations may choose to instead move messages into a
          spam or quarantine state.
        </t><t>
          The term "FOSS" refers to Free and Open Source Software.
        </t>
      </section>
    </section>

    <section>
      <name>DKIM ACDC</name>
      <t>
        The
        DKIM<xref target="RFC6376"/>
        extension Access Control and Differential Changes:
      </t><ul>
        <li>
          Places DKIM signatures in an ordered, numbered,
          random-accessible sequence which' state correlate.

          Identical DKIM signatures generated at the same hop,
          but which differ in only the used algorithm,
          share, however, a sequence number.

          With ACDC it can be, and usually is, sufficient to verify
          only the cheaply detectable highest numbered signature.
        </li><li>
          Adds reversible data difference tracking,
          and as such supports cryptographical content verification
          of any (ACDC ware) intermediate message state,
          up to the initial variant as sent by the originator.
        </li><li>
          Cryptographically protects the
          SMTP<xref target="RFC5321"/>
          envelope, that is,
          <tt>RCPT TO</tt> addresses
          as well as the
          <tt>MAIL FROM</tt>
          address.

          Replay of valid messages to initially not addressed recipients,
          as well as backscatter bounces to random addresses instead of
          the originator, become detectable.
        </li><li>
          Allows cryptographically verifiable collection of statistics of
          organizational trust (<xref target="RFC5863"/>, section 2.5)
          along the entire message path.
        </li><li>
          Allows recognition of certain flagged conditions (along the
          message path) only by looking at the highest numbered signature.
        </li>
      </ul><t>
        The
        DKIM<xref target="RFC6376"/>
        extension Access Control and Differential Changes
        is announced by adding an "acdc=" tag to the DKIM-Signature.
        (For efficiency reasons it <bcp14>SHOULD</bcp14> be placed
        early, before tags like "h=", "bh=" and "b=", for example.)
      </t><t>
        The tag starts with "sequence",
        a decimal number starting at 1,
        or incremented by 1 from the highest ACDC sequence number
        encountered in the message;
        the maximum value is 999:
        if incrementing would result in overflow,
        the message <bcp14>MUST</bcp14> be rejected;
        detected sequence holes <bcp14>MUST</bcp14> also cause
        rejection (but see below);
        in both cases
        SMTP<xref target="RFC5321"/>
        reply code 550 is to be used; with
        enhanced SMTP status codes<xref target="RFC3463"/>
        5.5.4 <bcp14>MUST</bcp14> be used.
      </t><t>
        Flag description is normative.
        (Note the missing <tt>FWS</tt> separators around <tt>=</tt>.)
        ABNF<xref target="RFC5234"/>:
      </t><sourcecode><![CDATA[
acdc = %x61 %x63 %x64 %x63 = sequence ":" 1*(flag) ":" [id] ":"
sequence = 1*3DIGIT; DIGIT from RFC 5234
flag = "D" / "E" / "I" / "O" / "P" / "R" / "S" / "s" /
       "V" / "v" / "X" / "x" / "Y" / "y" / "Z" / "z"
id = *42(ALPHA / DIGIT / "+" / "-"); optional (bounce) identifier
      ]]></sourcecode><dl>
        <dt>D</dt><dd><t>
          The message was modified at this hop,
          ACDC differential changes were generated,
          and are stored in a DKIM-Diff: header field.
          </t><t>
          The "Y" flag has to be set.
        </t></dd><dt>E</dt><dd><t>
          The
          SMTP<xref target="RFC5321"/>
          envelope (<tt>MAIL FROM</tt> and/or <tt>RCPT TO</tt>)
          was modified.
          A new "Access Control" (see there) evaluation has been performed.
        </t><t>
          The "O" flag has to be set if the <tt>MAIL FROM</tt> changed.
          The "y" flag has to be set.
        </t></dd><dt>I</dt><dd><t>
          This DKIM-Signature: header field was generated at ingress:
          shall the message leave the host again via egress,
          it will be removed.
          The purpose of such a field is that its flags can be used
          to query the verification state of the message.
          (Also see the "R", "S", "s", "V" and "v" flags.)
        </t></dd><dt>O</dt><dd><t>
          This hop claims the message origin.
        </t><t>
          This either means that the message originated at this hop,
          in which case the signature (usually, DKIM-typical) refers to
          the first address of the From header field,
          and the sequence number is 1.
        </t><t>
          It can also mean that the current hop was the, quoting
          <xref target="RFC3461"/>,
          <em>"final delivery for the [original] message"</em>,
          that the message got a <em>"new envelope return address"</em>,
          that is, the <tt>MAIL FROM</tt> of the SMTP envelope was changed.

          In this case the "E" flag has to be set
        </t><t>
          A new "Access Control" (see there) evaluation has been performed.
        </t></dd><dt>P</dt><dd><t>
          Postmaster mode.
          With this flag set the behaviour of ACDC borders test
          mode in that rejections must not occur (due to ACDC).
          This is to allow for a communication possibility window in
          a situation where messages would always be rejected, due to
          misconfigurations et cetera, and as such reflects
          SMTP<xref target="RFC5321"/>
          section 4.5.1 Minimum Implementation.
        </t><t>
          (If, due to some failure, the sequence number would be
          excessed by such a message, the sequence increment shall not
          be performed, even if it makes the message "more invalid".
          Implementations necessarily count the number of ACDC
          instances, and may imply an absolute maximum in order to
          avoid endless message wandering aka "loops" nonetheless.)
        </t><t>
          If the sequence number is 1,
          message recipients have to be inspected.
          If the
          IMF<xref target="RFC5322"/>
          header fields To: and Cc: only contain a single addressee with
          the local part
          postmaster<xref target="RFC1123"/>,
          and if the same "postmaster" is addressed as a
          SMTP<xref target="RFC5321"/>
          <tt>RCPT TO</tt> recipient,
          and if no more than two <tt>RCPT TO</tt> recipients exist in
          total,
          then the "P" flag has to be set.
        </t><t>
          Once set, all future DKIMACDC signatures must copy it.
          (It may be removed by a signature which claims a new
          message origin by setting the "O" flag.)
        </t></dd><dt>R</dt><dd><t>
          Reputation check to collect
          organizational trust (<xref target="RFC5863"/>, section 2.5)
          along the signature chain was performed.
        </t><t>
          On top of the "V" flag this means that all differential changes
          have been applied,
          and all signatures along the chain have been verified,
          and the entire chain validated correctly.
        </t><t>
          Only in signatures with sequence numbers greater than 1,
          and without the "Z" or "z" flags (in earlier signatures).
        </t></dd><dt>S</dt><dd><t>
          Only in conjunction with the "I" flag.
          Upon ingress the
          SPF<xref target="RFC7208"/>
          state was successfully verified.
        </t></dd><dt>s</dt><dd><t>
          Only in conjunction with the "I" flag.
          Upon ingress the
          SPF<xref target="RFC7208"/>
          verification failed.
        </t></dd><dt>V</dt><dd><t>
          ACDC signature verified successfully.
        </t><t>
          This means that the signature with the highest sequence number
          has been verified correctly,
          that the sequence of ACDC signatures is complete,
          and their flags make sense (in the sequence).
          In conjunction with the flag "R" even deeper inspection was
          performed.
        </t><t>
          Only in signatures with sequence numbers greater than 1.
        </t></dd><dt>v</dt><dd><t>
          DKIM signature verified successfully.
        </t><t>
          In signatures with sequence number 1,
          then missing the "O" flag,
          it means the message originated at a non-ACDC-aware host,
          and normal DKIM processing was performed and succeeded.
          Unless DKIM processing succeeded for the DKIM signature which
          covered the messages' From: header field address,
          the "Z" flag must be set, otherwise the "z" flag.
        </t><t>
          In messages with higher sequence numbers it comes alongside
          the "X" flag: necessarily the ACDC chain was broken, and the
          message changed, by an intermediate non-ACDC-aware hop.
          The "z" flag must be set.
        </t></dd><dt>X</dt><dd><t>
          DKIMACDC verification failed;
          however, the normal DKIM signature verification was performed,
          and succeeded.
        </t><t>
          The "z" flag must be set.
        </t></dd><dt>x</dt><dd><t>
          DKIM verification failed.
        </t><t>
          In signatures with sequence number 1,
          then missing the "O" flag,
          it means the message originated at a non-ACDC-aware host,
          and normal DKIM processing was performed and failed.
          The "z" flag must be set.
        </t><t>
          In messages with higher sequence numbers it comes alongside
          the "X" flag: necessarily the ACDC chain was broken, and the
          message changed, by an intermediate non-ACDC-aware hop.
          The "z" flag must be set.
        </t></dd><dt>Y</dt><dd><t>
          The message has seen
          IMF<xref target="RFC5322"/>
          modifications:
          somewhere along the chain the original message data was modified.
          Once set, all future ACDC signatures must copy it.
        </t></dd><dt>y</dt><dd><t>
          The message has seen
          SMTP<xref target="RFC5321"/>
          envelope modifications:
          somewhere along the chain the original envelope was modified.
          Once set, all future ACDC signatures must copy it.
        </t></dd><dt>Z</dt><dd><t>
          Announces the ACDC chain is incomplete.
          The message was processed by ACDC unaware hops.
          However, the message verifies correctly and seems to have
          never been modified non-reversibly.
          Once set, all future DKIMACDC signatures must copy it,
          unless later downgraded to the "z" flag.
        </t></dd><dt>z</dt><dd><t>
          The message has seen non-reversible modifications,
          and cannot be cryptographically verified back to its origin.
          Once set, all future DKIMACDC signatures must copy it.
        </t><t>
          If this flag is set ACDC looses its decisive meaning
          and "degrades" to normal DKIM:
          no more differential data is generated,
          and messages are distributed further / accepted if just any
          DKIM(ACDC) signature verifies.
          (Software configuration <bcp14>MAY</bcp14> allow otherwise.)
        </t></dd><dt>id</dt><dd><t>
          The optional "bounce identifier" offers enough room to store
          Universally Unique IDentifiers<xref target="RFC9562"/>.
        </t><t>
          It <bcp14>MAY</bcp14> be generated to help sending domains
          to uniquely identify messages within the DKIM "t=" and "x="
          time delta, as well as to ensure that successively sent
          identical messages are not detected as the same.
        </t><t>
          Receiving domains should not use this identifier due to the
          denial of service attack surface,
          regardless of collected organizational trust (see R flag).
        </t></dd>
      </dl><t>
        Unknown flags <bcp14>MUST</bcp14> be ignored.
        Invalid flag combinations and flag misuse <bcp14>MUST</bcp14>
        result in rejection with SMTP reply code 550; if
        enhanced status codes<xref target="RFC3463"/>
        are used, 5.5.4 <bcp14>MUST</bcp14> be used.
        (This includes the "P" flag upon incorrect use.)
      </t>
    </section>

    <section>
      <name>The DKIM-Store header field</name>
      <t>
        The DKIM-Store header field has no meaning in the email system.
        The sole purpose of mentioning it is to announce that it
        <bcp14>MUST</bcp14> be removed when messages enter
        and leave the email system.

        It could for example be temporarily created and used by
        non-integrated mail filter (milter) software to pass
        informational data in between the "ingress" and the "egress"
        processing side.
        To aid in software bugs and possible configuration errors
        this specification enforces removal of all occurrences.

        It is suggested to encrypt data passed around in this temporary
        header field with a key internal to the "local" email processing
        system in order to achieve locality.
      </t>
    </section>

    <section>
      <name>Access Control</name>
      <t>
        SMTP delivers messages to individual domains.
        With ACDC, when a SMTP envelope was created or changed,
        all distinct domain-names found within the list of intended
        SMTP <tt>RCPT TO</tt> addressees are collected,
        as the message needs to be forged on this individual domain base:
        ACDC will create DKIM-AC: header fields covering SMTP envelopes,
        and include them as messages are sent to individual domains.
      </t><t>
        The domains _dkimacdc DNS entries, as below, are queried.
        Any domain that announces ACDC support can be served by a single
        message for all recipients (possible limits aside).

        For other domains, to guarantee anonymity, it is necessary to
        differentiate in between public recipients in the To: and Cc:
        header fields, and private recipients in the Bcc: header field.

        <em>Remarks:</em> quality-of-service: for simplicity messages
        may always be forged on a single recipient base, individually.
      </t><t>
        In any case the completely prepared message,
        including the readily prepared DKIM-Signature(s), is forged,
        a DKIM-AC: header field is generated which covers the logical
        recipient subset, and the resulting message is then sent.
      </t><t>
        ACDC aware recipient domains are expected to manage a message
        DKIM-AC: identity cache to mitigate replay attacks.
        (Hint: the DKIM-AC: signature seems like a natural cache
        key source, see below.)
      </t><t>
        The DKIM "x=" tag <bcp14>MUST</bcp14> be used to place a
        lifetime constraint when creating signatures,
        to allow finite identity cache sizes.

        The maximum "t=" to "x=" delta <bcp14>MUST NOT</bcp14> be
        greater than 864000 seconds (ten days: to reach into the next
        working week).
        Example delta values for tag auto-generation may be the bounce
        defaults 432000 seconds (five days: used for example by the
        Mailman2 and mlmmj mailing-list managers and the postfix MTA),
        345600 seconds (four days: OpenSMTPD MTA),
        172800 seconds (two days: Exim MTA).
      </t><t>
        To keep the identity cache a write-once data structure, ACDC
        senders <bcp14>MUST NOT</bcp14> generate DKIM-AC: header fields
        with more than half of the 100 recipients that
        SMTP<xref target="RFC5321"/>
        section 4.5.3.1.8 guarantees as a minimum,
        unless a
        DNSSEC<xref target="RFC4033"/><xref target="RFC4034"/><xref target="RFC4035"/>.
        protected _dkimacdc DNS entry, as below, announced a limit.

        If more recipients need to be addressed on a single domain,
        multiple message forges with recipient subsets must be generated:
        like this each message forge is "atomic",
        and the DKIM-AC: header field covers all the SMTP envelope.

        SMTP MTAs of domains which announce ACDC <bcp14>MUST</bcp14>
        support at least half the minimum limit required by
        SMTP<xref target="RFC5321"/>
        (section 4.5.3.1.8).
      </t><blockquote>
        <em>Informative remark:</em>
        Implementations <bcp14>MAY</bcp14> offer configuration options
        to specify other (higher, lower) recipient limits.
        Like this the much higher limits in actual use
        (for example, the Exim MTA default is 50000)
        can be utilized.
      </blockquote><t>
        An ACDC aware recipient domain that receives an "acdc=" tagged
        message without a DKIM-AC: header field <bcp14>MUST</bcp14>
        reject the message with SMTP reply code 550; if
        enhanced status codes<xref target="RFC3463"/>
        are used, 5.5.4 <bcp14>MUST</bcp14> be used.

        It <bcp14>MUST</bcp14> likewise fail if the DKIM-AC: header field
        does not cover the SMTP envelope data.
        (It <bcp14>SHOULD</bcp14> test for a superset of recipients,
        and only fail if an envelope recipient is not included in the
        DKIM-AC: header field.)

        It <bcp14>MUST</bcp14> reject messages which fail the signature
        check of a DKIM-AC: or DKIM-Signature: header field, or the
        condition and flag check verification, with SMTP reply code 550;
        the enhanced status code <bcp14>MUST</bcp14> be 5.7.7.

        Senders <bcp14>MAY</bcp14> use
        Delivery Status Notifications<xref target="RFC3461"/>
        to fine-tune the resulting behaviour.
      </t>

      <section>
        <name>The DKIM-AC header field</name>
        <t>
          The syntax of this header field is the usual semicolon
          separated list of DKIM-style tags of unspecified order;
          unknown tags <bcp14>MUST</bcp14> be ignored.
          It is used to cryptographically link the SMTP envelope
          to the sent IMF mail message.

          The "sn=" tag is the linked DKIM-Signature sequence number,
          best placed early.
          Multiple signatures with the same sequence number,
          but different algorithm may exist,
          and so may DKIM-AC header fields.

          The selector of the linked signature is given by the "s=" tag,
          the used algorithm can be deduced from there.

          The "dr=" tag value is the recipient domain.

          The "mf=" tag is the
          SMTP<xref target="RFC5321"/>
          <tt>MAIL FROM</tt>
          of the covered message, the complete addr-spec,
          whereas "rt=" tag(s) contain only the local-parts of
          <tt>RCPT TO</tt>s.
          (<em>Warning:</em>
          SMTP<xref target="RFC5321"/>
          address local-parts permit quoted-strings.)

          Mirroring DKIM-Signature the tag list is concluded with the
          "b=" tag that is the cryptographic signature data of the
          DKIM-AC: header field.
          However, the reassembled (see
          DKIM<xref target="RFC6376"/>,
          section 3.5) "b=" value of the linked DKIM-Signature is
          "virtually assigned", and included when creating the
          cryptographic signature;
          thereafter the "b=" tag is assigned its own value.
        </t><t>
          All instances of DKIM-AC: header fields <bcp14>MUST</bcp14>
          be removed by ACDC-aware software as soon as possible;
          they <bcp14>MUST NOT</bcp14> be delivered by local delivery
          agents as part of the message, and <bcp14>MUST NOT</bcp14>
          be part of rejected messages.

          However, if a domain is only an intermediate, which was
          neither directly addressed nor which originated the mail,
          and which does not modify the SMTP envelope either, then it
          <bcp14>MUST NOT</bcp14> remove the "current" DKIM-AC: header
          field, and it <bcp14>MUST NOT</bcp14> generate a new one.
        </t>
      </section>

      <section>
        <name>The _dkimacdc.DOMAIN DNS TXT RR</name>
        <t>
          The syntax of this DNS resource record is the usual semicolon
          separated list of DKIM-style tags of unspecified order;
          unknown tags <bcp14>MUST</bcp14> be ignored.
          However, <tt>FWS</tt> separation of tag, equal sign, and value
          is not allowed.

          The optional tag "rl=" contains an unsigned integer that asserts
          the guaranteed minimum number of recipients that may be used as
          <tt>RCPT TO</tt>s in a single transaction;
          it may be as small as 1.
          A value of 0 equals 1.

          The tag "v=" and "a=" mirror their DKIM tags,
          however, "v=" is optional,
          and none to multiple "a=" tags <bcp14>MAY</bcp14> exist:
          they indicate, in descending order, the most desirable
          algorithms for this domain.
          Senders <bcp14>SHOULD</bcp14> try to honour the first fit,
          and exclusively so if the algorithm is a well established one.
          (For example, at the time of this writing, only RSA-SHA256 meets
          this requirement, ED25519-SHA256 does not.)

          DNS CNAME chains <bcp14>MUST</bcp14> be followed when looking
          up this DNS RR.
        </t>
      </section>
    </section>

    <section>
      <name>Differential Changes</name>
      <t>
        Whenever an ACDC enabled domain detects during DKIM-Signature
        creation that the relaxed representation of a message was
        modified along its flight from ingress to egress, for example,
        when it was processed by a mailing-list which tagged the
        subject and added a message footer, a DKIM-Diff: header field
        has to be created.
      </t><blockquote>
        <em>Informative remark:</em>
        In an unbroken chain of ACDC signatures the DKIM-Diff: covered
        changes can be applied in reverse order of creation in order
        to cryptographically verify all intermediate DKIM signatures,
        back to the original version as sent by the sender.
      </blockquote>

      <section>
        <name>The DKIM-Diff header</name>
        <t>
          The syntax of this header field is the usual semicolon
          separated list of DKIM-style tags of unspecified order;
          unknown tags <bcp14>MUST</bcp14> be ignored.

          The "sn=" tag is the linked DKIM-Signature sequence number,
          best placed early.

          The "c=" tag identifies the compression method used for the
          data in "hd=" and/or "bd="; the value "z" means
          ZLIB<xref target="RFC1950"/>,
          whereas "xz" means
          <xref target="LZMA2"/>.
          ZLIB <bcp14>MUST</bcp14> be supported by signers and verifiers,
          LZMA2 <bcp14>MUST</bcp14> only be supported by verifiers.
          (FOSS implementations of all compression types are available.)

          The "hd=" tag is used to store differential data for header
          fields, "bd=" that for body content.  Both tags are optional,
          but at least one exists.
          The data is the results of the BSDiff differential algorithm,
          as below, compressed with the method given in "c=", then
          BASE64<xref target="RFC4648"/>
          encoded.
        </t><blockquote>
          <em>Informative remark:</em>
          The higher cost of using
          <xref target="LZMA2"/>
          for compression could be amortized by lesser necessary I/O.
          When using the
          <xref target="BSDIPA"/>
          implementation as below, inspecting header data can aid
          choosing an appropriate compression algorithm.
        </blockquote><t>
          All header fields covered by the DKIM-Signature
          <bcp14>MUST</bcp14> be included,
          as <bcp14>MUST</bcp14> be all
          MIME<xref target="RFC2045"/>
          related header fields,
          regardless of their presence in the DKIM-Signature.

          All ACDC enabled DKIM-Signature:
          and DKIM-Diff: header fields <bcp14>MUST</bcp14> be included.

          Other than that the advice of
          DKIM<xref target="RFC6376"/>,
          section 5.4.1, on recommended signature content, still applies,
          but is hereby extended with the
          Author Header Field<xref target="RFC9057"/>.
        </t><t>
          ACDC aware software is urged to "oversign" aka "seal" aka
          sign fields that are not present at the time of signing,
          how DKIM calls it, in order to protect message modifications.

          Since only the newest DKIM-Signature is checked, and
          modifications can be undone, messages should be protected
          as much as possible.
        </t>
      </section>

      <section>
        <name>The BSDiff differential algorithm</name>
        <t>
          The differential changes are created with the DKIM "relaxed"
          normalized header field and body data, respectively, as seen
          on egress, alongside the equally normalized data present
          before modifications took place, that is, on ingress.
        </t><blockquote>
          <em>Informative remark:</em>
          For non-integrated systems like mail filters for example
          the DKIM-Store: header field can be used to pass around the
          necessary data in between the ingress side that sees the
          original message,
          and the egress side which will dispatch the modified variant.
        </blockquote><t>
          The header fields <bcp14>MUST</bcp14> be sorted byte-wise
          by-value by-name, the formed subgroups <bcp14>MUST</bcp14>
          remain in the header stack order defined by
          DKIM<xref target="RFC6376"/>
          section 5.4.2, Signatures Involving Multiple Instances of a Field.
        </t><t>
          The BSDiff algorithm of Colin Percival, which has excellent
          characteristics, is then used to create a binary delta of the
          header or body lines.
        </t><blockquote>
          There is a FOSS
          <xref target="BSDIPA"/>
          plug-and-play ISO C99 and perl implementation available that
          iterated the FreeBSD operating system implementation of BSDiff,
          and includes further references on the algorithm.
        </blockquote>

        <section>
          <name>BSDiff adaption</name>
          <ul>
            <li>
              First of all: the string suffix sorting and difference creation
              approach of Colin Percival has been left unchanged.
            </li><li>
              The original had been fixated on 64-bit file sizes
              and content representation.
              The adaption supports (compile-time switching in between) 32-bit
              (and 64-bit).
              Using 32-bit almost halves memory usage,
              and produces smaller patch control data.
              It is deemed sufficient for email purposes.
              (32-bit and 64-bit patches are not interchangeable.)
            </li><li>
              In order to reduce memory usage during patch generation,
              the adaption uses a shared memory region for differential and
              extra data: the former is therefore stored in reversed order,
              top down.
              (This reduces memory usage by the size of the target data set.)
            </li><li>
              The adoption stores data in big endian (network; MSF;
              most significant byte first) instead of little endian (LSF;
              least significant byte first) byte order.
            </li><li>
              The original uses three separate bzip2 streams to serialize
              control, differential and extra data.
              The adaption separated patch generation from the I/O layer,
              which will therefore see the entire readily prepared patch data.
            </li><li>
              The original header did not contain the size of the extra
              data, which was stored last, with its size implicitly extending
              to the end of the patch.
              The adaption includes the extra data size in the header,
              allowing more verification tests to be applied with only the
              header being readily parsed.
              This also enables the I/O layer to allocate perfectly sized
              memory with only the header data being available.
            </li>
          </ul>
        </section>

        <section>
          <name>Patch content</name>
          <t>
            Overall, the patch consists of the header,
            followed by the control data.
            Thereafter the two byte (8-bit octet) streams of
            differential data (in reverse order)
            and extra data conclude the patch.
          </t><t>
            The header and the control data consist of 32-bit signed integers,
            stored in big endian byte order (as above).
          </t><t>
            The control data is a stream of tuples of three values each,
            the first denoting the length of differential data to copy in
            bytes, the second that of extra data to copy;
            the read positions within the differential and extra data move
            by the same amount of bytes.
            The last value denotes the number of bytes to seek relatively
            in the data source after the copying has taken place:
            of all the values, only this one may be negative.
          </t><t>
            The header consists of four values denoting
            the length of the control block in bytes,
            the length of the difference data block,
            the length of the extra data block,
            concluded by the length of the original data source;
            The sum of the first three values must be one less than the maximum
            positive 32-bit signed integer.
            It follows that control data copy instructions also do not exceed
            this value.
          </t>
        </section>
      </section>

      <section>
        <name>Rationale</name>
        <t>
          Differences are included to allow DKIM verifiers to restore
          previous message content for the purpose of cryptographically
          verifying elder DKIM-Signature: header fields.
        </t><t>
          This for example allows for collecting trustworthy statistics of
          organizational trust (<xref target="RFC5863"/>, section 2.5).
          Or user interfaces may visually restore an initial From: header
          field when messages come from a known mailing-list.
        </t><t>
          For example, user interfaces could use traffic light semantics
          that unfold on click to traffic light semantics of all message
          versions, which would visualize differences (with precautions):
          this can empower users to make decisions on the
          trustworthiness of intermediates, and to, for example,
          enable the above mentioned From: header field restoration.
        </t><t>
          However, the data exists in the DKIM "relaxed" normalized variant,
          former states are not meant to be usable messages by themselves.
          For example some embedded OpenPGP signature and text couple
          would likely fail to verify, dependent upon the original MIME
          transfer encoding).
        </t><blockquote>
          <em>Informative remark:</em>
          This was deemed acceptable because of the purpose of including
          differential changes,
          and because a visualization of the DKIM covered message should
          still be sufficient to allow users making responsible
          decisions.
        </blockquote><t>
          Finally, the given example will likely verify as part of the
          complete received message, unless altered along the SMTP path:
          ACDC can ideally say where
          (and exactly what, in an unbroken ACDC chain).
        </t>
      </section>
    </section>

    <section>
      <name>Mitigations'25</name>
      <blockquote>
        <em>This sections is in intermediate state</em>
      </blockquote>
      <t>
        As of the time of this writing the email infrastructure is
        deeply divided due to standards like DMARC and SPF, which
        require mitigations to be applied in order to keep existing
        infrastructures in a usable state.
      </t><t>
        For example, SPF will not survive a single hop, which means
        that alias expansion will no longer work.
        The IETF has no solution for this problem,
        but the FOSS scene has created a "Sender Rewriting Scheme"
        so that aliases can be used regardless.
      </t><t>
        As another example, DMARC caused a lot of mailing-lists to
        apply mitigations in that either old DKIM signatures are
        removed, or renamed, and that the From: header field is
        rewritten in a "User A via List B" style.
      </t>

      <section>
        <name>ACDC mitigations</name>
        <t>
          This memo suggests to apply active mitigations as part of DKIM
          processing, temporarily, until, at some future time, the email
          infrastructure has adapted to a new reality.
        </t><ul><li>
          Rename DKIM-Signature: header fields to DKIX-Signature:.

          Because DKIM-Signature: header fields are removed or renamed,
          also by unanchored regular expressions, which would match for
          example EDKIM-Signature, ACDC aware software should rename
          any DKIM-Signature field into DKIX-Signature upon egress.

          Since only one DKIM-Signature will have to be verified
          successfully by non ACDC aware DKIM software, and ACDC aware
          software can toggle the single byte back before verifying
          elder signatures, this should be easy in practice:
          just treat DKIM-Signature and DKIX-Signature alike, but toggle
          before cryptographic verification.
        </li><li>
          [THIS IS HYPOTHETIC, BUT WOULD MITIGATE ANY DMARC PROBLEM.]
          Mitigate From: header fields.

          When a message was changed in between ingress and egress,
          and if the From: header field falls into the "one address"
          DMARC category (and is thus checked accordingly),
          change From:,
          and place the original From: address in the Reply-To: header
          field.

          If the <tt>MAIL FROM</tt> SMTP envelope changed in addition,
          use a "From: via MAIL FROM" notation
          (as in "display-name (local-part (AT) domain) via MAIL FROM").

          If that is impossible the author suggests a hypothetic and
          artificial dkim__mitigate__25 local address, which ACDC aware
          DKIM software detects on ingress, to treat is specially.
          One could think about a special tag that holds the former real
          address.

          It could also be configurable.
          This likely applies to mailing-lists only, which normally have
          a dedicated local address that could be used, anyway.
          They often do the mitigation themselves.

          But it could also apply to "holiday-alias-forwards",
          or when company footers etc where added to a message that
          passes along the local mail system.
        </li><li>
          [THIS IS HYPOTHETIC, BUT WOULD MITIGATE ANY SPF PROBLEM.]
          Because ACDC aware DKIM software needs to have a notion
          of a temporary cache anyway:
          Mitigate non-local <tt>MAIL FROM</tt>.

          If a message that does not originate locally leaves the
          email system on egress, with a SMTP envelope <tt>MAIL FROM</tt>
          of a foreign domain, the SPF check will fail on the next hop.

          The FOSS community has invented the "Sender Rewriting Scheme"
          to allow the decade old and established email alias
          infrastructure to continue to exist.

          It would be easy to create a cache entry that maps a
          synthesized local address to the non-local <tt>MAIL FROM</tt>,
          HMAC and private key protected to avoid misuse to the maximum
          extend possible, that lives for say ten days, in an equal way
          to "SRS", in order to let SPF tests pass, and also to be able
          to undo the address rewrite when bounces are to be handled.
        </li></ul>
      </section>
    </section>

    <section anchor="IANA">
      <name>IANA Considerations</name>
      <t>
        The author suggests creating a registry of header fields that
        shall be cryptographically be covered by DKIM/ACDC.
        This memo extends the list mentioned by
        DKIM<xref target="RFC6376"/>
        with the
        Author Header Field<xref target="RFC9057"/>.
      </t>
    </section>

    <section anchor="Security">
      <name>Security Considerations</name>
      <t>
        Public-key cryptography is the safest approach to identification
        of counterparts and verification of data.
        This specification enables DKIM to cryptographically verify
        SMTP envelopes, and to cryptographically verify all message
        transitions back to the original message sender.
      </t>
    </section>

  </middle>
  <back>

    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4648.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6376.xml"/>
      </references>

      <references>
        <name>Informative References</name>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.1123.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.1950.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2045.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3461.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.3463.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4033.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4034.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.4035.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5234.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5321.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5322.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.5863.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.6377.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.7208.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9057.xml"/>
        <xi:include href="https://bib.ietf.org/public/rfc/bibxml/reference.RFC.9562.xml"/>
        <reference anchor="BSDIPA" target="https://github.com/sdaoden/s-bsdipa">
          <front><title>BSDIPA, a mutation of BSDiff</title><author/></front>
        </reference>
        <reference anchor="LZMA2" target="https://tukaani.org/xz/xz-file-format.txt">
          <front><title>LZMA2: The .xz File Format</title><author/></front>
        </reference>
      </references>
    </references>

    <section anchor="FurtherDKIMUpdates">
      <name>Further DKIM Updates</name>
      <ul><li>
        This specification obsoletes the simple canonicalization type;
        It <bcp14>MUST NOT</bcp14> be used by software announcing DKIMACDC.

        <em>Rationale:</em> in order to minimize processing cost in time and
        space for and of differential processing,
        being able to work on and with only one data representation is
        beneficial.
        The "extremely crude ASCII Art attacks" mentioned in
        DKIM<xref target="RFC6376"/>
        section 8.1 are considered to be a rather artificial attack vector.
      </li><li>
        This specification obsoletes the DKIM "l=" tag that restricts the
        number of DKIM covered bytes of the normalized message body.
        This tag <bcp14>MUST NOT</bcp14> be used by software announcing
        ACDC support,
        and all the message body <bcp14>MUST</bcp14> always be used to
        create the body hash.

        <em>Rationale:</em> "l=" has always been insufficient to deal with
        message changes caused by mailing-lists etc,
        but effectively includes the security risk that message parts which
        are not covered by the signature appear as "valid content" to users
        looking at a DKIM verified message.
        The ACDC differential changes offer a better approach to deal
        with message changes, while completely covered message bodies ensure
        content validity.
      </li><li>
        For the "i=" tag this specification obsoletes the possible use of
        DKIM-Quoted-Printable for the optional Local-part.

        <em>Rationale:</em> because the syntax is "a standard email
        address where the local-part MAY be omitted", quoted-printable
        encoding is not necessary for representation.
      </li><li>
        This specification obsoletes the DKIM "z=" tag that was defined
        "for diagnostic use" to copy a freely defined set of header fields
        and their values present during signature creation.
        This tag <bcp14>MUST NOT</bcp14> be used by software announcing
        DKIMACDC.

        <em>Rationale:</em> the ACDC differential changes provide access
        to the same information.
      </li><li>
        For the "q=" tag this specification obsoletes the possible use of
        DKIM-Quoted-Printable for the optional x-sig-q-tag-args of
        possibly introduced future query types.

        <em>Rationale:</em> shall ever a new type become standardized beside
        the dns/txt that is with DKIM from the very start,
        that standard can very well give meaning to a "hyphenated-word"
        proxy identifier without making use of byte values which would
        require encoding.
      </li><li>
        This specification obsoletes the DKIM key representation tag "n="
        that was meant to include "notes that might be of interest to
        a human", "intended for use by administrators, not end users",
        and which "should be used sparingly".

        <em>Rationale:</em> no use case has been encountered in the DNS,
        let alone serious such; if future non-space-constrained key
        providers other than DNS should ever exist and be used to
        distribute DKIM keys, it is likely that they support inclusion
        of strings via some method that need not be included in the DKIM
        key representation itself.
      </li><li>
        Because above changes remove all use cases for the
        "dkim-quoted-printable" encoding defined in RFC 6376 2.11,
        this specification obsoletes the DKIM-Quoted-Printable encoding.
      </li><li>
        This specification obsoletes the use of <tt>FWS</tt> in ag-spec.
        Second its use was never encountered by the author.
        But first of all
        MIME<xref target="RFC2045"/>
        introduced parameters in ABNF as
        <tt>parameter := attribute "=" value</tt>
        without <tt>FWS</tt>,
        and its presence complicates parsers and hinders parser code reuse.
        The "acdc=" tag is defined without <tt>FWS</tt> support.
      </li></ul>
    </section>

    <section anchor="Acknowledgements">
      <name>Acknowledgements</name>
      <t>
        This document contains a citation of Dave Crocker.
        Thanks to, in the order of appearance,
        Jesse Thompson,
        Richard Clayton for arguments against reliance on header field
        stacks, and pro the numbering scheme,
        and especially for noticing the partial transaction replay attack
        problem,
        Douglas Foster,
        Michael Thomas for explicit man-in-the-middle replay addressing;
        Alessandro Vesely inspired the explicitness of the E flag,
        and Bron Gondwana for the inspiration to split up binary
        differences of headers and body.
        A big fat acknowledgment is due to Murray S. Kucherawy.
        Special thanks to Klaus Schulze, Manuel Goettsching,
        both also as Ash Ra Tempel,
        Laeuten der Seele,
        Laurent Garnier,
        as well as the Sleeping Environmental Bot broadcast.
      </t>
    </section>

 </back>
</rfc>
<!-- vim:set tw=1000:s-ts-mode -->
