rfc9768v3.txt | rfc9768.txt | |||
---|---|---|---|---|
Internet Engineering Task Force (IETF) B. Briscoe | Internet Engineering Task Force (IETF) B. Briscoe | |||
Request for Comments: 9768 Independent | Request for Comments: 9768 Independent | |||
Updates: 3168 M. Kühlewind | Updates: 3168 M. Kühlewind | |||
Category: Standards Track Ericsson | Category: Standards Track Ericsson | |||
ISSN: 2070-1721 R. Scheffenegger | ISSN: 2070-1721 R. Scheffenegger | |||
NetApp | NetApp | |||
August 2025 | September 2025 | |||
More Accurate Explicit Congestion Notification (AccECN) Feedback in TCP | More Accurate Explicit Congestion Notification (AccECN) Feedback in TCP | |||
Abstract | Abstract | |||
Explicit Congestion Notification (ECN) is a mechanism by which | Explicit Congestion Notification (ECN) is a mechanism by which | |||
network nodes can mark IP packets instead of dropping them to | network nodes can mark IP packets instead of dropping them to | |||
indicate incipient congestion to the endpoints. Receivers with an | indicate incipient congestion to the endpoints. Receivers with an | |||
ECN-capable transport protocol feed back this information to the | ECN-capable transport protocol feed back this information to the | |||
sender. ECN was originally specified for TCP in such a way that only | sender. ECN was originally specified for TCP in such a way that only | |||
skipping to change at line 431 ¶ | skipping to change at line 431 ¶ | |||
options. | options. | |||
The essential feedback part overloads the previous definition of the | The essential feedback part overloads the previous definition of the | |||
three flags in the TCP header that had been assigned for use by | three flags in the TCP header that had been assigned for use by | |||
Classic ECN. This design choice deliberately allows AccECN peers to | Classic ECN. This design choice deliberately allows AccECN peers to | |||
replace the Classic ECN feedback protocol, rather than leaving | replace the Classic ECN feedback protocol, rather than leaving | |||
Classic ECN feedback intact and adding more accurate feedback | Classic ECN feedback intact and adding more accurate feedback | |||
separately because: | separately because: | |||
* this efficiently reuses scarce TCP header space, given TCP Option | * this efficiently reuses scarce TCP header space, given TCP Option | |||
space is approaching saturation. | space is approaching saturation; | |||
* a single upgrade path for the TCP protocol is preferable to a fork | * a single upgrade path for the TCP protocol is preferable to a fork | |||
in the design that modifies the TCP header to convey all ECN | in the design that modifies the TCP header to convey all ECN | |||
feedback. | feedback; | |||
* otherwise, Classic and Accurate ECN feedback could give | * otherwise, Classic and Accurate ECN feedback could give | |||
conflicting feedback about the same segment, which could open up | conflicting feedback about the same segment, which could open up | |||
new security concerns and make implementations unnecessarily | new security concerns and make implementations unnecessarily | |||
complex. | complex; | |||
* middleboxes are more likely to faithfully forward the TCP ECN | * middleboxes are more likely to faithfully forward the TCP ECN | |||
flags than newly defined areas of the TCP header. | flags than newly defined areas of the TCP header. | |||
AccECN is designed to work even if the supplementary feedback part is | AccECN is designed to work even if the supplementary feedback part is | |||
removed or zeroed out, as long as the essential feedback part gets | removed or zeroed out, as long as the essential feedback part gets | |||
through. | through. | |||
2.1. Capability Negotiation | 2.1. Capability Negotiation | |||
skipping to change at line 494 ¶ | skipping to change at line 494 ¶ | |||
some or all of the byte counters can be optionally carried in an | some or all of the byte counters can be optionally carried in an | |||
AccECN Option. For efficient use of limited option space, two | AccECN Option. For efficient use of limited option space, two | |||
alternative forms of the AccECN Option are specified with the fields | alternative forms of the AccECN Option are specified with the fields | |||
in the opposite order to each other. | in the opposite order to each other. | |||
2.3. Delayed ACKs and Resilience Against ACK Loss | 2.3. Delayed ACKs and Resilience Against ACK Loss | |||
With both the ACE and the AccECN Option mechanisms, the Data Receiver | With both the ACE and the AccECN Option mechanisms, the Data Receiver | |||
continually repeats the current LSBs of each of its respective | continually repeats the current LSBs of each of its respective | |||
counters. There is no need to acknowledge these continually repeated | counters. There is no need to acknowledge these continually repeated | |||
counters, so the Congestion Window Reduced (CWR) mechanism of | counters, so the CWR mechanism of [RFC3168] is no longer used. Even | |||
[RFC3168] is no longer used. Even if some ACKs are lost, the Data | if some ACKs are lost, the Data Sender ought to be able to infer how | |||
Sender ought to be able to infer how much to increment its own | much to increment its own counters, even if the protocol field has | |||
counters, even if the protocol field has wrapped. | wrapped. | |||
The 3-bit ACE field can wrap fairly frequently. Therefore, even if | The 3-bit ACE field can wrap fairly frequently. Therefore, even if | |||
it appears to have incremented by one (say), the field might have | it appears to have incremented by one (say), the field might have | |||
actually cycled completely and then incremented by one. The Data | actually cycled completely and then incremented by one. The Data | |||
Receiver is not allowed to delay sending an ACK to such an extent | Receiver is not allowed to delay sending an ACK to such an extent | |||
that the ACE field would cycle. However, ACKs received at the Data | that the ACE field would cycle. However, ACKs received at the Data | |||
Sender could still cycle because a whole sequence of ACKs carrying | Sender could still cycle because a whole sequence of ACKs carrying | |||
intervening values of the field might all be lost or delayed in | intervening values of the field might all be lost or delayed in | |||
transit. | transit. | |||
skipping to change at line 661 ¶ | skipping to change at line 661 ¶ | |||
The procedures for retransmission of SYNs or SYN/ACKs are given in | The procedures for retransmission of SYNs or SYN/ACKs are given in | |||
Section 3.1.4. | Section 3.1.4. | |||
It is RECOMMENDED that the AccECN protocol be implemented alongside | It is RECOMMENDED that the AccECN protocol be implemented alongside | |||
Selective Acknowledgement (SACK) [RFC2018]. If SACK is implemented | Selective Acknowledgement (SACK) [RFC2018]. If SACK is implemented | |||
with AccECN, Duplicate Selective Acknowledgement (D-SACK) [RFC2883] | with AccECN, Duplicate Selective Acknowledgement (D-SACK) [RFC2883] | |||
MUST also be implemented. | MUST also be implemented. | |||
3.1.2. Backward Compatibility | 3.1.2. Backward Compatibility | |||
The three flags are set to 1 to indicate AccECN support on the SYN | The three flags set to 1 indicate AccECN support on the SYN have been | |||
have been carefully chosen to enable natural fall-back to prior | carefully chosen to enable natural fall-back to prior stages in the | |||
stages in the evolution of ECN. Table 2 tabulates all the | evolution of ECN. Table 2 tabulates all the negotiation | |||
negotiation possibilities for ECN-related capabilities that involve | possibilities for ECN-related capabilities that involve at least one | |||
at least one AccECN-capable host. The entries in the first two | AccECN-capable host. The entries in the first two columns have been | |||
columns have been abbreviated, as follows: | abbreviated, as follows: | |||
AccECN: Supports more Accurate ECN feedback (the present | AccECN: Supports more Accurate ECN feedback (the present | |||
specification). | specification). | |||
Nonce: Supports ECN-nonce feedback [RFC3540]. | Nonce: Supports ECN-nonce feedback [RFC3540]. | |||
ECN: Supports 'Classic' ECN feedback [RFC3168]. | ECN: Supports 'Classic' ECN feedback [RFC3168]. | |||
No ECN: Not ECN-capable. Implicit congestion notification using | No ECN: Not ECN-capable. Implicit congestion notification using | |||
packet drop. | packet drop. | |||
skipping to change at line 950 ¶ | skipping to change at line 950 ¶ | |||
as the SYN that first caused the Server to open the connection. | as the SYN that first caused the Server to open the connection. | |||
An 'Acceptable' packet is defined in Section 1.3. | An 'Acceptable' packet is defined in Section 1.3. | |||
Handling SYNs or SYN/ACKs of multiple types (e.g., fall-back): | Handling SYNs or SYN/ACKs of multiple types (e.g., fall-back): | |||
* Any implementation that supports AccECN: | * Any implementation that supports AccECN: | |||
- MUST NOT switch into a different feedback mode than the one it | - MUST NOT switch into a different feedback mode than the one it | |||
first entered according to Table 2, no matter whether it | first entered according to Table 2, no matter whether it | |||
subsequently receives valid SYNs or Acceptable SYN/ACKs of | subsequently receives valid SYNs or Acceptable SYN/ACKs of | |||
different types. | different types; | |||
- SHOULD ignore the TCP-ECN flags in SYNs or SYN/ACKs that are | - SHOULD ignore the TCP-ECN flags in SYNs or SYN/ACKs that are | |||
received after the implementation reaches the ESTABLISHED | received after the implementation reaches the ESTABLISHED | |||
state, in line with the general TCP approach [RFC9293]. | state, in line with the general TCP approach [RFC9293]; | |||
Reason: Reaching ESTABLISHED state implies that at least one | Reason: Reaching ESTABLISHED state implies that at least one | |||
SYN and one SYN/ACK have successfully been delivered. And all | SYN and one SYN/ACK have successfully been delivered. And all | |||
the rules for handshake fall-back are designed to work based on | the rules for handshake fall-back are designed to work based on | |||
those packets that successfully traverse the path, whatever | those packets that successfully traverse the path, whatever | |||
other handshake packets are lost or delayed. | other handshake packets are lost or delayed. | |||
- MUST NOT send a 'Classic' ECN-setup SYN [RFC3168] with | - MUST NOT send a 'Classic' ECN-setup SYN [RFC3168] with | |||
(AE,CWR,ECE) = (0,1,1) and a SYN with (AE,CWR,ECE) = (1,1,1) | (AE,CWR,ECE) = (0,1,1) and a SYN with (AE,CWR,ECE) = (1,1,1) | |||
requesting AccECN feedback within the same connection; | requesting AccECN feedback within the same connection; | |||
skipping to change at line 977 ¶ | skipping to change at line 977 ¶ | |||
(AE,CWR,ECE) = (0,0,1) and a SYN/ACK agreeing to use AccECN | (AE,CWR,ECE) = (0,0,1) and a SYN/ACK agreeing to use AccECN | |||
feedback within the same connection; | feedback within the same connection; | |||
- MUST reset the connection with a RST packet, if it receives a | - MUST reset the connection with a RST packet, if it receives a | |||
'Classic' ECN-setup SYN with (AE,CWR,ECE) = (0,1,1) and a SYN | 'Classic' ECN-setup SYN with (AE,CWR,ECE) = (0,1,1) and a SYN | |||
requesting AccECN feedback during the same handshake; | requesting AccECN feedback during the same handshake; | |||
- MUST reset the connection with a RST packet, if it receives | - MUST reset the connection with a RST packet, if it receives | |||
'Classic' ECN-setup SYN/ACK with (AE,CWR,ECE) = (0,0,1) and a | 'Classic' ECN-setup SYN/ACK with (AE,CWR,ECE) = (0,0,1) and a | |||
SYN/ACK agreeing to use AccECN feedback during the same | SYN/ACK agreeing to use AccECN feedback during the same | |||
handshake; | handshake. | |||
The last four rules are necessary because, if one peer were to | The last four rules are necessary because, if one peer were to | |||
negotiate the feedback mode in two different types of handshake, | negotiate the feedback mode in two different types of handshake, | |||
it would not be possible for the other peer to know for certain | it would not be possible for the other peer to know for certain | |||
which handshake packet(s) the other end had eventually received or | which handshake packet(s) the other end had eventually received or | |||
in which order it received them. So, in the absence of these | in which order it received them. So, in the absence of these | |||
rules, the two peers could end up using different ECN feedback | rules, the two peers could end up using different ECN feedback | |||
modes without knowing it. | modes without knowing it. | |||
* A host in AccECN mode that is feeding back the IP ECN field on a | * A host in AccECN mode that is feeding back the IP ECN field on a | |||
SYN or SYN/ACK: | SYN or SYN/ACK: | |||
- MUST feed back the IP ECN field on the latest valid SYN or | - MUST feed back the IP ECN field on the latest valid SYN or | |||
acceptable SYN/ACK to arrive. | acceptable SYN/ACK to arrive. | |||
* A TCP Server already in AccECN mode: | * A TCP Server already in AccECN mode: | |||
- SHOULD acknowledge a valid SYN arriving with (AE,CWR,ECE) = | - SHOULD acknowledge a valid SYN arriving with (AE,CWR,ECE) = | |||
(0,0,0) by emitting an AccECN SYN/ACK (with the appropriate | (0,0,0) by emitting an AccECN SYN/ACK (with the appropriate | |||
combination of TCP-ECN flags to feed back the IP ECN field of | combination of TCP-ECN flags to feed back the IP ECN field of | |||
this latest SYN). | this latest SYN); | |||
- MAY acknowledge a valid SYN arriving with (AE,CWR,ECE) = | - MAY acknowledge a valid SYN arriving with (AE,CWR,ECE) = | |||
(0,0,0) by sending a SYN/ACK with (AE,CWR,ECE) = (0,0,0). | (0,0,0) by sending a SYN/ACK with (AE,CWR,ECE) = (0,0,0). | |||
Rationale: When a SYN arrives with (AE,CWR,ECE) = (0,0,0) at a TCP | Rationale: When a SYN arrives with (AE,CWR,ECE) = (0,0,0) at a TCP | |||
Server that is already in AccECN mode, it implies that the TCP | Server that is already in AccECN mode, it implies that the TCP | |||
Client had probably not received the previous AccECN SYN/ACK | Client had probably not received the previous AccECN SYN/ACK | |||
emitted by the TCP Server. Therefore, the first bullet recommends | emitted by the TCP Server. Therefore, the first bullet recommends | |||
attempting at least one more AccECN SYN/ACK. Nonetheless, the | attempting at least one more AccECN SYN/ACK. Nonetheless, the | |||
second bullet recognizes that the Server might eventually need to | second bullet recognizes that the Server might eventually need to | |||
skipping to change at line 1038 ¶ | skipping to change at line 1038 ¶ | |||
Sending ECT: | Sending ECT: | |||
* Any implementation that supports AccECN: | * Any implementation that supports AccECN: | |||
- MUST NOT set ECT if it is in Not ECN feedback mode. | - MUST NOT set ECT if it is in Not ECN feedback mode. | |||
A Data Sender in AccECN mode: | A Data Sender in AccECN mode: | |||
- SHOULD set an ECT codepoint in the IP header of packets to | - SHOULD set an ECT codepoint in the IP header of packets to | |||
indicate to the network that the transport is capable and | indicate to the network that the transport is capable and | |||
willing to participate in ECN for this packet. | willing to participate in ECN for this packet; | |||
- MAY not set ECT on any packet (for instance if it has reason to | - MAY not set ECT on any packet (for instance if it has reason to | |||
believe such a packet would be blocked). | believe such a packet would be blocked). | |||
A TCP Server in AccECN mode: | A TCP Server in AccECN mode: | |||
- MUST NOT set ECT on any packet for the rest of the connection, | - MUST NOT set ECT on any packet for the rest of the connection, | |||
if it has received or sent at least one valid SYN or Acceptable | if it has received or sent at least one valid SYN or Acceptable | |||
SYN/ACK with (AE,CWR,ECE) = (0,0,0) during the handshake. | SYN/ACK with (AE,CWR,ECE) = (0,0,0) during the handshake. | |||
skipping to change at line 1062 ¶ | skipping to change at line 1062 ¶ | |||
mode, it can be certain that the Server is already in AccECN | mode, it can be certain that the Server is already in AccECN | |||
feedback mode. | feedback mode. | |||
Congestion response: | Congestion response: | |||
* A host in AccECN mode: | * A host in AccECN mode: | |||
- is obliged to respond appropriately to AccECN feedback that | - is obliged to respond appropriately to AccECN feedback that | |||
indicates there were ECN marks on packets it had previously | indicates there were ECN marks on packets it had previously | |||
sent, where 'appropriately' is defined in Section 6.1 of | sent, where 'appropriately' is defined in Section 6.1 of | |||
[RFC3168] and updated by Sections 2.1 and 4.1 of [RFC8311]. | [RFC3168] and updated by Sections 2.1 and 4.1 of [RFC8311]; | |||
- is still obliged to respond appropriately to congestion | - is still obliged to respond appropriately to congestion | |||
feedback, even when it is solely sending non-ECN-capable | feedback, even when it is solely sending non-ECN-capable | |||
packets (for rationale, some examples and some exceptions see | packets (for rationale, some examples and some exceptions see | |||
Sections 3.2.2.3 and 3.2.2.4). | Sections 3.2.2.3 and 3.2.2.4); | |||
- is still obliged to respond appropriately to congestion | - is still obliged to respond appropriately to congestion | |||
feedback, even if it has sent or received a SYN or SYN/ACK | feedback, even if it has sent or received a SYN or SYN/ACK | |||
packet with (AE,CWR,ECE) = (0,0,0) during the handshake. | packet with (AE,CWR,ECE) = (0,0,0) during the handshake; | |||
- MUST NOT set CWR to indicate that it has received and responded | - MUST NOT set CWR to indicate that it has received and responded | |||
to indications of congestion. | to indications of congestion. | |||
For the avoidance of doubt, this is unlike an RFC 3168 data | For the avoidance of doubt, this is unlike an RFC 3168 data | |||
sender and this does not preclude the Data Sender from setting | sender and this does not preclude the Data Sender from setting | |||
the bits of the ACE counter field, which includes an overloaded | the bits of the ACE counter field, which includes an overloaded | |||
use of the same bit. | use of the same bit. | |||
Receiving ECT: | Receiving ECT: | |||
skipping to change at line 1147 ¶ | skipping to change at line 1147 ¶ | |||
report the most recent value, no matter whether it is in a pure ACK, | report the most recent value, no matter whether it is in a pure ACK, | |||
or an ACK piggybacked on a packet used by the other half-connection, | or an ACK piggybacked on a packet used by the other half-connection, | |||
whether a new payload data or a retransmission. Therefore, the | whether a new payload data or a retransmission. Therefore, the | |||
feedback piggybacked on a retransmitted packet is unlikely to be the | feedback piggybacked on a retransmitted packet is unlikely to be the | |||
same as the feedback on the original packet. | same as the feedback on the original packet. | |||
3.2.1. Initialization of Feedback Counters | 3.2.1. Initialization of Feedback Counters | |||
When a host first enters AccECN mode, in its role as a Data Receiver, | When a host first enters AccECN mode, in its role as a Data Receiver, | |||
it initializes its counters to r.cep = 5, r.e0b = r.e1b = 1, and | it initializes its counters to r.cep = 5, r.e0b = r.e1b = 1, and | |||
r.ceb = 0, | r.ceb = 0. | |||
Non-zero initial values are used to support a stateless handshake | Non-zero initial values are used to support a stateless handshake | |||
(see Section 5.1) and to be distinct from cases where the fields are | (see Section 5.1) and to be distinct from cases where the fields are | |||
incorrectly zeroed (e.g., by middleboxes -- see Section 3.2.3.2.4). | incorrectly zeroed (e.g., by middleboxes -- see Section 3.2.3.2.4). | |||
When a host enters AccECN mode, in its role as a Data Sender, it | When a host enters AccECN mode, in its role as a Data Sender, it | |||
initializes its counters to s.cep = 5, s.e0b = s.e1b = 1, and s.ceb = | initializes its counters to s.cep = 5, s.e0b = s.e1b = 1, and s.ceb = | |||
0. | 0. | |||
3.2.2. The ACE Field | 3.2.2. The ACE Field | |||
skipping to change at line 1223 ¶ | skipping to change at line 1223 ¶ | |||
data to include on the ACK), it SHOULD first send a pure ACK that | data to include on the ACK), it SHOULD first send a pure ACK that | |||
does satisfy these conditions (see Section 5.2), so that it can feed | does satisfy these conditions (see Section 5.2), so that it can feed | |||
back which of the four values of the IP ECN field arrived on the SYN/ | back which of the four values of the IP ECN field arrived on the SYN/ | |||
ACK. A valid exception to this "SHOULD" would be where the | ACK. A valid exception to this "SHOULD" would be where the | |||
implementation will only be used in an environment where mangling of | implementation will only be used in an environment where mangling of | |||
the ECN field is unlikely. | the ECN field is unlikely. | |||
The TCP Client MUST also use the handshake encoding for the pure ACK | The TCP Client MUST also use the handshake encoding for the pure ACK | |||
of any retransmitted SYN/ACK that confirms that the TCP Server | of any retransmitted SYN/ACK that confirms that the TCP Server | |||
supports AccECN. If the final ACK of the handshake does not arrive | supports AccECN. If the final ACK of the handshake does not arrive | |||
before its retransmission timer expires, the TCP Server is follow the | before its retransmission timer expires, the procedure that the TCP | |||
procedure given in Section 3.1.4.2. | Server will follow is given in Section 3.1.4.2. | |||
+==================+================+=====================+ | +==================+================+=====================+ | |||
| IP ECN Codepoint | ACE on Pure | r.cep of TCP Client | | | IP ECN Codepoint | ACE on Pure | r.cep of TCP Client | | |||
| on SYN/ACK | ACK of SYN/ACK | in AccECN Mode | | | on SYN/ACK | ACK of SYN/ACK | in AccECN Mode | | |||
+==================+================+=====================+ | +==================+================+=====================+ | |||
| Not-ECT | 0b010 | 5 | | | Not-ECT | 0b010 | 5 | | |||
+------------------+----------------+---------------------+ | +------------------+----------------+---------------------+ | |||
| ECT(1) | 0b011 | 5 | | | ECT(1) | 0b011 | 5 | | |||
+------------------+----------------+---------------------+ | +------------------+----------------+---------------------+ | |||
| ECT(0) | 0b100 | 5 | | | ECT(0) | 0b100 | 5 | | |||
skipping to change at line 1498 ¶ | skipping to change at line 1498 ¶ | |||
mangling of the IP ECN field is asymmetric, which is currently common | mangling of the IP ECN field is asymmetric, which is currently common | |||
over some mobile networks [Mandalari18]. In this case, one end might | over some mobile networks [Mandalari18]. In this case, one end might | |||
see no unsafe transition and continue sending ECN-capable packets, | see no unsafe transition and continue sending ECN-capable packets, | |||
while the other end sees an unsafe transition and stops sending ECN- | while the other end sees an unsafe transition and stops sending ECN- | |||
capable packets. | capable packets. | |||
Invalid transitions of the IP ECN field are defined in Section 18 of | Invalid transitions of the IP ECN field are defined in Section 18 of | |||
the Classic ECN specification [RFC3168] and repeated here for | the Classic ECN specification [RFC3168] and repeated here for | |||
convenience: | convenience: | |||
* the Not-ECT codepoint changes. | * the Not-ECT codepoint changes; | |||
* either ECT codepoint transitions to Not-ECT. | * either ECT codepoint transitions to Not-ECT; | |||
* the CE codepoint changes. | * the CE codepoint changes. | |||
RFC 3168 says that a router that changes ECT to Not-ECT is invalid | RFC 3168 says that a router that changes ECT to Not-ECT is invalid | |||
but safe. However, from a host's viewpoint, this transition is | but safe. However, from a host's viewpoint, this transition is | |||
unsafe because it could be the result of two transitions at different | unsafe because it could be the result of two transitions at different | |||
routers on the path: ECT to CE (safe) then CE to Not-ECT (unsafe). | routers on the path: ECT to CE (safe) then CE to Not-ECT (unsafe). | |||
This scenario could well happen where an ECN-enabled home router | This scenario could well happen where an ECN-enabled home router | |||
congests its upstream mobile broadband bottleneck link, then the | congests its upstream mobile broadband bottleneck link, then the | |||
ingress to the mobile network clears the ECN field [Mandalari18]. | ingress to the mobile network clears the ECN field [Mandalari18]. | |||
skipping to change at line 1532 ¶ | skipping to change at line 1532 ¶ | |||
If AccECN has been successfully negotiated, the Data Sender MAY check | If AccECN has been successfully negotiated, the Data Sender MAY check | |||
the value of the ACE counter in the first feedback packet (with or | the value of the ACE counter in the first feedback packet (with or | |||
without data) that arrives after the three-way handshake. If the | without data) that arrives after the three-way handshake. If the | |||
value of this ACE field is found to be zero (0b000), for the | value of this ACE field is found to be zero (0b000), for the | |||
remainder of the half-connection the Data Sender ought to send non- | remainder of the half-connection the Data Sender ought to send non- | |||
ECN-capable packets and it is advised not to respond to any feedback | ECN-capable packets and it is advised not to respond to any feedback | |||
of CE markings. | of CE markings. | |||
Reason: the symptoms imply any or all of the following: | Reason: the symptoms imply any or all of the following: | |||
* the remote peer has somehow entered Not ECN feedback mode. | * the remote peer has somehow entered Not ECN feedback mode; | |||
* a broken remote TCP implementation. | * a broken remote TCP implementation; | |||
* potential mangling of the ECN fields in the TCP headers (although | * potential mangling of the ECN fields in the TCP headers (although | |||
unlikely given they clearly survived during the handshake). | unlikely given they clearly survived during the handshake). | |||
This advice is not stated normatively (in capitals), because the best | This advice is not stated normatively (in capitals), because the best | |||
strategy might depend on the likelihood to experience these | strategy might depend on the likelihood to experience these | |||
scenarios, which can only be known at the time of deployment. | scenarios, which can only be known at the time of deployment. | |||
Note that a host in AccECN mode MUST continue to provide Accurate ECN | Note that a host in AccECN mode MUST continue to provide Accurate ECN | |||
feedback to its peer, even if it is no longer sending ECT itself over | feedback to its peer, even if it is no longer sending ECT itself over | |||
skipping to change at line 1586 ¶ | skipping to change at line 1586 ¶ | |||
The following rules define when the receiver of a packet in AccECN | The following rules define when the receiver of a packet in AccECN | |||
mode emits an ACK: | mode emits an ACK: | |||
Change-Triggered ACKs: An AccECN Data Receiver SHOULD emit an ACK | Change-Triggered ACKs: An AccECN Data Receiver SHOULD emit an ACK | |||
whenever a data packet marked CE arrives after the previous packet | whenever a data packet marked CE arrives after the previous packet | |||
was not CE. | was not CE. | |||
Even though this rule is stated as a "SHOULD", it is important for | Even though this rule is stated as a "SHOULD", it is important for | |||
a transition to trigger an ACK if at all possible. The only valid | a transition to trigger an ACK if at all possible. The only valid | |||
exception to this rule is due to large receive offload (LRO) or | exception to this rule is due to Large Receive Offload (LRO) or | |||
generic receive offload (GRO) as further described below. | Generic Receive Offload (GRO) as further described below. | |||
For the avoidance of doubt, this rule is deliberately worded to | For the avoidance of doubt, this rule is deliberately worded to | |||
apply solely when _data_ packets arrive, but the comparison with | apply solely when _data_ packets arrive, but the comparison with | |||
the previous packet includes any packet, not just data packets. | the previous packet includes any packet, not just data packets. | |||
Increment-Triggered ACKs: An AccECN receiver of a packet MUST emit | Increment-Triggered ACKs: An AccECN receiver of a packet MUST emit | |||
an ACK if 'n' CE marks have arrived since the previous ACK. If | an ACK if 'n' CE marks have arrived since the previous ACK. If | |||
there is unacknowledged data at the receiver, 'n' SHOULD be 2. If | there is unacknowledged data at the receiver, 'n' SHOULD be 2. If | |||
there is no unacknowledged data at the receiver, 'n' SHOULD be 3 | there is no unacknowledged data at the receiver, 'n' SHOULD be 3 | |||
and MUST be no less than 3. In either case, 'n' MUST be no | and MUST be no less than 3. In either case, 'n' MUST be no | |||
skipping to change at line 2074 ¶ | skipping to change at line 2074 ¶ | |||
For the avoidance of doubt, this rule does not concern the arrival | For the avoidance of doubt, this rule does not concern the arrival | |||
of control packets with no payload, because they cannot alter any | of control packets with no payload, because they cannot alter any | |||
byte counters. | byte counters. | |||
Continual Repetition: Otherwise, if arriving packets continue to | Continual Repetition: Otherwise, if arriving packets continue to | |||
increment the same byte counter: | increment the same byte counter: | |||
* the Data Receiver SHOULD include a counter that has continued | * the Data Receiver SHOULD include a counter that has continued | |||
to increment on the next scheduled ACK following a change- | to increment on the next scheduled ACK following a change- | |||
triggered AccECN TCP Option. | triggered AccECN TCP Option; | |||
* while the same counter continues to increment, it SHOULD | * while the same counter continues to increment, it SHOULD | |||
include the counter every n ACKs as consistently as possible, | include the counter every n ACKs as consistently as possible, | |||
where n can be chosen by the implementer. | where n can be chosen by the implementer; | |||
* It SHOULD always include an AccECN Option if the r.ceb counter | * it SHOULD always include an AccECN Option if the r.ceb counter | |||
is incrementing and it MAY include an AccECN Option if r.ec0b | is incrementing and it MAY include an AccECN Option if r.ec0b | |||
or r.ec1b is incrementing. | or r.ec1b is incrementing; | |||
* It SHOULD include each counter at least once for every 2^22 | * it SHOULD include each counter at least once for every 2^22 | |||
bytes incremented to prevent overflow during continual | bytes incremented to prevent overflow during continual | |||
repetition. | repetition. | |||
The above rules complement those in Section 3.2.2.5, which determine | The above rules complement those in Section 3.2.2.5, which determine | |||
when to generate an ACK irrespective of whether an AccECN TCP Option | when to generate an ACK irrespective of whether an AccECN TCP Option | |||
is to be included. | is to be included. | |||
The recommended scheme is intended as a simple way to ensure that all | The recommended scheme is intended as a simple way to ensure that all | |||
the relevant byte counters will be carried on any ACK that reaches | the relevant byte counters will be carried on any ACK that reaches | |||
the Data Sender, no matter how many pure ACKs are filtered or | the Data Sender, no matter how many pure ACKs are filtered or | |||
End of changes. 26 change blocks. | ||||
36 lines changed or deleted | 36 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |