rfc9816v1.txt | rfc9816.txt | |||
---|---|---|---|---|
Internet Engineering Task Force (IETF) K. Patel | Internet Engineering Task Force (IETF) K. Patel | |||
Request for Comments: 9816 Arrcus, Inc. | Request for Comments: 9816 A. Lindem | |||
Category: Informational A. Lindem | Category: Informational Arrcus, Inc. | |||
ISSN: 2070-1721 LabN Consulting, L.L.C. | ISSN: 2070-1721 S. Zandi | |||
S. Zandi | ||||
G. Dawra | G. Dawra | |||
J. Dong | J. Dong | |||
Huawei Technologies | Huawei Technologies | |||
July 2025 | July 2025 | |||
Usage and Applicability of BGP Link-State Shortest Path Routing (BGP- | Usage and Applicability of BGP Link-State Shortest Path First (SPF) | |||
SPF) in Data Centers | Routing in Data Centers | |||
Abstract | Abstract | |||
This document discusses the usage and applicability of BGP Link-State | This document discusses the usage and applicability of BGP Link State | |||
Shortest Path First (BGP-SPF) extensions in data center networks | (BGP-LS) Shortest Path First (SPF) extensions in data center networks | |||
utilizing Clos or Fat Tree topologies. The document is intended to | utilizing Clos or Fat Tree topologies. The document is intended to | |||
provide simplified guidance for the deployment of BGP-SPF extensions. | provide simplified guidance for the deployment of BGP-LS SPF | |||
extensions. | ||||
Status of This Memo | Status of This Memo | |||
This document is not an Internet Standards Track specification; it is | This document is not an Internet Standards Track specification; it is | |||
published for informational purposes. | published for informational purposes. | |||
This document is a product of the Internet Engineering Task Force | This document is a product of the Internet Engineering Task Force | |||
(IETF). It represents the consensus of the IETF community. It has | (IETF). It represents the consensus of the IETF community. It has | |||
received public review and has been approved for publication by the | received public review and has been approved for publication by the | |||
Internet Engineering Steering Group (IESG). Not all documents | Internet Engineering Steering Group (IESG). Not all documents | |||
skipping to change at line 71 ¶ | skipping to change at line 71 ¶ | |||
5. BGP-SPF Applicability to Clos Networks | 5. BGP-SPF Applicability to Clos Networks | |||
5.1. Usage of BGP-LS-SPF SAFI | 5.1. Usage of BGP-LS-SPF SAFI | |||
5.1.1. Relationship to Other BGP AFI/SAFI Tuples | 5.1.1. Relationship to Other BGP AFI/SAFI Tuples | |||
5.2. Peering Models | 5.2. Peering Models | |||
5.2.1. Sparse Peering Model | 5.2.1. Sparse Peering Model | |||
5.2.2. Biconnected Graph Heuristic | 5.2.2. Biconnected Graph Heuristic | |||
5.3. BGP Spine/Leaf Topology Policy | 5.3. BGP Spine/Leaf Topology Policy | |||
5.4. BGP Peer Discovery Considerations | 5.4. BGP Peer Discovery Considerations | |||
5.5. BGP Peer Discovery | 5.5. BGP Peer Discovery | |||
5.5.1. BGP IPv6 Simplified Peering | 5.5.1. BGP IPv6 Simplified Peering | |||
5.5.2. BGP-LS SPF Topology Visibility for Management | 5.5.2. BGP-LS-SPF Topology Visibility for Management | |||
5.5.3. Data Center Interconnect (DCI) Applicability | 5.5.3. Data Center Interconnect (DCI) Applicability | |||
6. Non-Clos / Fat Tree Topology Applicability | 6. Non-Clos / Fat Tree Topology Applicability | |||
7. Non-Transit Node Capability | 7. Non-Transit Node Capability | |||
8. BGP Policy Applicability | 8. BGP Policy Applicability | |||
9. IANA Considerations | 9. IANA Considerations | |||
10. Security Considerations | 10. Security Considerations | |||
11. References | 11. References | |||
11.1. Normative References | 11.1. Normative References | |||
11.2. Informative References | 11.2. Informative References | |||
Acknowledgements | Acknowledgements | |||
Authors' Addresses | Authors' Addresses | |||
1. Introduction | 1. Introduction | |||
This document complements [RFC9815] by discussing the applicability | This document complements [RFC9815] by discussing the applicability | |||
of the BGP-SPF technology in a simple and fairly common deployment | of the BGP Link State (BGP-LS) Shortest Path First (SPF) technology | |||
scenario, which is described in Section 3. | in a simple and fairly common deployment scenario, which is described | |||
in Section 3. | ||||
Section 4 describes the reasons for BGP modifications for such | Section 4 describes the reasons for BGP modifications for such | |||
deployments. | deployments. | |||
Section 5 covers the BGP-SPF protocol enhancements to BGP to meet | Section 5 covers the BGP SPF protocol enhancements to BGP to meet | |||
these requirements and their applicability to data center [Clos] | these requirements and their applicability to data center [Clos] | |||
networks. | networks. | |||
2. Recommended Reading | 2. Recommended Reading | |||
This document assumes knowledge of existing data center networks and | This document assumes knowledge of existing data center networks and | |||
data center network topologies [Clos]. This document also assumes | data center network topologies [Clos]. This document also assumes | |||
knowledge of data center routing protocols such as BGP [RFC4271], | knowledge of data center routing protocols such as BGP [RFC4271], | |||
BGP-SPF [RFC9815], and OSPF [RFC2328] [RFC5340] as well as data | BGP-LS SPF [RFC9815], and OSPF [RFC2328] [RFC5340] as well as data | |||
center Operations, Administration, and Maintenance (OAM) protocols | center Operations, Administration, and Maintenance (OAM) protocols | |||
like the Link Layer Discovery Protocol (LLDP) [RFC4957] and | like the Link Layer Discovery Protocol (LLDP) [RFC4957] and | |||
Bidirectional Forwarding Detection (BFD) [RFC5880]. | Bidirectional Forwarding Detection (BFD) [RFC5880]. | |||
3. Common Deployment Scenario | 3. Common Deployment Scenario | |||
Within a data center, servers are commonly interconnected using the | Within a data center, servers are commonly interconnected using the | |||
Clos topology [Clos]. The Clos topology is fully non-blocking, and | Clos topology [Clos]. The Clos topology is fully non-blocking, and | |||
the topology is realized using Equal-Cost Multipath (ECMP). In a | the topology is realized using Equal-Cost Multipath (ECMP). In a | |||
multi-stage Clos topology, the minimum number of parallel paths in | multi-stage Clos topology, the minimum number of parallel paths in | |||
skipping to change at line 165 ¶ | skipping to change at line 166 ¶ | |||
resolve any overlay next hops. The hop-by-hop BGP peering paradigm | resolve any overlay next hops. The hop-by-hop BGP peering paradigm | |||
imposes several restrictions within a Clos. It prohibits the | imposes several restrictions within a Clos. It prohibits the | |||
deployment of route reflectors / route controllers as the EBGP | deployment of route reflectors / route controllers as the EBGP | |||
sessions are congruent with the data path. The BGP best-path | sessions are congruent with the data path. The BGP best-path | |||
algorithm is prefix based, and it prevents announcements of prefixes | algorithm is prefix based, and it prevents announcements of prefixes | |||
to other BGP speakers until the best-path decision process has been | to other BGP speakers until the best-path decision process has been | |||
performed for the prefix at each intermediate hop. These | performed for the prefix at each intermediate hop. These | |||
restrictions significantly delay the overall convergence of the | restrictions significantly delay the overall convergence of the | |||
underlay network within a Clos network. | underlay network within a Clos network. | |||
The BGP-SPF modifications allow BGP to overcome these limitations. | The BGP SPF modifications allow BGP to overcome these limitations. | |||
Furthermore, using the BGP-LS Network Layer Reachability Information | Furthermore, using the BGP-LS Network Layer Reachability Information | |||
(NLRI) format allows the BGP-SPF data to be advertised for nodes, | (NLRI) format allows the BGP SPF data to be advertised for nodes, | |||
links, and prefixes in the BGP routing domain and used for SPF | links, and prefixes in the BGP routing domain and used for SPF | |||
computations [RFC9552]. | computations [RFC9552]. | |||
Additional motivation for deploying BGP-SPF is included in [RFC9815]. | Additional motivation for deploying BGP-SPF is included in [RFC9815]. | |||
5. BGP-SPF Applicability to Clos Networks | 5. BGP-SPF Applicability to Clos Networks | |||
With the BGP-SPF extensions [RFC9815], the BGP best-path computation | With the BGP-SPF extensions [RFC9815], the BGP best-path computation | |||
and route computation are replaced with link-state algorithms such as | and route computation are replaced with link-state algorithms such as | |||
those used by OSPF [RFC2328], both to determine whether a BGP-LS-SPF | those used by OSPF [RFC2328], both to determine whether a BGP-LS-SPF | |||
NLRI has changed and needs to be readvertised and to compute the BGP | NLRI has changed and needs to be readvertised and to compute the BGP | |||
routes. These modifications will significantly improve convergence | routes. These modifications will significantly improve convergence | |||
of the underlay while affording the operational benefits of a single | of the underlay while affording the operational benefits of a single | |||
routing protocol [RFC7938]. | routing protocol [RFC7938]. | |||
Data center controllers typically require visibility to the BGP | Data center controllers typically require visibility to the BGP | |||
topology to compute traffic-engineered paths. These controllers | topology to compute traffic-engineered paths. These controllers | |||
learn the topology and other relevant information via the BGP-LS | learn the topology and other relevant information via the BGP-LS | |||
address family [RFC9552], which is totally independent of the | address family [RFC9552], which is totally independent of the | |||
underlay address families (usually IPv4/IPv6 unicast). Furthermore, | underlay address families (usually IPv4/IPv6 unicast). Furthermore, | |||
in traditional BGP underlays, all the BGP routers will need to | in usual BGP underlays, all the BGP routers will need to advertise | |||
advertise their BGP-LS information independently. With the BGP-SPF | their BGP-LS information independently. With the BGP-SPF extensions, | |||
extensions, controllers can learn the topology using the same BGP | controllers can learn the topology using the same BGP advertisements | |||
advertisements used to compute the underlay routes. Furthermore, | used to compute the underlay routes. Furthermore, these data center | |||
these data center controllers can avail the convergence advantages of | controllers can avail the convergence advantages of the BGP-SPF | |||
the BGP-SPF extensions. The placement of controllers can be outside | extensions. The placement of controllers can be outside of the | |||
of the forwarding path or within the forwarding path. | forwarding path or within the forwarding path. | |||
Alternatively, as each and every router in the BGP-SPF domain will | Alternatively, as each and every router in the BGP-SPF domain will | |||
have a complete view of the topology, the operator can also choose to | have a complete view of the topology, the operator can also choose to | |||
configure BGP sessions in the hop-by-hop peering model described in | configure BGP sessions in the hop-by-hop peering model described in | |||
[RFC7938] along with BFD [RFC5580]. In doing so, while the hop-by- | [RFC7938] along with BFD [RFC5580]. In doing so, while the hop-by- | |||
hop peering model lacks the inherent benefits of the controller-based | hop peering model lacks the inherent benefits of the controller-based | |||
model, BGP updates need not be serialized by the BGP best-path | model, BGP updates need not be serialized by the BGP best-path | |||
algorithm in either of these models. This helps overall network | algorithm in either of these models. This helps overall network | |||
convergence. | convergence. | |||
skipping to change at line 407 ¶ | skipping to change at line 408 ¶ | |||
To conserve IPv4 address space and simplify operations, BGP-SPF | To conserve IPv4 address space and simplify operations, BGP-SPF | |||
routers in Clos / Fat Tree deployments can use IPv6 addresses as the | routers in Clos / Fat Tree deployments can use IPv6 addresses as the | |||
peer address. For IPv4 address families, IPv6 peering as specified | peer address. For IPv4 address families, IPv6 peering as specified | |||
in [RFC8950] can be deployed to avoid configuring IPv4 addresses on | in [RFC8950] can be deployed to avoid configuring IPv4 addresses on | |||
router interfaces. When this is done, dynamic discovery mechanisms, | router interfaces. When this is done, dynamic discovery mechanisms, | |||
as described in Section 5.5, can be used to learn the global or link- | as described in Section 5.5, can be used to learn the global or link- | |||
local IPv6 peer addresses, and IPv4 addresses need not be configured | local IPv6 peer addresses, and IPv4 addresses need not be configured | |||
on these interfaces. If IPv6 link-local peering is used, then | on these interfaces. If IPv6 link-local peering is used, then | |||
configuration of IPv6 global addresses is also not required | configuration of IPv6 global addresses is also not required | |||
[RFC7404]. The Link Local/Remote Identifiers of the peering | [RFC7404]. The Link Local/Remote Identifiers of the peering | |||
interfaces MUST be used in the Link NLRI as described in | interfaces must be used in the Link NLRI as described in | |||
Section 5.2.2 of [RFC9815]. | Section 5.2.2 of [RFC9815]. | |||
5.5.2. BGP-LS SPF Topology Visibility for Management | 5.5.2. BGP-LS-SPF Topology Visibility for Management | |||
Irrespective of whether or not BGP-SPF is used for route calculation, | Irrespective of whether or not BGP-SPF is used for route calculation, | |||
the BGP-LS-SPF route advertisements can be used to periodically | the BGP-LS-SPF route advertisements can be used to periodically | |||
construct the Clos / Fat Tree topology. This is especially useful in | construct the Clos / Fat Tree topology. This is especially useful in | |||
deployments where an Interior Gateway Protocol (IGP) is not used and | deployments where an Interior Gateway Protocol (IGP) is not used and | |||
the base BGP-LS routes [RFC9552] are not available. The resultant | the base BGP-LS routes [RFC9552] are not available. The resultant | |||
topology visibility can then be used for troubleshooting and | topology visibility can then be used for troubleshooting and | |||
consistency checking. This would normally be done on a central | consistency checking. This would normally be done on a central | |||
controller or other management tool that could also be used for | controller or other management tool that could also be used for | |||
fabric data path verification. The precise algorithms and | fabric data path verification. The precise algorithms and | |||
skipping to change at line 463 ¶ | skipping to change at line 464 ¶ | |||
accomplished using the BGP-LS-SPF Node NLRI Attribute SPF Status TLV | accomplished using the BGP-LS-SPF Node NLRI Attribute SPF Status TLV | |||
as described in [RFC9815]. | as described in [RFC9815]. | |||
8. BGP Policy Applicability | 8. BGP Policy Applicability | |||
Existing BGP policy such as prefix filtering may be used in | Existing BGP policy such as prefix filtering may be used in | |||
conjunction with the BGP-LS-SPF SAFI. When BGP policy is used with | conjunction with the BGP-LS-SPF SAFI. When BGP policy is used with | |||
the BGP-LS-SPF SAFI, BGP speakers in the BGP-LS-SPF routing domain | the BGP-LS-SPF SAFI, BGP speakers in the BGP-LS-SPF routing domain | |||
will not all have the same set of NLRIs and will compute a different | will not all have the same set of NLRIs and will compute a different | |||
BGP local routing table. Consequently, care must be taken to assure | BGP local routing table. Consequently, care must be taken to assure | |||
routing is consistent and blackholes or routing loops do not ensue. | that routing is consistent and that routes to unreachable | |||
However, this is no different than if traditional BGP routing using | destinations or routing loops do not ensue. However, this is no | |||
the IPv4 and IPv6 address families were used. | different than if classical BGP routing using the IPv4 and IPv6 | |||
address families were used. | ||||
9. IANA Considerations | 9. IANA Considerations | |||
This document has no IANA actions. | This document has no IANA actions. | |||
10. Security Considerations | 10. Security Considerations | |||
This document introduces no new security considerations above and | This document introduces no new security considerations above and | |||
beyond those already specified in [RFC4271] and [RFC9815]. | beyond those already specified in [RFC4271] and [RFC9815]. | |||
skipping to change at line 592 ¶ | skipping to change at line 594 ¶ | |||
Authors' Addresses | Authors' Addresses | |||
Keyur Patel | Keyur Patel | |||
Arrcus, Inc. | Arrcus, Inc. | |||
2077 Gateway Pl | 2077 Gateway Pl | |||
San Jose, CA 95110 | San Jose, CA 95110 | |||
United States of America | United States of America | |||
Email: keyur@arrcus.com | Email: keyur@arrcus.com | |||
Acee Lindem | Acee Lindem | |||
LabN Consulting, L.L.C. | Arrcus, Inc. | |||
301 Midenhall Way | 301 Midenhall Way | |||
Cary, NC 95110 | Cary, NC 27513 | |||
United States of America | United States of America | |||
Email: acee.ietf@gmail.com | Email: acee.ietf@gmail.com | |||
Shawn Zandi | Shawn Zandi | |||
222 2nd Street | 222 2nd Street | |||
San Francisco, CA 94105 | San Francisco, CA 94105 | |||
United States of America | United States of America | |||
Email: szandi@linkedin.com | Email: szandi@linkedin.com | |||
End of changes. 16 change blocks. | ||||
30 lines changed or deleted | 32 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. |