rfc9816v1.txt   rfc9816.txt 
Internet Engineering Task Force (IETF) K. Patel Internet Engineering Task Force (IETF) K. Patel
Request for Comments: 9816 Arrcus, Inc. Request for Comments: 9816 A. Lindem
Category: Informational A. Lindem Category: Informational Arrcus, Inc.
ISSN: 2070-1721 LabN Consulting, L.L.C. ISSN: 2070-1721 S. Zandi
S. Zandi
LinkedIn LinkedIn
G. Dawra G. Dawra
Linkedin Linkedin
J. Dong J. Dong
Huawei Technologies Huawei Technologies
July 2025 July 2025
Usage and Applicability of BGP Link-State Shortest Path Routing (BGP- Usage and Applicability of BGP Link-State Shortest Path First (SPF)
SPF) in Data Centers Routing in Data Centers
Abstract Abstract
This document discusses the usage and applicability of BGP Link-State This document discusses the usage and applicability of BGP Link State
Shortest Path First (BGP-SPF) extensions in data center networks (BGP-LS) Shortest Path First (SPF) extensions in data center networks
utilizing Clos or Fat Tree topologies. The document is intended to utilizing Clos or Fat Tree topologies. The document is intended to
provide simplified guidance for the deployment of BGP-SPF extensions. provide simplified guidance for the deployment of BGP-LS SPF
extensions.
Status of This Memo Status of This Memo
This document is not an Internet Standards Track specification; it is This document is not an Internet Standards Track specification; it is
published for informational purposes. published for informational purposes.
This document is a product of the Internet Engineering Task Force This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has (IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Not all documents Internet Engineering Steering Group (IESG). Not all documents
skipping to change at line 71 skipping to change at line 71
5. BGP-SPF Applicability to Clos Networks 5. BGP-SPF Applicability to Clos Networks
5.1. Usage of BGP-LS-SPF SAFI 5.1. Usage of BGP-LS-SPF SAFI
5.1.1. Relationship to Other BGP AFI/SAFI Tuples 5.1.1. Relationship to Other BGP AFI/SAFI Tuples
5.2. Peering Models 5.2. Peering Models
5.2.1. Sparse Peering Model 5.2.1. Sparse Peering Model
5.2.2. Biconnected Graph Heuristic 5.2.2. Biconnected Graph Heuristic
5.3. BGP Spine/Leaf Topology Policy 5.3. BGP Spine/Leaf Topology Policy
5.4. BGP Peer Discovery Considerations 5.4. BGP Peer Discovery Considerations
5.5. BGP Peer Discovery 5.5. BGP Peer Discovery
5.5.1. BGP IPv6 Simplified Peering 5.5.1. BGP IPv6 Simplified Peering
5.5.2. BGP-LS SPF Topology Visibility for Management 5.5.2. BGP-LS-SPF Topology Visibility for Management
5.5.3. Data Center Interconnect (DCI) Applicability 5.5.3. Data Center Interconnect (DCI) Applicability
6. Non-Clos / Fat Tree Topology Applicability 6. Non-Clos / Fat Tree Topology Applicability
7. Non-Transit Node Capability 7. Non-Transit Node Capability
8. BGP Policy Applicability 8. BGP Policy Applicability
9. IANA Considerations 9. IANA Considerations
10. Security Considerations 10. Security Considerations
11. References 11. References
11.1. Normative References 11.1. Normative References
11.2. Informative References 11.2. Informative References
Acknowledgements Acknowledgements
Authors' Addresses Authors' Addresses
1. Introduction 1. Introduction
This document complements [RFC9815] by discussing the applicability This document complements [RFC9815] by discussing the applicability
of the BGP-SPF technology in a simple and fairly common deployment of the BGP Link State (BGP-LS) Shortest Path First (SPF) technology
scenario, which is described in Section 3. in a simple and fairly common deployment scenario, which is described
in Section 3.
Section 4 describes the reasons for BGP modifications for such Section 4 describes the reasons for BGP modifications for such
deployments. deployments.
Section 5 covers the BGP-SPF protocol enhancements to BGP to meet Section 5 covers the BGP SPF protocol enhancements to BGP to meet
these requirements and their applicability to data center [Clos] these requirements and their applicability to data center [Clos]
networks. networks.
2. Recommended Reading 2. Recommended Reading
This document assumes knowledge of existing data center networks and This document assumes knowledge of existing data center networks and
data center network topologies [Clos]. This document also assumes data center network topologies [Clos]. This document also assumes
knowledge of data center routing protocols such as BGP [RFC4271], knowledge of data center routing protocols such as BGP [RFC4271],
BGP-SPF [RFC9815], and OSPF [RFC2328] [RFC5340] as well as data BGP-LS SPF [RFC9815], and OSPF [RFC2328] [RFC5340] as well as data
center Operations, Administration, and Maintenance (OAM) protocols center Operations, Administration, and Maintenance (OAM) protocols
like the Link Layer Discovery Protocol (LLDP) [RFC4957] and like the Link Layer Discovery Protocol (LLDP) [RFC4957] and
Bidirectional Forwarding Detection (BFD) [RFC5880]. Bidirectional Forwarding Detection (BFD) [RFC5880].
3. Common Deployment Scenario 3. Common Deployment Scenario
Within a data center, servers are commonly interconnected using the Within a data center, servers are commonly interconnected using the
Clos topology [Clos]. The Clos topology is fully non-blocking, and Clos topology [Clos]. The Clos topology is fully non-blocking, and
the topology is realized using Equal-Cost Multipath (ECMP). In a the topology is realized using Equal-Cost Multipath (ECMP). In a
multi-stage Clos topology, the minimum number of parallel paths in multi-stage Clos topology, the minimum number of parallel paths in
skipping to change at line 165 skipping to change at line 166
resolve any overlay next hops. The hop-by-hop BGP peering paradigm resolve any overlay next hops. The hop-by-hop BGP peering paradigm
imposes several restrictions within a Clos. It prohibits the imposes several restrictions within a Clos. It prohibits the
deployment of route reflectors / route controllers as the EBGP deployment of route reflectors / route controllers as the EBGP
sessions are congruent with the data path. The BGP best-path sessions are congruent with the data path. The BGP best-path
algorithm is prefix based, and it prevents announcements of prefixes algorithm is prefix based, and it prevents announcements of prefixes
to other BGP speakers until the best-path decision process has been to other BGP speakers until the best-path decision process has been
performed for the prefix at each intermediate hop. These performed for the prefix at each intermediate hop. These
restrictions significantly delay the overall convergence of the restrictions significantly delay the overall convergence of the
underlay network within a Clos network. underlay network within a Clos network.
The BGP-SPF modifications allow BGP to overcome these limitations. The BGP SPF modifications allow BGP to overcome these limitations.
Furthermore, using the BGP-LS Network Layer Reachability Information Furthermore, using the BGP-LS Network Layer Reachability Information
(NLRI) format allows the BGP-SPF data to be advertised for nodes, (NLRI) format allows the BGP SPF data to be advertised for nodes,
links, and prefixes in the BGP routing domain and used for SPF links, and prefixes in the BGP routing domain and used for SPF
computations [RFC9552]. computations [RFC9552].
Additional motivation for deploying BGP-SPF is included in [RFC9815]. Additional motivation for deploying BGP-SPF is included in [RFC9815].
5. BGP-SPF Applicability to Clos Networks 5. BGP-SPF Applicability to Clos Networks
With the BGP-SPF extensions [RFC9815], the BGP best-path computation With the BGP-SPF extensions [RFC9815], the BGP best-path computation
and route computation are replaced with link-state algorithms such as and route computation are replaced with link-state algorithms such as
those used by OSPF [RFC2328], both to determine whether a BGP-LS-SPF those used by OSPF [RFC2328], both to determine whether a BGP-LS-SPF
NLRI has changed and needs to be readvertised and to compute the BGP NLRI has changed and needs to be readvertised and to compute the BGP
routes. These modifications will significantly improve convergence routes. These modifications will significantly improve convergence
of the underlay while affording the operational benefits of a single of the underlay while affording the operational benefits of a single
routing protocol [RFC7938]. routing protocol [RFC7938].
Data center controllers typically require visibility to the BGP Data center controllers typically require visibility to the BGP
topology to compute traffic-engineered paths. These controllers topology to compute traffic-engineered paths. These controllers
learn the topology and other relevant information via the BGP-LS learn the topology and other relevant information via the BGP-LS
address family [RFC9552], which is totally independent of the address family [RFC9552], which is totally independent of the
underlay address families (usually IPv4/IPv6 unicast). Furthermore, underlay address families (usually IPv4/IPv6 unicast). Furthermore,
in traditional BGP underlays, all the BGP routers will need to in usual BGP underlays, all the BGP routers will need to advertise
advertise their BGP-LS information independently. With the BGP-SPF their BGP-LS information independently. With the BGP-SPF extensions,
extensions, controllers can learn the topology using the same BGP controllers can learn the topology using the same BGP advertisements
advertisements used to compute the underlay routes. Furthermore, used to compute the underlay routes. Furthermore, these data center
these data center controllers can avail the convergence advantages of controllers can avail the convergence advantages of the BGP-SPF
the BGP-SPF extensions. The placement of controllers can be outside extensions. The placement of controllers can be outside of the
of the forwarding path or within the forwarding path. forwarding path or within the forwarding path.
Alternatively, as each and every router in the BGP-SPF domain will Alternatively, as each and every router in the BGP-SPF domain will
have a complete view of the topology, the operator can also choose to have a complete view of the topology, the operator can also choose to
configure BGP sessions in the hop-by-hop peering model described in configure BGP sessions in the hop-by-hop peering model described in
[RFC7938] along with BFD [RFC5580]. In doing so, while the hop-by- [RFC7938] along with BFD [RFC5580]. In doing so, while the hop-by-
hop peering model lacks the inherent benefits of the controller-based hop peering model lacks the inherent benefits of the controller-based
model, BGP updates need not be serialized by the BGP best-path model, BGP updates need not be serialized by the BGP best-path
algorithm in either of these models. This helps overall network algorithm in either of these models. This helps overall network
convergence. convergence.
skipping to change at line 407 skipping to change at line 408
To conserve IPv4 address space and simplify operations, BGP-SPF To conserve IPv4 address space and simplify operations, BGP-SPF
routers in Clos / Fat Tree deployments can use IPv6 addresses as the routers in Clos / Fat Tree deployments can use IPv6 addresses as the
peer address. For IPv4 address families, IPv6 peering as specified peer address. For IPv4 address families, IPv6 peering as specified
in [RFC8950] can be deployed to avoid configuring IPv4 addresses on in [RFC8950] can be deployed to avoid configuring IPv4 addresses on
router interfaces. When this is done, dynamic discovery mechanisms, router interfaces. When this is done, dynamic discovery mechanisms,
as described in Section 5.5, can be used to learn the global or link- as described in Section 5.5, can be used to learn the global or link-
local IPv6 peer addresses, and IPv4 addresses need not be configured local IPv6 peer addresses, and IPv4 addresses need not be configured
on these interfaces. If IPv6 link-local peering is used, then on these interfaces. If IPv6 link-local peering is used, then
configuration of IPv6 global addresses is also not required configuration of IPv6 global addresses is also not required
[RFC7404]. The Link Local/Remote Identifiers of the peering [RFC7404]. The Link Local/Remote Identifiers of the peering
interfaces MUST be used in the Link NLRI as described in interfaces must be used in the Link NLRI as described in
Section 5.2.2 of [RFC9815]. Section 5.2.2 of [RFC9815].
5.5.2. BGP-LS SPF Topology Visibility for Management 5.5.2. BGP-LS-SPF Topology Visibility for Management
Irrespective of whether or not BGP-SPF is used for route calculation, Irrespective of whether or not BGP-SPF is used for route calculation,
the BGP-LS-SPF route advertisements can be used to periodically the BGP-LS-SPF route advertisements can be used to periodically
construct the Clos / Fat Tree topology. This is especially useful in construct the Clos / Fat Tree topology. This is especially useful in
deployments where an Interior Gateway Protocol (IGP) is not used and deployments where an Interior Gateway Protocol (IGP) is not used and
the base BGP-LS routes [RFC9552] are not available. The resultant the base BGP-LS routes [RFC9552] are not available. The resultant
topology visibility can then be used for troubleshooting and topology visibility can then be used for troubleshooting and
consistency checking. This would normally be done on a central consistency checking. This would normally be done on a central
controller or other management tool that could also be used for controller or other management tool that could also be used for
fabric data path verification. The precise algorithms and fabric data path verification. The precise algorithms and
skipping to change at line 463 skipping to change at line 464
accomplished using the BGP-LS-SPF Node NLRI Attribute SPF Status TLV accomplished using the BGP-LS-SPF Node NLRI Attribute SPF Status TLV
as described in [RFC9815]. as described in [RFC9815].
8. BGP Policy Applicability 8. BGP Policy Applicability
Existing BGP policy such as prefix filtering may be used in Existing BGP policy such as prefix filtering may be used in
conjunction with the BGP-LS-SPF SAFI. When BGP policy is used with conjunction with the BGP-LS-SPF SAFI. When BGP policy is used with
the BGP-LS-SPF SAFI, BGP speakers in the BGP-LS-SPF routing domain the BGP-LS-SPF SAFI, BGP speakers in the BGP-LS-SPF routing domain
will not all have the same set of NLRIs and will compute a different will not all have the same set of NLRIs and will compute a different
BGP local routing table. Consequently, care must be taken to assure BGP local routing table. Consequently, care must be taken to assure
routing is consistent and blackholes or routing loops do not ensue. that routing is consistent and that routes to unreachable
However, this is no different than if traditional BGP routing using destinations or routing loops do not ensue. However, this is no
the IPv4 and IPv6 address families were used. different than if classical BGP routing using the IPv4 and IPv6
address families were used.
9. IANA Considerations 9. IANA Considerations
This document has no IANA actions. This document has no IANA actions.
10. Security Considerations 10. Security Considerations
This document introduces no new security considerations above and This document introduces no new security considerations above and
beyond those already specified in [RFC4271] and [RFC9815]. beyond those already specified in [RFC4271] and [RFC9815].
skipping to change at line 592 skipping to change at line 594
Authors' Addresses Authors' Addresses
Keyur Patel Keyur Patel
Arrcus, Inc. Arrcus, Inc.
2077 Gateway Pl 2077 Gateway Pl
San Jose, CA 95110 San Jose, CA 95110
United States of America United States of America
Email: keyur@arrcus.com Email: keyur@arrcus.com
Acee Lindem Acee Lindem
LabN Consulting, L.L.C. Arrcus, Inc.
301 Midenhall Way 301 Midenhall Way
Cary, NC 95110 Cary, NC 27513
United States of America United States of America
Email: acee.ietf@gmail.com Email: acee.ietf@gmail.com
Shawn Zandi Shawn Zandi
LinkedIn LinkedIn
222 2nd Street 222 2nd Street
San Francisco, CA 94105 San Francisco, CA 94105
United States of America United States of America
Email: szandi@linkedin.com Email: szandi@linkedin.com
 End of changes. 16 change blocks. 
30 lines changed or deleted 32 lines changed or added

This html diff was produced by rfcdiff 1.48.