<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent">
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<rfc
      xmlns:xi="http://www.w3.org/2001/XInclude"
      category="info"
      docName="draft-xiong-hpwan-problem-statement-00"
      ipr="trust200902"
      obsoletes=""
      updates=""
      submissionType="IETF"
      xml:lang="en"
      tocInclude="true"
      tocDepth="4"
      symRefs="true"
      sortRefs="true"
      version="3">

 <!-- ***** FRONT MATTER ***** -->

 <front>

   <title abbrev="Problems Statement for High Performance Wide Area Networks">Problem Statement for High Performance Wide Area Networks</title>
    <seriesInfo name="Internet-Draft" value="draft-xiong-hpwan-problem-statement-00"/>
   
   <author fullname="Quan Xiong" initials="Q" surname="Xiong">
      <organization>ZTE Corporation</organization>
      <address>
        <postal>
          <street/>
         <city></city>
          <region/>
          <code/>
          <country>China</country>
        </postal>
        <phone></phone>
        <email>xiong.quan@zte.com.cn</email>
     </address>
    </author>

	<author fullname="Kehan Yao" initials="K" surname="Yao">
      <organization>China Mobile</organization>
      <address>
        <postal>
          <street/>
         <city></city>
          <region/>
          <code/>
          <country>China</country>
        </postal>
        <phone></phone>
        <email>yaokehan@chinamobile.com</email>
     </address>
    </author>
	
    <author fullname="Cancan Huang" initials="C" surname="Huang">
      <organization>China Telecom</organization>

      <address>
        <postal>
          <street></street>
          
          <city></city>
          
          <region></region>
  
          <code></code>

          <country>China</country>
        </postal>

        <phone></phone>

        <email>huangcanc@chinatelecom.cn</email>
      </address>
    </author>
	
    <author fullname="Zhengxin Han" initials="Z" surname="Han">
      <organization>China Unicom</organization>

      <address>
        <postal>
          <street></street>
          
          <city></city>
          
          <region></region>
  
          <code></code>

          <country>China</country>
        </postal>

        <phone></phone>

        <email>hanzx21@chinaunicom.cn</email>
      </address>
    </author>
	
	<author fullname="Junfeng Zhao" initials="J" surname="Zhao">
      <organization>CAICT</organization>

      <address>
        <postal>
          <street></street>
          
          <city>Beijing</city>
          
          <region></region>
  
          <code></code>

          <country>China</country>
        </postal>

        <phone></phone>

        <email>zhaojunfeng@caict.ac.cn</email>
      </address>
    </author>		

   <area>Wit</area>
    <workgroup></workgroup>
   <keyword></keyword>
   
   <abstract>
	
	<t>High Performance Wide Area Network (HP-WAN) is designed for many 
	applications such as scientific research, academia, education and 
	other data-intensive applications which demand large volume data 
	transmission over WANs, and it needs to ensure large-scale data 
	processing and provide efficient transmission services. This 
	document outlines the problems for HP-WANs.</t>
	  
    </abstract>
  </front>
  <middle>
  
   <section numbered="true" toc="default"> <name>Introduction</name>
	
   <t>As described in [I-D.kcrh-hpwan-state-of-art], data is fundamental
   for research, academia, education, industrial and other data-intensive 
   applications, such as High Performance Computing (HPC) for scientific 
   research, cloud storage and backup of industrial internet data, distributed
   training of Artificial Intelligence (AI), and so on. Within these applications, 
   they may generate huge volumes of data by using advanced instruments and 
   high-end computing devices. It needs to ensure large-scale data transfer 
   within a completion time and provide stable and efficient transmission 
   services over non-dedicated Wide Area Networks (WANs). These WANs need to 
   connect research institutions, universities, and data centers across large
   geographical areas, and it usually requires massive data transmission over 
   long-distance links. For example, sharing data between research institutes
   must transfer over hundreds or thousands of kilometers. Moreover, some 
   applications may demand a periodic and on-demand migration with variable
   transmission frequency, requiring timely data transmission. The large data 
   transfer co-existed services over WANs demand high performance, such as 
   effective high-throughput, fairness among multiple services, and high 
   network utilization. </t>
	
   <t>More recently, the massive data transmission and long-distance connection 
   over complicated WANs have become a key factor affecting the performance 
   of existing technologies. For example, the high-volume data may be transmitted 
   over WANs, which depends on the transport layer protocols such as Transfer Control
   Protocol (TCP), Quick UDP Internet Connections (QUIC), Remote Direct Memory 
   Access (RDMA) and so on. The traditional congestion control mechanisms can not 
   achieve the high performance, which are typically implemented at the host 
   (sender and receiver) to control or prevent the congestion. For the host, it may 
   adjust sending rates based on the feedback from the network when the packet loss
   or congestion occurred. But it will impact the performance with the long feedback
   loop and it could also be inefficient without the fine-grained awareness of 
   network capability. For the network, it always reactively transfers the packets
   leading to low bandwidth utilization due to the bottleneck link and instantaneous
   congestion. For example, the network could enhance the capability to regulate
   the traffic to avoid incast network congestion preemptively and it could also
   be actively collaborated with the host to adjust the rate efficiently and 
   rapidly when congestion occurred. The negotiation between the host and the
   network is required to assist the network operator's traffic management 
   and bandwidth allocation and utilization optimization and help the host 
   to adjust the rate with the network resource scheduling acknowledgement.
   So the host with sophisticated congestion control upon more active network 
   coordination should be considered to improve overall HP-WANs transmission 
   performance. </t>
   
   <t>High Performance Wide Area Network (HP-WAN) is designed specifically 
   to meet the high-speed, low-latency, and high-capacity needs of massive 
   data set applications, which puts forward high performance requirements 
   such as effective high-throughput, multiple service fairness and high 
   bandwidth utilization. This document outlines the problems for HP-WANs.</t>
	
    
      <section numbered="true" toc="default"><name>Requirements Language</name>
	  
	 <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
       "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
       "OPTIONAL" in this document are to be interpreted as described in BCP
       14 <xref target="RFC2119" pageno="false" format="default"/> 
	   <xref target="RFC8174" pageno="false" format="default"/> when, and only when, 
	   they appear in all capitals, as shown here.</t>
	   
      </section>
    </section>
	
    <section anchor="Terminology" numbered="true" toc="default"> <name>Terminology</name>
	<t>The terminology is defined as following.</t>
	
	<t>High Performance Wide Area Networks (HP-WANs): indicate the 
	wide area networks designed specifically to meet the high-speed, 
	low-latency, and high-capacity needs of research, academia, 
	education, industrial and other data-intensive applications. The 
	primary goal of HP-WAN is to achieve massive data transmission
	within a completion time, which puts forward high performance 
	requirements such as effective high-throughput, multiple service
	fairness and high bandwidth utilization.</t>
	
	<t>It also makes use of the following abbreviations and definitions
	 in this document:</t>
	   
	    <dl newline="false" spacing="normal" indent="15" pn="section-2-3">
		<dt>DC: </dt>
		<dd>Data Center</dd>	
	    <dt>DCI: </dt>
	    <dd>Data Centers Interconnection</dd>
	    <dt>HPC: </dt>
	    <dd>High Performance Computing</dd>
	    <dt>WAN: </dt>
	    <dd>Wide Area Networks</dd>
	    <dt>MAN: </dt>
	    <dd>Metropolitan Area Networks</dd>  
		<dt>PFC: </dt>
		<dd>Priority Flow Control</dd>	
	    <dt>ECN: </dt>
	    <dd>Explicit Congestion Notification</dd>
	    <dt>ECMP: </dt>
	    <dd>Equal-Cost Multipath</dd>
	    <dt>RTT: </dt>
	    <dd>Round-Trip Time</dd>
	    <dt>TCP: </dt>
	    <dd>Transfer Control Protocol </dd>
	    <dt>RDMA: </dt>
	    <dd>Remote Direct Memory Access</dd>
	    <dt>QUIC: </dt>
	    <dd>Quick UDP Internet Connections</dd>	
		</dl>
    </section>   
   
    
   <section numbered="true" toc="default"><name>High-performance Goals for HP-WANs</name>
   
   <t>The services need to be provided in HP-WANs mainly focus on massive 
   data with timely transmission while multiple services may co-exist over 
   long-distance networks as described below.</t>
  
   <ul spacing="normal">
   <li>Massive data transmission, bulk or high-volume data transfer, e.g. the 
   data volume of a flow could be at 2Gbps~1Tbps. </li>
   <li>Timely data transmission, it has a completion time but without strict
   real-time transmission requirements, e.g. minutes~milliseconds.</li>
   <li>Predictable transmission, the transmission frequency is variable and 
   predictable, e.g. a periodic or on-demand migration migration.</li>
   <li>Long-distance transmission over non-dedicated WANs, between one or more 
   sites or DCs, e.g.more than 100km or 1000km.</li>   
   <li>Multiple services are co-existed with concurrent flows.</li>
   <li>Minimize cost.</li>
   <li>Data security and integrity.</li>
   </ul>
   
	<t>From the application perspective, it is required to achieve effective 
	high-throughput data transmission for an HP-WAN flow to meet a completion
	time. Moreover, it is also crucial to maximize bandwidth utilization while 
	ensuring fairness among multiple services. This document outlines the 
	high-performance requirements for HP-WANs as described below.</t>
	
   <ul spacing="normal">
   
   <li>Effective high-throughput: HP-WANs put forward high performance 
   requirements for the throughput of high-volume data transmission within
   a completion time over WANs. It will be impacted by the performance 
   indicators such as bandwidth, packet loss ratio, latency and so on, 
   for example, the packet loss and RTT are negatively correlated with 
   throughput. It is required to achieve ultra-high goodput, ultra-low
   packet loss ratio, low latency and resilience to ensure effective
   high-throughput transmission in HP-WANs.</li>
   
   <li>Multiple services fairness: HP-WANs put forward high performance 
   requirements for fairness when multiple services are co-existed
   with concurrent flows. It refers to ensuring that different 
   types of services can obtain reasonable resources and services in 
   network resource allocation and management in order to meet their 
   respective quality of service (QoS) requirements, while ensuring the 
   fairness of resource allocation.</li>  
 
   <li>Ultra-high bandwidth utilization: HP-WANs put forward high performance 
   requirements for the bandwidth utilization of the network. It needs to
   efficiently use available network capacity to maximize data transfer 
   rates and minimize latency to achieve the low cost in HP-WANs. 
   It is required to achieve bandwidth utilization rate exceeding 90% 
   to ensure that network resources are fully utilized.</li>
   </ul>
	
	</section>

   
   <section numbered="true" toc="default"> <name>Problem Statements</name>
   
   <t>It will be challenging to provide effective high-performance 
   transmission in HP-WANs scenarios with massive concurrent services 
   and long-distance delays and packet loss. The long-distance networks
   may have more uncertainties, such as long Round-Trip Time (RTT) latency, 
   routing changes, network congestion, packet loss and link quality 
   fluctuations, all of which may have a negative impact on the throughput. 
   The services are massive and concurrent with multiple types and different 
   traffic models such as the elephant flows with short interval time, high
   speed and large data scale, which may occupy a large amount of network 
   resources and lead to the unfairness among different flows, low network 
   utilization and cost-effectiveness.</t>
 
   <t>The existing network technologies have various problems and cannot 
   meet the performance requirements. This document outlines the problems 
   for HP-WANs.</t>
   
   <section  numbered="true" toc="default"> <name>Long-distance Delay and Slow Feedback</name>
   
   <t>Several congestion control algorithms are implemented such as loss-based
   congestion control algorithms (e.g. Reno and CUBIC, it depends the
   congestion notification with packet loss) and congestion-based congestion
   control algorithms (e.g. BBR, it depends on the measurement of congestion).
   It will delay the network state feedback due to the long-distance transmission
   delays and large RTT, resulting in the inability to adjust the transmission
   rate in a timely manner. It will be challenging for congestion control 
   in WANs for controlling the total amount of data entering the network to 
   maintain the traffic at an acceptable level. Feedback should be independent
   of the transmission distance, and as timely as possible.</t>
   
   <t>For example, Explicit Congestion Notification (ECN) can be used for Reno 
   and CUBIC to achieve an end-to-end congestion notification based on IP and 
   transport layers. When a congestion occurred, the network may signal congestion
   by ECN markings or by dropping packets, and the receiver passes this information
   back to the sender in transport-layer acknowledgements, notifying the source
   to adjust the transmission rate to achieve congestion control. The long-distance
   will delay the notification and slow the feedback, which result in untimely 
   adjustment and buffer overflow, causing a decrease in network performance. 
   Especially for incast congestion based on multi-source targeting, the network
   needs to send a fast feedback based on offered load. </t>
   
   <t>For BBR, it actively measures bottleneck bandwidth (BtlBw) and 
   round-trip propagation time (RTprop) based on the model to calculate 
   the bandwidth delay product (BDP) and then to adjust the transmission 
   rate to maximize throughput and minimize latency. But BBR relies on 
   real-time measurement of the parameters which may vary greatly, feedback 
   slowly, thereby affecting the control precision of BBR in long-distance 
   networks. </t>
   
   <t>Moreover, other congestion control algorithms such as the Data Center
   Quantized Congestion Notification (DCQCN) and High Precision Congestion 
   Control (HPCC++) would not tolerate the slow feedback loop over WANs. </t>
   
   </section>
   
   <section  numbered="true" toc="default"> <name>Coarse-grained Exploitation of Network Capacities</name>
   
    <t>The existing congestion control mechanisms focus on rate adjustment, which 
    can control the sending rate of data flows at the source of data transmission,
    thereby avoiding or reducing network congestion. It will be challenging for the
    host to adjust the sending rates efficiently without the awareness of network 
    capacity. For example, for CUBIC, as per <xref target="RFC9438" pageno="false" format="default"/>, when the packet loss is detected
    using classic ECN mechanism, it will reduce the congestion window based on its 
	multiplicative window decrease factor, that will adjust the sending rate
	with sawtooth pattern. And for L4S as per <xref target="RFC9330" pageno="false" format="default"/>, it uses more frequent 
	ECN tagging to provide low latency and scalable throughput and to reduce the 
	convergence time and eliminate the sawtooth effect. However, due to ECN feedback
	of congestion and frequent rate adjustment, it will result in significant changes
	in throughput, which affects bandwidth utilization and transmission efficiency. 
	It still lack more accurate network information which is critical for significant
	transmission capacity gaps between the appropriate sending rate and the available
	network capacity especially when transmitting the high-volume data over WANs .  </t>
	   
   	<t>Moreover, it incurs inconsistency between the sending rate of the host and 
	the network transmission capability to achieve accurate sending rate adjusting.
	For example, when determining the starting rate of data transmission, the slow 
	start in congestion control will lead to overall throughput bottleneck with 
	insufficient bandwidth utilization and fail to fully unleash the potential of
	the network capacity. But the fast start can not adapt to the cache capacity of
	network devices especially when multiple flows are transmitted over the same 
	link, causing network congestion and resulting in packet loss and transmission 
	delay. For HP-WANs, the fine-grained network-aware sending rate negotiation
	needs to comprehensively consider factors such as predictable network bandwidth, 
	latency, packet loss rate, while balancing bandwidth utilization and congestion
	avoidance in WANs.  </t>

   </section>
   
   <section  numbered="true" toc="default"> <name>Instantaneous Traffic</name>
	 
	<t>From the network perspective, it can just reactively transfer the
	high-volume data without scheduling the predictable traffic and network 
	resources to estimate network congestion preemptively. It will be 
	challenging for the network without the awareness of instantaneous 
	traffic which will occupy a large amount of network resources, resulting
	in low bandwidth utilization due to the uneven resource allocation.</t>

	<t>For example, in HP-WAN applications, a large amount of data will 
	be transmitted, e.g. the data volumes of a single flow may be from 10G 
	to 1TB, the massive data transferring with large burst may cause 
    instantaneous congestion, packet loss, and queuing delay within
    network devices in WANs. There will be more aggregations at the
    edge of WANs and it may be accumulated as the flows traverse, 
    join, and separate over hops. It will be challenging for unmanageable
    congestion control for the bursty traffic. </t>
   
    <t>Moreover, goodput bottleneck with transmission completion time and 
    duration brings traffic scheduling challenging. The applications may
    have multiple concurrent services co-existed with existing dynamic
    flows. Considering the multiple services with various types and 
    different traffic requirements, the traffic is required to be 
    scheduled to multiple paths and fine-grained network resources to
    achieve high utilization and QoS guarantee. </t>
   
   </section> 
   
   
   <section  numbered="true" toc="default"> <name>Incast Congestion upon Bottleneck Links</name>
   
   <t>It will be challenging for incast congestion causing by bottleneck links 
   bandwidth in long-distance and multi-hop networks. And it will be difficult
   to control packet loss, queuing latency and jitter leading to the decrease 
   of throughput. Incast traffic is the mastermind of congestion for the 
   greedy transmission. The network may regulate them to avoid congestion 
   preemptively. It may proactively avoid the path-level congestion and
   operate actively reserving and allocating network bandwidth through a 
   scheduler to match the bottleneck link bandwidth as much as possible, 
   thus fully utilizing bandwidth and preventing packet loss. </t>
   
   <t>Moreover, the congestion in the network can be reduced, thereby reducing 
   packet loss caused by buffer overflow, through effective flow control
   which refers to a method for ensuring the data is transmitted efficiently 
   and reliably and controlling the rate of data transmission to prevent the
   fast sender from overwhelming the slow receiver and prevent packet loss in
   congested situations. But it will be challenging to ensure the fairness 
   among multiple services over different distances due to the unequal allocation
   of network resources among flows with different RTTs. For example, some flows
   may occupy more bandwidth due to the use of large window sizes, smaller RTTs,
   or larger packets.</t>
   
   </section>
   </section>
   

   <section  numbered="true" toc="default"> <name>Security Considerations</name>
   <t>This document covers several of representative applications and
   network scenarios that are expected to make use of HP-WAN
   technologies.  Each of the potential use cases does not raise
   any security concerns or issues, but may have security 
   considerations from both the use-specific perspective and
   the technology-specific perspective.</t>
   </section>
   <section numbered="true" toc="default"> <name>IANA Considerations</name>
   <t>This document makes no requests for IANA action.</t>
   </section>
	
   <section numbered="true" toc="default"> <name>Acknowledgements</name>
   <t>The authors would like to acknowledge Guangping Huang, Yao Liu and 
    Zheng Zhang for their thorough review and very helpful comments.</t>
   </section> 
   
  </middle>
  
  <!--  *****BACK MATTER ***** -->

 <back>
 
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8664.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9232.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7424.xml"/>	
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3168.xml"/>
		<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9438.xml"/>
		<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9330.xml"/>		
      </references>
    </references>
 
 </back>
</rfc>
