<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE rfc SYSTEM "rfc2629-xhtml.ent">
<?xml-stylesheet type='text/xsl' href='rfc2629.xslt' ?>
<rfc
      xmlns:xi="http://www.w3.org/2001/XInclude"
      category="info"
      docName="draft-xiong-hpwan-problem-statement-02"
      ipr="trust200902"
      obsoletes=""
      updates=""
      submissionType="IETF"
      xml:lang="en"
      tocInclude="true"
      tocDepth="4"
      symRefs="true"
      sortRefs="true"
      version="3">

 <!-- ***** FRONT MATTER ***** -->

 <front>

   <title abbrev="Problems Statement for High Performance Wide Area Networks">Problem Statement for High Performance Wide Area Networks</title>
    <seriesInfo name="Internet-Draft" value="draft-xiong-hpwan-problem-statement-02"/>
   
   <author fullname="Quan Xiong" initials="Q" surname="Xiong">
      <organization>ZTE Corporation</organization>
      <address>
        <postal>
          <street/>
         <city></city>
          <region/>
          <code/>
          <country>China</country>
        </postal>
        <phone></phone>
        <email>xiong.quan@zte.com.cn</email>
     </address>
    </author>

	<author fullname="Kehan Yao" initials="K" surname="Yao">
      <organization>China Mobile</organization>
      <address>
        <postal>
          <street/>
         <city></city>
          <region/>
          <code/>
          <country>China</country>
        </postal>
        <phone></phone>
        <email>yaokehan@chinamobile.com</email>
     </address>
    </author>
	
    <author fullname="Cancan Huang" initials="C" surname="Huang">
      <organization>China Telecom</organization>

      <address>
        <postal>
          <street></street>
          
          <city></city>
          
          <region></region>
  
          <code></code>

          <country>China</country>
        </postal>

        <phone></phone>

        <email>huangcanc@chinatelecom.cn</email>
      </address>
    </author>
	
    <author fullname="Zhengxin Han" initials="Z" surname="Han">
      <organization>China Unicom</organization>

      <address>
        <postal>
          <street></street>
          
          <city></city>
          
          <region></region>
  
          <code></code>

          <country>China</country>
        </postal>

        <phone></phone>

        <email>hanzx21@chinaunicom.cn</email>
      </address>
    </author>
	
	<author fullname="Junfeng Zhao" initials="J" surname="Zhao">
      <organization>CAICT</organization>

      <address>
        <postal>
          <street></street>
          
          <city>Beijing</city>
          
          <region></region>
  
          <code></code>

          <country>China</country>
        </postal>

        <phone></phone>

        <email>zhaojunfeng@caict.ac.cn</email>
      </address>
    </author>		

   <area>Wit</area>
    <workgroup></workgroup>
   <keyword></keyword>
   
   <abstract>
	
	<t>High Performance Wide Area Network (HP-WAN) is designed for many 
	applications such as scientific research, academia, education and 
	other data-intensive applications which demand high-speed data 
	transmission over WANs, and it needs to provide efficient transmission 
    services within a completion time. This document outlines the 
	problems for HP-WANs.</t>
	  
    </abstract>
  </front>
  <middle>
  
   <section numbered="true" toc="default"> <name>Introduction</name>
	
   <t>As described in <xref target="I-D.kcrh-hpwan-state-of-art" pageno="false" format="default"/>, data is fundamental
   for research, academia, education, industrial and other data-intensive 
   applications, such as High Performance Computing (HPC) for scientific 
   research, cloud storage and backup of industrial internet data, distributed
   training of Artificial Intelligence (AI), and so on. The use cases in 
   non-dedicated networks from public operators such as large file transfer, 
   traffic across data centers and sharing traffic between dedicated network
   and non-dedicated network are also described in 
   <xref target="I-D.yx-hpwan-uc-requirements-public-operator" pageno="false" format="default"/>.</t>
   
   <t>Within these applications, they may generate huge volumes of data by using 
   advanced instruments and high-end computing devices. They need to be 
   connected between research institutions, universities, and data centers 
   across large geographical areas over long-distance links. For example, 
   sharing data between research institutes must transfer over hundreds or 
   thousands of kilometers. It needs to ensure large-scale data transfer and
   provide stable and efficient transmission services over Wide Area Networks (WANs). 
   These applications may require a periodic or on-demand high-speed transfer 
   with variable start time, data volume and transmission patterns, which
   demanding data transmission within a completion time. </t>
   
   <t>More recently, the massive data transmission and long-distance connection 
   over WANs have become a key factor affecting the performance of existing
   transport layer protocols such as Transfer Control Protocol (TCP), 
   Quick UDP Internet Connections (QUIC), Remote Direct Memory Access (RDMA) and 
   so on. Different transport protocols carrying massive data transfer requests
   will co-exist in the same network and the multiple transport protocols 
   optimizations may incur much overhead, including congestion control
   algorithms redesign and parameter tuning, hardware adaptation and QoS 
   policies, etc. The transport protocol proxy may be deployed to adapt the
   functionality for different transport protocols.</t>

   <t>Moreover, the traditional congestion control algorithms are typically 
   implemented at the host (sender and receiver) perform blind transmission 
   by controlling the size of the congestion window with rate adjusting by 
   detection of overloaded links. It will be difficult to predict the 
   performance due to the unpredictable behaviour of the WANs. For example, 
   for the host, without awareness of network capability, it will lead to 
   a poor convergence speed impacting the completion time due to the slow
   start and passive rates adjusting. It will also lead to RTT fluctuation 
   due to large buffer and long queues upon long feedback loop. For the network, 
   it will transfer the unscheduled traffic with low bandwidth utilization 
   due to the bottleneck links and instantaneous congestion. All of above 
   will impact the performance and result in the untimely transmission of 
   high-volume data. </t>
   
   <t>High Performance Wide Area Network (HP-WAN) is designed for many 
   applications such as scientific research, academia, education and 
   other data-intensive applications which demand high-speed data 
   transmission over WANs, and it needs to provide efficient transmission 
   services within a completion time. A variety of problems about what
   are specifically in the way for HP-WAN requirements are outlined in
   this document.</t>
	
    
      <section numbered="true" toc="default"><name>Requirements Language</name>
	  
	 <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
       "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
       "OPTIONAL" in this document are to be interpreted as described in BCP
       14 <xref target="RFC2119" pageno="false" format="default"/> 
	   <xref target="RFC8174" pageno="false" format="default"/> when, and only when, 
	   they appear in all capitals, as shown here.</t>
	   
      </section>
    </section>
	
    <section anchor="Terminology" numbered="true" toc="default"> <name>Terminology</name>
	<t>This document adopts the terminology defined in <xref target="I-D.kcrh-hpwan-state-of-art" pageno="false" format="default"/>. </t>
	
	<t>It also makes use of the following abbreviations and definitions
	 in this document:</t>
	   
	    <dl newline="false" spacing="normal" indent="15" pn="section-2-3">
		<dt>BDP: </dt>
		<dd>Bandwidth Delay Product</dd>			
		<dt>DC: </dt>
		<dd>Data Center</dd>	
	    <dt>DCI: </dt>
	    <dd>Data Centers Interconnection</dd>
	    <dt>HPC: </dt>
	    <dd>High Performance Computing</dd>
	    <dt>WAN: </dt>
	    <dd>Wide Area Networks</dd>
		<dt>PFC: </dt>
		<dd>Priority Flow Control</dd>	
	    <dt>ECN: </dt>
	    <dd>Explicit Congestion Notification</dd>
	    <dt>ECMP: </dt>
	    <dd>Equal-Cost Multipath</dd>
	    <dt>RTT: </dt>
	    <dd>Round-Trip Time</dd>
	    <dt>TCP: </dt>
	    <dd>Transfer Control Protocol </dd>
	    <dt>RDMA: </dt>
	    <dd>Remote Direct Memory Access</dd>
	    <dt>QUIC: </dt>
	    <dd>Quick UDP Internet Connections</dd>	
		</dl>
    </section>   
   
    
   <section numbered="true" toc="default"><name>Technical Goals for HP-WANs</name>
   
   <t>The services need to be provided in HP-WANs mainly focus on massive
   data with timely transmission while multiple services may co-exist over 
   long-distance WANs as described below.</t>
  
   <ul spacing="normal">
   <li>Massive data transmission, high-volume data with high-speed transfer, 
   e.g. the data speed of a flow could be at 2Gbps~1Tbps.</li>
   <li>Requested completion time, the data transmission should be completed
   within a requested completion time, e.g. the completion time could be 
   minutes~milliseconds.</li>
   <li>Scheduled transmission, traffic patterns could be scheduled by the 
   sender, e.g. data volume, start time, finish time, service type.</li>
   <li>Long-distance transmission over non-dedicated WANs, with multiple hops
   and domains, long RTT latency, routing changes, network congestion,
   packet loss, and link quality fluctuations, e.g. the distance between
   two sites or DCs could be more than 100km or 1000km.</li>   
   <li>Multiple services are co-existed with concurrent flows, with different 
   transport protocols for data transmission, such as QUIC, TCP and RDMA etc.</li>

   </ul>
   
	<t>It is required to achieve high-speed data transmission within 
	a completion time. Moreover, it is also crucial to maximize bandwidth 
	utilization while ensuring fairness among multiple services. This 
	document outlines the technical goals for HP-WANs as described below.</t>
	
   <ul spacing="normal">

   <li>High throughput: ensuring the high-speed data transmission within
   a requested completion time for a flow,  which could be impacted by 
   the bandwidth, convergence speed, start time and RTT.</li>
   
   <li>Efficient use of capacity: efficiently using available network 
   capacity with fairness to maximize data transfer rates and minimize 
   the completion time for multiple flows.</li>   
   
   </ul>
	
	</section>

   
   <section numbered="true" toc="default"> <name>Problem Statement</name>
   
   <t>The specific requirements of HP-WANs may encompass a wide range of 
   aspects. These include transport-related technologies such as proxy, 
   flow control, QoS negotiation, congestion control, admission control
   and traffic scheduling. Additionally, they also involve routing-related
   technologies like traffic engineering, resource scheduling, and load 
   balancing.</t>
   
   <t>Existing network technologies face numerous challenges and fall short
   of meeting performance requirements. This document highlights the key
   issues associated with HP-WANs in the following sub-sections.</t>
   
   
    <section  numbered="true" toc="default"> <name>Poor Convergence Speed</name>
	
	<t>The traditional congestion control mechanisms perform blind transmission
    by controlling the size of the congestion window with rate adjusting by 
    detection of overloaded links. WAN is a black box to provide unpredictable 
	behaviors for high-speed transmission due to the issues such as multiple 
	hops and domains, long Round-Trip Time (RTT), routing changes, network 
	congestion, packet loss, and link quality fluctuations.	The BDP (Bandwidth 
	Delay Product) which represents the maximum amount of data that can be in
	transit on the network at any given time is variable over WANs, so the 
	inflight data is difficult to predict for host-based congestion control 
	algorithms. It will lead to the poor convergence speed that the host always
	takes significantly long time to identify the optimal sending rate comparing
	to the requested completion time. </t>
	
	<t>For example, it will use the slow start and blind detection with 
   unawareness of network capability leading to long convergence time
   such as Cubic (e.g.over 50s), BBR (e.g.over 30s) and BBRv2 (e.g.30~50s). 
   BBR divides the entire process into four stages, Startup, Drain, 
   ProbeBW and ProbeRTT. The probe cycle of ProbeRTT state is long, 
   e.g. 10s. The convergence time will be multiple probe cycle which
   will impact the completion time at seconds level. There is a significant
   transmission capacity gaps between the appropriate sending rate and the
   available network capacity. The transport protocols should signal and 
   collaborate with the network to negotiate the rate for the host to send
   traffic.</t>

    </section>	
	
	<section  numbered="true" toc="default"> <name>Unscheduled Traffic</name>
	
	<t>The host sending large unscheduled traffic without collaboration will
	lead to the instantaneous congestion in WANs. For multiple high-speed
	flows, the random arrival and departure of cross-traffic without scheduling 
	creates significant fluctuations for available capacity in WANs. The network
	infrastructure may struggle to handle high-volume data transfers efficiently
	if applications do not proactively schedule the traffic. Without awareness 
	of the traffic patterns, the network risks unscheduled resource allocation, 
	leading to low bottleneck bandwidth utilization, reduced overall throughput, 
	and uncontrolled completion time.</t>

	<t>For example, for HPC applications, a large amount of data will be transmitted,
	e.g. the data volumes of a single flow may be from 10G to 1TB, the host sends 
	the unscheduled large traffic leading to the instantaneous congestion, packet
	loss, and queuing delay within network devices in WANs, resulting in low throughput. 
	Considering the multiple services with various types of flows, the optimal 
    bandwidth and transmission time may be different and the traffic is random to 
	join and leave without to be scheduled to multiple paths and fine-grained 
	network resources, which can not achieve the timely transmission. The resource 
	of WANs should be scheduled at the elements along the path to provide predictable
	capability for high-speed transmission.</t>
   
   </section> 
   
   <section  numbered="true" toc="default"> <name>Long Feedback Loop</name>
   
   <t>The congestion algorithms are implemented by controlling the size 
   of the congestion window and adjusting the sending rates upon the network
   status feedback. It will delay the network feedback due to the long-distance 
   transmission delays and large RTT, resulting in the inability to adjust
   the transmission rate in a timely manner. It will be challenging for congestion
   control over WANs for controlling the total amount of data entering the 
   network to maintain the traffic at an acceptable level, leading to RTT 
   fluctuation due to long queues and large buffer at network devices 
   with high-speed transmission upon the long network state feedback loop. 
   Especially when multiple flows targeting an aggregating node, the maximum 
   value is exceeding devices buffer capacity.</t>
   
   <t>For example, the loss-based congestion control algorithms, such as 
   Reno and CUBIC, depends on the congestion notification with packet loss.
   Explicit Congestion Notification (ECN) can be used to achieve an 
   end-to-end congestion notification based on IP and transport layers.
   When a congestion occurred, the network may signal congestion
   by ECN markings or by dropping packets, and the receiver passes this
   information back to the sender in transport-layer acknowledgements, 
   notifying the source to adjust the transmission rate. It will use the
   slow start, requiring large buffer which is impacted by multiple hops 
   and long RTT latency over WANs.</t>
   
   <t>And the congestion-based congestion control algorithms such as BBR, 
   depends on the measurement of congestion, it actively measures 
   bottleneck bandwidth (BtlBw) and round-trip propagation time (RTprop)
   based on the model to calculate the BDP and then to adjust the 
   transmission rate to maximize throughput and minimize latency. But
   BBR relies on real-time measurement of the parameters, and will 
   optimize the buffer overflow, but it is not significant under large
   RTT, e.g. retransmission will increase when the buffer size is 
   less than two BDPs, thereby affecting the control precision of BBR
   in long-distance networks. </t>
   </section>
   
   <section  numbered="true" toc="default"> <name>Multiple Transport Protocols Adaption</name>
   
   <t>Multiple services are coexisted for massive data transmission over 
   WANs with different transport protocols, such as QUIC, TCP and RDMA etc.
   Multiple transport protocols, each handling substantial data transfer 
   requests, will coexist within the same network. Optimizing these diverse
   transport protocols can entail significant overhead. This encompasses 
   issues such as redesigning congestion control algorithms, mapping parameters,
   adapting hardware components, and formulating QoS policies. To improve 
   such significant overhead, a more flexible deployment strategy, such as 
   the implementation of a transport protocol proxy, can be enabled for
   the adaptation of functionality to suit the requirements of different 
   transport protocols. The proxy should support high-speed transmission
   such as traffic classification, packet processing, buffering, and 
   implement the collaboration and interaction between proxy and hosts.
   Seamless communication between hosts and network infrastructure requires
   adaptive coordination across heterogeneous transport protocols 
   (e.g., TCP, UDP, QUIC, RDMA). </t>
   
   <t>Moreover, in some scenarios, it is difficult to simultaneously ensure 
   both encrypted data and high-speed transmission. Encryption algorithms
   (e.g., AES, RSA) require intensive CPU operations, which reducing available
   capacity for data transmission. Edge computing nodes with limited CPU 
   capabilities struggle to balance encryption and data processing.
   The proxy could perform optimizations (e.g., hardware acceleration, 
   distributed encryption modules) to mitigate the bottlenecks.</t>
   
   </section>

   </section>
   

   <section  numbered="true" toc="default"> <name>Security Considerations</name>
   <t>This document covers several of representative applications and
   network scenarios that are expected to make use of HP-WAN
   technologies. Each of the potential use cases does not raise
   any security concerns or issues, but may have security 
   considerations from both the use-specific perspective and
   the technology-specific perspective.</t>
   </section>
   <section numbered="true" toc="default"> <name>IANA Considerations</name>
   <t>This document makes no requests for IANA action.</t>
   </section>
	
   <section numbered="true" toc="default"> <name>Acknowledgements</name>
   <t>The authors would like to acknowledge Guangping Huang, Yao Liu and 
    Zheng Zhang for their thorough review and very helpful comments.</t>
   </section> 
   
  </middle>
  
  <!--  *****BACK MATTER ***** -->

 <back>
 
    <references>
      <name>References</name>
      <references>
        <name>Normative References</name>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.2119.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8174.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.8664.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9232.xml"/>
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.7424.xml"/>	
        <xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.3168.xml"/>
		<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9438.xml"/>
		<xi:include href="https://xml2rfc.ietf.org/public/rfc/bibxml/reference.RFC.9331.xml"/>	
		<xi:include href="https://datatracker.ietf.org/doc/bibxml3/draft-yx-hpwan-uc-requirements-public-operator.xml"/>
        <xi:include href="https://datatracker.ietf.org/doc/bibxml3/draft-kcrh-hpwan-state-of-art.xml"/>
      </references>
    </references>
 
 </back>
</rfc>
