Multi-Layer Network Research: Optical Networks
In last month’s President Page, I talked about the merit of multi-layer network research and I promised to expand on that subject further in future columns. This month is the first of four parts focusing on optical networks. Next month we will discuss satellite networks, and the month after integrated wireless and fiber networks together with computing and applications. The final installment will be on network security.
Fiber networks circa 1990s concentrate on Layer 1 physical transceivers/channels and Layer 2 circuit switching (all in the digital domain). In fact, some early hardware were hardwired point-to-point! Up to a few years ago most of the circuits were quasi-static and the connections hardly change, usually staying for months if not years. Around 15 years ago MPLS, Multi-Protocol-Label-Switching, allowed setting up of sub-wavelength (one wavelength channel in a fiber can carry 10Gpbs to 1.6Tbps in recent months) capacity digital circuits via Layer 2 switches (all digital, the optics is merely used as a pipe that could have been copper of the same rate with no noticeable architecture differences). Since for MPLS the switching is done in the digital domain and the Layer 1 channels are totally static, the setup of new resources can be done as fast as a few seconds (but it is rarely done due to business practices). MPLS is driven from Layer 3 and pays no attention to the states of Layer 1 and Layer 2. For switching of the entire capacity of a wavelength, GMPLS, Generalized Multi-Protocol-label-Switching, was introduced later but it was seldom used until the last few years because of the lack of customers requiring such a large capacity. New GMPLS connections (those with inline Erbium Doped Fiber Amplifiers, EDFAs, take tens of minutes to start up because the new wavelength entries affect all existing traffic quality via non-linear coupling of mid-span amplifiers. Each new connection is turned on gently in many stages and in between stages the quality of all other connections are verified to satisfy QoS. A few years ago an advanced form of circuit provisioning SDN, Software Defined Networking, came along but it is still a Layer 3 function with no detailed knowledge of the lower layers other than the topology and the size of the pipes. There is no improvement of startup times because of the lack of visibility and control of the details of the physical properties of Layer 1.
With the advent of data analytics, a large amount of data may be moved from point to point or multi-points within a metro or long haul network. Many Giga-Bytes of data can be moved from storage to cloud computing centers, High Performance Computing Centers, and special purpose accelerator computing services. Many of the applications have time deadlines of tens of mS to seconds. The burstiness of the offered traffic will increase by many orders of magnitude and statistical multiplexing for smoothing traffic flows are often not present. To prevent excessive congestion delay, either the network is overprovisioned or a dynamic network management controller must quickly allocate resources to meet peaky demands. The time scale of the network management and control system will have to speed up by many folds (from minutes to ∼10mS) to do this effectively. Thus, turning wavelengths on and off slowly (as in tens of minutes) is not an option. There needs to be a much better visibility of the states and the dynamic physical models of Layer 1 which is nonlinear and very complicated. Thus, this problem must involve Layers1, 2 and 3 at least if not all the way to the Application Layer. Some believe machine learning and artificial intelligence can automatically handle the complexity. Unfortunately, ML/AI cannot handle the whole job. The cardinality of the detailed states of the network are exponentially large (we estimate the entropy change rate of a long haul network is around 300Gbps) and many possible states have never been visited before and thus learning is not feasible. ML/AI can be used for part of the problem as in determining the significant sampling rate1 of the network states keyed on the offered traffic. ML/AI must be used in conjunction with real physical models of the dynamics of the channel and thus cross-layer controls are imperative to achieve efficiency. In addition, the network should also have visibility of the applications and in particular its future intentions for better planning purposes (at the time scale of seconds). A more encompassing class of techniques loosely called cognitive networking will use physical modeling, Bayesian techniques, ML/AI, combinatorial and optimization techniques to achieve the following:
- Inference of network states based on traffic and link state sensing, sometimes with active probing typically with sparse and stale data.
- Decisions and actions taken on circuit initiations and tear downs, load balancing, reconfiguration and restoration.
- Prediction of intentions of users and appropriate actions recommended or taken.
- Providing efficient resource usage of the all-optically switched architecture with fast agility.
- Prediction of optimum configuration for fast adaptation; improving delay performance without detailed assumptions of channel and traffic statistics and overprovisioning.
- Detection of reliability and security related anomalies in network and react automatically: isolation, reconstitution, re-optimization, insertion of key interconnects, …
Since the network state is very large and also changes rapidly (in seconds), full sampling and reporting to the network management system will be very wasteful of sensing efforts and transmission capacity (∼1Tbps for a long haul system) and also require unwieldy huge processing power (HPC class) to compute optimum configurations. New techniques on essential sparse sampling and suboptimum but very fast algorithms must be used to determine network configurations. This is a very complicated problem that needs to run at a very fast pace. There are notions on how this can be done and thus far there have been no real comprehensive solutions for the end-to-end network management and control problem. One impediment is that there is only a small group of researchers and engineers who are conversant and proficient across all the network layers and applications. Nonetheless, research in this direction will be very rewarding and important. This is one example why multi-layer network research is part of the ComSoc strategic direction.
Finally, I want to make a comment on the free space version of optical networks. There the channel is even more complicated and the turbulent atmospheric channel changes in the time scale of mS. For this problem, not looking at the network across all layers will prevent the architect from arriving at an elegant and efficient solution.
1 The significant sampling rate is the minimum sampling rate (not uniform in general) driven by observed offered traffic and network loading that can effectively characterize the state of the network. This can be as small as 1/100 of the Nyquist sampling rate.