Inverse Multiplexing for ATM. Technical Operation, Applications and Performance Evaluation Study

Marcos Postigo-Boix¹, Joan Garcia-Haro², Mónica Aguilar-Igartua¹

¹Dpt. of Applied Mathematics & Telematics. Polytechnic University of Catalonia (UPC)
²Dpt. of Information Technologies & Communications. Polytechnic University of Cartagena (UPCT)
E-mail: (mpostigo@mat.upc.es; joang.haro@upct.es; maguilar@mat.upc.es)

Abstract

In a Wide Area Network (WAN) established infrastructure, one of the main problems ATM network planners and users face, when greater than T1/E1 bandwidth is required, is the high cost associated to T3/E3 links. The technology to cover the gap between T1/E1 and T3/E3 bandwidth at reasonable cost is known as Inverse Multiplexing for ATM (IMA). IMA allows multiple T1/E1 lines to be aggregated to support the transparent transmission of ATM cells over one single virtual trunk. In this paper, the fundamentals and major applications of IMA technology are described. Also, the behavior of IMA multiplexers is carefully analyzed and a method to dimension them proposed. For that purpose an IMA simulation tool has been developed, which permits the study of individual devices and the evaluation of the end-to-end performance of a logical trunk under several ATM input traffic patterns. The analytical study is based on the comparison with a M/D/C/(N+C) queue under Poisson input traffic.

1. Introduction

The dominant role of ATM in the LAN scenario is still unclear in comparison to other competing technologies as switched Ethernet, Fast Ethernet and even Gigabit Ethernet. ATM is well introduced in the backbone area, especially in private corporate environments. In the WAN, network operators are first providing access to the ATM network using the existing infrastructure and slowly deploying new one and implementing pure ATM interfaces over it. Users, however, want the ATM bandwidth benefits for their high-speed applications soon, but in a cost-effective manner.

Currently, there are basically two available options to provide access to the ATM services at a WAN scale. One consisting of T3/E3 links offering considerable bandwidth (44,736/34,368 Mbps) and usually not justified since they can be underutilized by most of the prospective users. Furthermore, the tariffs that carriers charge for them are very expensive. The other alternative is substantially cheaper and uses T1/E1 links (1.544/2.048 Mbps), but the offered bandwidth is insufficient for some user needs. Prices depend on several factors as distance and particular carrier. On the other hand, in general, T3/E3 links are only available in big cities [1, 2]. Due to cost and availability of service, an intermediate solution offering enough bandwidth at reasonable cost is required.

In July 1997, the ATM Forum published the Inverse Multiplexing for ATM specification, known as IMA [3], whose last version has been released in April, 1999 [4]. IMA defines the transparent transmission of a high-speed ATM cell stream over one logical link composed of several T1/E1 lines. IMA distributes and transfers a single flow of ATM layer cell traffic onto multiple physical links. At the remote end, the traffic is recombined and the original ATM cell sequence fully recuperated and delivered to the higher layers that will further process it.

The Inverse Multiplexer (IMUX) is the device in charge to group several T1/E1 physical circuits into a single logical trunk. An IMUX accepts ATM cell streams coming from different traffic sources as well as traffic coming directly from LANs that is adapted and converted to ATM cell format. In both cases, the IMUX distributes the resulting cells in a round-robin fashion over the physical links maintaining the QoS required by each individual connection.

In this paper, thus, the origins, application and technical foundations of inverse multiplexing are explained in tutorial style. Then, a model for an IMUX is presented and intensively evaluated under Poisson Traffic. To perform the evaluation under different input traffic distributions an IMA system simulator has been developed. The idea is to elaborate a methodology to help engineers and network planners to characterize and dimension an IMUX device. That is, to obtain the buffer size and the number of T1/E1 output links that guarantee the required QoS parameters demanded by users, basically measured as Cell Loss Ratio (CLR) and mean cell delay. An approximate analysis to compute easily the IMUX performance is derived avoiding the need of costly simulations.

The paper is organized as follows. Section 2 discusses the current knowledge regarding operation and main applications. The simulation model is introduced in section 3. The study under Poisson input traffic and the approximate analysis for the CLR and the mean cell delay are...
presented in section 4. Finally, section 5 summarizes the most relevant conclusions of this study.

2. Inverse Multiplexing Fundamentals and Network Applications

In a conventional multiplexer, various independent input channels are combined into one high-speed link for efficient transmission. The situation changes when existing applications and local traffic have the opposite requirement. Users have high-speed LAN data, video-on-demand, multimedia applications, etc., to interconnect, interact and interoperate between remote ends. They could lease a wide-bandwidth circuit (e.g., T1/E1 or T3/E3) but its capacity has to be full-used over time to justify its cost. As an alternative, a cost-cutting scheme can be used. The original idea was based on leasing a number of small bandwidth synchronized digital telephone circuits (56/64 Kbps) to obtain a higher-speed connection which capacity is approximately the sum of link capacities minus a small amount of overhead. Therefore, it is possible the transmission of a wider-bandwidth signal over the existing switched digital telephone network on a dial-up basis. This process is the reverse of multiplexing, because it involves breaking a wider signal into multiple smaller capacity and independent channels for transmission. Furthermore, the customer uses the minimum required bandwidth (with the granularity of the channel rate) and for the minimum time necessary. Also, he has the flexibility to use as much or as little bandwidth as needed on a demand basis. Finally, the user has only to pay for the time bandwidth is consumed. Inverse multiplexing for ATM follows this scheme, but using multiple T1/E1 links to constitute an IMA group of the required high-speed to bear ATM services. This implies additional advantages. For example, if link failures happen, the connections involved can be still preserved although, temporarily reducing the total bandwidth assigned. The failed T1/E1 links are automatically recovered and added back to the IMA group when they are restored. Consequently, IMA continues offering service to the applications although the QoS is reduced during the failure. Besides that, IMA could offer bandwidth on demand, dynamically adding more T1/E1 links to a session in progress if more bandwidth is requested. Links could be torn down when the bandwidth is no longer needed. Currently, this is one of the ATM Forum sub-working group goals.

The three typical network configurations where the IMUX technology applies are: access connection to the ATM network, internal network connection between ATM switches and dedicated bandwidth connection between two remote places or ATM leased line IMA.

IMA is a process in which an ATM cell stream is cyclically distributed onto multiple T1/E1 physical links as shown in Fig. 1. The original flow is reassembled at the receiving end. To control the different physical links and to reconstruct the ATM cell stream the IMA Control Protocol (ICP) is employed. ICP uses special OAM cells or ICP cells that are periodically inserted in each line. These cells carry information regarding the state of the link, the number of links being multiplexed, when to add or remove a T1/E1 link from an IMA group and the differential delay values among links. It allows the receiver to arrange IMA frames that arrive not aligned due to different delays suffered in each individual link. The IMA protocol also defines a new OAM cell, the Filler cell. Filler cells are appropriately injected to provide cell rate decoupling between the bit rate of incoming ATM layer cells and the IMUX operation nominal rate. This rate is known as IMA Data Cell Rate (IDCR).

\[ \text{Figure 1. IMA transmit end.} \]

**Stuff events** (two consecutive SICP cells) are required to prevent de-synchronization among links operating with independent clocks. SICP cells or Stuff ICP, provide some tolerance to compensate clock divergences.

Fig. 2 represents the IMA receiver. It is the responsible to align IMA frames buffering them to recombine back the traffic into the original ATM layer cell sequence. Reception buffers are cyclically read at IDCR. The reading process is controlled by the IMA cell clock at IDCR. Delay Compensation Buffers are read according to a round-robin order. When an ICP cell is found, the receiver ignores it and immediately advances to check the next link until a Filler or an ATM layer cell is found. Analogously, when a stuff event is found, the two SICP cells are ignored. If the receiver finds a Filler cell, it is discarded and the IMA waits to the next tick of the clock. If it is an ATM layer cell, it is passed to the ATM layer.

3. IMA Characterization and Simulation Model

To examine IMA operation under different traffic patterns, load conditions, number of output links and different buffer sizes an IMA simulator written in C++ programming language has been developed. It simulates independent IMUX devices (sender and receiver) as well as a complete end-to-end IMA group. Basically, the system is composed of a traffic source, an IMUX, a variable number of transmission links, an inverse demultiplexer and a traffic collector. Input traffic of variable load at-
tacks the IMUX device that operates at IDCR. The IMUX can be configured to have C output ports running at TRLCR (Timing Reference Link Cell Rate), perfectly synchronized. The number of IMA links is variable, however the typical and commercial values range from 2 to 8. The IMUX device implements buffering of selectable size to temporary store ATM layer cells in case they cannot be immediately injected to the output ports. IMA OAM cells are considered not to occupy memory since they are generated at the same time they are transmitted. The memory at the sender is logically organized into FIFO queues, one for each output port and one for the incoming cells from where cells are cyclically extracted out at IDCR. The queue size depends on the number of the arriving ATM layer cells and it is limited to the total memory pool assigned to the IMA device. The memory, is thus, shared among all output ports. A cell is only lost whenever the memory is completely full of ATM layer cells. As Fig. 1 shows, the ATM layer cells are distributed onto the output links according to a round-robin algorithm. Also, the introduction of control cells for their respective output ports is indicated. The memory at output ports is employed to schedule the departure of cells, and it is a part of the total memory of the IMUX. This memory is used on demand. Therefore, empty buffers are only present at the input queue of the system. The IMUX output ports go through a network of links. It only simulates a different length for the links in the IMA group. Therefore, the impact of different delays on the cell delay and cell delay variation can be studied. However, we considered all the lengths identical. Consequently, the end-to-end delay will be the minimum, since for different lengths, the receiver would wait for cells arriving to the longest link buffer.

![Figure 2. IMA receive end.](image)

At the remote end, the IMA de-multiplexer receives the C lines and reconstructs the original cell stream. Incoming cells are stored into queues, one for each link, and then, they are cyclically read out at IDCR rate.

4. Performance Evaluation under Poisson Input Traffic

In this Section, we compare the results conducted by simulation to the ones obtained from a proposed mathematical analysis that can ease the IMUX dimensioning without the need of costly simulations. We analyze the CLR and the mean waiting time for the ATM layer cells in the IMUX, as the main QoS parameters. To gain insight and simplify the analysis, we consider ATM cells arrive to the IMUX according to a Poisson process. Doing so, IMUX operation can be related to the discrete-time behavior of a M/D/C/(N+C) queuing system. We know Poisson Traffic cannot be found in today networks, but it is a starting point in teletraffic engineering and, still a good choice when the arrival process is unknown and not very bursty (i.e., at intermediate network nodes).

We use a Poisson process of parameter $\lambda$, where $\lambda$ is the average ATM cell generation rate. Particularizing for an IMA system, a Poisson source generates cells at mean rate $\lambda$, and the IMUX is loaded by a factor $\rho = \lambda/\text{IDCR}$. This is also valid for any type of traffic, where $\lambda$ is the mean number of cells generated per unit of time.

4.1. CLR estimate in an IMA multiplexer by using a M/D/C/(N+C) system

Assuming Poisson input traffic and the same amount of available resources, C output links and an input queue of finite size N cells, the actual IMA multiplexer and its M/D/C/(N+C) representation are, a priori, similar. Both have N memory positions to hold N cells and they dispose of C identical output ports, each with a constant service time. Time is slotted into fixed length slots and the slot is taken as the unity to simplify the analysis. Although, it may be considered that Filler cells are also transmitted when there are not enough ATM layer cells in a M/D/C/(N+C) model to serve, it still remains some differences as it is explained below. The treatment of priorities and scheduling policies to support different traffic classes is out of the scope of this paper.

Both systems differ in how incoming cells are processed and sent to their respective output link. Let us analyze first the effect produced by ICP/SICP cells. The IMA specification [3, 4] recommends ICP cell insertion within each IMA frame on a physical link at specific locations. Each ICP cell appears in a different slot within a frame on different links in the IMA group, but the position is the same from frame-to-frame on any given link. Consequently, when an ICP cell is delivered the transmit IMA serves this time slot ATM layer cells only on the remaining C-1 links. In comparison, a M/D/C/(N+C) system serves C cells if they are present in their buffers.

In a M/D/C/(N+C) system, cell delivery is always done at the end of a time slot. If the amount of ATM layer cells stored into the system is lesser than the number of output ports, Filler cells are introduced over the corresponding output links. In an IMUX, the decision to dispose cells onto the output links is taken at IDCR rate on a round-robin way as it is depicted in Fig. 3. Operating at IDCR it may be possible that no ATM layer cell be available to
schedule over an output line at the first decision time and a Filler cell be inserted instead. However, C or more ATM layer cells may arrive just after this instant (Fig. 3).

**Figure 3. M/D/C/(N+C) model vs. IMUX.**

An additional effect has to be considered. Cells are cyclically delivered at IDCR rate but ICP cells have to be periodically inserted onto a given output port. Therefore, when an ICP cell or a stuff event must be inserted, the corresponding ATM layer cell will be placed onto the next available link (see cell number 1 in Fig. 4). This event delays the delivery of ATM layer cells. Consequently, an ATM layer cell (cell number 4) arriving to the IMUX before others is transmitted later as it is illustrated in Fig. 4. This cell will leave after the two SICP cells preceding it. This incident lasts until the completion of the distribution of C-1 cells at IDCR rate.

**Figure 4. Delay introduced by ICP/SICP cells (4 links).**

All these events delay the delivery of ATM layer cells compared to a M/D/C/(N+C) model causing more occupation of the memory. This produces an increase of the CLR, mean cell delay and cell delay variation. The higher occupation could be interpreted as the number of "extra" cells that could be waiting at the IMUX, due to ICP/SICP cell insertion. At IDCR it is decided the particular output link for an ATM layer or Filler cell departure. Doing so, IMA offers a virtual link of IDCR rate. As 1/TRLCR is not a multiple of 1/IDCR the number of cell departure decision times per slot is not constant. Concretely, in some slots it is decided C ATM layer or Filler outgoing cells and C-1 cells for the other slots. This is done to compensate ICP/SICP cell delivery. The output time slots where it is decided the departure of C-1 ATM layer or Filler cells are denoted by δ's in the sequence S(n) [7].

\[ S(n) = \sum_{i=1}^{\infty} \delta \left[ n - \left\lfloor \frac{2049}{17 \cdot C \cdot i} + 1 \right\rfloor \right] \]  

On the other hand, ICP cells are located in an IMA frame at the offset(s) position (2), being s the link number and M the length of an IMA frame. Expression (3) represents the ICP or SICP cell insertion.

\[ \text{offset}(s) = \frac{2(s+1)}{2 \log_2(s+1)} - M \]  

\[ ICP(n) = \sum_{s=0}^{C-1} \sum_{k=0}^{\infty} \left\lfloor \delta \left[ n - \text{offset}(s) - 2049 \cdot k \right] + \sum_{i=0}^{\infty} \delta \left[ n - \text{offset}(s) - \left\lfloor \frac{127 \cdot C}{\text{IDCR}} \cdot i \right\rfloor \right] \right\rfloor \]  

When an ICP cell is inserted it will be delivered to its corresponding output link but it will affect ATM layer cell delivery since their service is delayed. This can be modeled as if the memory pool is reduced by one cell buffer, since an ATM layer cell remains in memory instead of being served. However, it is decided the departure of C-1 ATM layer cells, the memory occupation decreases because the additional cell leaving the system will be taken from cells already stored in the memory. Therefore, the number of extra cells or the equivalent level of memory reduction increases with an ICP or SICP cell service and decreases during the slots where only C-1 cell departures are decided (4). Fig. 5 shows the extra cell level for the case with 2 output links.

**Figure 5. Extra cell level in an IMUX (2 output links).**

\[ \text{extra}(n) = \sum_{m=0}^{n} \left[ ICP(m) - S(m) \right] \]  

For each number of output links it is possible to calculate the probability that the IMUX will have a certain reduction of memory. Obtaining the mean value (m) and the standard deviation (σ) of these probabilities we get an approximate formula of the buffer reduction (5) by rounding the sum.

\[ N_{\text{red}} = \lceil m + \sigma \rceil \]  

We take this approach for the buffer reduction, since the mean value is not considering the width of the distri-
Fig. 6 plots the CLR under an offered load per link of 80%. We have conducted simulations for an offered load per link ranging from 50% to 90% and C taking values from 2 to 8. Simulation results are represented by dots. The analytical values derived from the solution of the M/D/C/(N+C-Nref(C)) system are denoted by lines. Simulations have been conducted in segments of two million cells. Due to the number of cells simulated, the 95% confidence intervals are only significant and visible for CLR values around 10⁻⁸. To reach simulated CLR values of very small order (CLR<10⁻¹²), it is necessary to introduce rare-event simulation techniques to accelerate the execution time. It is observed that the proposed approximation improves as the offered load and the buffer size increase. For a given buffer size, when the applied load grows the CLR increases since the number of cell arrivals is higher. Augmenting the buffer size the CLR decreases but more slowly as the offered load becomes larger. Also, as the offered traffic load grows, the difference of leasing a few number of output links or a greater one becomes less effective since at high load, the losses are too high as well. In addition, the CLR grows when the number of links increases. That is, assuming that the offered load per output link remains fixed, the input cell rate to the IMUX increases when the number of links grows.

### 4.2. Estimate of the Mean Cell Delay at the IMA Multiplexer

Once a suitable M/D/C/(N+C-Nref(C)) model has been selected to approximate the IMUX CLR, we check if it also provides a good estimate for the mean waiting time suffered by a cell in the system. Comparing the mean waiting time as a function of the offered load at the simulated IMUX and the one derived from its equivalent queuing model, both mean delays seem to exhibit a similar behavior differing by a constant value.

In a M/D/C/(N+C-Nref(C)) model, the minimum cell waiting time (under input load near to zero) is always fixed and equal to 0.5 slots. In an IMUX, the mean cell delay depends on the number of output ports. Accordingly, as a first approach, we can add to the analytical results the difference between minimum mean delays from the simulated IMUX and the M/D/C/(N+C-Nref(C)) system. The minimum mean cell delay of an IMUX has two components; one due to the round-robin cell distribution at IDCR and the other caused by the delay that ICP/SICP cells introduce [7]. The first component means that an incoming ATM layer cell is not transmitted before the clocking time at IDCR, although it is possible that some links be filled with OAM cells at those instants. This delay component can be formulated by (6).

\[
W_{min,IDCR} = \int_0^1 (1-x)dx + \text{IDCR} \int_0^1 \frac{1}{1-x}dx
\]

(6)

The delay introduced by ICP/SICP cells is because the system must always guarantee the delivery of C outgoing cells (ATM layer, Filler or ICP/SICP), one for each output port at TRLCR rate. Thus, when a control cell is inserted some ATM layer cell has to wait for its transmission. Calculating the mean cell delay those cells add into the links [7] yields the expression (7). In consequence, the minimum mean cell delay can be described by (8).

\[
W_{min,IMA} = W_{min,IDCR} + W_{ICP}
\]

(8)

Once this delay is computed, we approximate the results by the summation of mean delays in the M/D/C/(N+C-Nref(C)) system plus the difference between minimum mean delays (10). These values are further adjusted by using simulation (11) with the polynomial of minimum degree that approximates these results. The resulting expression to estimate the mean cell delay at the IMUX is denoted by (9).

\[
W_{IMA}(\rho, C) = W_{M/D/C/(N-C-N_{ref}(C))} + W_{2}(C) + W_{3}(\rho, C)
\]

(9)

\[
W_{2}(C) = W_{min,IMA} - W_{min,M/D/C/(N-N_{ref}(C))} = W_{min,IMA} - 0.5
\]

(10)

\[
W_{3}(\rho, C) = 0.1462\rho - 0.2926 \times 10^{-2} C + 0.01606
\]

(11)

Fig. 7 plots the comparison among simulated and approximated values for an offered load of 80% respectively. Confidence intervals are not shown since their values are very small (relative errors lower than 2%).
use of (11) is to dispose of the simulated results in a computable expression. Look at Fig. 8, where the results for an offered load of 80% are shown without the finest adjustment (11), being the approximation still good enough.

![Figure 7. IMA mean cell delay (load 80%).](image)

![Figure 8. IMA mean cell delay (load 80%). Without $\bar{W}_3$.](image)

Given some values defining the required QoS, the proposed method allows the dimensioning of the main parameters that characterize an IMUX. For that purpose, the resulting analytical expressions can be used (and their outcomes may be tabulated) instead of developing costly simulations. For instance, employing the figures shown above, if the maximum mean cell delay and the CLR required are fixed to two time slots and $10^{-4}$ respectively, for an offered load per link of 0.8 Erlang. The number of output links needed is 4 lines and the buffer capacity at the IMUX is 24 cells deep. Of course, it is true if the Poisson input traffic hypothesis holds. Similarly, the IDCR and TRLCR rates can be dimensioned taking into consideration the offered load per link.

5. Conclusions

In this paper, the Inverse Multiplexing for ATM and its primary network applications are described. IMA represents a physical layer technology; therefore, it can be used to transport any service previous its adaptation to ATM cell format. To study the behavior and evaluate the performance of IMA systems a flexible and object-oriented simulation tool has been developed [7]. IMA multiplexers have been characterized and analyzed under Poisson input traffic. First, the CLR as a function of the offered load and the number of output links has been investigated. An equivalent M/D/C/(N+C) queue model is proposed to model the memory required at the IMUX. Reducing the queue size of this analytical model, an accurate approximation to the simulation results is obtained. The queuing system is further manipulated to adjust the mean cell delay. Nevertheless, in any case, the IMUX performance depends on the input traffic pattern. This also holds for any network node. Using Poisson traffic is almost the only way to obtain analytical models and closed mathematical expressions. Although, Poisson traffic is not a realistic pattern for access network nodes and for the current traffic measured in ATM networks, it is still a useful approach to validate the characterization and simulation models of these ATM network nodes. Operators, network planners and traffic engineers will continue developing device node libraries to incorporate in more or less complex network simulators. These simulators will be fed with realistic traffic patterns [6] according to users changing needs. But, engineers with high probability will still use easy and useful patterns, as the Poisson one, to validate and verify their simulators. This is the reason why we still believe in the usefulness of a "permanent" and understandable traffic pattern to study the IMUX and to analyze mathematically it [5, 7].

6. Acknowledgements

This work was supported by the Spanish Research Council under project SSADE (CICYT TEL99-0822).

7. References