The aggregate conversion from the complex physical network topology to the simple virtual topology reduces not only load overhead, but also the parameter distortion of links and nodes during the aggregation process, thereby increasing the accuracy of routing. To this end, focusing on topology aggregation of multi-domain optical networks, a new topology aggregation algorithm (ML-S) was proposed. ML-S upgrades linear segment fitting algorithms to multiline fitting algorithms on stair generation. It finds mutation points of stair to increase the number of fitting line segments and makes use of less redundancy, thus obtaining a significant improvement in the description of topology information. In addition, ML-S integrates stair fitting algorithm and effectively alleviates the contradiction between the complexity and accuracy of topology information. It dynamically chooses an algorithm that is more accurate and less redundant according to the specific topology information of each domain. The simulation results show that, under different topological conditions, ML-S maintains a low level of underestimation distortion, overestimation distortion, and redundancy, achieving an improved balance between aggregation degree and accuracy.

The scale of optical networks is rapidly expanding with the development of internet technology. It will be difficult to manage the increasing burden and overhead of networks if all the nodes are located at one domain [1] . Large-scale networks are generally made up of different operators’ networks. Considering network management and security, the internal topology information of networks is usually invisible [2] . A network system is often divided into different domains, providing a tendency of multi-domain optical network for optical transmission [3] .

An important motivation of this study is to provide a better scalability for the rapid expansion of current internet users. Topology aggregation is one of the impetuses to solve the problems in scalability and security of large-scale multi-domain optical networks [4] . The basic principle is to aggregate topology information of single domain and to inform other domains, effectively hiding confidential information while reducing the information exchanged among domains [5] . The aggregated virtual topology is a concise description of nodes and links of the physical topology. At present, typical aggregation models include: virtual node aggregation, total topology aggregation, full mesh aggregation, star aggregation, spanning tree aggregation, hybrid aggregation, etc. [6] . The existing topology aggregation algorithms mainly include: single point substitution algorithm (SP), stretching factor algorithm, linear segment fitting algorithm(LS), polynomial fitting algorithm, cubic spline curve fitting algorithm, aggregation algorithm based on polymerization coefficient, aggregation algorithm based on weighted dominating set and so on [7] . While the above algorithms target at a better balance between the aggregation and accuracy, the error in accuracy is still considerable and greatly affects the accuracy of routing selection. Therefore, it’s necessary to improve the balance between the aggregation and accuracy. The novelty of the new proposed algorithm is to measure the aggregation accuracy with the distortion area, which makes the measurement of topology aggregation more accurate. It’s different from using the success rate of routing algorithm to measure the performance of topology aggregation algorithm, because the routing algorithm can influence the aggregation accuracy of topology aggregation algorithm to some extent.

2. ML-S Algorithm

2.1. Stair Generation Algorithm

This algorithm is capable to provide detailed description of topology information. To describe a link
l
(
u
,
v
)
of a domain, a pair of QoS values
(
d
l
,
w
l
)
are used, in which
d
l
and
w
l
represent the delay and bandwidth of the link, respectively. The demand of delay and bandwidth for an arbitrary service request is represented by
r
(
r
d
,
r
w
)
. When
r
d
≥
d
l
,
r
w
≤
w
l
, the link can provide services for the request.

If there are N optional paths between nodes a and b,
p
=
{
p
1
,
p
2
,
⋯
,
P
N
}
represents all physical paths between them. Each path is denoted by a pair of parameters (
d
p
i
,
w
p
i
), and the weight of the path is defined by:

Figure 1 shows a simple network physical topology. In the basic process of stair generation algorithm, nodes a and b are selected to maintain, and all links and nodes between them are aggregated into virtual topology.

To analyze, the first step is to find all the physical paths between a and b as shown in Table 1, and then abstract them into a logical path with weights. The first physical path is optimal if the bandwidth is the only factor under consideration. While the sixth physical path is the best if the delay is concerned. By the same token, the QoS values of any physical paths cannot fully represent the routing information between node a and b, when bandwidth and delay are both counted. In other words, it’s impossible to find the optimal path under both conditions.

In Figure 2, the bandwidth is plotted over the delay, and the topology information is represented by a set of inflection points forming a stair shape.

What determine the shape of the stair are not all the paths, but the path {p6, p2, p3, p5, p1}. Thus they are defined as the stair representation points. In the

coordinate plane, the lower right of the stair is defined as an acceptable service request area (ASA) and the upper left is defined as an unaccepted service request area (UASA).

The original stair can accurately reflect the real information of network topology, but the redundancy is not acceptable. Further fitting on its basis is required [9] . Inevitably, the compression of topology information will impair the descriptive accuracy of network resource, i.e. topology distortion. There are two kinds of topology distortion, one is the rejection of connection requests when service is available (Error Rejected Area, ERA), also known as underestimation distortion; another is blind acceptance of connection requests when service is not accessible (Crank Back Area, CBA), namely overestimation distortion [10] . The underestimation distortion often results in a waste of the available resources of the network, and overestimation distortion misleads the service requests into an area without the physical path meeting its QoS requirements.

As the two important indexes of the aggregation algorithm used to describe the accuracy of original network topology information, Error Rejected Ratio (ERR) and Crank Back Ratio (CBR) can reflect the influence of underestimation and overestimation distortions on network performance better. The calculation formulas are:

ERR
=
Number
of
feasible
requests
rejected
Number
of
overall
feasible
requests
(2)

CBR
=
Number
of
unfeasible
requests
accepted
Number
of
requests
accepted
(3)

The two indicators, ERR and CBR, constituting the description accuracy of the network, combine the aggregation degree, and the balance between the aggregation degree and description accuracy. They are the focus of this study [11] .

2.2. Stair Fitting Algorithm

As a part of the ML-S algorithm, stair fitting algorithm has advantages of fixed redundancy and low complexity. It can be applied to the network with uncomplicated topological structure. Schematic diagram of stair fitting algorithm is showed in Figure 3. Stair fitting algorithm generates a new even stair to approximate original stair.
p
d
o
w
n
(
D
min
,
W
min
)
and
p
u
p
(
D
max
,
W
max
)
represent the highest and lowest points of the original stair, and
f
L
denotes the equation of the fitting segment L.
f
L
−
1
is its inversion function.

A new fitted stair is calculated accordingly and its redundancy is fixed. It can be represented by the five parameters (
D
min
,
D
max
,
W
min
,
W
max
,
n
).

2.3. Multiline Fitting Algorithm

Multiline fitting algorithm is an improvement of the least squares fitting algorithm. Due to the fixed redundancy of the least squares fitting algorithm, there is only one line segment. However, the algorithm can induce a considerable topology distortion for complex network topology structures. The main reason of the distortion is the inhomogeneity and mutability of the stair. To this end, multiline fitting algorithm is proposed in Figure 4.

Compared with the least square fitting algorithm, the number of fitting line segment is greater than or equal to 1. The change in quantity is determined by the inflection points, which regulate the shape of the stair greatly and are defined as the mutation points. Because the determination of mutation points decides the fitting accuracy, the key of this fitting algorithm lies in the selection of mutation points (Equation (4)):

∑
j
=
1
n
(
w
v
−
w
j
)
2
≤
thesetvalue
(
2
≤
i
≤
n
)
(4)

For each inflection point, the true bandwidth of delay
d
j
is
w
j
.
w
v
is the fitted bandwidth for delay
d
j
. The fitted bandwidth based on mutation point i is decided by the least squares fitting algorithm at both ends of the point. Equation (4) shows the variance of the real stair and the fitting curve. The variance is inversely proportional to the fitting. If the variance is less than the set value, the fitting effect is preferable and satisfies the conditions, with i becoming a mutation point. Otherwise, i is an ordinary inflection point. Equation (5) is the fitting line segment of both ends of the mutation point.

Step 1: Define variable i, set its initial value to 1, which is the number of mutation points.

Step 2: Use Equation (4) to determine the ith point of the stair. If no mutation point is found, go to Step 7.

Step 3: Use the least squares fitting algorithm to generate a line segment between the first inflection point of the unfitted stair and the previous inflection point of the mutation point. The point at each end of the line segment is (ai, bi) and (ci, di), respectively.

Step 4: The shape of the original stair remains unchanged between the fitting line’s end point and the mutation point.

Step 5: Plus the value of i by 1,
i
=
i
+
1
.

Step 6: Calculate the next mutation point of the stair by Equation (4). If the mutation point could not be found, utilize the least squares fitting algorithm to generate a line segment between the last mutation point and the last inflection point of the stair. The point at each end of the line segment is (a, b) and (c, d), respectively. Algorithm ends. Otherwise, go to Step 3.

Step 7: Use the least squares fitting algorithm to generate a line segment between the first inflection point and the last inflection point of the stair. The point at each end of the line segment is (a0, b0) and (c0, d0), respectively. Algorithm ends.

The initial redundancy of multiline fitting algorithm is two points, which increase with the number of mutation points. Adding a mutation point will lead to one more fitting line and subsequently increase the redundancy by two points.

2.4. ML-S Algorithm

The redundancy of the stair fitting algorithm is fixed and its complexity is low, thereby suitable for the network with simple topology structure. However, the redundancy of multiline fitting algorithm is positively correlated with the number of mutation points, making it applicable to the network with complex topology structure. Combining the characteristics of both algorithms, we design an algorithm―ML-S―aiming at the elegant balance of aggregation degree and accuracy. The entire network topology is composed of several parts, with appropriate aggregation algorithm for each part. For parts with fewer mutation points, the multiline fitting algorithm with high accuracy is adopted. Because of the small number of mutation points, the redundancy is low. On the contrary, stair fitting algorithm can provide a better balance of aggregation degree and accuracy.

The basic procedure of ML-S algorithm:

Step 1: Use the stair generation algorithm to produce the stair map of the original topology.

Step 2: Use Equation (4) to calculate the number of mutation points (m) in the original topology. The network topology is divided into N parts according to the number of inflection points, N = roundup (m/10). And there are n inflection points per part, n = m/N.

Step 3: Define a variable i with an initial value 1, representing part i of the topology.

Step 4: Calculate the number of mutation points (k) in part i of the topology. If k < n/4, go to Step 5; Otherwise, go to Step 6.

Step 5: Use the stair fitting algorithm to fit the original stair in part i and add 1 to i,
i
=
i
+
1
. When
i
≤
N
, go to Step 4; otherwise, end algorithm.

Step 6: Use the multiline fitting algorithm to fit the original stair in part i and add 1 to i,
i
=
i
+
1
. When
i
≤
N
, go to Step 4; otherwise, end algorithm.

Complexity includes time complexity and space complexity, space complexity refers to redundancy on performance indicators, so here's an analysis of the time complexity. In order to make quantitative analysis and easier to understand, it is necessary to redescribe the network model used in this paper.

The entire network is composed of many domains,
N
(
G
,
L
d
)
represents the entire network. G is the set of all domains,
G
=
{
g
i
|
g
i
=
(
V
i
,
E
i
,
B
i
)
,
1
≤
i
≤
|
G
|
}
shows that the network is divided into
|
G
|
domains.
V
i
,
E
i
and
B
i
represent nodes set, links set, and boundary nodes set respectively in domain

g
i
.
|
V
i
|
,
|
E
i
|
and
|
B
i
|
represent the number of nodes, links and boundary nodes respectively in domain
g
i
. The number of mutation points is
|
M
i
|
in domain
g
i
. The number of mutation points is
|
m
j
|
in part j of domain
g
i
.

The simulation parameters of network model are set as follows: the bandwidth of each physical link is randomly generated between [1, 20] units, each unit can represent a certain number of Mbps; the delay of each physical link is randomly generated between [5, 20] units, each unit can represent a certain number of ms; randomly select 10% - 20% nodes of each domain to be boundary nodes and generate random links to connect all the boundary nodes; 500 random requests were randomly generated between the designated boundary nodes of the network topology. In order to guarantee that the distortion area is fully measured, random requests need to satisfy Equation (6):

The following conclusions can be drawn from the data in Figure 7: In the fixed topology, ML-S algorithm on the three performance indicators (CBR, ERR, and CBR + ERR) is superior to least squares algorithm and stair fitting algorithm, mainly because it takes into account the inhomogeneity and mutability of the stair, and selects some significant mutation points to achieve more accurate description of the topology. ML-S is close to the multiline fitting algorithm with optimal distortion performance, the main reason is multiline fitting algorithm is a vital part of ML-S algorithm and ML-S algorithm takes full advantage of great

distortion performance of multiline fitting algorithm; Because of stair fitting algorithm helps to suppress the redundancy of ML-S algorithm, ML-S algorithm is far better than multiline fitting algorithm in redundancy. Compared with the least squares algorithm and stair fitting method, ML-S algorithm obtains a significant improvement on the description of topology information with less redundancy, though redundancy rate is relatively increased. In terms of the balance between aggregation degree and accuracy, ML-S algorithm is significantly better than other algorithms, making a good balance between two aspects.

3.2. Experimental Scheme 2

The simulation of the experiment Scheme 2 is carried out in the random network topology. Using Waxman model to generate network topology randomly, the model is close to the real network and is widely used in simulation of various network routing algorithms. The generation rules of network model are as follows: All nodes of each domain are randomly distributed within a specified area, and each node is determined by randomly generated coordinates. The probability
p
(
u
,
v
)
of direct link between two nodes is:

p
(
u
,
v
)
=
β
exp
−
d
(
u
,
v
)
L
α
(7)

d
(
u
,
v
)
represents the distance between node u and v; L represents the maximum distance between the two nodes in the model;
α
and
β
are parameters within the range (0, 1).
α
represents the ratio of the number of longer links to shorter links, Increasing
α
will increase the number of long links, while the number of shorter links will decrease correspondingly.
β
represents the link density in the topology,
β
guarantees the uniformity of link distribution.

To simulate the experiment, 50 Waxman topologies were randomly generated each time. The number of nodes in the domain were selected as {10, 20, 30, 40, 50, 60, 70, 80, 90, 100} in turn. Firstly, measure 50 ERR and CBR values of 50 Waxman topologies have 10 nodes in the domain and calculate the average value of the 50 ERR and CBR values as final experimental data. In the same way, to get other ERR and CBR values of Waxman topologies have different nodes in the domain from 20 nodes to 100 nodes. Between the designated boundary nodes of the network topology, 500 random requests were randomly generated. In order to ensure distortion area is fully measured, the random requests need to satisfy Equation (6).

In order to verify the stability of ML-S algorithm, we designed the comparison experiment 2, the following conclusions can be drawn from the data in Figure 8: In the random topology, ML-S algorithm on the three performance indicators

(CBR, ERR, and CBR + ERR) is superior to least squares algorithm and stair fitting algorithm, mainly because the redundancy of ML-S algorithm is not fixed but the redundancy of least squares algorithm and stair fitting algorithm is fixed, ML-S improves the accuracy of network topology description by adding redundancy properly. ML-S is close to the multiline fitting algorithm with optimal distortion performance and the index tends to be stable with the number of nodes in the domain increases, the main reason is ML-S algorithm dynamically chooses algorithm that is more accurate and less redundant. For parts with fewer mutation points, the multiline fitting algorithm with high accuracy is adopted. On the contrary, stair fitting algorithm is adopted; Because of the smaller number of mutation points, ML-S algorithm is far better than the multiline fitting algorithm in redundancy and the index tends to be stable with the increase of the node number in the domain. The ML-S algorithm has reduced CBR and ERR approximately to half compared with the least squares algorithm and the stair fitting algorithm, greatly improving access success rate for feasible requests. ML-S algorithm not only reduces the waste of network resources, but also improves the overall performance of the network.

4. Conclusion

The proposed ML-S algorithm has better balance of aggregation degree and accuracy. Based on the stair generation algorithm, stair fitting algorithm, and multiline fitting algorithm, the ML-S algorithm improves the accuracy of network topology description by adding redundancy properly according to the specific topology information of each domain. The ML-S algorithm also dynamically chooses algorithm that is more accurate and less redundant. Compared with other algorithms in both fixed topology and random topology, the simulation results show that the ML-S algorithm has a stable and decent aggregation performance and achieves a better balance between the aggregation degree and accuracy. On overall distortion performance indicators, ML-S algorithm can reduce nearly 60% compared with least squares algorithm and stair fitting algorithm. On redundancy, ML-S algorithm can reduce nearly 50% compared with multiline fitting algorithm. Moreover, the proposed algorithm should be further studied in the following aspects. Firstly, dynamic updating mechanism of topology might be considered in the process of topology aggregation. Secondly, the topology aggregation algorithm with multi-parameter combination should be designed in addition to delay and bandwidth.