Tối ưu hóa viễn thông và thích nghi Kỹ thuật Heuristic P17

Chia sẻ: Tien Van Van | Ngày: | Loại File: PDF | Số trang:23

lượt xem

Tối ưu hóa viễn thông và thích nghi Kỹ thuật Heuristic P17

Mô tả tài liệu
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

In recent years, flow control and network management in high speed networks have drawn much attention by researchers in the telecommunications field. In large-scale network environments with high complexity, decentralized control and decision making are required. Many research papers concerned with multidisciplinary approaches to modeling, controlling and managing networks from a computational intelligence point of view appear in the key telecommunication journals every year.

Chủ đề:

Nội dung Text: Tối ưu hóa viễn thông và thích nghi Kỹ thuật Heuristic P17

  1. Telecommunications Optimization: Heuristic and Adaptive Techniques. Edited by David W. Corne, Martin J. Oates, George D. Smith Copyright © 2000 John Wiley & Sons Ltd ISBNs: 0-471-98855-3 (Hardback); 0-470-84163X (Electronic) 17 Intelligent Flow Control under a Game Theoretic Framework Huimin Chen and Yanda Li 17.1 Introduction In recent years, flow control and network management in high speed networks have drawn much attention by researchers in the telecommunications field. In large-scale network environments with high complexity, decentralized control and decision making are required. Many research papers concerned with multidisciplinary approaches to modeling, controlling and managing networks from a computational intelligence point of view appear in the key telecommunication journals every year. However, only a few of them concentrate on the similarity between resource allocation in a network environment and the market mechanism in economic theory. By taking an economic (market based and game theoretic) approach in the flow control of a communication network, we seek solutions where the intelligence and decision making is distributed, and is thus scalable, and the objective of a more efficient and fair utilization of shared resources results from the induced market dynamics. Unlike other decentralized approaches, our main focus is on using computational intelligence to model the distributed decision agents and to study the dynamic behavior under game-theoretic framework in allocating network resources. By viewing a network as a collection of resources which users are selfishly competing for, our research aims at finding efficient, decentralized algorithms, leading to network architectures which provide explicit Quality of Service (QoS) guarantees, the crucial issue in high speed multimedia networks. There are several parallel research projects from different institutions dealing with this complicated issue. COMET at Columbia Unversity (http://comet.columbia.edu/research/) Telecommunications Optimization: Heuristic and Adaptive Techniques, edited by D.W. Corne, M.J. Oates and G.D. Smith © 2000 John Wiley & Sons, Ltd
  2. 308 Telecommunications Optimization: Heuristic and Adaptive Techniques calls the decentralized decision making process Networking Games. Using network game approaches, they intend to solve the crucial issue of pricing from the ground-up of the network from the engineering point of view, rather than having, as in existing communication networks, a necessarily arbitrary price structure imposed post facto. By taking a market-based approach to distributed systems, the Michigan MARX Project (http://ai.eecs.umich.edu/MARX/) seeks solutions to enable adaptive allocation of resources in large-scale information systems. A very useful collection of links and various resources on network economics and related matters is maintained at Berkeley by H.R. Varian (http://www.sims.berkeley.edu/resources/infoecon/Networks.html). However, there are few published works which attempt to solve call admission control as well as traffic management in telecommunication networks from a game-theoretic point of view by using computational intelligence. In this chapter, we organize our work into two parts. The first part is dedicated to the Connection Admission Control (CAC) process in ATM networks. By taking dynamic CAC as a resource sharing problem, we use a cooperative game model with the product form of a certain combination of user preferences to optimize the call block probability while maintaining the negotiated QoS at the call setup stage. After deriving the product form of the user preference function, we use a genetic algorithm to optimize this objective function to maintain fair shares during the traffic transmission phase. The second part of our work models non-cooperative user behavior in the resource allocation process under a generalized auction framework. We propose an optimal auction rule that leads to equilibrium in a repeated dynamic game. We also use neural networks to model the user strategy in bidding, and briefly discuss the formation of common knowledge in repeated games and the allocation efficiency. The common trait for these two parts is that we use some computational intelligence techniques to solve the complicated modeling and optimization problems in resource allocation. 17.2 A Connection Admission Control Scheme Based on Game Theoretic Model in ATM Networks 17.2.1 Preliminaries of Congestion Control in ATM Networks Asynchronous Transfer Mode (ATM) has gradually become the standard in Broadband Integrated Services Digital Networks (B-ISDN). It aims to integrate all of the digital communication services, such as voice, image, video and data transfer, into a single integrated network. A fixed-length packet in an ATM network is called a cell. The network takes advantage of statistical multiplexing to improve the link utilization, while ensuring the requirements of different types of Quality of Service (QoS) from various traffic sources. The demand for intelligent flow control to prevent possible network congestion has become a major issue for network management. The congestion control schemes of the ATM network can be classified into two types: preventive and reactive control. The first approach is mainly applied to Constant Bit Rate (CBR) and Variable Bit Rate (VBR) services, while the second is applied to Available Bit Rate (ABR) services. Preventive congestion control includes Connection Admission Control (CAC) and traffic enforcement. CAC decides whether to accept or reject a user request for connection establishment. If a connection is established, the bandwidth (and possibly the cell buffers at each switching node) for the
  3. Intelligent Flow Control under a Game Theoretic Framework 309 incoming service along this connection has to be explicitly allocated during the call holding period. The number of connections accepted at any time has an upper bound due to finite network resources, which mainly depends upon statistical characteristics of traffic sources provided by the end users, QoS constraints and the pricing scheme added on each connection. Traffic enforcement regulates the traffic load in each connection by shaping or smoothing the incoming traffic according to the user declared traffic descriptors. There exist a lot of CAC schemes for bandwidth allocation. Early attempts often used a static approach to model the incoming traffic: during the call setup phase, the end user provides two sets of parameters, one indicating the QoS requirements (e.g. cell loss rate, maximal tolerable delay) and the other containing all the statistical descriptors of the traffic. If the network has enough bandwidth and buffers to transmit the new traffic source, then it will accept this user request and allocate the required resources. Otherwise, the call will be blocked or rejected. However, because of its reliance on the traffic descriptors provided by the end user, the above method has the following problems (Hsu and Walrand, 1996): • How to choose statistical parameters for different traffic sources has not yet been agreed upon. • Usually, at the call setup phase, the end user is not able to determine the traffic descriptors accurately. • Once the incoming traffic is multiplexed with other cell streams, its statistical characteristics will change, and the user declared parameters may not have the same accuracy for all traversed links during transmission. To overcome these problems, one simple solution is to reduce the parameters of the traffic descriptors from the end user (e.g. peak cell rate and mean cell rate), and the network makes a conservative resource allocation by assuming the ‘worst’ traffic pattern the user will provide. Details of various traffic models for ‘worst’ case service patterns can be found in Perros and Elsayed (1996). Although quite simple, this approach often leads to inefficient bandwidth utilization. Thus, it is necessary to develop a scheme using a dynamic allocation mechanism which can estimate the required bandwidth of each source by monitoring user-specified QoS parameters during cell transmission. A review of dynamic allocation strategies for ATM networks can be found in Chong et al. (1995). However, the diversity of traffic characteristics and the variety of QoS constraints make the optimization effort of dynamic bandwidth allocation a complicated problem. Various analytical methods and queuing models have been used, but most of them are highly computationally expensive (e.g. even the simple heterogeneous on/off queuing model in Lee and Mark (1995) has no analytical solution in closed form). Recently, heuristic approaches based on equivalent bandwidth have been presented using neural networks and fuzzy set theory. For example, Hiramatsu (1991) proposed a CAC and a traffic enforcement framework using a neural network; Chang and Cheng (1994) proposed a method of traffic regulation based on a fuzzy neural network; Ndousse (1994) implemented a modified leaky bucket to smooth the traffic by using fuzzy set theory; in Zhang and Li (1995), the authors derived a CAC scheme based on a fuzzy neural network, and integrated it with a genetic algorithm to optimize the network parameters. All of these schemes can be seen as a trade-off between the accuracy of traffic modeling and the implementation complexity. A major drawback is
  4. 310 Telecommunications Optimization: Heuristic and Adaptive Techniques the lack of fairness among different classes of service, especially when different pricing schemes are introduced to the individual user. The recent development of the differentiated service Internet model recalls the interest of fair shares among different services, and the CAC scheme also requires improvement to fit this need. In the following, we first briefly describe the measurement-based method to monitor the equivalent bandwidth of traffic sources; then we propose a cooperative game model to derive the measure of fair shares in resource allocation. Under the game theoretic framework, the fairness criterion of resource allocation can be derived through maximizing the product form of the user preference function (also called the utility function). Given the analytical form of the utility function based on the bandwidth-delay-product, we use a modified genetic algorithm to solve the optimization problem and derive the available bandwidth for each type of traffic. The CAC criterion is modified to maintain the fairness of accepting/rejecting the incoming calls. Simulation results show that our method can achieve a desired fairness for different traffic types and also maintain good link utilization. 17.2.2 Cooperative Game Model for Dynamic Bandwidth Allocation The equivalent bandwidth model of traffic sources We first briefly introduce the method to estimate the equivalent bandwidth of multiplexed traffic based on traffic measurement. Due to the diversity of the traffic arrival process, when heterogeneous traffic sources are multiplexed, the bandwidth allocated for each single traffic source has an equivalent representation, assuming that no multiplexing is induced. In terms of equivalent bandwidth, the estimation is based on large quantities of traffic multiplexing. Denote the traffic arrival process as {X1, X2, …} and the service rate as S. In a given traffic measurement interval, there are W such discrete arrivals, and we assume W is large enough. With regard to its asymptotic distribution, we have: W 2W X1 = ˆ ∑ X k , X 2 = ∑ X k ,⋅ ⋅ ⋅ ˆ (17.1) k =1 k =W +1 Assume the arrival process is stationary; the new process in equation 17.1 asymptotically has identical independent distribution. According to large deviation theory, the equivalent bandwidth b can be calculated by the following equation (Chen, 1996): n /W 1 W λW (θ ) = log ˆn W n ∑ eθXˆ i (17.2) i =1 δˆn = sup{λW (θ ) ≤ Sθ } W ˆn (17.3) θ >0 λW (δˆn ) ˆn W b = lim (17.4) n →∞ δˆn W
  5. Intelligent Flow Control under a Game Theoretic Framework 311 Details about the equivalent bandwidth can be found in Lee and Mark (1995), and measurement-based admission control procedures in Jamin et al. (1996). It is necessary to point out that the equivalent bandwidth of the individual traffic represents the required resource capacity that the network manager has to allocate for the established connection in order to meet the QoS requirement. The accepted traffic sources can also be regulated via end-to-end traffic enforcement (Chen, 1996). The above provides a brief illustration of dynamic bandwidth allocation, which is also the preliminary in developing a CAC strategy that we will discuss next. The fair share criterion based on the cooperative game model Early research on dynamic resource allocation mainly concentrated on improving link utilization under certain QoS constraints. Less attention has been paid to the fair shares issue among different service types when many incoming calls are competing for network connections and some calls must be blocked. Recently, more work has emphasized the problem of fairness among end users, as well as different types of traffic streams using network resources. For example, in Bolla et al. (1997), the authors try to balance the call block probability among the incoming calls of different types. In Jamin et al. (1998), the authors combine the CAC strategy with a usage-based pricing scheme to achieve a certain degree of fairness. In general, these works propose different objective functions that have to be optimized when admission control and the models of the objective function can be classified into two classes. One formulates the optimization problem taking the Grades of Service (GoS) as constraints of the objective function. The other approach assigns each class of traffic a certain weight or a reward parameter, and the distribution of the GoS is adjusted in resource allocation by changing the value of the weights. The implementation of the above approaches is by no means easy due to the lack of an efficient algorithm to optimize the objective function on-the-fly. Thus, it is difficult to satisfy the needs of a real- time CAC decision using the above schemes. However, from the user-preference point of view, the problem of fair shares among different traffic types is suitable for modeling as multi-player cooperative games. Early studies such as Sairamesh et al. (1995) and Dziong and Mason (1996) discussed the call block probability in resource allocation under the game theoretic model. In Dziong and Mason (1996) the Raiffa, Nash and modified Thomson solutions were compared with regard to their characteristics and the CAC boundary to apply the solution, but only a few simple traffic models are used in simulation to achieve nearly equal call blocking probabilities among the different traffic types. In the following, we first introduce the cooperative game model, and then define the utility function based on the form of bandwidth-delay-product of the incoming traffic. Using this model, we propose a CAC scheme to achieve a certain trade-off of call block probabilities among different traffic types. Finally, we try to optimize the objective function used in the cooperative game model by applying a genetic algorithm. To simplify the notation, suppose there are two types of incoming traffic competing for the network resources. All possible bandwidth allocation strategies between the different types of traffic form a strategy set, and the outcome of the game is evaluated with two players’ utilities namely u = {u1, u2}, ui ∈ R. Denote U as the set of all possible outcomes of the game between the two players, and assume that U is convex and compact. Note that this assumption is introduced in order to derive the unique optimal solution under the
  6. 312 Telecommunications Optimization: Heuristic and Adaptive Techniques cooperative game model. However, in solving the optimization problem using a genetic algorithm, we do not need this property of the set U. The outcome of the cooperative game with two players has the following properties: the increase of one player’s utility will decrease that of the other; each player’s cooperative utility is no less than the non-cooperative outcome. Suppose that when two types of traffic sources demand network bandwidth, the network manager uses a centralized decision rule to ensure the fairness of resource sharing, such as certain coordination between the users transmitting different types of traffic. This procedure is called a cooperative game. For convenience, assuming that the elements in U have been normalized to [0, 1], we define the player’s preference as his utility function, given below: v1 = u1 + β (1 − u 2 ) (17.5) v2 = u2 + β (1 − u1 ) (17.6) where β is a weight factor. In the cooperative game model with two players, by solving the optimization problem: max{v1 ⋅ v 2 } (17.7) u∈U the two players’ bandwidth utilization may achieve a ‘relatively’ fair share at some operation point with respect to β. When β = 0, the solution of equation 17.7 is called the Nash point; when β = 1, the solution is called the Raiffa point; and when β = −1, the solution is called the modified Thomson point. In the multi-player cooperative game model, player j 's utility function has the form v j = u j + | β ( N − 1) | − ∑ ui (17.8) j ≠i The economic meanings of each solution corresponding to equation 17.7 can be found in Shubik (1982). When using the cooperative game model, choosing an appropriate utility function is the key issue in ensuring fairness among the various services. Except for the diversity of the bandwidth requirements of the traffic sources, the statistical characteristics of the traffic and utilization of the resources along the end-to-end links should also be considered. In deriving a utility function of an appropriate form, we may want to include the above factors, and also impose a weight factor that controls the influence on each player's satisfaction at different GoS levels. Denote C to be the original data length of the traffic source; denote C′ to be the average length after compensating the data retransmission due to cell error or cell loss. We use the empirical formula C′ = C·[1+α(L)] to model the traffic amount, where L is the cell loss rate at one switching node and α(L) is a function of L called the influence factor, due to cell retransmission. Assume that B is the available bandwidth to be allocated to the user, and T is the mean end-to-end cell delay, then the average cell transmission time can be written as follows:
  7. Intelligent Flow Control under a Game Theoretic Framework 313 C′  C′  nC ′L D ( L, B ) ≈ + k  + 2T  (17.9) B B  1 − nC ′L In equation 17.9, k is a coefficient modeling the sliding window for traffic shaping, and can be chosen as half of the window size under a window-based traffic shaping mechanism. Notice that k may also be a variable in connection with the burst curve of the arrival traffic. The upper bound of k should not exceed the capacity of the leaky bucket under leaky bucket flow control. n is the number of nodes and/or switches that the traffic traverses from source to destination. In equation 17.9 we potentially assume that the cell loss rate at each node is approximately the same. Using the result derived above, the utility function based on the bandwidth-delay-product is defined as follows  nC ′L  C ( L, B) = f ( L) C ′ + k (C ′ + 2 BT ) (17.10)  1 − nC ′L   In equation 17.10, f (L) is a weight function that decreases with the increase in the cell loss rate, depending on different QoS requirements. From equation 17.10, we can see that the number of traffic sources and the bandwidth-delay-product represent the availability of network resources to a certain type of traffic during the call setup phase. f (L) is considered as the nominal QoS given that the connection is established for the incoming traffic. Hence we consider C(L,B) as the utility function of various types of traffic sources. Note that the utility function for each type of traffic is given as: ui = Ci ( Li , Bi ), i = 1,2,..., N (17.11) then the outcome of the cooperative game given in equation 17.7 ensures the fair share of bandwidth among different traffic sources in a quantitatively simplified measurement. Note that utility functions of other forms may also be introduced for differentiated or best effort services following a similar derivation; the genetic algorithm does not require the objective function to have a specific analytical form. Note also that in our approach, the cooperative game is not played by the end users but the network manager, who sets certain operation points of the CAC boundary according to the fair shares criterion. The CAC strategy with fair shares among various traffic sources After introducing the cooperative game model, we propose the CAC strategy, which includes the following stages: when the ith class of traffic arrives, the network estimates its ~ equivalent bandwidth bi using equations 17.1–17.4 proposed in section 17.2.2. If the statistical parameters of the incoming traffic are available, the equivalent bandwidth can also be calculated by means of queuing analysis through various numerical approaches. At the same time, the network manager uses a genetic algorithm to search for the optimal solution of the optimization problem of equation 17.7, thus achieving the fair shares bandwidth Bi available for this class of service. In case the bandwidth allocated to each traffic is per-flow guaranteed (e.g. static allocation), the network end can run genetic search off-line or at an earlier stage; while a network using dynamic bandwidth allocation, Bi is adjusted online via genetic optimization to achieve the desired optimal point. Assume the
  8. 314 Telecommunications Optimization: Heuristic and Adaptive Techniques ~ bandwidth for the ith class of traffic sources is Bi , as a result, the network decides to ~ ~ accept or reject the incoming traffic depending on whether Bi − Bi is greater than bi . In fact, when the ith and jth classes of traffic compete for the network resources, the network will accept the class at a higher priority whose allocated bandwidth has not exceeded its maximal bandwidth in fair share. In the next subsection, we will give the detailed procedure using a GA to solve the optimization problem of equation 17.7 and we will show that by properly choosing the weight factors of the utility function in equations 17.5 and 17.6, the desired fairness among various classes of service can be achieved. 17.2.3 Call Admission Control with Genetic Algorithm The basic principles of genetic algorithm Genetic Algorithms (GAs) are stochastic optimization methods that usually require the objective function to have the form given below: min{ f (C ) | C ∈ IB N } (17.12) Usually, we have: ∀C ∈ IB N = {0,1} N , 0 < f (C ′) < ∞ We can see that equation 17.7 satisfies the above requirement. Chapter 1 has introduced the basic operation of a genetic algorithm. In the following, we concentrate on the results of applying the technique in our application. Computer Simulation Results In our work we use the GA proposed in Zhang et al. (1994; 1995), which uses modified mutation and crossover operators. The parameters coded into strings for GA optimization are the bandwidth available for each type of traffic source. When there are ni sources accepted by the network with service type i, the cell loss rate of this type of service at the switch can be estimated by using equivalent bandwidth (Chen, 1996). Normalizing equations 17.10 and 17.11 and substituting them into equation 17.7, we can get the fitness value for GA optimization. The simulation results show that by using an improved GA approach, the global optimum (strictly speaking, it is sub-optimum or near optimum) is achieved with around only 20 iterations in GA evolution. The simulation scenario is chosen as follows. We assume there are two types of traffic source. One (traffic I) has a peak bit rate of 64kbps, a mean bit rate of 22kbps and an average burst length of 100 cells. The source-destination route traverses four switching nodes and the mean cell delay is 4ms. The QoS of traffic I requires a cell loss rate less than -4 10 . The other (traffic II) has a peak bit rate of 10Mbps, a mean bit rate of 1Mbps and an average burst length of 300 cells. The source-destination route traverses two switching nodes and the average delay is 0.28 ms. The traffic source has the QoS constraint of a cell -8 loss rate less than 10 . Assume that the link capacity of the network is 155.5 Mbps; the weight function in equation 17.10 takes the empirical formula as below:
  9. Intelligent Flow Control under a Game Theoretic Framework 315 f ( L) = e −10( L − QoS ) , α ( L) = 1.005L (17.13) The length of one string representing the normalized bandwidth in the GA is 15 bits. In computer simulation, the CAC boundary given different traffic scenarios (data length of traffic sources) as well as different buffer sizes is calculated via the measurement-based approach discussed in earlier. Concerning the fair shares between traffic I and II, the maximal number of connections for each type of traffic during a heavy load period is derived using GA optimization, and is listed in Table 17.1. Table 17.1 The CAC boundaries with different weight factors of two types of traffic sources. Buffer size Data length of Data length of The number of The number of Weight factor β ( in cells ) traffic I traffic II traffic I traffic II Nash 0 103 10 5 10 7 4046 3 Raiffa 1 103 105 107 4748 1 Thomson -1 103 105 107 2430 8 -0.25 103 105 107 3460 4 -0.75 103 105 107 2729 7 0.25 103 105 107 4328 2 0.75 103 105 107 4625 1 Nash 103 105 108 2628 7 Raiffa 103 105 108 3296 3 Thomson 103 105 108 1368 13 Nash 103 103 108 1966 12 Raiffa 103 103 108 4338 4 Thomson 103 103 108 76 17 Nash 102 105 107 2495 3 Raiffa 105 107 3868 1 102 Thomson 102 105 107 1245 5 From Table 17.1, we can see that the traffic sources with a longer data length (indicating a longer connection holding period) will get a relatively larger bandwidth. It is clear that traffic II needs a much larger bandwidth than traffic I. If fair share is not considered between the two types of traffic sources, traffic II is more likely to be blocked than traffic I during the heavy load period when both types of incoming traffic compete for network connection. The solution of the cooperative game has such properties that it guarantees certain reserved bandwidth for both of the traffic sources, despite the diversity
  10. 316 Telecommunications Optimization: Heuristic and Adaptive Techniques of their required bandwidth and the number of hops in their routes. The greater the amount of traffic, the more the reserved bandwidth for the usage of this traffic type at the network operation point. Table 17.1 also reveals that the CAC boundary will become smaller when the buffer size decreases, provided other configurations unchanged. However, the solution of the game theoretic model will not exceed the boundary without the constraint of fair shares in a CAC decision. The selection of the weight factor β of the utility function has a significant influence on the fair share among different types of traffic sources. The network manager may choose an appropriate β to make the trade-off of call block probabilities among different traffic sources. In Table 17.1, we can see that the available bandwidth for traffic I increases as β increases from −1 to 1. When β = 1, the utility function of each type of traffic considers its own gain as well as the loss of other types of traffic for the amount of bandwidth available. When β = −1, the net gains of the players’ cooperation in sharing the bandwidth are considered. When β = 0, each player only cares about individual profits through cooperation. Thus, the above three conditions have a clear economic meaning representing certain fair share criteria in a CAC decision. Through various simulation studies, we find that using a Nash solution as an additional constraint to the CAC decision may be a good candidate in making a balance of call block probabilities among different traffic types. As β = 0 and the data length of traffic varies, the upper bound of the bandwidth available for traffic I is derived from 28% to 87%, while for traffic II it is from 46% to 78%, and the link utilization remains relatively high. In short, the simulation results show that a certain trade- off between call block distribution among different traffic sources and the network efficiency can be achieved by setting an appropriate operation point at the CAC boundary, which is an optimization problem under the cooperative game framework. 17.2.4 Summary of Connection Admission Control using a Cooperative Game In this section, the CAC mechanisms based on dynamic bandwidth allocation are briefly discussed. The major issue of fair shares of the bandwidth among different traffic sources is analyzed under a cooperative game model. The utility function concerning the bandwidth- delay-product of each type of traffic is proposed, and the final optimization problem is solved using an improved genetic algorithm. To speed up our CAC scheme, the optimization problem to achieve fair shares can be calculated offline. In online tuning, we can set the initial population to include the previous solution for GA optimization. The proposed CAC scheme is still a centralized mechanism to regulate the traffic flows from end users, and we potentially assume that the network decision-maker has enough information (traffic statistics and QoS requirements) on the incoming traffic to force the link utilization to the desired operation point. The cooperative game model also has a clear economic meaning to interpret certain types of solution as the negotiation result among the users to improve all of their profits in using the available network bandwidth. When such information to derive the utility function of each type of traffic is unavailable, a distributed mechanism for resource allocation is required. We will model the decentralized resource allocation process as a generalized auction game in the next section.
  11. Intelligent Flow Control under a Game Theoretic Framework 317 17.3 Optimal Auction Design for Resource Sharing 17.3.1 Resource Allocation as an Auction Design Problem In the previous section, we have studied the connection admission control problem under a cooperative game framework. Under that framework, we assume that the network manager can explicitly control all of the user traffic during call setup and traffic transmission. However, for the traffic using best effort transmission, the cooperative approach may fail to set the operation point along the CAC boundary, due to the inherent nature of non- cooperative users in sharing the network resources. To characterize user behavior in a non- cooperative environment, a generalized auction model will be discussed in the following. From the non-cooperative game point of view, communication networks are often characterized by what the economists call externalities. That means the value of a network to a user depends upon the other users. The positive externalities are that a communication network is more valuable if more people are connected. The negative externalities are that users who cannot or will not coordinate their actions sufficiently to achieve the most desirable allocation of network resources may lead to network congestion. To avoid negative externalities, prices play a key role as allocation control signals. The telephone system and current Internet represent two extremes of the relationship between the resource allocation and pricing. The resources allocated to a telephone call are fixed, and prices are based on the predictable demand at any given time. In the case of the Internet, the current practice of pricing by the maximal capacity of a user’s connection decouples the pricing from resource allocation. In the emergence of multi-service networks (e.g. ATM), neither of these approaches are viable; the former because of the wide range of applications will make demand more difficult to predict; and the latter, because once the entry fee is paid, there is no incentive to limit usage, since increasing consumption benefits individual users, whereas limiting it to a sustainable level brings benefits which are shared by all. This makes it vulnerable to the well-known ‘tragedy of the commons’. Thus, there is a need to develop new approaches to price network resources as an alternative to flow control. In a distributed pricing environment, the network manager should charge each network user a price dynamically responsive to unpredictable demand. Auctions have long been considered as an efficient tool for resource allocation in a decentralized control system. To analyze the potential performance of different kinds of auction, we follow Vickrey (1961) and study the auctions as non-cooperative games with imperfect information. We consider the use of the auction model as a decentralized mechanism for efficiently allocating resources to the users of the network (called bidders). In our approach, allocations are for arbitrary shares of the total available quantity of network resources, as in Lazar and Semret (1997), against the auction of identical items proposed in Vickrey (1962). Following the analytical model in Milgrom (1981), we find that the auctioneer's best policy is a generalized Vickrey auction under certain regularity assumptions. However, any buyer who does not bid his highest possible price for a unit resource might be assigned a nonzero probability that he will get any resource at all when the manager performs optimal policy, which is different from the PSP auction rule derived in Lazar and Semret (1997). The discussion of auction design in section 3 can be generalized to other scheduling problems in distributed computing systems, while the allocation rule remains the same.
  12. 318 Telecommunications Optimization: Heuristic and Adaptive Techniques In the following discussion, we assume that the resource to be allocated to the potential bidders is a quantity of link capacity, and does not assume any specific mapping to a QoS requirement. Rather, the users are defined as having an explicit valuation (i.e. utility function) toward quantities of resource, and each user is free in choosing his bidding strategy to maximize his own valuation. The evolutionary behavior of the users’ knowledge of their opponents’ bidding strategies is also studied under the repeated auction game. 17.3.2 Basic Definitions and Assumptions for Optimal Auction Design Following Milgrom (1981), we use similar notation to derive the problem of optimal auction design with incomplete information on the bidders. Given the resource of quantity C, and a set of players N = {1,2,…,n} submitting bids, i.e. declaring their desired share of the total resource and the prices they are willing to pay for it, an auction mechanism is the rule used by the auctioneer to allocate resources to the players based on their bids. We will use i and j to represent typical bidders in N. The auctioneer’s problem derives from the fact that he does not know how much the bidders are willing to pay for the resource capacities they need. That is, for each bidder i, there is some value ti which is bidder i’s estimate of the unit resource capacity he wants to bid for. We shall assume that the auctioneer (e.g. network manager) uncertainty about the value estimate of bidder i can be described by a continuous probability distribution over a finite interval. Specifically, we let ai represent the lowest possible value that bidder i might assign to a unit resource; we let di represent the highest possible value that i might assign to a unit resource. Let fi:[ai,di]→R+ be the probability density function for i’s value estimate ti, assuming that 0 < ai < di < +∞, ∀ti ∈ [ai,di], fi(ti) > 0. Correspondingly, let Fi:[ai,di]→[0,1] denote the cumulative distribution function so that: ti Fi (t i ) = ∫ai f i ( s i )ds i Thus, Fi(ti) is the auctioneer’s assessment of the probability that bidder i has a value estimate no greater than ti. We will let T denote the set of all possible combinations of a bidder’s value estimates: T = [a1 , d1 ] × [a 2 , d 2 ] × L × [a n , d n ] For any bidder i, we let T–i denote the set of all possible combination of value estimates which might be held by bidders other than i, so that: T− i = × [a j , d j ] j∈N , j ≠ i We further assume that the value estimates of the n bidders are stochastically independent random variables. Thus, the joint density function on T for the vector t = (t1, t2, …, tn) of individual value estimates is:
  13. Intelligent Flow Control under a Game Theoretic Framework 319 f (t ) = ∏ f j (t j ) j∈N Of course, bidder i considers his own value estimate of unit resource to be a known quantity, not a random variable. However, we assume that bidder i assesses the probability distributions for other bidders’ value estimates in the same way as the auctioneer does. That is, both the resource seller and bidder i assess the joint density function on T–i for the vector: t −i = (t1 , L , t i −1 , L , t i +1 , L t n ) of values for all bidders other than i to be: f −i (t −i ) = ∏ f j (t j ) j∈N , j ≠i We denote the auctioneer’s personal value estimate for unit resource, if he were to keep it unsold, to be t0. We assume that the auctioneer has no private information about the resource, so that t0 is known to all the bidders. Suppose that the resource is infinitesimally divisible and the bidders’ budget limit in one-shot as well as repeated bidding game is B = {b1 , b2 ,L , bn }, so that bidder i can always request for bi / ti resource capacity, given that his value estimate is ti. 17.3.3 Feasible Auction Mechanisms Given the density function fi and each bidder’s evaluation estimate ti, the network manger’s problem is to select an auction mechanism to maximize his own expected utility. We must now develop the notation to describe the auction mechanisms that the manager might select. To begin, we will restrict our attention to a special class of auction mechanism: the direct revelation mechanism (Milgrom, 1981). In the direct revelation mechanism, the bidders simultaneously and confidentially announce their value estimates to the manager; then the manager determines who gets the (partial) resource according to their request, and how much each bidder should pay for unit resource capacity, as some function of the announced value estimates t = (t1 , t 2 , L, t n ) and required capacity r = (r1 ,L , rn ), ri = bi / t i of total resource. Thus, a direct revelation mechanism is described by a pair of outcome functions (p, x) of the form p: T→Rn and x: T→Rn such that, if t is the vector of the announced value estimates of unit capacity, then pi(t) is the probability that bidder i gets ri of the total resource and xi(t) is the expected amount of money that bidder i must pay for unit resource. Notice that in this case, we assume that the bidders’ budget limit B is deterministic, and for the case that all bidders bid for a capacity no less than the total resource, the auction problem reduces to a traditional one item auction. We shall assume throughout this section that the manager and the bidders are risk neutral and have additive separable utility functions for resource capacity being sold or consumed. Thus, if bidder i knows that his value estimate of unit capacity is ti , then his expected utility from the auction mechanism described by (p, x) is:
  14. 320 Telecommunications Optimization: Heuristic and Adaptive Techniques bi U i ( p , x, t i ) = ti ∫ [ti pi (t ) − xi (t )] f −i (t −i )dt −i (17.14) T− i where dt −i = dt 1L dt i −1dt i +1 L dt n . Similarly, the expected utility for the network manager from this auction mechanism is: bj U 0 ( p, x ) = ∑ ∫ t j [t 0 (1 − p j (t )) + x j (t )] f (t )dt (17.15) j∈N T where dt = dt 1L dt n . Not every pair of functions (p, x) represents a feasible auction mechanism. There are three types of constraint which must be imposed on (p, x). First, since there are only C unit capacities to be allocated, the function p should satisfy the following probability conditions: bj ∀i ∈ N , t ∈ T , 0 ≤ pi (t ) ≤ 1 and ∑ p j (t ) t j ≤C (17.16) j∈N Secondly, we assume that the manager cannot force a bidder to participate in an auction which offers him less expected utility than he could get on his own. If the bidder did not enter the auction, he would not pay any money, and his utility payoff would be zero. Thus, to guarantee that the bidder will participate in the auction, the following individual- rationality conditions must be satisfied: ∀i ∈ N , ∀t i ∈ [ai , d i ], U i ( p, x, ti ) ≥ 0 (17.17) Thirdly, we assume that the manager cannot prevent any bidder from lying about his individual value estimate of unit capacity if the bidder expects to gain from lying. Thus, the direct revelation mechanism can be implemented only if no bidder ever expects to gain from lying. That is, honest responses must form a Nash equilibrium in the auction game. To guarantee that no bidder has any incentive to lie about his value estimate, the following incentive-compatibility conditions must be satisfied: ∀i ∈ N , ∀si , ti ∈ [ai , d i ], bi (17.18) U i ( p , x, ti ) ≥ si T ∫ [ti pi (t − i , si ) − xi (t − t , si )] f − i (t − i )dt − i −i given ti is the true value estimate of bidder i. We call (p, x) a feasible auction mechanism if and only if equations 17.16–17.18 are all satisfied. Given an auction mechanism (p, x), we define: Qi ( p, t i ) = ∫ pi (t ) f −i (t −i )dt −i (17.19) T− i
  15. Intelligent Flow Control under a Game Theoretic Framework 321 to be the conditional probability for any bidder i that he will get ri resource capacity from the auction mechanism (p, x), given that his value estimate of unit resource capacity is ti. Our first result is a simplified characterization of the feasible auction mechanism. Lemma 1 (p, x) is a feasible auction mechanism if and only if the following conditions hold: ∀i ∈ N , ∀si , t i ∈ [ai , d i ] , if si ≤ t i , then Qi ( p, si ) / s i ≤ Qi ( p, t i ) / t i (17.20) ti Qi ( p, s i ) ∀i ∈ N , ∀t i ∈ [ai , d i ] , U i ( p, x, t i ) = U i ( p, x, ai ) + bi ∫a i si ds i (17.21) ∀i ∈ N , U i ( p, x, ai ) ≥ 0 (17.22) and together with equation 17.16. The proof is given in the appendix. Note that in a resource auction, there also exist many non-direct revelation approaches. For example, the Progressive Second Price (PSP) auction proposed in Milgrom (1981) is an auction rule with a 2-dimensional message space, and the author shows that this auction has a Nash equilibrium, hence the design objective is met at equilibrium. In this case, a PSP auction is feasible as ai = di in equation 17.20 when complete information of the valuations among all bidders is known to all participants (often called common knowledge). However, Lemma 1 may not apply to more general cases of a divisible auction unless one can guess the right mapping of direct revelation to desired mechanism and build it into the allocation rule from the start, as in Wurman (1997) . 17.3.4 Optimal Auction Design for Allocation of Arbitrarily Divisible Resource In the following, we will show the optimal condition for the auction design problem in the set of feasible mechanisms defined previously. With a simple regularity assumption, we can compute the optimal auction mechanism explicitly in an efficient way. This is important for the implementation of an auction game in a decentralized network control environment. Lemma 2 Suppose that p: T→Rn satisfies that ∀i ∈ N, t ∈ T, p maximizes   bi  1 − Fi (t i )    ∫ i∑ ti ti − t0 −  ∈N    pi (t ) f (t )dt f i (t i )    (17.23) T subject to the constraints of equations 17.16 and 17.20. Assume also that
  16. 322 Telecommunications Optimization: Heuristic and Adaptive Techniques  ti pi (t −i , si )  xi (t ) = ti  pi (t ) −  ∫a i si dsi   (17.24) Then (p, x) represents an optimal auction. With a simple regularity assumption, we can compute an optimal auction mechanism directly from Lemma 2. We say that the auction problem with a non-empty feasible mechanism is regular if for any i ∈ N, the function: 1 − Fi (t i ) ci (t i ) = t i − f i (t i ) is strictly increasing in ti. Now consider that the manager keeps the total resource if: t 0 > max c j (t j ) j∈N otherwise he first gives ri capacity of the resource to bidder i who has the highest ci(ti) with a probability pi (t ) = ti / d i if C–ri ≥ 0. If C–ri < 0, the manager only allocates capacity of an amount C to that bidder with the same probability. If: ci (t i ) = c j (t j ) = max c k (t k ) k∈N then the manager may break the tie by considering the bidder requiring smaller quantity of resource first, or by some other allocation rule. Ties will only happen with zero probability in the regular case. Given that there is still a certain amount of resource left after the first allocation, the manager can choose the bidder with the second highest ci(ti) from among the remaining bidders according to the probability constraints (17.20), and perform allocation recursively until the maximal value of ci(ti) is less than t0 or there is no capacity left at all. Let M = {1,2,L , m} be the re-ordered index of bidders satisfying that ∀i, j ∈ M, i < j; we have ci(ti) ≥ cj(tj) ≥ t0. Thus, the specific auction rule given above can be defined as follows:  +   i −1    ∑ Eri = ri ∧ C − r j p j (t )  pi (t )   (17.25)   j =1    where Eri means the expected amount of resource that bidder i will get in the sense of repeated games of this auction. The probability assignment has the form ∀i ∈ N: pi (t −i , si ) = si / d i (17.26) if bidder i is chosen by the auctioneer to have non-zero probability to win (i.e. i ∈ M ).
  17. Intelligent Flow Control under a Game Theoretic Framework 323 Theorem Given the auction problem with regularity assumption defined above, the auction rule (p, x) satisfying equations 17.24–17.26 represents an optimal auction mechanism. Several remarks are worth noting. In the auction considered above, any bidder needs to pay if he has a non-zero probability to win a certain amount of resource capacity. The money (whether relating to real money or ‘funny money’ based on quotas) he pays to the auctioneer is the minimal price he needs to have the same probability to win the required capacities as that assigned when he bids at his value estimate given his opponents' profile. For identical bidders (their ai, di, fi, bi are the same), the manager's rule is like generalized Vickrey auction (Vickrey, 1961), but there still exists economic inefficiency in the optimal auction mechanism due to incomplete information (lack of common knowledge) among the bidders and the auctioneer. Any bidder will have some risk of not getting any resources at all, even when there exists some resource capacity larger than he needs and his opponents bid lower than his value estimate of unit capacity. Note also that the network manager will –1 bid like a resource buyer by marking his reserved price to ci (t0) for unit capacity. For example, suppose there are 10 bidders with a budget of 1000 each to buy 10 unit resources in an auction game. To model the incomplete information among the bidders, we let ai = 100, di = 1000, t0 = 100, fi = 1/900. By straightforward calculation, we get the manager’s reserved price of 550, which is much higher than his own value estimate. This price does not change when the numbers of bidders and each bidder’s budget limit varies. Implementing the auction rule given in equations 17.24–17.26 used by the manager in a network environment is just like that in Lazar and Semret (1997), with minor modifications. An extended version of a PSP auction with a simulation study in a complicated network environment can be found in Lazar and Semret (1998). Our approach can also be applied to the decentralized allocation of network resources with the assumption of the bidders’ probability density functions and their budget limits as common knowledge. The auction game can be repeatedly played by the same users or some kind of software agents. The users and the manager can set an a priori probability density function and tune the function in the repeated games. If the manager has a channel to send a message indicating the users’ demand (e.g. average price of the unit capacity from previous bidding), the users will tune their bidding seals smartly so that the whole game will converge to the desired equilibrium. 17.3.5 Simulation Study of Individual Learning to Form Common Knowledge An issue of obvious concern is how the convergence of common knowledge scales with the number of bidders, as well as each bidder’s bidding strategy in the repeated game using the proposed auction rule. It should be noted that, in the repeated auction game, both the bidder and the auctioneer have an incentive to modify their valuation of the probability density function of their opponent's bidding profile, according to the history information. The dynamic behavior of the auction game will be studied experimentally in the following. We use TREX (implemented by Java – see Semret (1996)) as the prototype to simulate an interactive auction game using software agents. Unlike the scenarios of interconnected networks studied in Lazar and Semret (1998a), our simulations are limited to a single route
  18. 324 Telecommunications Optimization: Heuristic and Adaptive Techniques with a divisible resource capacity, and add learning strategies to the bidders as well as the auctioneer to formulate certain system dynamics (e.g. convergence of probability density to the true value estimation of the unit capacity). Detailed descriptions are in Chen (1998). In computer simulation, the resource capacity is normalized to 1, and five bidders with a budget constraint of 0.5 each enter the repeated auction game with incomplete information on their opponents’ true value estimations. Assume the bidders are named from 1 to 5 and their true value estimations of unit capacity are 0.6, 0.7, 0.8, 0.9 and 1.0, respectively. Initially, the bidders and the auctioneer assume their opponents' true value estimations to be uniformly distributed in [0, 10]. In every round of the auction game, the auctioneer performs the optimal auction rule derived in section 3.4 to allocate resources to the bidders. At the end of each round, the bidders and the auctioneer use learning strategies to tune their model of the opponents’ Probability Density Functions (PDFs) of true value estimation. We expect that the PDFs of each bidder's valuation (as well as the auctioneer's valuation of each bidder) will converge to the true value estimate in the repeated auction games. The learning strategy for the bidders and the auctioneer is very important to the formation (and perhaps the evolution) of common knowledge. There exist various bidding strategies optimizing the expected utility of each bidder in a non-cooperative manner. For simplicity, we only simulate two strategies: a Parzen window and a neural network approach to model the learning behavior of each participant in a repeated auction. To modify the opponents’ PDFs, each bidder needs to collect the history information of the opponents’ bids, and maximizes his own utility by setting the best sealed bid for the next round. In Parzen window learning, the PDFs are updated by using the new information of the opponents’ bidding seals after each round of the auction. In the neural network approach, each bidder uses a feedforward neural network to predict his opponents' bids according to the bids collected historically. Backpropagation training is used to tune the parameters of the neural network after 10 rounds of the auction. The PDFs are updated according to the empirical risk minimization rule. Extensive discussion on the simulation study of an optimal repeated auction can be found in Chen (1998). After 500 rounds of the auction, we pick bidder 5 to see his modified PDFs from bidders 1 to 4. Other bidders use the same type of learning strategies and yield similar dynamic behaviors. The simulation result using the Parzen window approach is given in Figure 17.1, and the result using the neural network approach is shown in Figure 17.2. Similar results are derived for bidders 1 to 4 and the auctioneer, and the plots are omitted. We can see that the peaks of the PDFs from bidders 1 to 4 are centered approximately at the bidders’ true value estimates of unit capacity after enough rounds of the repeated auction game. The neural network approach yields better estimation of true PDFs than the Parzen window in the same number of rounds. We expect that a good learning strategy may speed up the formation of common knowledge in the repeated auction game. In the extreme case, all the bidders and the auctioneer know their opponents' true value estimates of the unit capacity, therefore, the auction game reduces to a generalized Vickrey auction with complete information. As a limit of the repeated game, the best strategy for the bidder is simply to bid at his truthful value estimate with maximal possible capacity according to budget constraints. The allocation is economically efficient in that the bidder with the highest value estimate of unit capacity gets his resources first, and the network manager allocates the resources obeying exactly the order of the value estimates in sequence from the sealed bids.
  19. Intelligent Flow Control under a Game Theoretic Framework 325 Figure 17.1 The PDFs estimation from bidder 5 using the Parzen window. Figure 17.2 The PDFs estimation from bidder 5 using the neural network approach.
  20. 326 Telecommunications Optimization: Heuristic and Adaptive Techniques 17.4 Concluding Remarks In this chapter, we propose a game theoretic framework to analyze flow control and resource allocation in high speed networks. We organize our work in two parts. The first part deals with the connection admission control problem in ATM networks. By taking dynamic CAC from a different point of view, we propose a cooperative game model that considers a certain combination of user preferences to optimize the call block probability while maintaining the negotiated QoS. After deriving the product form of the utility function of different traffic types as user preferences, we use a genetic algorithm to optimize this objective function to maintain fair shares during the traffic transmission along each connection. The second part of the work tries to model the non-cooperative user behavior in resource allocation under a generalized auction framework. We propose an optimal auction rule that leads to equilibrium in a repeated dynamic game. We use the Parzen window and neural networks to model the user strategies in tuning the PDFs of the opponents, and briefly discuss the formation of common knowledge in the repeated game, as well as the allocation efficiency. Contrary to our CAC scheme as a centralized network control mechanism, the auction game to allocate network resources is suitable for the decentralized control environment. The common trait for these two parts is that we use some computational intelligence techniques to solve the complicated modeling and optimization problem in resource allocation. Our future work will be to study the interactions of bidders’ strategies with non-spot entrance and leaving, and to investigate the conditions to maintain a stable auction game with maximal allocation efficiency. Applications of our approach to the economic systems are also under investigation. 17.5 Appendix 17.5.1 Proof of Lemma 1: Note that: bi bi si ∫ [t i p i (t −i , si ) − xi (t −t , si )] f −i (t −i )dt −i = U i ( p, x, s i ) + si (t i − s i )Qi ( p, si T− i Thus, the incentive compatibility (equation 17.18) is equivalent to: bi ∀i ∈ N , ∀s i , t i ∈ [ai , d i ],U i ( p, x, t i ) ≥ U i ( p, x, s i ) + (t i − s i )Qi ( p, s i ) (17.27) si Using equation 17.27 twice with the role of ti and si switched, we get bi (t i − si ) bi (t i − si ) Qi ( p, si ) ≤ U i ( p, x, t i ) − U i ( p, x, s i ) ≤ Qi ( p, t i ) si ti
Đồng bộ tài khoản