After an HLR failure, the data in the backup are reloaded into the HLR. Data that have been changed in the period between the last backup checkpointing and restart of the HLR (this period is referred to as the uncovered period) cannot be recovered. Thus, the following HLR restoration procedure is executed.
HLR Restoration Procedure
Step 1 -- The HLR sends a database reset message to the VLRs where its MSs are located.
Step 2 -- The VLR derives all MSs of the HLR, and for each MS the VLR sends a registration message to the HLR. After the location update operation, the HLR record is recovered.
Note that the above HLR restoration procedure is not robust because during the uncovered period, an MS may move into a VLR unknown to the HLR at the last checkpointing time. If so, the HLR will not be able to locate the VLR of the MS at Step 1 of the HLR restoration procedure.
Let
be the probability that no MS (of the HLR) is in a VLR at the last HLR checkpointing time before an HLR failure, and some MSs have moved in the VLR before the failure occurs. In other words,
is the probability that the HLR fails to request update information from a VLR (which has updated location information during the uncovered period) in the restoration process. To compute it we make the following assumptions:
-
The MS arrival to the VLR is a Poisson process with the arrival rate
.
-
The VLR residence time (the period an MS resides in a VLR) is a general distribution with mean 30 min.
-
The checkpointing period at the HLR is a constant TH.
-
The interval between two failures is much longer than TH.
Following the above assumptions, we conduct several simulation experiments. In the simulation model, we generate the uncovered periods and the user arrivals to the VLR during these uncovered periods. Then we check if these arrivals will eventually leave the VLR at the HLR failure. According to [9], the uncovered periods are uniformly distributed in (0, TH]. The user arrivals are a Poisson process (and the interarrival times are exponentially distributed). We consider various user residence time distributions such as exponential, uniform, and Gamma, with different variances. We found that the results are insensitive to user residence time distribution. Every simulation run consists of 2 million user arrivals to ensure that the simulation experiments are not affected by the initial and final effects. Based on the simulation experiments, Table 1 lists
as a function of TH and
where the mean VLR residence time is 30 minutes. The table indicates that when the call arrival rate
is low,
is high. The table also shows that
increases as TH decreases because
is determined by two factors:
-
The probability p1 that no users of the HLR are in the VLR at a checkpointing
-
The probability p2 that some users enter the VLR after the checkpointing, but do not leave at the failure
For a fixed user arrival rate (
) and average VLR residence time, both p1 and p2 decrease as TH increases. Thus,
decreases as TH increases.
Another problem of the GSM HLR restoration procedure is that the location information of a VLR may not be changed during the uncovered period. However, the HLR still asks the VLR to resend the information, which results in unnecessary overhead. This situation may occur if all MSs entering a VLR during the uncovered period eventually leave the VLR before the HLR fails. Let ß be the probability that such a situation occurs. From the simulation experiments, Table 2 shows that ß is large for small
. We also observe that ß increases as TH decreases because probability ß is determined by two conflicting factors:
-
The probability p3 that no user enters the VLR during the uncovered period. This probability increases as TH decreases.
-
The probability p4 that some users enter the VLR, and all these users leave the VLR at the HLR failure. If TH is short, the entering users are unlikely to leave at the HLR failure, or p4 decreases as TH decreases. We found that p3 has a more significant effect on p4 in the cases we study; thus, ß increases as TH decreases.
In summary, our study indicates that the standard GSM HLR restoration procedure will incur large network overhead if
is small (which result in large
and ß). Small
to a VLR is typical if the VLR and HLR are in different countries. Thus, it is desirable to identify the exact VLRs that should be contacted by the HLR after an HLR failure. We will describe an algorithm to address this issue.
The VLR Identification Algorithm
We have proposed an algorithm [8] to identify a superset of the VLRs to be contacted by the HLR after a failure. In this article, we propose an improved algorithm called the VLR Identification Algorithm (VIA), which can identify the exact VLRs to be contacted by the HLR except for one case: due to the transient behavior of message sending, at the time of HLR failure, this set of VLRs may be a bit larger than the set of "true" VLRs to be contacted by the HLR. To simplify the description, we assume that every VLR covers exactly one MSC. Extension of our algorithm to accommodate multiple MSCs is trivial. The VIA consists of three procedures: Procedure 1 for checkpointing, Procedure 2 for registration, and Procedure 3 for restoration. Procedure 1 periodically saves the HLR records into the backup, and Procedure 3 restores the HLR records when an HLR failure occurs. Procedure 2 performs the standard GSM registration operations and keeps track of the VLRs that have been modified since the last checkpointing. (In Procedure 3, HLR will contact these VLRs to obtain correct user location information after an HLR failure.) Procedure 2 is based on the principle of counting where the HLR counts the number of new MSs that have entered a VLR since the last checkpointing. If such MSs exist, the VLR is marked in a list in the backup. Thus, at an HLR failure, the VLRs to be contacted by the HLR is in this list (which can be obtained in the backup).
To implement the VIA, extra data structures are required in the HLR, as shown in Fig. 3. This figure only shows the fields of the HLR record required to exercise the VIA. In the backup, the extra data structure is a set VLR_List* of VLRs that have been modified during the uncovered period. After a HLR failure, the HLR only needs to send the reset messages to the VLRs in VLR_List*.
In the HLR, every record includes two extra fields, as illustrated in Fig. 3.
Two extra data structures are introduced to the HLR:
-
TS is the last checkpointing (backup) time.
-
VLR_Counter is a set of (VLR,Count) pairs, where Count represents the "effective number" of MSs entering the VLR during the uncovered period. In Fig. 3, there are three effective MSs in VLR V1. Note that an MS is not effective to the VLR if it has entered the VLR area but eventually left the area during the uncovered period. Also note that the VLRs in VLR_Counter are the VLRs in VLR_List*.
The VIA works as follows. The HLR records are periodically saved into the backup by using the following checkpointing procedure.
VIA Procedure 1: Checkpointing
Step 1 -- for every location entry p in HLR* do
HLR[p]*.VLR
HLR[p].VLR;
Step 2 -- TS
current time;
Step 3 -- for every location entry p in HLR do
HLR[p].ts
TS; HLR[p].PVLR
HLR[p].VLR;
Step 4 -- VLR_Counter
Ø, VLR_List*
Ø;
In the checkpointing procedure, every location entry is saved into the backup (Step 1). The clock TS is set to the time of checkpointing (Step 2). The timestamp filed ts of every location entry in HLR is set to TS to indicate that the last location of the MS was updated no later than the latest checkpointing time TS (see Step 3). The PVLR is set to the current VLR address of the MS. Finally, both VLR_Counter and VLR_List* are set to empty to indicate that no VLR has a new roaming MS at TS(Step 4).
Suppose that MS p moves into a VLR area Vnew at time t. Following the standard GSM registration procedure, a registration message is sent from Vnew to the HLR. The following procedure at the HLR is triggered to perform the registration operation.
VIA Procedure 2: Registration
Step 1 -- (Update HLR)
Step 1.1 -- Vold
HLR[p].VLR;
/* Vold is the last VLR visited by p*/
Step 1.2 -- HLR[p].VLR
Vnew;
/* Vnew is the current VLR visited by p*/
Step 1.3 -- told
HLR[p].ts;
/* told is the time when p entered Vold*/
Step 1.4 -- HLR[p].ts
t;
/* t is the time when p entered Vnew*/
Step 1.5 -- Send a deregistration message to cancel the VLR entry of p at Vold;
Step 1.6 -- Send an acknowledgment to Vnew;
Step 2 -- (Update the VnewCount field in VLR_Counter)
if HLR[p].VLR
HLR[p].PVLR then
/* Vnew is not the VLR visited by p at the last checkpointing; thus, p is a "new" visitor to the VLR after the last checkpointing */
Step 2.1 -- if VLR_Count[Vnew] exists then
VLR_Counter[Vnew].Count
VLR_Counter[Vnew].Count + 1;
Step 2.2 -- else create VLR_Counter[Vnew] and
VLR_List*[Vnew];
VLR_Counter[Vnew]
1;
/* Steps 2 increments the number of MSs that entered Vnew after the last checkpointing
*/
Step 3 -- (Update the Vold counter entry)
if told > TS and Vold != HLR[p].PVLR then
/*p enters Vold in the uncovered period and Vold is not the VLR visited by p at the last checkpointing */
Step 3.1 -- VLR_Counter[Vold].Count
VLR_Counter[Vold].Count – 1;
Step 3.2 -- if VLR_Counter[Vold].Count = 0 then
/* No effective MS is in Vold*/
Step 3.2.1 -- delete VLR_Counter[Vold] and VLR_List*[Vold];
/* Step 3 decrements the number of MSs entered Vold after the last checkpointing */
In the registration procedure, the location information of the MS is updated (Step 1.2), its location record at the old VLR Vold is canceled (Step 1.5), and Vnew is acknowledged (Step 1.6). The last update time told is saved to be used in Step 3.
At Steps 2 and 3 of Procedure 2, VLR_Counter[]is used to count the "effective" number of MSs that entered the VLRs during the period [TS, t]. Note that if the MS was in Vnew before TS (i.e., HLR[p].VLR = HLR*[p].VLR = HLR[p].PVLR), then the HLR may consider that the MS never moves out of the VLR, and there is no need to increment the VLR counter (and Step 2 is skipped).
If the MS entered Vold in the uncovered period (i.e., tsold > TS) and Vold is not the VLR visited by the MS at the last checkpointing (i.e., Vold
HLR[p].PVLR), then it implies that the movement into Vold is not effective because the MS has moved out of Vold at t. Thus, the Vold counter should be decremented by 1 as described in Step 3. If Vold is the VLR visited by the MS at the last checkpointing, the MS is never considered an effective MS (see Step 2). In this case, there is no need to decrement the Vold counter when the MS moves out of Vold.
If VLR_Counter[V].Count > 1, any update to VLR_Counter[V] will not invoke modification to VLR_List*[]. In other words, access to the HLR backup is avoided. The purpose of Procedure 2 is to avoid updating the backup for every registration operation.
After an HLR failure, Procedure 3 is executed to restore the HLR. In this procedure, the HLR restores the location entries from the backup and requests current status of the MSs from all VLRs that have updated the MS information between the last checkpointing time and the HLR failure time. Note that after the execution of VIA Procedure 2, VLR_List* in the backup contains the VLRs the HLR should contact after an HLR failure. Due to the transient behavior of message sending, at the time of HLR failure, this set of VLRs may be a bit larger than the set of "true" VLRs to be contacted by the HLR.
VIA Procedure 3: Restoration
Step 1 -- TS
current time;
Step 2 -- for every location entry p in do
/* The HLR recovers the MS records from the backup */
HLR[p].PVLR = HLR[p].VLR
HLR[p]*.VLR;
HLR[p].ts
TS;
Step 3 -- for every VLR entry V in VLR_List* do
/* The HLR initiates the standard GSM HLR failure restoration procedure */
Send a reset message to V;
Some observations of the VIA are given below:
-
The VIA does not modify the standard GSM MAP protocol. In other words, we do not change the standard message flow of the GSM MAP, and no new GSM MAP message type is created. The VLRs are not modified. We only modify the HLR internal macros (e.g., Steps 2 and 3 of Procedure 2) for the VIA operations.
-
In the VIA, the backup (specifically VLR_List*) is only modified at Steps 2.2 and 3.2.1 of Procedure 2. Let N be the average number of registrations and deregistrations before a backup operation (i.e., Step 2.2 or 3.2.1) is executed in the VIA. The larger the N value, the smaller the HLR backup cost. Table 3 indicates that the backup access cost of the VIA is low. For example, the backup operation is executed every 54 location update operations for TH = 5 hr, and
= 6/hr. Thus, the execution of the VIA is expected to be efficient.
-
The VIA is especially useful if the GSM system accommodates international roaming. VLRs in foreign countries are seldom accessed, and ß and
can be large, as illustrated in Tables 1 and 2. In Taiwan, a typical GSM operator has roaming agreements with over 20 foreign GSM networks, and may potentially access over 100 VLRs. If the HLR has to contact all these VLRs after a failure, the network cost (specifically message exchanges across countries) can be huge. Thus, implementing the VIA in the HLR is desirable.
Conclusion
This article presented failure restoration procedures in GSM. In general, VLR failure recovery largely consists of asking the HLR for the relevant information on a need-to-know basis. HLR failures are more serious, and recovery is primarily based on restoration from a backup. The standard GSM HLR failure restoration procedure is not robust because the HLR may not contact the right VLRs to access the updated location information. To identify the VLRs to be contacted by the HLR after its failure, we propose an efficient VLR identification algorithm. The algorithm maintains timestamped HLR records and keeps counting the effective number of MSs entering each VLR since the last HLR checkpointing. As a result, at an HLR failure recovery the VLRs that need to be contacted for MS location update can easily be identified to speed the failure restoration procedure.
Acknowledgments
Thanks to the three anonymous reviewers, the quality of this article has been significantly improved. Yi-Bing Lin's work was supported in part by National Science Council, Contract No. NSCNSC87-2213-E009-014. Shu-Chin Su's work was supported in part by the Wireless Communication Program, Ministry of Economics Affairs.
References
[1] ETSI/TC Rec. GSM 09.02, "Mobile Application Part (MAP) Specification, Version 4.8.0," 1994.
[2] Y.-B. Lin and S. K. DeVries, "PCS Network Signaling Using SS7," IEEE Pers. Commun., June 1995, pp. 44–55; see also http://liny.csie.nctu.edu.tw.
[3] ETSI/TC Rec. GSM 03.03, "Numbering, Addressing and identification," 1993.
[4] ETSI/TC, "Organisation of Subscriber Data," Rec. GSM 03.08, 1993.
[5] Y.-J. Cho, Y.-B. Lin, and C.-H. Rao, "Reducing the Network Cost of Call Delivery to GSM Roamers," IEEE Network, vol. 11, no. 5, Sept./Oct. 1997, pp. 19–25.
[6] ETSI/TC Rec. GSM 03.07, "Restoration Procedures, Version 4.2.0," 1993.
[7] ITU-T Rec. Q.1004, "Location Register Restoration Procedures," 1993.
[8] Y.-B. Lin, "Failure Restoration of Mobility Databases for Personal Communication Networks," ACM-Baltzer J. Wireless Networks, vol. 1, 1995, pp. 365–72.
[9] A. R. Noerpel, L. F. Chang, and Y.-B. Lin, "Performance Modeling of Polling De-registration for Unlicensed PCS," IEEE JSAC, vol. 14, no. 4, 1996.
Biographies
Ming-Feng Chang received B.S. and M.S. degrees in electrical engineering from National Taiwan University in 1982 and 1984, respectively, and a Ph.D. degree in computer science from the University of Illinois at Urbana-Champaign in 1991. He is currently an associate professor in the Department of Computer Science and Information Engineering, National Chiao Tung University, Taiwan, Republic of China. His research interests include VLSI system design, personal communications systems, and mobile computing.
Yi-Bing Lin [SM] received his B.S.E.E. degree from National Cheng Kung University in 1983, and his Ph.D. degree in computer science from the University of Washington in 1990. Between 1990 and 1995 he was with the Applied Research Area at Bell Communications Research (Bellcore), Morristown, New Jersey. In 1995, he was appointed full professor of the Department of Computer Science and Information Engineering (CSIE), National Chiao Tung University. He is now chair of CSIE. His current research interests include design and analysis of personal communications services networks, mobile computing, distributed simulation, and performance modeling.
Shu-Chin Su received an M.S. degree in computer science from Stevens Institute of Technology in 1982. She is currently an executive assistant to the general director, Computer Communication Research Laboratories, Industrial Technology Research Institute. She is also pursuing a Ph.D. degree from Sun Yat-Sen University. Her research interests include personal communications systems, distributed systems, and database management systems.