Copyright 1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

This article was published in the July 1999 issue of
IEEE Communications Magazine.

CIRULE1.GIF (372 bytes)

Abstract
Traditionally, different networks were developed to handle voice, data, and video. The circuit-switched telephone network carried voice and the packet network carried data. Due to different deployment of these networks, different services were developed, such as voice mail in the telephone network and electronic mail on the Internet. With the revolution of multimedia in the computer industry, voice, video, and data are now being carried on both networks. Supplementary services, such as transfer and forwarding (which were originally developed for private telephone networks and later migrated to public telephone networks) are now being developed for packet networks. The standards for packet networks are being defined in the H.323-based series of ITU-T recommendations. This article provides the H.323 architecture for supplementary services, the differences in deployment of these services between the circuit-switched and packet-switched networks, and interworking of these services across hybrid networks.

 

CIRULE2.GIF (100 bytes)


Supplementary Services in the H.323 IP Telephony Network

CIRULE3.GIF (212 bytes)

Markku Korpi, Siemens AG
Vineet Kumar, Intel Corporation

 

The International Telecommunication Union -- Telecommunication Sector (ITU-T) H.323 [1] series of recommendations describes terminals, equipment, and services for multimedia communication over packet-based networks (e.g., IP networks). They cover the protocols necessary for their operation and for interconnection with circuit-switched networks. H.323 terminals and equipment may carry real-time voice, data, facsimile, and video, or any combination, including videotelephony. In this section we will provide some basic information on H.323, including the entities and protocols involved, and the use of these entities and protocols to provide a basic service of making a call between two users.

Functional Entities in H.323

An H.323 terminal is also known as an H.323 client. It terminates signaling and payload at an end user. The client could be a multimedia PC, videophone, IP phone, or a terminal adapter that connects an analog phone or facsimile machine to the H.323 network.
A gatekeeper routes calls to their destinations. It may also provide back-end services such as address resolution (e.g., to map a phone number or another alias address to an IP address), admission control (e.g., through authentication, authorization, bandwidth control), and accounting. Clients register with the gatekeeper when they go on-line.
A router routes packets over the IP backbone network.
A gateway interconnects H.323 and switched circuit networks (SCNs). The gateway provides conversion of signaling protocols as well as media transmission formats. The basic applications of H.323 gateways include:
  • PSTN/H.324 gateway, which converts voice and facsimile calls between H.324-compliant public switched telephone network (PSTN) terminals and H.323 on the IP network
  • H.320 gateway, which converts multimedia calls between H.320 on an integrated services digital network (ISDN) and H.323 on the IP network
  • Fax/voice gateway, which converts calls between telephones and facsimile machines connected to the SCN (e.g., PSTN, ISDN, Global System for Mobile Communication -- GSM) and H.323 on the IP network
A conference server is used for multipoint voice and video calls. It is also known as a multipoint control unit (MCU). The signaling is always centralized at the server. The media can be centralized or distributed. When centralized, the server mixes the voice and selects the video for all the clients in the conference. When distributed, for example in networks that allow multicasting, all clients in the conference mix voice packets and select the video from all other clients in the conference.
A feature server is used to provide supplementary services such as calling card authorization or call pickup/call park. These types of features require execution of functions that are commonly accessible from the associated endpoints.
Endpoints terminate signaling and payload channels. IP phones, IP videophones, gateways, conference servers, and feature servers are examples of endpoints.

Description of a Hypothetical H.323 Network

An H.323 network is shown in Fig. 1. Three calls (labeled 1, 2, and 3) are shown between two domains (labeled X and Y). Domain X has three terminals named A, C, E, and domain Y has three terminals named B, D, F. The subscript of the terminal name shows the call number that the terminal is involved in. For example, A1 indicates terminal A is involved in call number 1. The first call (between A and B) is a voice-only call between an H.323 endpoint and a GSM phone. The second call (between C and D) is a fax call between facsimile machines on the H.323 and PSTN networks. The third call (between E and F) is a multimedia call between H.323 and H.320 clients. The network contains back-end services that can be used, for example, for address resolution, authentication, authorization, accounting, and enforcement of policies.
Domain X is an H.323 network and domain Y provides connectivity to the SCN. The H.323, PSTN, ISDN, and GSM networks are interconnected through gateways to allow seamless interoperability among devices on these different networks. Traditional telephony devices such as analog phones and facsimile machines can be used on an H.323 network by connecting them to an H.323 terminal adapter. Such a terminal adapter is a small fax/voice gateway and may provide the look and feel of a central office to these traditional devices. H.323 signaling is routed via a gatekeeper but the payload is transferred directly between endpoints.

Protocols In H.323

Some of the signaling and media protocols in H.323 are:
  • H.225.0 defines signaling between two endpoints (e.g., client and client, client and conference server, client and gateway, client and feature server) for basic call setup and release [2]. This signaling is usually routed through a gatekeeper and is derived from Q.931 ISDN signaling by using a subset of the Q.931 messages and adding functionality in the user information fields of the H.225.0 messages. The Q.931 core makes interconnection with ISDN networks easier.
  • H.225.0 RAS defines signaling between the client and the gatekeeper for discovery of and registration to a gatekeeper by a client [2]. H.225.0 RAS is also used by an endpoint to receive admission from a gatekeeper to place or receive a call. Through the use of keep-alive messages, a gatekeeper can track the operational status of registered clients.
  • H.245 defines signaling between endpoints to exchange capabilities and carry end-to-end control messages governing operation of the H.323 entity [3]. The endpoints have a variety of capabilities, but the call established uses only the capabilities that are common to both endpoints. This signaling is routed through the gatekeeper when a gatekeeper is used in the H.323 network.
  • H.450 series defines signaling between endpoints for supplementary services [1]. This signaling is routed through the gatekeeper when a gatekeeper is used in the H.323 network.
  • H.225.0 Annex G defines signaling between edge gatekeepers in different domains to advertise addresses handled by their domain [1].
  • H.235 defines signaling for authentication of users and entities, and encryption of data [1].
  • RTP defines delivery of real-time data from one endpoint directly to another endpoint in the case of unicast, or to multiple endpoints in the case of multicast [4].
  • H.341 defines the management information base (MIB) for controlling and managing an H.323 network [1].

Basic Call Completion In H.323

The following steps demonstrate how a basic call is made from client A to client B, as shown in Fig. 1. It is assumed that client A and the gateway representing client B have discovered and registered with their respective gatekeepers X and Y. It is also assumed that through inter-domain signaling, gatekeeper X knows that gatekeeper Y will complete calls destined for client B. So, when client A sends a setup message to its gatekeeper X with the alias address (e.g., telephone number, IP address, email address) of client B, gatekeeper X uses its address resolution server to find the IP address of gatekeeper Y. Gatekeeper X then forwards the setup message to gatekeeper Y, which then sends it to client B. If the user at client B answers the call, client B returns a connect message via the two gatekeepers to client A. The connect message triggers the gatekeepers to start the accounting for the call. The message exchange has enough information to allow the two clients to start sending multimedia packets to each other.

Supplementary Services Architecture

Protocol Principles for
Supplementary Service Control

The most important requirement for control of supplementary services in H.323 is that the protocol actions are performed using functional peer-to-peer signaling. The protocol is designed in such a manner that the functional entities communicate with their peer entities (servers, clients, or gateways) directly without assuming network intervention. Additional requirements are:
  • Use of an open, standardized protocol to enable vendor independence for the customer
  • Independence from the network configuration and topology
  • Ease of interworking with existing QSIG-based [5] PBX networks
  • Ease of interworking with public ISDN networks such as E-DSS1 [6] and National ISDN in North America [7] and CCS No.7 ISUP [8]
  • Extendible and transparent protocol to enable manufacturer-specific supplementary services and additional network features as advantages compared to competitors
  • Ease of use of supplementary services as components by applications via application programming interfaces (APIs)

A Case for QSIG as the
Protocol for Supplementary Services

The protocol that comes closest to fulfilling the requirements mentioned above is the supplementary services protocol of "QSIG." It is available as a worldwide standard -- International Organization for Standardization/International Electrotechnical Commission Joint Technical Committee 1 (ISO/IEC JTC1) for private ISDN telecommunication networks and is being standardized for B-ISDN as well. QSIG is a modular protocol containing a generic transport protocol and individual transaction protocols for each supplementary service. It is an end-to-end, peer-to-peer protocol and is extensible for future needs and for support of vendor-specific and network-specific features. The signaling mechanisms used in QSIG supplementary services are very similar to those used in the ISDN DSS1 and ISUP protocols. This means that the experience of the industry in implementing interworking functions between the various signaling systems can be reused. As a result, the migration path from current PBX networks to H.323-based enterprise networks become more straightforward, and the network carriers can utilize the signaling transparency for their value-added purposes.

Basic Supplementary Services

Supplementary services in H.323 are specified in a multi-tier approach. Basic services consist of building blocks or primitives from which more complex services can be developed. Compound services are developed from two or more basic services. Both basic and compound services are used by applications to provide multimedia services to an end-user. Since there has been a history of experience gained by the telephony industry in identifying valuable services, a large subset of these services have been specified as basic services in the H.450 series of recommendations. Some basic services are:
  • Multiple call handling -- allows a multimedia client to handle multiple calls simultaneously.
  • Call transfer -- enables a user A to transform an existing call (user A -- user B) into a new call between user B and a user C selected by user A. Usually a conversation between users A and C precedes the transfer, but it is not necessary.
  • Call forwarding -- also known as call diversion, it comprises the services call forwarding unconditional, call forwarding busy, call forwarding no reply, and call deflection. It applies during call establishment by providing a diversion of an incoming call to another destination alias address. Any of the above variants of call forwarding may operate on all calls, or selectively on calls fulfilling specific conditions programmed or manually selected by the user.
  • Call hold -- The simplest form of hold is called near-end hold. In this case the holding client A provides some form of media on hold (MOH) until the held client B is subsequently retrieved. The other form of hold is remote-end hold. In this case the holding client A sends a hold request to the remote client B requiring the held client B to provide MOH to the held user. The holding user A may perform other actions while user B is being held, e.g., consulting with another user C.
  • Call park and pickup -- Call park is a service that enables the parking user A to place an existing call with user B (parked user) to a parking position and to later pick up the call from the same or other terminal. The pickup service can also be used to answer calls in an alerting group.
  • Call waiting -- permits a served user while being busy with one or more calls to be informed of an incoming call by an indication. The user then has the choice of accepting, rejecting, or ignoring the waiting call. The user calling the busy party is informed that the call is a waiting call at the called destination.
  • Message waiting indication -- provides general-purpose notifications of waiting messages, including the number, type, and subject of the messages. The priority of the highest priority message is also provided.
  • N-way conferencing -- allows a multiparty conference to be established. This can happen as a result of two or more simultaneous calls being merged into one conference or as a result of an initial two-party call later being expanded into a conference. The limit on the number of participants in a conference is usually based on the policy of the entity hosting the n-way conference.

Architectural Model

In H.323, services are distributed across endpoints, based on the suitability of the service at that endpoint, as shown below:
  • An H.323 client maintains the states of the calls it is handling. Some of the services that are suitable for implementation in a client are multiple call handling, call transfer, call forwarding, call hold, call waiting, message waiting, and n-way conferencing.
  • A conference server maintains the state of the calls that it is handling. The service it provides is n-way conferencing.
  • A feature server implements services that are not suitable for client implementation. Services such as call park/pickup are used in an automatic call distribution environment when calls are not directed to any user but to the first available agent with a specific skill. Such services are best implemented in a feature server, which then interfaces with a group of clients.
Another service that may be implemented in a feature server is calling card authorization. Such a service is also not directed at a specific user. Once a call has been authorized, the feature server can transfer the call to the specified user by using call transfer.
A feature server can also be used as a proxy or secondary client for those clients that are non-operational (e.g., powered-down). Upon detection of the operational failure of a primary client, the gatekeeper notifies and routes all calls destined for the non-operational client to the feature server. The proxy could then provide such services as call forwarding and messaging (for voice, facsimile and electronic mail).

Use of Basic Services

In the previous sections a number of basic services specified in H.323 are described. This section shows how compound services can be generated from basic services, and how these services can be effectively used by multimedia applications for the end user. Three examples are shown: consultation transfer, programmable call forwarding and conference out of consultation.

Consultation Transfer

For a consultation transfer, the user needs to do the following:
  • Put a multimedia call on hold and retrieve it later.
  • Call another person and possibly alternate between the two calls.
  • Transfer the two calls together.
Figure 2 shows a signaling flow for one consultation transfer scenario. Note that while the first call between A and B is held, the initiating client A establishes a second call to client C of the consulted party. Since there is no fixed limit on the number of H.323 calls a client can make simultaneously, this second call can usually be established in the same manner as the first call -- as a basic call. Since party A is in two calls, one being held and another being active, party A may alternate between these two calls simply by putting the active call on hold and retrieving a previously held call.

Programmable Call Forwarding

In traditional telephony, a user is provided with only a few choices by which a call can be forwarded. But in H.323, forwarding can be made dependent on a variety of conditions, such as the state of the called party -- busy, no reply, absent; the caller identification; the time of day or day of the week. For each scenario, the user can program the forwarding of incoming calls to different destination addresses. Programming of the destination address can be done locally at the home client or by remote programming via a connection to the home client.
As shown in Fig. 2 at the signaling level, call forwarding takes place by rerouting the call to the new destination under control of the forwarding multimedia client. After receipt of a setup message, client B sends a signaling message containing the forwarding destination address of client C to the calling client A. The calling client A then establishes a connection directly to the forwarded-to client C.

Conference Out of Consultation

The steps in a consultation conference scenario are call hold, consultation, possibly alternating, and conference. The steps involving hold, consultation, and alternating are described above. For conferencing, the signaling consists of client A transferring its calls with B and C to a conference server, and then making its own call to the conference server. In H.323, the address of the conference server is declared by each client as part of its capability exchange at call setup time. The conference server could be resident in client A or as a resource of the network.

H.323 and Public ISDN Models for Supplementary Services

The Two Models

The H.323 model for supplementary services uses the internet/intranet model where the network does the routing and applications run in the endpoints (e.g., desktop computers, servers, and gateways). In this peer-to-peer model, both the payload and signaling is sent transparently through the network without requiring processing by any network entities.
In contrast, the ISDN model follows the traditional telephony model, where intelligence for signaling resides in the network and the payload is routed end-to-end. Even though ISDN DSS1 signaling assumes intelligence in the endpoints, the endpoints are "slaves" of the network and cannot perform supplementary services with other ISDN endpoints without network involvement. Some ISDN service providers deploy stimulus control of supplementary services (e.g., National ISDN in North America) [7]. The only exception is the user-to-user signaling service, but it is for limited use since its contents are not standardized and the ISDN service providers place strict rules on its use.

Comparisons of the Two Models

The two models differ in the way they maintain the state of the calls. In the telephony model, the state of all calls handled by the network is centralized in the network. In the H.323 model, the state is distributed to the endpoints involved in the call.
The two models also differ in the manner in which services are deployed. The ISDN network is intelligent, so services are deployed in the network and then offered to end users by a service provider. This dependence of ISDN signaling on the network leads to a long chain of dependencies before services become available to the end users. First, the standards must be developed and implemented before they can be deployed in the ISDN network exchanges. Next, the CCS No.7 [8] must be updated in the inter-exchange network. This involves a large up-front cost in the network infrastructure for the service provider before any return on investment can be achieved from the users of the service. In contrast, services for H.323 can be developed by any manufacturer and sold directly to the end user for deployment. This translates into low-cost entry for the service provider.
Due to the differences in deployment of services, the model for charging for these services is also different. In the telephony model, customers are charged usually on a monthly basis. In the H.323 model, the customer pays for the software up front for unlimited use.
On the question of provisioning of new services, it is generally believed that the telephony model reduces some complexity by deploying new services at the centralized location in the network. This model works best for stimulus-type terminals, such as "black telephones." The protocol includes keypad input and display output. It is intended to be interpreted by humans only, and not by applications. In H.323, on the other hand, new services reside in the endpoints or in their proxies and are distributed in much the same way as any other software that is sold in the market today.
The two models also deal with service incompatibility issues much differently. In the telephony model, the more capable network executes the service on behalf of the less capable terminal that does not have the needed service implemented for the call. This is true in the case of stimulus-type terminals. But in the case of functional ISDN DSS1 protocol, a terminal equipment upgrade is needed anyway for use of the new service. In H.323, the clients exchange their capabilities and only use those services that are common to both clients. Therefore, services that are present in one client are simply not used if they are not present in the other client involved in the call.

Operation of Supplementary Services over Hybrid Networks

As carriers, service providers, and business users start to deploy IP multimedia telephony, the new systems must be adapted to existing communication networks and they must be able to interwork with legacy communication equipment attached to the IP network. Without the historic load of legacy systems it would be much easier to design a new network architecture. Many simplifications could be done, but such an isolated, homogenous environment exists only in rare cases. In this section, the network requirements for the operation of supplementary services are considered from both the corporate-network and carrier-network perspectives.
Figure 3 shows an example of a hybrid network. It shows some aspects of both a carrier and a corporate network. Native H.323 calls for voice, fax, and other multimedia can be made between any users of a corporate H.323 network without involvement of any gateways.
On all these calls supplementary services are available if both endpoints support H.450 protocols and have subscribed to these services. The service provider may, for example, charge for the use of supplementary services or block the use of services not subscribed by the user. This could be controlled in the gatekeeper, which has access to back-end services containing a service subscription database.
Supplementary service functionality and availability on calls between legacy systems and an H.323 multimedia network depends on the capabilities of the entities involved in the call signaling along the path of the call. Again the gateway provider may charge, not only for the call duration, but also for supplementary service usage.
Figure 4 shows the structure of a QSIG/H.323 gateway and the protocol stacks for the control plane and the user plane.

Terminal Adapters

Terminal adapters are used for the attachment of one or more legacy fax and telephone devices to an H.323 network. Their structure is analogous to the QSIG gateway described above except that the SCN side of the protocol stack is replaced with the appropriate analog-signaling scheme. All H.450 supplementary services that are necessary for the operation of the devices can be implemented within the terminal adapter. Invocation of features between the telephone and terminal adapter over the analog interface is performed using dual-tone multifrequency (DTMF) tones.

Gateway Interworking for Supplementary Services

As shown in Fig. 4 it is the task of the interworking layer of the gateway to map the supplementary service signaling procedures, messages, and protocol elements between the two networks. If the call models of the networks are similar, the mapping function does not need to be aware of the state of the underlying call, and becomes straightforward. But if they are not the same, the mapping function needs to keep track of the state of each side and perform state dependent actions.
Actions on Nonmatching Signaling -- In cases where the SCN side of the gateway does not support a supplementary service, it must reject or ignore an invocation. Alternatively, the gateway may emulate the supplementary service on behalf of the SCN user. For example, a gateway may become the H.450.3 call forward rerouting point of a call. In cases where the H.323 side does not support a feature but the SCN side does, the gateway will act in a similar manner on behalf of the H.323 side.
Table 1 gives a guide for the correspondence between the different standards.
SCN Signaling Transport -- In cases of SCN-to-SCN calls via the H.323 network, the gateway may provide network-specific signaling transport (e.g., CCS No.7 [8], manufacturer-specific enhancements of QSIG, GSM [9], DPNSS [10], etc.) via the encapsulation mechanism provided in H.450.1 as shown in Fig. 5. This capability is very important, because it allows deployment of H.323 as a universal multimedia network protocol, while preserving full transparency for specific signaling of legacy networks.
An alternative approach would be that each gateway would transport its SCN signaling scheme directly on the IP. This creates a practically endless list of different methods of making calls over IP, none of which are compatible with any others. Interworking would be left to the terminating side, leading to the implementation of numerous protocols by each terminating gateway and endpoint implementation.

Conclusion

The model and architecture of H.323/H.450 supplementary services described in this article are based on provisioning and distribution of functions between H.323 endpoints and servers. The architecture enables scalability, because clients and servers can be individually added to the network without mandating centralized provisioning entities. The cost of required processing power and memory in the endpoints is unlikely to be a limiting factor. We can learn from GSM that a uniform network infrastructure is most important, since it allows mass production of user equipment, which in turn will lower the prices. As an indication, H.323 telephones have already started to appear on the market.
The architecture is suitable for use by both corporate networks and new-generation telcos. The deployment of H.323 has already begun in corporate networks. Corporate users can in many cases immediately utilize their existing PCs, workstations and servers in their H.323 infrastructure. Additionally the QSIG-based core of H.450 enables smooth migration from current installed PBX networks to H.323 multimedia networks. Thus, the goal to lower the short-term and long-term cost and to allow easy, gradual growth of the network is reached.
In a similar manner H.323 will enable new telcos to provide not only basic telephony and some related features, but also IP-based value-added services, such as unified messaging, call distribution, VPN, and one-number service. The H.450 supplementary services provide signaling methods and standard building blocks for such services. The H.323 gatekeeper-routed model allows any model of billing of these network services. Existing analogue telephone and fax equipment can be connected to the H.323 network via terminal adapters and residential gateways. The incentive to use new telcos' services will very likely be based on their ability to offer a variety of services for a competitive total price.

References
[1] ITU-T Rec. H.323, "Packet-Based Multimedia Communications Systems," Geneva, Switzerland, Jan. 1998; (link to substandards H.450.x, H.235, etc.).
[2] ITU-T Rec. H.225.0, "Call Signaling Protocols and Media Stream Packetization for Packet-Based Multimedia Communications Systems," Geneva, Switzerland, Jan. 1998.
[3] ITU-T Rec. H.245, "Control Protocol for Multimedia Communications," Geneva, Switzerland, Jan. 1998.
[4] IETF RFC, "Real-Time Transport Protocol."
[5] ISO/IEC Std. CD 11579, "Reference configurations for PISN exchanges (PINX)," Geneva, Switzerland, 1994.
[6] ETSI Standard EN 300 403-1 V1.2.2, "Digital Subscriber Signaling System No. 1 (DSS1) Protocol; Signaling network layer for circuit-mode basic call control," Sophia Antipolis, France, Apr. 1998;
[7] Bellcore Std. SR-3875 Issue 3, "National ISDN 1999," May 1998;
[8] ITU-T Rec. Q.761, "Signaling System No.7 -- ISDN User Part Functional Description," Geneva, Switzerland, Sept. 1997;
[9] ETSI Std. GTS GSM 04.07 V5.0.0, "Digital cellular telecommunications system (Phase 2+); Mobile radio interface signaling layer 3; General aspects," Sophia Antipolis, France, Feb. 1996;
[10] BTNR 188: Digital Private Networking Signaling System No 1, London, U.K.

Biographies
Markku Korpi received his M.S. (1975) degree in computer science from Helsinki University of Technology, Finland. He is a chief systems architect at Siemens Information and Communication Networks, Munich, Germany. His current interests are in the design of IP-based multimedia networks. He has been active in the standardization of H.323 and has served as an editor of H.450 standards at the ITU-T.
Vineet Kumar received his B.S. (1980), M.S. (1982), and Ph.D. (1985) degrees from Iowa State University, Ames. He is a distinguished member of staff at Intel Architecture Labs, Intel. His current interests are in the standardization, implementation, and commercialization of IP multimedia telephony. He has been active in the standardization of H.323 and has served as an editor of several H.323-related Recommendations at the ITU-T.