Minutes of the IP Telephony (iptel) Working Group 43rd IETF, December 7-11, 1998 Orlando, FL Prepared By: Jonathan Rosenberg Notes taken By: Joerg Ott, Wilhelm Wimmreuter Within the body of the minutes below, sections surrounded by **** denote decisions made by the group during the meeting. This is to facilitate rapid perusal of the minutes to determine concrete outcomes. Pointers to specific slides are also given throughout the text, surrounded by []. The iptel group met for a single session on Thursday, Dec. 10, for two hours. First was a presentation of the agenda by the chair: 1. Agenda Bashing [5 mins, J. Rosenberg] 2. ITU interactions [J. Rosenberg] 3. Open Issues in GLP framework [30 mins, J. Rosenberg] 4. Location of Local Gateways [5 mins,L. Slutsman] 5. PGRP and GLP [15 mins, R. Davis] 6. CPL Framework Issues [40 mins J. Lennox] 7. IN call models and the CPL [15 mins, J. Dobrowolski] There not being any changes, the chair proceeded to the first item. ITU Interactions ---------------- J. Rosenberg Slides at http://www.bell-labs.com/mailing-lists/iptel/iptel_dec98.pdf The ITU is currently working on Annex G of H.323 in Study group 16. The annex specifies communication between administrative domains, which right now is limited to exchange of gateway and terminal addresses. The annex looks similar to the gateway location protocol (GLP) model. We would like to communicate and cooperate on this matter. What we have done so far is send a liaison statement to the rappateur of the group, encouraging communication. The iptel chairman has also been in communication with the editor of Annex G. We have also appointed a representative from iptel to attend ITU meetings. The plan for moving forward is to assess what problems Annex G is trying to, and does in fact, solve, and which it does not. We must also ascertain whether the architectural assumptions in Annex G are appropriate for our model. Based on this study, we can move forward. The chair agreed to provide the group with updates as things progress. Open Issues in GLP Framework ---------------------------- J. Rosenberg Slides at http://www.bell-labs.com/mailing-lists/iptel/iptel_dec98.pdf There are a number of open issues remaining in the GLP framework. These include aggregation and attributes, QoS measures, cost metrics, and LS/signaling server co-residency [slide 3]. The aggregation issue is a complex one. There are lots of attributes we'd like to have, such as cost, codecs, protocols, administrator, policy parameters, etc. However, this makes aggregation difficult. So, there is a tradeoff between aggregation and information; i.e., a tradeoff in scalability vs. information richness. [slide 4] There are a number of solutions possible. The first one is to preserve all information end to end. Advertisements can only be aggregated together if their attributes can be combined in some way without loss of information. In the example [slide 5], assume that the two advertisements from the upsteam LS's have telephone prefixes which can be aggregated without loss of information (516* union 517* = 51*). The advertisements can be aggregated only because the other attributes are identical. Another solution is to make the attributes hop by hop [slide 6]. In this case, the information which cannot be aggregated is simply discarded. The information is only used for hop by hop policy decision making, and has no end to end significance. Fortunately for us, the decision about all attributes does not need to be made in the framework document. But, the chair felt something should be said about the problem in general. So, the open issue is: what, if anything, to say regarding aggregation and attributes in the framework. Also, we must try and come to some decisions to begin preparation of the actual protocol. A comment was made that there can be difficulties if individual implementations have free run about what attributes to use. It could lead to an explosion in the attribute space and non-scalability of the algorithm. Dave Oran proposed that the framework might be able to define classes for each attribute, such as INTEGER or BOOLEAN. Then, the protocol could define rules for aggregating these attributes based purely on the class, without understanding the semantics of the particular attribute. Jonathan was concerned that if a LS didn't understand the semantics of a particular attribute, it wouldn't be able to determine how to aggregate if there was more than one choice. For example, if BOOLEAN classes could be ANDed, OR'ed, or discarded, if the LS had no notion of what the particular boolean was, how could it decide which to apply? Another proposal from the floor was to define a minimal set and allow extensions. **** The chair then proposed that the framework document say nothing about particular attributes, but just mention that there is a tradeoff. There being no objections, consensus was achieved on this. **** The next open issue was then discussed, QoS measures. The problem is that it is desirable to know what kind of voice quality to expect from a given gateway, and that we would ideally like to know this information from the GLP. However, speech quality depends on two factors - (1) the loading of the gateway, in terms of number of calls, for example, or (2) the loss/delay/jitter along the path from a particular user to the gateway. However, determination of the second factor is severely problematic. Since the path is different for each user connected to the gateway, there is no way to represent path quality information in attributes advertised by the gateway. Furthermore, such information changes dynamically. Thus, getting up to date information would require constant updating. Doing this is known to cause routing instabilities in dynamic routing protocols, causing route flaps between routes during congestion. An additional difficulty is that the signaling and media may take completely different paths. [slide 8]. An example was then provided [slide 9]. The dotted lines represent signaling relationships between gateways and location servers. The red router is congested. User 2 experiences no loss or delay, since the path between it and the gateway is uncongested (using a least hop route), whereas the path between user 1 and the gateway is congested. So, when the gateway propagates information about its "QoS", for which user should it represent the information? The proposal was not to solve the QoS path problem [slide 10]. The chairman indicated that an LS could use any means at its disposal to try and ascertain this information to guide policy decisions. For example, an LS might ping the gateway. It was pointed out that such pings in fact won't help anyway, due to the disparity of signaling and media paths. However, it would make sense to allow a capacity metric to describe the gateway. This metric could indicate DSP resources, access bandwidth, or number of circuits available. An LS could then load balance based on these quantities. An example was given to show this [slide 11]. A question was asked about whether we could advertise current loads, not just total capacity. The chair expressed concerns about doing this, again since it leads to instability and route flaps. Also, the dynamic nature of this information makes its use questionable. Dave Oran pointed out that using an absolute metric for capacity makes load balancing problematic. If a gateway advertises 5 circuits, and sends such an advertisement to two different LS's, each LS might think the gateway can handle 5 calls, and each might send it 5 calls. Thus, the capacity can only be used as a relative metric. In other words, a gateway with a capacity of 4 can handle twice as many calls as a gateway witch capacity 2. This relative unit would also be dimensionless. **** There was consensus to include a capacity metric, and not to address path QoS issues. There was also consensus on the use of a relative, dimensionless metric for capacity. Issues on how to aggregate this metric were to be taken to the list. **** The next issue to address was cost metrics [slide 13]. Gateways charge a certain amount for a call. It would be desirable to include such information in a gateway advertisement, so that users could do least-cost gateway selection. However, cost structures can be arbitrarily complex. There are issues related to currency units, the breadth of different calling plans, the fact that plans are often subscriber tailored, and the fact that cost may dependent on past usage histories. Representing all this information is hard. Also, IETF cannot constrain business practice. By limiting the ways to express cost, we would limit business practice, and we can't do that. A number of possible solutions were proposed, including doing nothing, supporting only simple costs, including URL's, or type-length-values to represent cost [slide 14]. The chair indicated that this was his preferred solution. Dave Oran indicated that this issue is complex and out of scope for us. Joerg Ott, however, indicated concerns about using TLV. An LS would be required to understand the meaning of them, and so would end users. The issue is further complicated by time zones, by the way, he indicated. The chair indicated that these were complex issues, and we didn't need to solve them now but just give warnings in the framework document. It was then suggested that an airline broker type of system could be used. People could ask for the cheapest call to somewhere, and the brokering systems could search and get the answer, just like airlines work today. It was pointed out that the scale of airlines reservations is much smaller than calls, and also, that the call setup latencies need to be much smaller than airline booking queries. The chair pointed out further that this kind of solution had been proposed before, and that there were serious scalability problems with it. **** There was consensus not to do complex cost structures in the protocol, but no consensus on exactly what to do (TLV, URL, nothing, etc). It was agreed to continue discussion on the list. **** The final semi-open issue was co-residency of the LS and signaling server (i.e., gatekeeper or SIP server) [slide 15]. There seemed to be consensus on the list that these two could be seperate or co-resident. When separate, there would need to be communication between the signaling server and LS to access the database. This communication could consist of a query (such as LDAP), or an actual transfer of the database. The chair wanted to emphasize that we would not be specifying this right now. **** There was agreement that the LS and signaling server could be separate or co-resident, and that we would not specify the communication protocol between them for now. **** Location of Local Gateways -------------------------- Lev Slutsman, AT&T Slides at http://www.bell-labs.com/mailing-lists/iptel/slutsman.ppt Lev proposed some ideas regarding location of local gateways. He first discussed some motivational services [slide 2]. The main service is as follows: A customer wants to route their calls via the Internet, given that the quality is sufficient, and that the price is low, otherwise the call should be routed along the PSTN as normal. Doing this requires the originating exchange for the call to have knowledge of the Internet gateways. In particular, it needs to know the set of ingress (i.e., local) gateways for the first PSTN-IP conversion, along with the egress (i.e., remote) gateways for the IP to PSTN conversion. Lev presented an architectural view of how the problem might be solved [slide 4]. The ingress switch (XOS) queries its SCP to find the route for the call. The SCP in turn gets information from a controller which is connected to the Internet. This controller is a participant in the GLP. It therefore knows about the various gateways, and might know about the quality of the connection between pairs of gateways. As such, it can tell the SCP (which tells the originating switch) which gateway to route the call to for the PSTN-IP hop, or, if the IP quality is not good, or too expensive, the controller can indicate usage of the telephone interexchange carriers for completing the call. Lev's proposal therefore was to incorporate information into the protocol for allowing a controller to determine both the ingress and egress gateway jointly [slide 5]. Jonathan Rosenberg pointed out that effectively what Lev wanted was integrated PSTN/IP routing, taking into account the whole end to end system. He indicated that doing that was likely politically and technically impossible. Dave Oran agreed. He indicated that you could look at networks in two ways: one could be embedded in the other, so that we abstract away internals of the embedded network, or they could be peers, exchanging full routing information. The peering issue is extremely hard. **** There was no consensus on how/if to move forward on this, due in part to confusion about exactly what was being proposed. **** Peer Gateway Routing Protocol (PGRP) ------------------------------------ Ronald Davis, Lucent Technologies Slides at http://www.bell-labs.com/mailing-lists/iptel/pgrp.pdf Ron spoke about a protocol called PGRP. It is used to exchange information among gatekeepers. It makes use of a topology server to support communication in an area, and to aggregate information from another area. It uses multicast as a way to exchange information. A question was immediately raised concerning scalability. Ron pointed out that this was an INTRA domain solution, not INTER domain. Ron then addressed a number of questions that had been raised on the list beforehand. These included issues like "what is the list of elements", and how out of service transition state works. The chair emphasized that the discussion was good, but that the group was not currently doing an intra-domain protocol, and therefore PGRP was out of scope. However, there were lots of good ideas here, and perhaps some could be borrowed for our inter-domain protocol. Call Processing Language Framework ---------------------------------- Jonathan Lennox, Columbia University Slides available at http://www.bell-labs.com/mailing-lists/iptel/cpl-orlando1.pdf, http://www.bell-labs.com/mailing-lists/iptel/cpl-orlando1.ps Jonathan presented an overview of open issues on the call processing language framework document. The first issue was the scope of work [slide 1]. The original requirements document was very broad and ambitious. It included things like call distribution services, database access, etc. So, the scope has been narrowed to just be "user creation of basic network services". The emphasis is on user creation. The CPL is geared for end user uploading of scripts to define services. It is not meant as a full blown IN service creation environment. The types of services to be supported in the first version are those derived from the basic primitives of proxy (forward), redirect, and rejection, coupled with conditional testing, arranged in a decision tree [slide 3]. Example services are call forward busy, call forward on no answer, one number service, time dependent routing, call screening, mobility, private numbering plans, etc. These are similar in scope to IN's CS-1 services. Jonathan then described what is meant by decision trees [slide 4]. These are basic interconnections of objects to describe flow of control. Each block is some primitive with a number of ouputs, that can lead to other blocks. This is the same model used for SIB's in IN. Jonathan emphasized [slide 5] that we are not trying to solve (1) call centers, (2) user agent control - the CPL runs on a network server, (3) no media server control, (4) no persistent call state (although this was less clear). He indicated that these things might be added in later versions, but were not for version 1. Lawrence Conroy asked why media server control was left out. Jonathan responded that it was for simplicity; other groups have studied complex IN problems, requiring a lot of time. We should not try to tackle this in version 1. Also, the CPL is targeted for end user services. Media control is generally not an end user service, but rather an administrator service. This was reiterated by the chair. Someone proposed that 911 services be covered by the CPL. The chair rejected this, saying it wasn't something a CPL worries about, and that there are lots of other issues to be solved, way outside our scope, to make 911 service on the Internet. Jonathan L. then asked if this is what our call model is [slide 6]. The slide shows two states, "not in a call" and "in a call". Dave Oran asked if a script can regain control of a call once its been redirected. Jonathan L indicates that no, you cannot. If you want to retain control, you should proxy and not redirect. The definition of redirect is to give up control. Jonathan L. indicates that this is inherent in the SIP model; the chair indicates that this works in a similar way in H.323. Jonathan continues to propose that calls are effectively viewed as RPCs [slide 7]. The input is the detination, next hop server, timeout, etc., and the output is one of pickup, busy, no-answer, rejected, etc. This fits nicely with decision trees. From a network perspective [slide 8], we cannot possibly have a network wide view of the service. Thus, a CPL defines a service as seen by ONE proxy/gatekeeper/server. Controlling services across many servers requires many independent scripts. Interaction is dictated solely by the signaling protocol. We make no attempt at global coordination. Comments from the group expressed concerns that CS1 is not sufficient to define a service. Proprietary add ons are required to implement the desired functions. Dave Oran also pointed out that there are inconsistencies in the call model. Busy is a state and not an output, also server failure is an event, not an output. Jonathan then discussed the issue of script interactions [slide 9]. He proposed that when the primitives are redirect, reject, and forward, scripts need not be aware of the existence of other scripts. Furthermore, if two scripts run on the same server, they simply act as if they were running on two different servers. In addition, server administrators control the order of execution of scripts on the same machine. Interaction between servers [slide 10] is the responsibility of the signaling protocol. We don't know whay kind of logic will execute on a remote server; it may not even be a CPL (it could be SIP-CGI or some hard coded logic). Signaling protocols must be able to handle inter-server problems. For example, SIP provides a loop detection functionality. We should, however, design SIP servers to have failure behavior - like alerting a user by email if a script fails. Jonathan then went on to discuss language design issues, and in particular proposed XML. XML has many advantages [slide 11]. Its readable and producable by humans and machines, allows easy static parsing, has a natural tree syntax for decisions trees, existing link syntaxes, existing namespace support, familiar syntax, and many freely-available parsers. The disadvantages [slide 12] are that its verbose, and not a real programming language. However, this may be a good thing; we want to restrict functions a user can invoke anyway. An important language design issue is failure modes [slide 13]. The main question is what a server should do if it can't understand a script. This is a big issue especially if we have primitives in the script defined using XML namespaces. Some possibilities are to reject the script on submission, try to work with what it understands, or have constructs in the script that define what to do if something is not understood. All raise interoperability issues. Another issue is one file vs. many files [slide 14]. Usage of many files allows referencing external subroutines. This facilitates reuse and could provide for more powerful and flexible service creation. It also allows for cusomization. The disadvantages are that it is a maintenance nightmare, makes cycle detection much harder, and introduces additional failure possibilities (the classic DLL not found problems...) The final issue raised is separation of the CPL server from signaling server [slide 15]. An advantage of separation is that serving a CPL is different from being an IP telephony signaling server, and we could allow one CPL server to control many signaling servers. Using separate servers is currently done in the IN; the SCP and SSP are separate. The difficulty is that it introduces the need for a protocol between them, filling the role INAP plays in the IN world. Jonathan proposed that an implementor can separate them, but we won't stanardize the protocols needed for communications. **** Consensus therefore was reached on the following: The scope of the CPL is limited to user-defined services. Execution of the CPL is defined for network servers. The types of services are limited to those similar in flavor to IN CS-1. The CPL server and signaling server may be separate, but no protocol is going to be defined between them. The script defines a service as seen on one server. The CPL does not define interactions between servers. The CPL is defined by dags connecting primitives together. No consensus was reached on the following: Whether a CPL could reside in one or more files. How failure modes are to be supported. The exact scope of services supported. **** IN Call Models and the CPL -------------------------- Janusz Dobrowolski, Lucent Technologies Slides at http://www.bell-labs.com/mailing-lists/iptel/inmodels.ppt Janusz presented arguments on why call models are a good idea for the CPL. He indicates that call models are basically state machines [slide 1], and these have been in use all over since the 1960's. He then indicated that the advantages of call models are that they are needed for many services, wheras Internet servers are stateless. It was then pointed out that there are many types of Internet servers, such as web, COPS, SMTP, SIP, etc., and most maintain state. Even HTTP servers maintain state through cookies. Janusz then proposed that the CPL be able to maintain state as an option, and the CPL is constructed as a set of condition/action pairs. He also discussed some interactions between the CPL and IN devices and protocols [slide 4]. He also gave pointers to existing call models [slide 5], and gave applications where they were needed [slide 6]. The chair then indicated that he felt he understood how the call model fit into the CPL. In IN terminology, the CPL is the "global functional plane". It expresses an abstract view of the service. The call model is not present at this point in the IN conceptual model, nor is it present in the CPL. There is unquestionably a notion of state, but the details of the state machine are hidden from the CPL in lower layers. For example, if implemented using SIP, SIP provides a state machine, but its not seen by the CPL. Thus, we have a call model, but we are not defining it in the CPL since it is not necessary. Definition of the call model might be necessary if we wished to separate the CPL and signaling servers, to facilitate communications. In IN terms, you need the call model for INAP to work. If the SCP and SSP were coresident; who cares what the call model is. **** There was no consensus to include or not include a call model in the CPL. ****