SYSTEM AND METHOD FOR MONITORING THE
PERFORMANCE AND QUALITY OF SERVICE PROVIDED
BY A COMMUNICATION SERVICE OR NETWORK
Technical Field
The present invention relates to a system and method for monitoring the performance and quality of service delivered over communication services and networks. Background Art
In the prior art there are many systems available for monitoring specific network elements. However, none of them are able to refine and integrate information in a manner that directly measures customer Quality of Service ("QOS"), let alone measure QOS with respect to particular objects, classes of objects, or an entire communication service or network. Without a direct measurement of QOS, a network operations staff is severely handicapped in its ability to promptly correct problems that affect customer service. Instead, when a customer calls with complaints related to inadequate service performance, it is often a labor-intensive task to research and isolate the realistic likely causes of trouble that are adversely affecting the customer. Even when a specific problem does not exist, an objective consideration of performance is very difficult. There are various factors that affect network performance, including throughput, provided bandwidth and reliability of performance, all of which greatly impact the QOS contracted for and realistically expected by a customer. There are several reasons why current network monitoring systems are inadequate. For example, most systems available today for real-time network performance monitoring are network-centric. As a result, only a very low-level view of performance, from the standpoint of individual hardware of network elements, can be monitored. Further, required customer services typically combine numerous network elements from different vendors that often use dissimilar technologies. Some
individual services even use available multiple network technologies to provide end- to-end customer service. Yet, present day systems are all limited to the specific technology being monitored with little ability for the outputs of different monitoring systems to be integrated to provide an overall view of the performance of the communication service or network.
Additionally, network elements are typically diversely routed. They also include varying levels of fault tolerance.
As a result, not only is technical trouble-shooting greatly complicated as noted above, but so is the development of any meaningful objective metric related to QOS. A useful QOS approach must (1) selectively or collectively aggregate the performance of many network elements; (2) take into account the differing performance criteria of elements from distinct vendors, possibly even using dissimilar technologies; and (3) develop a protocol that recognizes that a simple aggregation of various performance metrics is typically inappropriate in a sophisticated network environment. Disclosure of Invention
A monitoring system is disclosed for providing information relative to the performance of a communication service or network and the quality of service being provided. The monitoring system can be added to an existing network or it can be added as a network is assembled. The inventive system includes the use of mechanisms that sense the state of objects, collect data related to the objects, assemble the collected data into a form representative of a characteristic, and then present the characteristic to a user to enable the user to determine performance. Preferably, the mechanisms include an element manager, an element adapter, a server, and a performance monitoring client data platform.
In another aspect of the invention, the system includes the ability to prepare a plurality of models representing varied service or network configurations, the ability to select from one of the plurality of models, and the ability to collect and assemble data so that the performance of a particular model can be compared to other models as well as the current configuration of the service. Moreover, the system includes the ability to compare an active state of one or more objects with a stored historical state for the same objects, permitting either real-time evaluation to obtain general or
particular information relative to service performance or quality. The stored model configurations can also be used to provide plans for rapid recovery of the communication service or network following a catastrophic event.
From the standpoint of the performance monitoring client data platform, the system facilitates the display of information related to service performance, the interrogation of the service in a manner that enables the substantially transparent probing of various service levels to gather performance data, and mechanisms for selectively processing the gathered data to present the processed performance data in a form beneficial to an end-user. The monitoring system of the present invention has the ability to adapt to any network or user model. Each network technology to be monitored has its own types of associated network elements and facilities as well as different performance metrics to be monitored. Designing the present system to have a dynamic modeling capability addresses the desire to be able to model any network component or even different interconnected or separate networks. Dynamic modeling allows any type of objects to be described and monitored without any changes in the core software. Thus, the system can model real objects such as switches, cards, or trunks, as well as logical objects including regions, customers or network services.
To directly monitor QOS, a computational engine gathers raw performance metrics from the network and sorts the information into objective measures of customer service performance including throughput, provided bandwidth, and reliability of performance. The system maintains a model of the network and computes network state updates based on comparisons of previously collected performance data or a theoretical model of expected network performance. The system of the present invention also provides for modeling of navigational and display aspects of network objects within each network. Navigational modeling provides a user with a mechanism for browsing the model information and choosing specific network objects or classes of objects to monitor within a modeled network as well as the relationship between objects. A display model provides a method for customizing the display of monitored information in a user interface. A load model provides a means to distribute developed network models across multiple server machines, thereby providing a highly scalable system.
All of the above is accomplished on a system having distributed, fault tolerant architecture.
The system architecture and design incorporate the latest software technology to create a robust and scalable system. The user displays are preferably JAVA-based and therefore, runable on a wide range of hardware and operating platforms, resulting in an easy, cost-effective deployment across multiple business units or enterprises. The communication infrastructure uses standard CORBA
(Common Object Request Broker Architecture) services to provide ease of integration with other operating systems. In contrast to the present invention, currently available performance monitoring systems are unable to provide a true end-to-end view of network Quality of Service (QOS), let alone the ability to focus on different aspects of performance desired by a user. The present system allows performance monitoring of many network objects, taking into account the differing performance criteria of various objects, using a methodology that takes into account the operational complexity of a realistic sophisticated network.
Brief Description of Drawings
Figure 1 is a schematic view of the performance monitoring system architecture.
Figure 2 is a schematic diagram showing the various programming languages used to facilitate the implementation and operation of the system on a host network.
Figure 3 is a schematic diagram of the connectivity of the several subsystems incorporated in a system server.
Figure 4 is a basic schematic drawing of the interface between server side and client side subsystems.
Best Mode for Carrying Out the Invention
A performance monitoring system 10 of the present invention, as illustrated in Figure 1 , is adaptable to any network that can be monitored. Performance monitoring system 10 can employ one or more computational servers 1 1 for load distribution purposes. In Figure 1, server 11 includes a plurality of element adapters
13 that are connected to a plurality of element managers 15, which in turn are in contact with a plurality of networks 17, each of which are capable of individual or
collective monitoring. Element managers 15 provide a uniform interface to networks 17 and each associated object 22 making up a network and are responsible for sensing the various aspects of an object including its current state. Server 11 is also connected to a plurality of JAVA based client data platforms 19 configured to request and receive information relative to the status of networks 17 or objects 22 contained within one or more of the networks. The status is provided through the generation of state signals representing the current real-time state of objects 22 using the interface of element managers 15.
Performance monitoring system 10 preferably uses the JAVA programming language for implementing the display of real-time network performance through client data platforms 19 in view of its ability to function on a wide range of hardware and operating system platforms including Microsoft® Windows® and UNIX variants. The use of JAVA or similar programming languages with similar portability characteristics enable parties, such as network customers or system maintainers, to use their existing data processing platform or whatever other data platform the party prefers, thereby facilitating the implementation of performance monitoring system 10.
Within each server 11, a plurality of dynamically configurable models (e.g., network or circuit configurations) are stored in an alterable memory 21 that enables a party to seek real-time information or sense the current state of one or more network objects 22 within a network 17, classes of objects 22 or even entire networks 17. The current state may preferably be compared with a stored state of an archived model/object data as stored in a non-volatile storage system, model storage (MS) 23, to facilitate model comparisons and a determination of overall Quality of Service (QOS) between a current state and a known past state. The use of stored or archived models enable a party to compare real-time and historical information and provide insight as to the current status of different levels of monitored networks 17 even down to the component level. In some situations, it may be desirable to compare a current state with a theoretical state associated with a stored model to provide a base-line comparison.
Since client data platforms 19 can be used to easily select objects 22 and networks 17 to be monitored in real-time, a party utilizing client data platforms 19 in
combination with the comparison data available in database 23 enjoys numerous advantages. For example, it is possible to proactively identify potential object or network problem areas before they become major network disruptions. From a security standpoint, the ability to analyze current and past customer and related user trends enable a network maintainer to identify aberrant or unexpected network traffic or other system utilization. It is also possible to optimize the performance of networks 17 dynamically through such things as the simple adjustment of switching transfer points based on performance gains obtained under similar load and utilization conditions. Even when real-time analysis is not possible, the ability to analyze past models and network states can be very beneficial in revising and optimizing new network and object configurations such as through the use of theoretical "what if inquiries. Stored model configurations can even provide a blueprint for rapid recovery of a network 17 following catastrophic events such as hurricanes, tornados, or earthquakes, which may significantly damage whole portions of the network. Even in its operational backend associated with servers 11, performance monitoring system 10 is not constrained to a single operational programming language. For data processing operations within the server 11, high performance programming languages such as C++ or UNIX are preferably used. It should be noted that the server 11 is not limited to these languages. They merely represent examples of suitable high performance, high level languages.
Instead, performance monitoring system 10 relies on core software with a dynamic modeling module to permit any type of network object 22 to be modeled without changes to the core software. Thus, performance monitoring system 10 can model real objects such as switches, cards, or trunks, as well as logical objects such as network regions, user access points, an account, available services, a network session or a expanded class of objects, as understood by those skilled in the art.
Nevertheless, in a preferred embodiment of the invention, within servers 11 Common Object Request Broker Architecture (CORBA) has been selected as the programming language of choice for use with respect to objects 22. CORBA is a distributed object architecture that allows objects 22 to inter-operate across networks 17 regardless of the programming language originally associated with the operation
or use of a particular object. The details of an object 22 are encapsulated in a standard interface program. Object 22 can then be used on any server 11 using virtually any computer-based language. Another advantage of a CORBA modeled object 22 is that it can function both as a network client and as a network server. When an object 22 provides services to another object 22, it acts as a server. When an object 22 requests services from another object 22, it acts as a client. The ability of CORBA to model objects as both network clients and servers is particularly helpful when modeling the interaction of classes of objects 22 within a network 17 for the purpose of performance analysis. Referring now to Figure 2, each element manager 15 preferably uses
CORBA to provide a communication link between servers 11 and their associated networks 17. In turn, each server 11 preferably uses the same language to provide a communications link with performance monitoring client data platforms 19. Finally the various subsystems or modules of servers 1 1 and performance monitoring client data platforms 19 also use CORBA for intercommunication.
Server 11 includes several subsystems that work together to provide services to client applications associated with a performance monitoring client data platform 19. The subsystems include an object initializer 25, model processor 29, archiver 31, topology manager 32, object persistence 33, and system executive 45. The subsystems will usually have configuration data associated with them, which are referred to as models. Each subsystem uses its own model. To facilitate analysis of a network performance the invention relies on the ability to break down a network object 22 into different aspects, the aspects being maintained by different sets of modules, with a shared managed object identifier (MOID) in common with each of the modules to unify the aspects into a single managed object 22. The basic aspects an object typically include navigation, state, and element management. As shown in Figure 3, topology manager 32 implements the navigation aspects of an object 22. Object initializer 25, model processor 29, and system executive 45 implement the computation aspects. As discussed below, modules 25, 29, and 45 provide static and real-time computed states of an object 22. Object persistence 33 is used by the other modules to obtain system configuration from a persistent database represented by MS 23 and accessed through an archiver 31. A
client manager 47, shown in Figure 4, provides a single point of contact for performance monitoring client data platform 19 to talk to servers 11, acting as a proxy for services provided by other server modules. Element management is controlled by element adapter 13. Element adapter 13, shown in Figure 1, is responsible for collecting data from remote devices by way of element manager 15 and delivering the results in the proper format to the portion of server 11 responsible for implement the computation aspects.
More specifically, object initializer 25 starts and stops computational object instances on request and requests data collection from subsystem element adapter 13, which communicates with networks 17 through element managers 15, as discussed with respect to Figure 2 above. The object initializer 25 also serves as a name service for CORBA object references, so that each object 22 is given a unique managed object identifier (MOID), which is maintained for that object even as its state changes or as different aspects of an object 22 are analyzed or modified. Further, object initializer 25 acts as a mediator between model processor 29 and system executive 45, accepting directives to activate a computational state on a local server 11 from the system executive, determining the details of how to comply with a directive, and configuring the model processor accordingly.
Model processor 29 implements objects 22 into a computational model comprising a representation of the computational state of each object and their relationships, the code to perform computations, and an interpreter to execute the code.
Archiver 31 provides persistent storage of performance data by maintaining a history of the state of managed objects 22. On start-up, archiver 31 accesses information in persistent storage through object persistence 33, which has access to MS 23 for historical data and can obtain the MOID through the object initializer 25. The combination of subsystems enable archiver 31 to provide information permitting a comparison between both a current real-time state and a prior stored or theoretical state of any managed object 22. Archiver 31 also has direct access to MS 23 where it maintains the state of managed objects. As a practical matter MS 23 may be one or more file or data storage systems.
Topology manager 32 provides navigation information to users interfaced to performance monitoring system 10 through performance monitoring client data platform 19. Topology manager 32 does not directly depend on any other module, other than to save its data through OP to MS 23. It maintains navigation information about objects 22 and their relationship to one another and the rest of a modeled network 17. As such, topology manager 32 maintains objects 22 for abstract entities (e.g., logical objects) relevant to the user, such as a geographic region, as well as the typical network elements (e.g., physical objects) such as a switch.
Along with providing navigational information, TM 32 coordinates the introduction of new equipment into the network and cooperates with other modules to propagate the topology updates for the new equipment as represented by real objects 22.
TM 32 also provides expansion abstractions through logical objects 22 or a class of objects 22, which is a shortcut defined with respect particular managed objects that will display information relevant in a user's mind (e.g., the set of switches in a region). It provides the ability to navigate from one object 22 to one or more additional objects by invoking an expansion function that creates a class of discrete objects 22 into a logical object that can be analyzed as a whole.
There may not be, however, any direct relationship between the perceptual model of the topology and its physical layout. Thus, a further advantage of the invention is the ability to model a physical network 17 in many different ways to provide enhanced ability to perceive and monitor the performance of network objects 22 in desirable ways of interest to the user without a limitation as to the physical elements themselves that make up the network. In short, topology manager 32 is used to reflect a user's desired perception of a network 17 apart from or on top of the actual physical network topology.
System executive module 45 manages the system state of server 11 as well as load, fault recovery and backup. System executive 45 also starts server 1 1 and loads it with the modeled objects 22 appropriate for that server to perform its performance analysis role within one or more networks 17. System executive module 45 has direct access to object initializer 25 and object persistence subsystem 33. System executive 45 tells object initializer 25 to send object persistence subsystem 33 any
required MOID for a managed object 22, so that it can go to non-volatile memory 35 through the interface between object persistence subsystem 33 and archiver 31, to obtain any stored system model requested to initialize the system for system executive 45 In Figure 4, a communications interface 45 is shown between server 11 and performance monitoring data platforms 19 of performance monitoring system 10 The client side components represented by a performance monitoring client data platform 19 can be assembled in a desktop or portable computer like unit, which can be positioned in a location convenient for those interested in monitoring the performance of one or more modeled network 17 and related objects 22 Similar units can also be located where network service technicians are housed to enable them to continuously monitor modeled network states
A client manager (CM) 47 communicates with the local server services associated with a server 11 CM receives data requests from and provides responses to a client task manager (CTM) 49, a managed object browser (MOB) 51 and a performance monitor display (PMD) 53, all associated with performance monitoring data platform 19 Thus, CM 47 functions as a proxy for the performance monitoring data platform 19, presenting requests to server 11 and sending data from the server to the performance monitoring data platform in response to the requests CTM 49 is where a user of the monitoring system begins CTM 49 provides initial services such as authentication login/logout CTM 49 is also responsible for session management and inter-application communication, including launching and maintaining the state of the other client side modules, such as MOB 51 and PMD 53 It provides the ability for users to save and restore session states CTM 49 communicates with MOB 51 and PMD 53, but the latter two modules do not communicate directly with each other This approach is preferably undertaken to provide separation of the module implementation and semantics from the other modules
CTM 49 is the primary coordinator of client-side activities within the peformance monitoring system 10 As such, it is responsible for user and system interfaces and for starting new client-based components and modules, saving and loading user state, such as shortcuts that open multiple windows and coordinating
with servers 11 in the performance monitoring system 10 to provide users with reliable access to the system.
CTM 49 also provides for seamless switchover to backup facilities whenever a primary server 11 goes offline. CTM 49 keeps a list of references to available servers 1 1 so that in the event of a server failure it can switch operations to another server, update the server environment and notify all client components associated with platform monitoring client data platform 19 that they should refresh their state using the newly created environment.
Once authenticated, a user presents a request for information or data relative to an object 22 or network 17 to CTM 49, which presents the request to CM 47. CM 47 then organizes the subsystems available in server 11 to obtain the requested information or data. How the information or data is obtained is transparent to the user. The user only has to be concerned with the request to the server 1 1 and the answer received. MOB 51 implements a human interface for navigating managed objects 22 and exploring their relationship. It provides navigation capabilities for the users of performance monitoring system 10. MOB 51 displays summary performance information for each of the managed objects 22 in the system. The basic role of MOB 51 is to enable easy navigation to reach and then analyze the performance of managed objects 22.
MOB 51 is responsible for browsing both real and logical objects 22. Users must be able to navigate based on the "perceptual topology" of a network 17 (e.g., how the users perceive the network) rather than the physical topology. MOB 51 must also deal with transitive relationships between objects 22, which are not directly reflected in the physical network. A managed object 22 can be anything referenced by a constructed model of the system. A managed object 22 can be real (e.g., a switch or a link) or logical (e.g., a region, an account, a session or a class of objects). Significantly, because they provide enhanced perception of portions of a network 17 of particular concern to a user, classes of managed objects 22 may be created that amount to a generic or collective name for a group of objects. For example, MOB 51 may have an entry for all switches in California to let the user activate a performance monitor on all of these switches. In an ordinary system
design, an MOB 51 would have to realize all the switches (perhaps thousands) and then pass all the individual objects 22 represented by each individual switch to the PM 53. Using classes of managed objects 22, the communication between the MOB 51 and the PM, by way of CTM 49, can be reduced to two statements, one for "California" and one for the expansion "all switches". Classes of managed objects 22 can also be used to provide shortcuts, thereby enabling performance monitoring system 10 to jump down levels of network hierarchy to retrieve whatever information or data is required to provide appropriate performance monitoring of any aspect of a modeled network 17. Thus, MOB 51 preferably advises users when any portion of an active and modeled network 17 is down. It also may be configured to provide notification if there are any changes in managed objects 22 as represented by network topology or physical equipment.
MOB 51 must also reflect changes in the network topology, for example, when new pieces of equipment are added to the network, or when faulty equipment is brought down. When the MOB 51 notices that result of an expansion has changed, it notifies the user about the change and updates its expansion displays appropriately.
MOB 51 also takes part in session management activities of CTM 49. Upon request by a user, CTM 49 attempts to save the session information (e.g., screen configuration) by requesting that all client modules, such as MOB 51, produce a snapshot of the current visual state. Later, when the same session is loaded, MOB 51 must be able to start from a snapshot and reconstruct the corresponding visual state. Topology manager 32, shown in Figure 3, is involved in the navigation requests associated with MOB 51 , by way of CM 47. Hence, topology manager 32 must cooperate with user interface-intensive programs such as MOB 51 at user interaction speeds. Thus, topology manager 32 preferably keeps an on-line cache of an entire monitored network 17 so that expansions will only require in-memory access.
PMD 53 provides real-time monitoring of an active network model, permitting information to be displayed in many different formats such as tables,
charts, forms, graphs and the like. Preferably PMD utilizes a graphical user interface (GUI). Historical data can be presented as well as real-time data. The historical and real-time data can be combined to show trending data. Thus, a user can work through CTM 49 to set up PMD 53 to present gathered data in the preferred or most meaningful form. To facilitate information dissemination and review PMD 53 preferably includes the ability to request and receive CORBA data from a server 11.
The combination of client task manager 49, managed object browser 51 and performance monitor display 53 within a platform monitoring data platform 19 enable a user to move through different levels of a modeled network 17 to transparently seek out or probe for information without regard to the complexity of the underlying technologies that may be involved in the form of real-time analysis of current states represented by managed objects 22, classes of objects 22, and networks 17. As noted above, it is possible to compare a real-time analysis with historical or theoretical states to provide a base-line for appropriate consideration of network performance.
Thus, through a highly modular approach that gathers object data, separates the data in accordance with basic object aspects, analyzes it, and then permits the reconstitution of the object data in an organized form under user control, user performance monitoring system 10 has the ability to aggregate the performance aspects of many service or network objects 22, both real and logical, permitting a detailed analysis of both individual objects and classes of objects. It includes the ability to take into account the differing performance criteria of elements from distinct vendors, possibly even using dissimilar technologies through a robust analysis and modeling protocol, currently using such programming approaches as
JAVA and CORBA. It is recognized, of course, that different languages may also be appropriate. Moreover, through the ability to analyze the current state different objects and classes of objects in real-time, and through a comparison with past states, the inventive system recognizes that a simple aggregation of various performance metrics is typically inappropriate in a sophisticated service or network environment. Performance monitoring service 10 is easily scalable, adaptable, and even can be used in "what if scenarios to optimize service performance as
circumstances change. Stored model configurations can even be used to provide plans for rapid recovery of the service or network following catastrophic events.