"A Data Management System"
INTRODUCTION
The invention relates to a data management system for interfacing between client systems and a database controller.
It is known to provide a data management system which reads from a database such as an OLAP database in .NET and transmits the data together with a data reading object to the client system. The data reading object includes meta data describing the data format and program code for viewing the data. However, such objects are large, resulting in a high bandwidth requirement and the need to execute a large program on the client system.
WO03/098477 describes a search and presentation engine in which Attributes to objects and to Kinds, and a search query includes an Attribute and a list of Kinds associated with the attributes if the number of objects having the attributes is longer than a predetermined object list length.
The invention is directed towards proving an improved data management system which allows higher speed data retrieval in situations where significant volumes of data are retrieved to generate a presentation on a requesting client system.
SUMMARY OF THE INVENTION
According to the invention, there is provided a data management system comprising an internal meta data store, and data request functions for retrieving data from a database according to the meta data, and presentation functions for generating instructions for presentation of retrieved data according to the meta data,
wherein the data request functions:
retrieve meta data for a request and process the meta data to generate a meta data hierarchical structure, and
use the meta data hierarchical structure to fetch data from the database, and pass the retrieved data to the presentation functions in the structure of the meta data hierarchical structure;
wherein the presentation functions generate a data presentation dataset with said structure and with a writeback instruction for receiving an amended data value from a user; and
wherein a writeback function automatically modifies data values which are linked with a modified value in the presentation dataset according to the hierarchical structure.
In one embodiment, the data request functions process the meta data by:
retrieving a first set of meta data without access control,
retrieving a second set of meta data with access control restrictions associated with the data request,
generating a hierarchical meta data structure from the first set,
modifying the structure according to the access control restrictions.
In one embodiment, the data request functions:
subtract the second set from the first set to provide an intermediate set; and
subtract the intermediate set from the structure in a manner which preserves integrity of hierarchical relationships between meta data which satisfies the access control restrictions, to provide a modified structure.
In one embodiment, the step of subtracting the intermediate set comprises linking sub- elements with a fresh parent element if an original parent element is removed because of access control restrictions.
In one embodiment, the data request functions further modify the modified structure according to hidden members blocked by access control restrictions to provide a virtual hierarchical structure.
In one embodiment, data request objects cache the processed meta data and data retrieved from the database.
In one embodiment, the system further comprises a cache function for automatically determining if the cache stores required meta data and for activating a meta data retrieval operation from the internal database if it receives a request for meta data which is not stored in the cache.
In one embodiment, the cache comprises separate spaces for meta data and data.
In one embodiment, the cache function uses a similar schema for both spaces.
In one embodiment, the data request functions comprises a factory mechanism for instantiating data report objects from a data report class, and for writing retrieved data to a data report object as attributes of the object.
In one embodiment, each data value of a row of a relational database table is written as an attribute.
In one embodiment, all data report objects for a request are written to an array, and the array is returned to the presentation object.
In one embodiment, the writeback mechanism automatically modifies a presentation dataset by generating a writeback table for a user writeback request, the table including a user identifier, a time stamp, and a value of amended data.
In one embodiment, the writeback mechanism automatically appends additional values to the writeback table as further writeback requests are received.
In one embodiment, the writeback mechanism executes as a data request function according to a multi-threaded model.
In one embodiment, the presentation objects comprise workflow objects for managing access control restrictions for generation of presentation datasets.
In one embodiment, the presentation objects create report structures, at least some report structures comprising a plurality of data slices, each data slice corresponding to a meta data hierarchical structure.
In one embodiment, the data request functions and/or the presentation functions are objects in the object-oriented paradigm.
In another aspect, the invention provides data management method carried out by a system comprising an internal meta data store, data request functions for retrieving data from a database according to the meta data, and presentation functions for generating instructions for presentation of retrieved data according to the meta data, wherein the data request functions:
retrieve meta data for a request and process the meta data to generate a meta data hierarchical structure , and
use the meta data hierarchical structure to fetch data from the database, and pass the retrieved data to the presentation functions in the structure of the meta data hierarchical structure;
wherein the presentation functions generate a data presentation dataset with said structure and with a writeback instruction for receiving an amended data value from a user; and wherein a writeback function automatically modifies data values which are linked with a modified value in the presentation dataset according to the hierarchical structure.
DETAILED DESCRIPTION OF THE INVENTION
Brief Description of the Drawings
The invention will be more clearly understood from the following description of some embodiments thereof, given by way of example only with reference to the accompanying drawings in which: -
Fig. 1 is a diagram showing a data management system of the invention together with client systems and a database with which it interfaces;
Fig. 2 is a diagram showing architecture of the system in more detail;
Figs. 3 (a) and 3(b) are together a flow diagram showing a method for retrieving meta data; and
Fig. 4 is flow diagram showing how data is retrieved.
Description of the Embodiments
Referring to Fig. 1 a data management system 1 interfaces with an external database 2 and with clients 3.
At a high level, a client 3 in a configuration mode creates (a) a slice, which is a template for retrieval and presentation of data for a report or part of a report. The server 1 checks (b) with the database 2 if such a slice is possible and if so the client 3
and the server 1 create and preview (c, d) a report with either the single slice or multiple slices. These steps are all in a configuration mode. Subsequently, in real time, a user requests (e) a report, in response to which the client 3 downloads (f) a master report viewer, which is used by the client 3 to create (g) a report viewer and to (h) request data. The server 1 then (i) retrieves the data slices from the database 2.
Upon download of the data the client 3 generates a presentation using the report viewer. The presentation may take a rich format with "3D" representation of data with versatile options for user interaction both for display of the data of the report and for modifying some data and viewing consequent changes in other data of the report.
Referring to Fig. 2 the system 1 comprises data request objects 25 which inter- work with presentation objects 26 for interfacing with the clients 3. They also access an internal database 27 for retrieving meta data, and they retrieve data from an external database 2. The presentation objects 26 include the following groups of objects:
26(a), objects for low-level interfacing functionality such as button controls and searches within a GUI;
26(b) common objects used by many other presentation objects 26, for generating displays such as grids, overviews, charts, and trees; 26(c) workflow objects for managing access control security and priority queuing or requests; and
26(d) display presentation objects for generating visually rich displays such as representations of data as cubes, carousels, a "3D city" in which textures and colours are used to convey data trends, a pie chart ("Superpie"), and a thermometer.
In this specification, the work "object" means an object in the object-oriented paradigm.
Meta data
The presentation objects 26 allow user selection on a simple GUI of data queries, with selection/deletion of options. The data request objects 25 generate corresponding
queries and retrieve meta data from the internal database 27. The meta data for a request is stored in a caching mechanism which is refreshed periodically.
The meta data contains, among others, definitions of data sources, data queries, reports, filters, security settings, and user settings (styles, various selections). Data is requested by both an end-user application (the report viewer) and an administrator application executing on client systems 3. The data is available to the client systems by means of results of data request object 25 method calls.
Results of queries issued on the internal database 27 can be read either sequentially (row by row) or as a whole structure (dataset). When data request objects 25 send data to the client machine, the data is small, i.e. the amount of extra "meta-data", describing the data format, is kept small.
The meta data 50 in the internal database 27 describes the structure of data in the external database 2 and its presentation, hierarchies, dimensions and properties. It is used by several presentation objects 26.
The retrieval process of the meta data involves the data request objects 25 performing the following:
A call to GetDimensions and GetMembers methods. The call is performed using administrative privileges to allow the retrieval of all the dimensions and members, regardless of the security settings (XML file 55). - A call to GetDimensions and GetMembers methods using current user's privileges. Only those dimensions and members are retrieved which are visible for the user (XML file 56).
Both above results (A and B) are each returned as separate flat XML files 55 and 56 containing one node entry for each dimension and member, without regard to their place in the hierarchy.
The XML files are then compared and a difference between all dimensions/members and filtered (by user privileges) dimensions/members is
computed. This results in a list 57 of elements which are hidden for a given user (C).
Basing on the first XML result file 55 (containing all elements) a hierarchy of OLAP dimensions and members is reconstructed. The result is an XML file (H) 58 with nodes properly contained under their respective parents.
Next, the hidden elements (C) 57 are removed from the hierarchical list (H) 58. If an element being removed has any sub-elements, these are moved up the hierarchy so that they now belong to the parent of the removed element. This ensures that no more nodes than necessary are removed and that the whole hierarchy is valid, as each of its nodes has a valid parent.
At this point a hierarchy 60 is constructed and contains every element (dimension or member) that is visible to the user. This list is used for purposes such as designing queries and data control binding definitions. The application has a possibility to further restrict visibility of given members. By doing so, the meta data is once again adjusted to hide the selected members and any necessary reconstruction of the hierarchy is performed. The selection of hidden members is done on a per-slice basis, i.e. each slice based on the same data source (thus the same meta data and resulting hierarchy) can have its own distinct selection of hidden members. This results in 'virtual' hierarchies being created, as they don't reflect the actual hierarchy contained in the database.
The resulting hierarchy is used by the query creator objects 26 whenever a query to an external database is requested. The MDX query is constructed by listing every member in the selection on its own, without resorting to use of the grouping functions (like DESCENDANTS, ALL_BUT_SELF). These functions would not be valid because they are resolved on the database and would use the server 'real' hierarchy, not the 'virtual' hierarchy created by removing hidden members. The query is then passed to the database and appropriate data is returned. It will be appreciated that this method allows dynamic modification of meta data according to user access control, which still adhering to the basic structure of a data slice.
Caching
The data returned is cached by a caching mechanism. Whenever a report is about to be displayed by the report viewer presentation object, the internal mechanisms check what kind of data the report will be utilizing. Then calls to the cache are performed to request the data. A cache object checks whether the data is already stored and if it is, then it is returned to the caller. If the requested results are not in the cache yet, the cache system relays the call to the data layer. The result is then returned up to the caller, as well as being stored in the cache internal storage. Any subsequent call to the same data will result in the cache returning the stored data.
The decision whether the data stored and the data requested is the same takes into consideration a number of factors: the data source definition, the dimensions used, requested members, any functions performed on those members, or any filtering directives used. Each distinct query is assigned an identifier under which it is stored in the cache. Upon every call to the cache the incoming query is analyzed and if a matching identifier is found in the cache storage, the data is returned immediately.
The cache consists of two separate storage spaces: one for the actual data and one for the meta data. Both of these caches work independently and can be used for separate client systems. Yet both share the same paradigms and mechanisms internally. This makes the cache mechanism easily extendible and flexible in supporting different storage types.
Data Retrieval
The data request objects 25 include a tables factory mechanism that converts the tabular data read from the database into arrays of objects, so that the rows of data can be accessed easier (not only sequentially). Each row is transformed into an object whose fields stand for columns in the database. The arrays of objects serialize with much less additional meta data than has heretofore been the case.
Referring to Fig. 4 the database 2 tables are indicated by the numeral 80. A data reader object 81 of the tables factory mechanism fills an object instantiated from a class TReport 82 with data from an individual row as attributes of the object. The mechanism 35 then writes the object to an array 85 of objects 83. Presentation objects 26 then process the table 85 with excellent versatility to generate a required display. The display may be a "3D" representation, possibly with shading or texturing being used to highlight data. Such display graphics can be generated because attributes of the objects 85 can drive display characteristics.
The reason why the array 85 is more compact and versatile than prior objects downloaded to clients for data presentation is because the individual objects 83 are simple to manipulate by the report viewer on the client system 3.
The data is read from the external database 2 in a sequential order. Then an instance of a derived class 82 is created for each row of retrieved data as described above with respect to Fig. 3. An array 85 of such objects is then returned as a result of a web method call. As the class 82 is serialized its instances can be sent as a SOAP (Simple Object Access Protocol) message.
Presentation
The presentation objects 26 provide data in the form of "cubes". A presentation object 26 implements a "writeback" method in which end-users can modify the values already existing in the cubes. For example, users can insert new values at the leaf level and watch the aggregated change accordingly - and this way formulate new data presentation versions. The presentation object displays data in which certain cells are marked (for example with a different background color) as "writeable back". In these cells the user can insert new values, and then order the system to "write them back" into the database. After the data is entered into the grid on the client side, the changes made are sent to the server, where a SQL INSERT operation is executed into a writeback table associated with the given cube. That statement consists of: date, username, identifiers of the dimension (excluding virtual ones) members from the cube plus value, i.e.:
INSERT INTO Writebackjable '10/28/2005', 'OlapWB', ID_DIM1, ID_DIM2,...,ID_DIMN, Value. - such a vector identifies one cell in the cube.
Because the writeback mechanism can writeback into the same cell many times, the writeback table can have many records that refer to one cell in the cube. A mechanism (grouping and summing) displays the correct value after writeback. For example: original value in the cube: 10 the user wants the value to be 4 - so the value in the writeback table is "-6", so that the resulting value is: 10 - 6 = 4 - next - the user want the value to be 20 - so another value in the writeback table (in a new row) would be: "16", so that the resulting value is: 10 - 6 + 16 = 20
The writeback mechanism can execute a trigger to do something else with the writeback data, as configured beforehand.
In order to create an INSERT statement properly, the writeback mechanism needs the following data: name of the writeback table for the given cube, and - names of all dimensions and columns corresponding with the writeback table.
It is possible to read that data using a Decision Support Objects (DSO) mechanism - which is a COM library.
The writeback procedure is executed by data request objects 25 running in a multi¬ threaded model. A control mechanism creates a dedicated, separate thread passing multiplexed data between presentation objects and data request objects.
It will be appreciated that the invention provides for high speed retrieval of data even though there may be a large volume of data required for a request. This is primarily achieved because of:
the manner in which meta data is stored, retrieved, and processed in a hierarchical structure; the manner in which the writeback mechanism operates, particularly in the context of the meta data hierarchical structure; the manner in which access control restrictions are applied in the meta data retrieval process, involving simple subtraction operations with XML documents; separation of the writeback procedure from the external database; and operation of the cache , both for meta data and for the actual data retrieved from the database.
The invention is not limited to the embodiments described but may be varied in construction and detail.