CN104424018A - Distributed calculating transaction processing method and device - Google Patents

Distributed calculating transaction processing method and device Download PDF

Info

Publication number
CN104424018A
CN104424018A CN201310372973.3A CN201310372973A CN104424018A CN 104424018 A CN104424018 A CN 104424018A CN 201310372973 A CN201310372973 A CN 201310372973A CN 104424018 A CN104424018 A CN 104424018A
Authority
CN
China
Prior art keywords
affairs
processor
distributed
distributed calculation
calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310372973.3A
Other languages
Chinese (zh)
Other versions
CN104424018B (en
Inventor
方亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Advanced New Technologies Co Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201310372973.3A priority Critical patent/CN104424018B/en
Publication of CN104424018A publication Critical patent/CN104424018A/en
Application granted granted Critical
Publication of CN104424018B publication Critical patent/CN104424018B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a distributed calculating transaction processing method and device. The distributed calculating transaction processing method comprises receiving an execution command of a distributed calculating transaction; finding a process configuration file of the distributed calculating transaction according to the execution command; determining a processor which is corresponding to every execution step in the distributed calculating transaction and execution sequences of the processors according to the process configuration file; calling the processors to execute the distributed calculating transaction according to the execution sequences. The distributed calculating transaction processing method enables execution processes of the distributed calculating transaction to be unified and standard, saving development resources, reducing development difficulties and technical requirements of developers and being beneficial to promotion of rapid development and application of the distributed calculating technology.

Description

Distributed Calculation transaction methods and device
Technical field
The application relates to distributed computing technology field, particularly relates to Distributed Calculation transaction methods and device.
Background technology
Distributed Calculation affairs are a kind of computing method, allow by the actuating logic in definition affairs the work that a computation requirement can be complete.In the prior art, Distributed Calculation affairs are that the form of writing code realizes, but write which kind of code and determined by author, different people writes different codes, and can not by fractional reuse between different codes, the code write each time can only use a kind of computation model, this just causes the wasting of resources to a certain extent, the development difficulty of Distributed Calculation is also allowed constantly to increase, too much to the program development requested knowledge of developer, the fast development of distributed computing technology can not be adapted to.
Summary of the invention
In an embodiment of the application, provide a kind of Distributed Calculation transaction methods, in order to avoid the wasting of resources, reduce development difficulty, adapt to the fast development of distributed computing technology, the method comprises:
Receive the execution instruction of Distributed Calculation affairs;
According to described execution instruction, search the procedure configuration files of described Distributed Calculation affairs;
According to described procedure configuration files, determine that in described Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and each processor;
Call each processor by described execution sequence and perform described Distributed Calculation affairs.
In another embodiment of the application, provide a kind of Distributed Calculation transacter, in order to avoid the wasting of resources, reduce development difficulty, adapt to the fast development of distributed computing technology, this device comprises:
Receiver module, for receiving the execution instruction of Distributed Calculation affairs;
Search module, for according to described execution instruction, search the procedure configuration files of described Distributed Calculation affairs;
Determination module, for according to described procedure configuration files, determines that in described Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and each processor;
Execution module, performs described Distributed Calculation affairs for calling each processor by described execution sequence.
In the embodiment of the application, in Distributed Calculation affairs, each performs step all each self-corresponding processor, perform Distributed Calculation affairs and only need find corresponding procedure configuration files, the execution sequence of processor and these processors that need call is determined according to procedure configuration files, call these processors by execution sequence to perform, make the execution flow process unification of Distributed Calculation affairs, specification, same Distributed Calculation affairs can be avoided to write different code by different developer and cause the wasting of resources, exploitation difficulty; And, each processor can be multiplexing in different Distributed Calculation affairs, developer is without the need to rewriteeing code to realize different Distributed Calculation affairs, only procedure configuration files need be determined, namely by the configuration to processor and processor execution sequence, realize the execution that different distributions formula calculates affairs, saving is exploited natural resources, reduce development difficulty and the technical requirement to developer, be conducive to the fast development and the application that advance distributed computing technology; In addition, the mode calling processor is adopted to perform Distributed Calculation affairs, also can use the computation model in multiple distributed computing environment Distributed Calculation affairs simplely, the advantage making full use of multiple computation model completes combined type and calculates, adapt to complicated calculations scene, in terms of existing technologies, development difficulty is less, and range of application is wider.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme in the embodiment of the present application, below the accompanying drawing used required in describing embodiment is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the application, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.In the accompanying drawings:
Fig. 1 is the process flow diagram of Distributed Calculation transaction methods in the embodiment of the present application;
Fig. 2 is the schematic diagram of an instantiation of Distributed Calculation affairs registration in the embodiment of the present application;
Fig. 3 is the schematic diagram that in the embodiment of the present application, Distributed Calculation affairs perform processor corresponding to step;
Fig. 4 is the schematic diagram of filter in the embodiment of the present application;
Fig. 5 is that in the embodiment of the present application, Distributed Calculation affairs are the processing procedure schematic diagram that distribution on line formula calculates affairs;
Fig. 6 is the business process schematic diagram of a Distributed Calculation in the embodiment of the present application;
Fig. 7 is the structural representation of Distributed Calculation transacter in the embodiment of the present application;
Fig. 8 is the structural representation of an instantiation of Distributed Calculation transacter in the embodiment of the present application;
Fig. 9 is the structural representation of an instantiation of Distributed Calculation transacter in the embodiment of the present application.
Embodiment
For making the object of the embodiment of the present application, technical scheme and advantage clearly understand, below in conjunction with accompanying drawing, the embodiment of the present application is described in further details.At this, the schematic description and description of the application for explaining the application, but not as the restriction to the application.
In order to avoid the wasting of resources, reduce development difficulty, adapt to the fast development of distributed computing technology, a kind of Distributed Calculation transaction methods is provided in the embodiment of the application, by in Distributed Calculation affairs, each performs a step corresponding processor all separately, corresponding procedure configuration files is first found when performing Distributed Calculation affairs, the processor that need call is determined according to procedure configuration files, and the execution sequence of these processors, call these processors by execution sequence again to perform, to complete Distributed Calculation issued transaction, make the execution flow process of Distributed Calculation affairs unified, specification.Give an example, scorecard model is roughly provide respective scoring by multiple scorecard submodel, and a kind of algorithm of final utilization provides a comprehensive grading to all scorings, and the step of needs can comprise:
1, first preparing the data of each scorecard submodel: a scorecard submodel is likely single table, is also likely multilist, now needs these tables to be organized into a wide table, each own wide table of each scorecard submodel;
2, parallel processing: the scoring calculating each scorecard submodel respectively, unification outputs in a Table A;
3, statistical computation is carried out: statistical computation is performed to the data in Table A, finally obtains a score value, and the reference explanation of scoring conclusion.
To above-mentioned scorecard model, the application it can be used as Distributed Calculation affairs, and each performs each self-corresponding processor of step and can comprise:
1, tables of data connection handling device (P1): for merging one or more table, and output in the new table of a definition;
2, scorecard submodel processor (P2): for the calculating of a scorecard submodel, and output in the output table of a definition;
3, statistical treatment device (P3): for the statistic algorithm by setting, calculate comprehensive grading, and output in an output table of specifying;
The procedure configuration files of these Distributed Calculation affairs: a flow definition Flow1, comprises multiple step: order performs one or more P1; A circle logic performs P2; Perform P3.
By the definition of Flow1 and P1, P2, P3, not only can realize the scoring scene that other are similar, and P1 wherein can be applied in the model calculating of other non-scorecard classes, P3 can realize more, more complicated statistical by expanding statistic algorithm.
The following detailed description of the concrete enforcement of Distributed Calculation transaction methods in the embodiment of the present application.
Fig. 1 is the process flow diagram of Distributed Calculation transaction methods in the embodiment of the present application.As shown in Figure 1, in the embodiment of the present application, Distributed Calculation transaction methods can comprise:
The execution instruction of step 101, reception Distributed Calculation affairs;
Step 102, according to the execution instruction of Distributed Calculation affairs received, search the procedure configuration files of these Distributed Calculation affairs;
The procedure configuration files that step 103, basis are searched, determines that in these Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and each processor;
Step 104, call each processor by the execution sequence determined and perform this Distributed Calculation affairs.
Flow process can be learnt as shown in Figure 1, Distributed Calculation transaction methods in the embodiment of the present application can a kind ofly define in off-line or distribution on line formula computation process, for data preparation, data conversion, data calculate and data input and output process specification and perform framework.Itself does not calculate, but provides a kind of driving logic of calculating, allow computing unit (processor) orderly perform a complete Distributed Calculation the affair logic according to the order of setting.Distributed Calculation transaction methods in the embodiment of the present application can be applicable to the Policy Model running credit, risk class with decision engine, such as can be applicable to AGDS(Alibaba General Decision Service, the general decision service of Alibaba) in cloud decision-making product, can be described as CE(Computing Engine, computing engines) framework, the production environment of disposing in data modeling, policy is disposed.
In the embodiment of the present application, the processor that execution Distributed Calculation office needs is determined by the procedure configuration files of Distributed Calculation affairs, and the execution sequence of these processors, call these processors by execution sequence and can perform Distributed Calculation affairs, make the execution flow process unification of Distributed Calculation affairs, specification, thus avoid same Distributed Calculation affairs to write different code by different developer and cause the wasting of resources, exploitation difficulty; Further, each processor can be multiplexing in different Distributed Calculation affairs, and by procedure configuration files configuration processor and processor execution sequence, to realize the execution that different distributions formula calculates affairs, and then saving is exploited natural resources, and reduces development difficulty; In addition, adopt the mode calling processor to perform Distributed Calculation affairs, can use the computation model in multiple distributed computing environment in Distributed Calculation affairs, development difficulty is little, applied range simplely.
In one embodiment, after the execution instruction receiving Distributed Calculation affairs, first according to the execution instruction received, search the procedure configuration files of these Distributed Calculation affairs.During enforcement, the corresponding relation of Distributed Calculation affairs and corresponding procedure configuration files can be determined in advance, follow-uply procedure configuration files can be found according to this corresponding relation, in the process of searching, immediately procedure configuration files can certainly be determined according to some input information.In a preferred embodiment, determine procedure configuration files when Distributed Calculation affairs are registered.Such as, before the execution instruction receiving Distributed Calculation affairs, also comprise the registration process of Distributed Calculation affairs, this process comprises: the registration request receiving Distributed Calculation affairs, according to this registration request, register the execution instruction of these Distributed Calculation affairs, and determine the procedure configuration files of these Distributed Calculation affairs.
Fig. 2 is the schematic diagram of an instantiation of Distributed Calculation affairs registration in the application one embodiment.In this example, the Distributed Calculation transaction methods of the embodiment of the present application is applied as the CE framework in AGDS cloud decision-making product, as shown in Figure 2, CE frame section is affixed one's name in a distributed computing environment, comprising the CE registration center carrying out registering for Distributed Calculation affairs.Multiple Distributed Calculation affairs (1 ~ 6) will perform in a distributed computing environment, then needing first packs, and the CE registration center be distributed in distributed computing environment registers.Each Distributed Calculation affairs can be considered a plug-in unit of CE framework.Distributed Calculation affairs are to the execution instruction of CE registration center registration oneself, to obtain the entrance performed, outside by initiating to perform instruction accordingly to trigger corresponding Distributed Calculation affairs execution, Distributed Calculation affairs can have multiple execution instruction, and each performs the parameter that instruction can have oneself.When Distributed Calculation affairs are registered to CE registration center, CE registration center can determine the procedure configuration files of these Distributed Calculation affairs.Like this, after the execution instruction receiving Distributed Calculation affairs, the Distributed Calculation affairs that will trigger can be determined, and find the procedure configuration files of these Distributed Calculation affairs, and perform this Distributed Calculation affairs according to the procedure configuration files found.
For defining in Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and processor to procedure configuration files, can be XML(Extensible Markup Language, extend markup language) configuration file.In one embodiment, after the procedure configuration files finding Distributed Calculation affairs, the procedure configuration files found is distributed in distributed computing environment.When performing Distributed Calculation affairs, first according to the procedure configuration files be distributed in distributed computing environment, determine that in Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and processor, such as by resolving, loading, perform the procedure configuration files of these Distributed Calculation affairs, can determine that in Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and processor.
Corresponding to the processor of an execution step in Distributed Calculation affairs, being the abstract of a computing unit, is for defining a processing unit can processed chunk data, change.Each processor has oneself characteristic, can carry out growth data processing power by extensible processor.Different calculating scenes can have different processors, with the craft replacing some simple exploitation Accounting Legend Code.Such as, in distributed computing environment different computation models can have different processors, such as:
Perform DESQL(ODPS SQL, data engine Structured Query Language (SQL); SQL:Structured QueryLanguage, Structured Query Language (SQL)), HSQL(Hive SQL, Hive Structured Query Language (SQL)) processor, one can obtain SQL script from file or configuration, and the processor performed;
MR(Map/Reduce, mapping/abbreviation) processor, a processor that can perform Map/Reduce and calculate, it can define multiple input data, export data, and specifies built-in filter processing sequence;
Select(selects) processor, one, for Query Database table, obtains the processor of data.
In one embodiment, filter can be comprised in some processor, filter is the processing unit calculated data line, filter process be data line, it is the minimum unit in Distributed Calculation, and the most problems in conventional numeric statistics field can be converted to the calculating to data line.Filter can carry out as exported the calculating such as data, algorithm, a data conversion data line.Growth data processing power can be carried out by expansion filter during enforcement.
In order in Distributed Calculation isomery server between keep unified context, to ensure that the setting in XML configuration file can be implemented, and in order to shield hadoop, odps distributed computing environment difference in realization, in a preferred embodiment, some processor can also according to the server of isomery in Distributed Calculation or different distributed computing environment, the mode of agency is used to encapsulate distributed context, such as MR agent processor, the all processing procedures in Map/Reduce process can be acted on behalf of, comprise mapper(to map), combin(merges), reduce(abbreviation), group(divides into groups) etc.The mode of agency achieves the subclass of each process class in Map/Reduce process, and according to realize need to have redefined the function of some keys realize logic, add expanded function, again these subclasses are provided away in the mode of substantially abstract parent, other developers without the need to focus on isomery in Distributed Calculation server between or in the distributed computing environment that hadoop, odps etc. are different, how Map/Reduce specifically runs, the function of these subclasses can be realized.
Fig. 3 is the schematic diagram that in the application one embodiment, Distributed Calculation affairs perform processor corresponding to step.List some processors in Fig. 3, and divide according to the characteristic of processor:
1, basic processing unit: the basic operation interface defining processor; Basic processing unit comprises filter processor and ordinary processor.
2, filter processor: processor, when processing data, uses filter to process data, change, comprises list filter processor, table of comparisons filter processor two kinds; Wherein list filter processor comprises MR agent processor, performing MR calculating, comprising the processor for specific Map/Reduce processing links such as Mapper processor, Reducer processor and Combiner processor for acting on behalf of.
3, ordinary processor: for filter processor, does not use filter to process data, changes, and is universal processor, achieves the basic operation interface of some processors; Ordinary processor easily extensible realizes the processor comprising some special processing modes, comprising:
SQL processor: can the processor of execution architecture query language;
Query processor: the processor that data query can be initiated particular data storage medium, comprise OTS(Open TableService, open type data table serve), Tair(non-relational database), OceanBase(distributed data base), database storage medium, as shown in Figure 3, query processor can comprise data base querying processor, OTS query processor, Tair query processor and OceanBase query processor;
Data verification processor: the checkings such as type, stability bandwidth, scope, expression formula calculating can be carried out to data.
Fig. 4 is the schematic diagram of filter in the application one embodiment.As shown in Figure 4, filter can be divided into according to functional characteristic:
Basic filter: the basic operation interface defining filter; Basic filter comprises data conversion filter, data export filter and algorithm filter; Wherein:
Data conversion filter: comprise type conversion filter and row conversion filter, wherein type conversion filter is used for carrying out type conversion to the row in data line or multiple row, row conversion filter be used in data line one row or multi-column data be again ranked or extend, row conversion filter comprise multiple row to one row conversion filter, multiple row to multiple row conversion filter and one row to multiple row conversion filter.
Data export filter: comprise tables of data and export filter, message output filter and far call output filter, wherein tables of data exports filter for exporting data line in a tables of data, message exports filter and is sent in messenger service for exporting data line, and far call exports filter for calling the interface transmission data line on remote server.
Algorithm filter: comprise decision engine filter, distributed algorithm filter and script and calculate filter, wherein decision engine filter calculates data line for using decision engine, distributed algorithm filter calculates data line for using distributed algorithm, and script calculates filter and calculates data line for using the script of language-specific.Distributed algorithm filter comprises Fourier algorithm filter, method filter is figured in decision-making, logistic regression algorithm filter and scorecard algorithm filter.
In one embodiment, after determining that in Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and processor, call each processor by the execution sequence determined and perform this Distributed Calculation affairs.The executive mode of Distributed Calculation affairs has two kinds, and one is tape input parameter, rreturn value; Another kind does not have input parameter and rreturn value.Front one is suitable for real-time distributed computation schema, because real-time distributed calculating needs to know result of calculation at once usually.Rear one is applicable to off-line distributed computing model, and off-line Distributed Calculation result normally stores with other, pass-through mode embodies, as result table, file etc.Real-time distributed computation schema can be such as that distribution on line formula calculates affairs, imports one or more server into, filter layer by layer and calculate, and finally returning results for the one or more events imported into outside according to the flow process set.Real-time distributed computation schema can also be such as distributed message processing transactions, for the message imported into outside, in order or parallel one or more message server that passes to calculate, and result forwards the most at last.Off-line distributed computing model can be such as hadoop, odps etc.If Distributed Calculation affairs are off-line Distributed Calculation affairs during enforcement, then, after carrying out issued transaction, transaction processing results data are stored; If Distributed Calculation affairs are real-time distributed calculating affairs, then before carrying out issued transaction, obtain issued transaction required input parameter, after carrying out issued transaction, output transactions result data.
Fig. 5 is that in the embodiment of the present application, Distributed Calculation affairs are the processing procedure schematic diagram of real-time distributed calculating affairs.As shown in Figure 5, one computation model is Distributed Calculation affairs, perform in a distributed computing environment, after this computation model packing CE registration center be distributed in distributed computing environment registers, computation model JAR(Java Archive, Java archive file) comprise script and XML configuration file, script can be the code calculated, rule script etc., XML configuration file is used for describing computation model and how runs, the i.e. the affair logic of this computation model in Distributed Calculation, comprise the execution sequence describing processor corresponding to each execution step and processor.Computation model JAR can also comprise the physical resources such as resource file in wrapping, for providing the data resource coordinating script to run.Computation model, by after execution instruction triggers, carrys out the execution of a processor, a processor according to the processor execution sequence set in XML configuration file.Computation model can obtain the data required for calculating from outside, input parameter simultaneously, again can according to configuration setting output data after having calculated.Can have one or more data filtering process in a processor, each data filtering process has filter list, and the data line that processor reads can be processed layer by layer via the filter in data filtering passage, changed, the data that final output a line is new.
Fig. 6 is the business process schematic diagram of a Distributed Calculation in the embodiment of the present application.As shown in Figure 6, from the first step, all corresponding processor of each step, comprises the processor of implementation step 1 function, for connecting with DESQL mode Join() multiple table, create temporary table; The processor of implementation step 2 function, for performing MR, MPI(Message PassingInterface, message passing interface) or BSP(Bulk Synchronous Parallel, Integral synchronous parallel computation) computation model, or carry out regulation engine decision analysis; The processor of implementation step 3 function, for performing the data verification of MR, MPI, BSP or SQL computation model, carries out the result output; The processor of implementation step 4 function, for clearing up temporary table in DESQL mode.As can be seen from Fig. 6 also, in an embodiment of application, adopt the mode calling processor to perform Distributed Calculation affairs, the computation model in multiple distributed computing environment can be used Distributed Calculation affairs simplely, as DESQL and MR, the advantage that can make full use of multiple computation model like this completes combined type and calculates, adapt to complicated calculations scene, in terms of existing technologies, development difficulty is less, range of application is wider, is conducive to the fast development and the application that advance distributed computing technology.
As previously mentioned, in one embodiment, when calling each processor and performing, if processor comprises the filter calculated data line, then can run filter and carry out data filtering process.Different according to the ability that processor is supported, the method for operation of filter has two kinds, and one performs in order, namely performs data filtration treatment by serial order to data line; Another kind of according to setting executed in parallel, namely by parallel mode, data filtration treatment is performed to data line.
Filter performs data filtration treatment by serial order to data line a typical application scenarios, needs to data line do repeatedly convert time, just can adopt in this way: such as categorical data conversion filter f1 and tables of data export filter f2 and combinationally use, when data line passes to f1, f1 can convert some field types in these row data to the another kind of type of setting, then pass to f2, f2 outputs to these row data in the table of specifying.Such as decision engine filter f1 and tables of data output filter f2 combinationally uses again, when data line passes to f1, f1 judges this row data rule, and exports a line result after calculating, then result data is passed to f2, f2 outputs to these row data in the table of specifying.
Filter can be used in decision-making scoring scene to data line parallel execution of data filtration treatment, differently calculates, gather judge afterwards to the result that different modes produces to data line.Such as Fourier algorithm filter f1, logistic regression algorithm filter f2, scorecard algorithm filter f3 and tables of data export filter f4 and combinationally use, when data line passes to f1, f1 calculates a series of desired value, give f2 to determine whether to return, if need to return, return f1, otherwise carry out Comprehensive Evaluation to f3 to desired value, the result of judge is given f4 and is outputted in tables of data.This calculating normally multirow data parallel performs, and finally transfers to f3 Comprehensive Assessment, has therefore used parallel mode execution.
The embodiment of the application can utilize the SDK(Software Development Kit provided in distributed computing environment, software development toolkit), carry out the issued transaction of Distributed Calculation, the issued transaction problem under stand-alone environment and distributed computing environment can be solved, data are processed, conversion and transmission, and realize the management of operation flow configurationization, can be applicable to off-line or distribution on line formula computation process, comprise and be applied to Distributed Calculation, in distributed algorithm or distributed rule engine, wherein distributed algorithm can be such as Fourier, logistic regression, neural network etc., distributed rule engine is for adopting decision table, the ad hoc rules engine modes such as decision tree Sum fanction, realize distributed rule to judge.
Because distributed computing environment java language maximum is at present write, as hadoop, odps etc., therefore the Distributed Calculation transaction methods of the embodiment of the present application also can use java language development, during enforcement, each processor can run different scripts, these scripts can be statically compiled into java code, such as python, scala script.Certainly, other Languages also can be adopted as required to develop the Distributed Calculation transaction methods of the embodiment of the present application.If a distributed computing platform will be run on hadoop, odps, only need to configure corresponding procedure configuration files, without the need to exploitation code.
Based on same inventive concept, additionally provide a kind of Distributed Calculation transacter in the embodiment of the present application, as described in the following examples.The principle of dealing with problems due to this device is similar to Distributed Calculation transaction methods, and therefore the enforcement of this device see the enforcement of Distributed Calculation transaction methods, can repeat part and repeat no more.
Fig. 7 is the structural representation of Distributed Calculation transacter in the embodiment of the present application.As shown in Figure 7, in the embodiment of the present application, Distributed Calculation transacter can comprise:
Receiver module 701, for receiving the execution instruction of Distributed Calculation affairs; Receiver module 701 is the parts being responsible for receiving instruction, request or information in Distributed Calculation transacter, and can be software, hardware or the combination of the two, such as, can be input equipment, input interface etc.;
Search module 702, for according to the execution instruction received, search the procedure configuration files of these Distributed Calculation affairs; Searching module 702 is the parts of being responsible for searching procedure configuration files in Distributed Calculation transacter, can be software, hardware or the combination of the two, such as, can be the components and parts such as the process chip of this locating function;
Determination module 703, for according to the procedure configuration files of searching, determines that in these Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and each processor; Determination module 703 is the parts being responsible for determining processor and processor execution sequence in Distributed Calculation transacter, can be software, hardware or the combination of the two, such as, can be that this determines the components and parts such as the process chip of function;
Execution module 704, performs this Distributed Calculation affairs for calling each processor by the execution sequence determined; Execution module 704 is responsible in Distributed Calculation transacter calling the part that processor performs Distributed Calculation affairs, and can be software, hardware or the combination of the two, such as, can be the components and parts such as the process chip of this n-back test.
In the present embodiment, determination module determines by the procedure configuration files of Distributed Calculation affairs the processor that execution Distributed Calculation office needs, and the execution sequence of these processors, execution module calls these processors by execution sequence can perform Distributed Calculation affairs, make the implementation unification of Distributed Calculation affairs, specification, thus avoid same Distributed Calculation affairs to write different code by different developer and cause the wasting of resources, exploitation difficulty; Further, each processor can be multiplexing in different Distributed Calculation affairs, and by procedure configuration files configuration processor and processor execution sequence, to realize the execution that different distributions formula calculates affairs, and then saving is exploited natural resources, and reduces development difficulty; In addition, execution module adopts the mode calling processor to perform Distributed Calculation affairs, and can use the computation model in multiple distributed computing environment in Distributed Calculation affairs, development difficulty is little, applied range simplely.
In a preferred embodiment, receiver module 701 can also be used for: before the execution instruction receiving Distributed Calculation affairs, receive the registration request of these Distributed Calculation affairs;
As shown in Figure 8, the transacter of Distributed Calculation shown in Fig. 7 can also comprise:
Registering modules 801, for according to this registration request, registers the execution instruction of these Distributed Calculation affairs, and determines the procedure configuration files of these Distributed Calculation affairs.Registering modules 801 is the parts being responsible for registration Distributed Calculation affairs in Distributed Calculation transacter, and can be software, hardware or the combination of the two, such as, can be the components and parts such as the process chip of this registering functional.
As shown in Figure 9, in one embodiment, the transacter of Distributed Calculation shown in Fig. 7 can also comprise:
Distribution module 901, for after searching module 702 and finding the procedure configuration files of these Distributed Calculation affairs, is distributed to distributed computing environment by this procedure configuration files; Distribution module 901 is the parts being responsible for distribution flow configuration file in Distributed Calculation transacter, and can be software, hardware or the combination of the two, such as, can be the components and parts such as the process chip of this distribution function.
Determination module 703 specifically may be used for: according to this procedure configuration files be distributed in distributed computing environment, determines that in these Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and each processor.
In one embodiment, execution module 704 specifically may be used for:
If these Distributed Calculation affairs are off-line Distributed Calculation affairs, then, after carrying out issued transaction, transaction processing results data are stored;
If these Distributed Calculation affairs are real-time distributed calculating affairs, then before carrying out issued transaction, obtain issued transaction required input parameter, after carrying out issued transaction, output transactions result data.
In order in Distributed Calculation isomery server between keep unified context, to ensure that the setting in XML configuration file can be implemented, and in order to shield hadoop, odps distributed computing environment difference in realization, in a preferred embodiment, can comprise in each processor: according to server or the different distributed computing environment of isomery in Distributed Calculation, adopt the mode of agency to encapsulate distributed contextual processor.
In one embodiment, execution module 704 specifically may be used for:
When calling each processor and performing, if processor comprises filter, then run filter and carry out data filtering process, described filter is: the processing unit calculated data line.
In one embodiment, execution module 704 specifically may be used for:
When running filter and carrying out data filtering process, by serial order, data filtration treatment is performed to data line, or by parallel mode, data filtration treatment is performed to data line.
In sum, in the embodiment of the application, in Distributed Calculation affairs, each performs step all each self-corresponding processor, perform Distributed Calculation affairs and only need find corresponding procedure configuration files, the execution sequence of processor and these processors that need call is determined according to procedure configuration files, call these processors by execution sequence to perform, make the execution flow process unification of Distributed Calculation affairs, specification, same Distributed Calculation affairs can be avoided to write different code by different developer and cause the wasting of resources, exploitation difficulty; And, each processor can be multiplexing in different Distributed Calculation affairs, developer is without the need to rewriteeing code to realize different Distributed Calculation affairs, only procedure configuration files need be determined, namely by the configuration to processor and processor execution sequence, realize the execution that different distributions formula calculates affairs, saving is exploited natural resources, reduce development difficulty and the technical requirement to developer, be conducive to the fast development and the application that advance distributed computing technology; In addition, the mode calling processor is adopted to perform Distributed Calculation affairs, also can use the computation model in multiple distributed computing environment Distributed Calculation affairs simplely, the advantage making full use of multiple computation model completes combined type and calculates, adapt to complicated calculations scene, in terms of existing technologies, development difficulty is less, and range of application is wider.
Those skilled in the art should understand, the embodiment of the application can be provided as method, system or computer program.Therefore, the application can adopt the form of complete hardware embodiment, completely software implementation or the embodiment in conjunction with software and hardware aspect.And the application can adopt in one or more form wherein including the upper computer program implemented of computer-usable storage medium (including but not limited to magnetic disk memory, CD-ROM, optical memory etc.) of computer usable program code.
The application describes with reference to according to the process flow diagram of the method for the embodiment of the present application, equipment (system) and computer program and/or block scheme.Should understand can by the combination of the flow process in each flow process in computer program instructions realization flow figure and/or block scheme and/or square frame and process flow diagram and/or block scheme and/or square frame.These computer program instructions can being provided to the processor of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine, making the instruction performed by the processor of computing machine or other programmable data processing device produce device for realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be stored in can in the computer-readable memory that works in a specific way of vectoring computer or other programmable data processing device, the instruction making to be stored in this computer-readable memory produces the manufacture comprising command device, and this command device realizes the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
These computer program instructions also can be loaded in computing machine or other programmable data processing device, make on computing machine or other programmable devices, to perform sequence of operations step to produce computer implemented process, thus the instruction performed on computing machine or other programmable devices is provided for the step realizing the function of specifying in process flow diagram flow process or multiple flow process and/or block scheme square frame or multiple square frame.
Above-described specific embodiment; the object of the application, technical scheme and beneficial effect are further described; be understood that; the foregoing is only the specific embodiment of the application; and be not used in the protection domain limiting the application; within all spirit in the application and principle, any amendment made, equivalent replacement, improvement etc., within the protection domain that all should be included in the application.

Claims (10)

1. a Distributed Calculation transaction methods, is characterized in that, comprising:
Receive the execution instruction of Distributed Calculation affairs;
According to described execution instruction, search the procedure configuration files of described Distributed Calculation affairs;
According to described procedure configuration files, determine that in described Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and each processor;
Call each processor by described execution sequence and perform described Distributed Calculation affairs.
2. the method for claim 1, is characterized in that, before receiving the execution instruction of Distributed Calculation affairs, also comprises: the registration request receiving described Distributed Calculation affairs; According to described registration request, register the execution instruction of described Distributed Calculation affairs, and determine the procedure configuration files of described Distributed Calculation affairs.
3. the method for claim 1, is characterized in that, after the procedure configuration files finding described Distributed Calculation affairs, also comprises: described procedure configuration files is distributed to distributed computing environment;
According to described procedure configuration files, determine that in described Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and each processor, comprise: according to the described procedure configuration files be distributed in distributed computing environment, determine that in described Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and each processor.
4. the method for claim 1, is characterized in that, calls each processor and performs described Distributed Calculation affairs, comprising by described execution sequence:
If described Distributed Calculation affairs are off-line Distributed Calculation affairs, then, after carrying out issued transaction, transaction processing results data are stored;
If described Distributed Calculation affairs are real-time distributed calculating affairs, then before carrying out issued transaction, obtain issued transaction required input parameter, after carrying out issued transaction, output transactions result data.
5. the method for claim 1, is characterized in that, each processor comprises: according to server or the different distributed computing environment of isomery in Distributed Calculation, adopts the mode of agency to encapsulate distributed contextual processor.
6. the method as described in any one of claim 1 to 5, is characterized in that, calls each processor and performs described Distributed Calculation affairs, comprising by described execution sequence:
When calling each processor and performing, if processor comprises filter, then run filter and carry out data filtering process, described filter is: the processing unit calculated data line.
7. a Distributed Calculation transacter, is characterized in that, comprising:
Receiver module, for receiving the execution instruction of Distributed Calculation affairs;
Search module, for according to described execution instruction, search the procedure configuration files of described Distributed Calculation affairs;
Determination module, for according to described procedure configuration files, determines that in described Distributed Calculation affairs, each performs the execution sequence of each self-corresponding processor of step and each processor;
Execution module, performs described Distributed Calculation affairs for calling each processor by described execution sequence.
8. device as claimed in claim 7, is characterized in that, also comprise:
Distribution module, for described search module searches to described Distributed Calculation affairs procedure configuration files after, described procedure configuration files is distributed to distributed computing environment;
Described determination module specifically for: according to the described procedure configuration files be distributed in distributed computing environment, to determine in described Distributed Calculation affairs that each performs the execution sequence of each self-corresponding processor of step and each processor.
9. device as claimed in claim 7, it is characterized in that, each processor comprises: according to server or the different distributed computing environment of isomery in Distributed Calculation, adopts the mode of agency to encapsulate distributed contextual processor.
10. the device as described in any one of claim 7 to 9, is characterized in that, described execution module specifically for:
When calling each processor and performing, if processor comprises filter, then run filter and carry out data filtering process, described filter is: the processing unit calculated data line.
CN201310372973.3A 2013-08-23 2013-08-23 Distributed Calculation transaction methods and device Active CN104424018B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310372973.3A CN104424018B (en) 2013-08-23 2013-08-23 Distributed Calculation transaction methods and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310372973.3A CN104424018B (en) 2013-08-23 2013-08-23 Distributed Calculation transaction methods and device

Publications (2)

Publication Number Publication Date
CN104424018A true CN104424018A (en) 2015-03-18
CN104424018B CN104424018B (en) 2018-02-16

Family

ID=52973089

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310372973.3A Active CN104424018B (en) 2013-08-23 2013-08-23 Distributed Calculation transaction methods and device

Country Status (1)

Country Link
CN (1) CN104424018B (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106502720A (en) * 2016-09-26 2017-03-15 海尔优家智能科技(北京)有限公司 A kind of data processing method and device
CN106651568A (en) * 2016-12-26 2017-05-10 中国建设银行股份有限公司 Business processing method and device
CN107832125A (en) * 2017-10-10 2018-03-23 中国银联股份有限公司 Method for processing business and device under a kind of distributed environment
CN107844363A (en) * 2017-10-27 2018-03-27 东软集团股份有限公司 Business transaction processing method, device, storage medium and equipment
CN108063680A (en) * 2016-11-09 2018-05-22 深圳市太易云互联科技有限公司 Resource allocation control method and device
CN108924184A (en) * 2018-05-31 2018-11-30 阿里巴巴集团控股有限公司 data processing method and server
CN109002462A (en) * 2018-06-04 2018-12-14 北京明朝万达科技股份有限公司 A kind of method and system for realizing distributed things
CN109189468A (en) * 2018-08-06 2019-01-11 北京马上慧科技术有限公司 A kind of access of examination & approval data source configurationization and XML map configurationization system
CN106095391B (en) * 2016-05-31 2019-03-26 携程计算机技术(上海)有限公司 Calculation method and system based on big data platform and algorithm model
CN110327625A (en) * 2019-07-08 2019-10-15 网易(杭州)网络有限公司 Processing method, device, processor, terminal and the server of file
CN110597602A (en) * 2019-09-17 2019-12-20 北京字节跳动网络技术有限公司 Transaction processing method and device, computer equipment and storage medium
CN112765152A (en) * 2019-11-05 2021-05-07 北京京东振世信息技术有限公司 Method and apparatus for merging data tables
CN112765152B (en) * 2019-11-05 2024-04-12 北京京东振世信息技术有限公司 Method and apparatus for merging data tables

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020087366A1 (en) * 2000-12-30 2002-07-04 Collier Timothy R. Tentative-hold-based protocol for distributed transaction processing
CN101072282A (en) * 2006-02-03 2007-11-14 株式会社理光 Image processor and image processing method
CN101848210A (en) * 2009-03-24 2010-09-29 奥林巴斯株式会社 Distributed processing system(DPS)
CN102073540A (en) * 2010-12-15 2011-05-25 北京新媒传信科技有限公司 Distributed affair submitting method and device thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020087366A1 (en) * 2000-12-30 2002-07-04 Collier Timothy R. Tentative-hold-based protocol for distributed transaction processing
CN101072282A (en) * 2006-02-03 2007-11-14 株式会社理光 Image processor and image processing method
CN101848210A (en) * 2009-03-24 2010-09-29 奥林巴斯株式会社 Distributed processing system(DPS)
CN102073540A (en) * 2010-12-15 2011-05-25 北京新媒传信科技有限公司 Distributed affair submitting method and device thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
耿骞,韩圣龙,傅湘玲: "《信息系统分析与设计 (第二版)》", 30 January 2008 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106095391B (en) * 2016-05-31 2019-03-26 携程计算机技术(上海)有限公司 Calculation method and system based on big data platform and algorithm model
CN106502720A (en) * 2016-09-26 2017-03-15 海尔优家智能科技(北京)有限公司 A kind of data processing method and device
CN106502720B (en) * 2016-09-26 2019-11-08 海尔优家智能科技(北京)有限公司 A kind of data processing method and device
CN108063680A (en) * 2016-11-09 2018-05-22 深圳市太易云互联科技有限公司 Resource allocation control method and device
CN106651568A (en) * 2016-12-26 2017-05-10 中国建设银行股份有限公司 Business processing method and device
CN107832125A (en) * 2017-10-10 2018-03-23 中国银联股份有限公司 Method for processing business and device under a kind of distributed environment
CN107844363A (en) * 2017-10-27 2018-03-27 东软集团股份有限公司 Business transaction processing method, device, storage medium and equipment
CN107844363B (en) * 2017-10-27 2020-08-28 东软集团股份有限公司 Business transaction processing method, device, storage medium and equipment
CN108924184A (en) * 2018-05-31 2018-11-30 阿里巴巴集团控股有限公司 data processing method and server
CN108924184B (en) * 2018-05-31 2022-02-25 创新先进技术有限公司 Data processing method and server
CN109002462B (en) * 2018-06-04 2020-11-27 北京明朝万达科技股份有限公司 Method and system for realizing distributed transaction
CN109002462A (en) * 2018-06-04 2018-12-14 北京明朝万达科技股份有限公司 A kind of method and system for realizing distributed things
CN109189468A (en) * 2018-08-06 2019-01-11 北京马上慧科技术有限公司 A kind of access of examination & approval data source configurationization and XML map configurationization system
CN109189468B (en) * 2018-08-06 2022-12-30 北京马上慧科技术有限公司 Examination and approval data source configuration access and XML mapping configuration system
CN110327625A (en) * 2019-07-08 2019-10-15 网易(杭州)网络有限公司 Processing method, device, processor, terminal and the server of file
CN110327625B (en) * 2019-07-08 2023-07-21 网易(杭州)网络有限公司 File processing method, device, processor, terminal and server
CN110597602A (en) * 2019-09-17 2019-12-20 北京字节跳动网络技术有限公司 Transaction processing method and device, computer equipment and storage medium
CN112765152A (en) * 2019-11-05 2021-05-07 北京京东振世信息技术有限公司 Method and apparatus for merging data tables
CN112765152B (en) * 2019-11-05 2024-04-12 北京京东振世信息技术有限公司 Method and apparatus for merging data tables

Also Published As

Publication number Publication date
CN104424018B (en) 2018-02-16

Similar Documents

Publication Publication Date Title
CN104424018A (en) Distributed calculating transaction processing method and device
US11487772B2 (en) Multi-party data joint query method, device, server and storage medium
CN103761080B (en) Structured query language (SQL) based MapReduce operation generating method and system
CN108536761A (en) Report data querying method and server
CN108369591B (en) System and method for caching and parameterizing IR
JP2010524060A (en) Data merging in distributed computing
CN109408493A (en) A kind of moving method and system of data source
CN110019314B (en) Dynamic data packaging method based on data item analysis, client and server
CN114691786A (en) Method and device for determining data blood relationship, storage medium and electronic device
CN105204920B (en) A kind of implementation method and device of the distributed computing operation based on mapping polymerization
CN108052635A (en) A kind of heterogeneous data source unifies conjunctive query method
CN109471718A (en) Computing resource configuration method, device, equipment and medium based on recognition of face
CN112015402A (en) Method and device for quickly establishing service scene and electronic equipment
Camacho-Rodríguez et al. Reuse-based optimization for pig latin
CN112364052A (en) Heterogeneous data management method, device, equipment and computer readable storage medium
CN111008020A (en) Method for analyzing logic expression into general query statement
CN114238463A (en) Calculation engine control method and device for distributed index calculation
CN109977175A (en) Data configuration querying method and device
US10417250B1 (en) System, method, and computer program for maintaining data dependencies during data transformation
CN104050264A (en) Method and device for generating SQL statement
CN116401277A (en) Data processing method, device, system, equipment and medium
CN111078728A (en) Cross-database query method and device in database filing mode
CN113495723B (en) Method, device and storage medium for calling functional component
CN111159203B (en) Data association analysis method, platform, electronic equipment and storage medium
CN111159213A (en) Data query method, device, system and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20191224

Address after: P.O. Box 31119, grand exhibition hall, hibiscus street, 802 West Bay Road, Grand Cayman, British Cayman Islands

Patentee after: Innovative advanced technology Co., Ltd

Address before: Greater Cayman, British Cayman Islands

Patentee before: Alibaba Group Holding Co., Ltd.