CN103645864A - Magnetic disc array dual-control system and realization method thereof - Google Patents
Magnetic disc array dual-control system and realization method thereof Download PDFInfo
- Publication number
- CN103645864A CN103645864A CN201310733225.3A CN201310733225A CN103645864A CN 103645864 A CN103645864 A CN 103645864A CN 201310733225 A CN201310733225 A CN 201310733225A CN 103645864 A CN103645864 A CN 103645864A
- Authority
- CN
- China
- Prior art keywords
- controller
- link
- message
- port
- sends
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Abstract
The invention discloses a magnetic disc array dual-control system and a realization method thereof. The realization method comprises the following steps: S1, configuring dual-control interconnection ports for two controllers; S2, establishing a dual-control memory map of the two controllers; S3, adopting a dual-control link mechanism to establish a dual-control link; S4, adopting a dual-control information mechanism to realize information passing between the two controllers. Due to the adoption of the scheme, the dual-control link mechanism is adopted to establish the dual-control link, then the dual-control information mechanism is adopted to realize information passing, so as to combine two independent array controllers into the magnetic disc array dual-control system through which dual-control redundancy fault tolerance, I/O flow load balancing and the like are realized. Accordingly, the data storage risk is reduced to the minimum to ensure the high reliability, safety and feasibility of the system, therefore, the system has a very high market application value.
Description
Technical field
The present invention relates to disk array, in particular, a kind of disk array double-control system and its implementation.
Background technology
In recent years along with the develop rapidly of computer technology, the performance of disk array has had significantly lifting, disk array is day by day universal as the carrier of data storage, for the system that requires high availability and high security, such as industries such as finance, post and telecommunications, electric power, insurance, security, market is more and more higher for the fault-tolerant requirement of disc array system.
The core of disk array storage system is array control unit.Array control unit is comprised of the machine element of High Speed General, and for example, its hardware forms and comprises CPU, and high-speed cache (Cache) and optical-fibre channel (FC) etc. are mainly used to realize the management of storage forwarding and the whole array of data.Array control unit is the data transmission path between main frame and disk, comprises towards the interface of main frame and two aspects of disk.Array control unit is processed the information from main frame and disk, by the parallel work-flow to a plurality of disk members, provides the transfer rate far above independent disk, thereby the data rate between matched data front end and disk unit makes it reach balanced.The software platform of array control unit is low profile edge real time operating system rapidly and efficiently normally.
Disk array controller is as the core of disc array system, and its reliability is directly connected to the availability of disk array.On the disk array of single control, only have a disk array controller, once this disk array device goes wrong, must there will be the machine of delaying, What is more, even there will be destructive data corruption, thereby cause business again cannot carry out.
Therefore, there is defect in prior art, needs to improve.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of new disk array double-control system and its implementation.
Technical scheme of the present invention is as follows: a kind of implementation method of disk array double-control system, and it comprises the following steps: S1 is two controller configuration dual control interconnect port; S2, sets up the dual control memory-mapped of described two controllers; S3, adopts dual control linking mechanism to set up dual control link; S4, adopts dual control message mechanism to realize the message transmission of described two controllers.
Preferably, in described implementation method, in step S1, define respectively each two states of the first controller 0 and second controller 1, comprise that state is prepared and confirmation is prepared, add up to totally four kinds of states.
Preferably, in described implementation method, in step S1, some subregions and port thereof are set, each subregion connects respectively CPU and data channel by its different port.
Preferably, in described implementation method, in step S2, opposite end memory-mapped is arrived to local terminal, for reading and writing end memory.
Preferably, in described implementation method, in step S4, described message mechanism comprises some message, and each message comprises respectively function and confirms two states, for realizing the reading and writing data of described disk array double-control system.
Preferably, in described implementation method, described message mechanism comprises and sends that command description block interrupts, command description block is confirmed to interrupt, sends that data are interrupted, data validation interrupts, sends that exchange is interrupted, exchange is confirmed to interrupt, sends that small-sized exchange is interrupted, small-sized exchange is confirmed to interrupt, sends and write buffer memory and push interruptions, write buffer memory and push and confirm to interrupt, obtain buffer pointer and interrupt, obtain that message buffer pointer interrupts, interruption is confirmed in buffer zone.
Preferably, in described implementation method, after step S3, before step S4, also carry out following steps S40: two local terminals obtain respectively buffer zone address and the message format regional address of opposite end, for making opposite end obtain local buffer content.Preferably, step S40 comprises the following steps: S401, and arbitrary local terminal is prepared local buffer zone address, sends and requires buffer zone interrupting information to opposite end, and require buffer zone address and the message format regional address of opposite end; S402, the described buffer zone interrupting information that requires is processed in opposite end, then sends and confirms that buffer zone interrupting information is to described local terminal; S403, described local terminal is received described confirmation buffer zone interrupting information, confirms that local terminal and opposite end all complete the processing to buffer zone.
Preferably, in described implementation method, step S3 comprises the following steps: judge respectively whether two controllers start, all to carry out wait flow process, these two controllers are unified in to linking point, as two side controllers of described disk array double-control system, then carry out link flow process; Only arbitrary startup is as single-control startup state processing; Described wait flow process, comprises when a side controller arrives linking point, and judgement other end controller does not also arrive linking point, waits for that other end controller arrives linking point pre-seting in the time period; Wherein, described, pre-set in the time period, other end controller does not arrive linking point, as timeout treatment; Described single-control startup state processing, under described single-control startup state, before the controller having started arrives linking point, judge whether opposite terminal controller starts, to enter described wait flow process, again obtain universal input output hardware information, form described disk array double-control system, then enter dual control flow process; Otherwise as single control system, use, and after inserting opposite terminal controller, carry out link flow process; Described link flow process comprises the following steps: S31, the first controller 0 sends the first output message register that the non-transparent bridge port of second controller 1 is prepared in its link, sends the 3rd output register that link detection time is counted to the non-transparent bridge port of second controller 1; S32, the first input message registers and the 3rd input register of second controller 1 root port, by message mechanism, reads the link preparation of the first controller 0 and the reception information of link detection number of times; S33, when described reception information is correct, second controller 1 sends the first output message register that the non-transparent bridge port of the first controller 0 is prepared in its link, described link detection number of times is subtracted to the 3rd output register of the non-transparent bridge port that sends to the first controller 0 after 1; S34, the first input message registers and the 3rd input register of the first controller 0 root port, by message mechanism, reads the link preparation of second controller 1 and the reception information of link detection number of times; Judging whether described link detection number of times is 0, otherwise return to execution step S31, is to perform step S35; S35, the first controller 0 sends its link and confirms second controller 1, second controller 1 receives after the link confirmation of the first controller 0, the link that sends second controller 1 confirms the first controller 0, the first controller 0 finishes described link flow process after receiving that the link of second controller 1 is confirmed.
Another technical scheme of the present invention is as follows: a kind of disk array double-control system, and it comprises the first controller 0 and second controller 1; Two described controllers arrange respectively dual control interconnect port; Between two described controllers, dual control memory-mapped is set; Between two described controllers, adopt dual control linking mechanism to set up dual control link; Between two described controllers, adopt dual control message mechanism to realize message transmission.
Adopt such scheme, the present invention adopts dual control linking mechanism to set up dual control link, by dual control message mechanism, realize message transmission afterwards, thereby by two independently array control unit tissue become the system of disk array dual control, can realize dual control redundancy fault-tolerant, read-write flow load balance etc., thereby reduce data storage risk, ensured high reliability, high security and the high availability of system, there is very high market using value.
Accompanying drawing explanation
Fig. 1 is the framework schematic diagram of an embodiment of double-control system of the present invention;
Fig. 2 is the first level address conversion schematic diagram of one embodiment of the present of invention;
Fig. 3 is the look-up table address translation schematic diagram of one embodiment of the present of invention;
Fig. 4 is the TLP address translation schematic diagram of one embodiment of the present of invention;
Fig. 5 is the NT mapping mapping table schematic diagram of one embodiment of the present of invention;
Fig. 6 is that schematic diagram is prepared in the dual control link of one embodiment of the present of invention;
Fig. 7 is the Message register mappings schematic diagram of one embodiment of the present of invention;
Fig. 8 is that the Doorbell message of one embodiment of the present of invention transmits schematic diagram;
Fig. 9 is the dual control message flow schematic diagram of one embodiment of the present of invention;
Figure 10 is the schematic diagram of an embodiment of implementation method of the present invention.
Embodiment
For the ease of understanding the present invention, below in conjunction with the drawings and specific embodiments, the present invention will be described in more detail.In accompanying drawing, provided preferred embodiment of the present invention.But the present invention can realize by many different forms, is not limited to the described embodiment of this instructions.On the contrary, providing the object of these embodiment is to make to the understanding of disclosure of the present invention more thoroughly comprehensively.
It should be noted that, when element is called as " being fixed on " another element, can directly can there is element placed in the middle in it on another element or also.When an element is considered to " connection " another element, it can be directly connected to another element or may have centering elements simultaneously.The term that this instructions is used " vertical ", " level ", " left side ", " right side " and similar statement are just for illustrative purposes.
Unless otherwise defined, all technology that this instructions is used are identical with the implication that belongs to the common understanding of those skilled in the art of the present invention with scientific terminology.The term using in instructions of the present invention in this instructions, just in order to describe the object of specific embodiment, is not for limiting the present invention.The term "and/or" that this instructions is used comprise one or more relevant Listed Items arbitrarily with all combinations.
Below according to hardware design framework, with IDT(Integrated Device Technology, Inc., integrated device technology company limited) chip port is configured to example, EEPROM(Electrically Erasable Programmable Read-Only Memory is described, EEPROM (Electrically Erasable Programmable Read Only Memo)) layoutprocedure; It should be noted that, except IDT chip, those skilled in the art, according to instructions and related description thereof, can be applied to this patent in other related chips equally, and IDT chip should not be considered as the restriction to this patent claim required for protection; Then by PCIE(Peripheral Component Interconnect Express, quick peripheral hardware interconnect standard) conversion of TLP bag arranges the process of dual control memory-mapped; The dual control linking mechanism of designing and Implementing; The dual control message mechanism of designing and Implementing.
It should be noted that, summary of the invention of the present invention comprises dual control from being configured to normal link a mutual cover system that sends message, otherwise function, for example, for idiographic flow and the active dual control of Active-Active(of the read-write operation of dual control) specific implementation, also have the Take-Over(that fault shifts to take over) etc., can be with reference to prior art specific implementation on basis of the present invention; But the most basic foothold of these functions is system of the present invention and its implementation, on dual control message mechanism, carry out specific implementation.
In the disk array double-control system specific implementation step of (being called for short " double-control system "), mainly comprise following components: the configuration of dual control interconnect port, dual control memory-mapped, dual control linking mechanism and dual control message mechanism.As shown in figure 10, an embodiment of implementation method of the present invention is, a kind of implementation method of disk array double-control system, and it comprises the following steps: S1 is two controllers configuration dual control interconnect port; S2, sets up the dual control memory-mapped of described two controllers; S3, adopts dual control linking mechanism to set up dual control link; S4, adopts dual control message mechanism to realize the message transmission of described two controllers.
Continue to illustrate embodiment and the particular content of each step below.
Step S1 is two controller configuration dual control interconnect port; Preferably, in described implementation method, in step S1, some subregions and port thereof are set, each subregion connects respectively CPU and data channel by its different port.For example, for dual control port, arrange, between dual control, realize interconnected, need to be interconnected by the port of two IDT chips is undertaken by backboard, therefore need to the port on IDT chip be arranged, comprising the following aspects:
(1) subregion 0 is set, port 0 meets CPU0, and port 8 connects FC(Fiber Chanel, optical-fibre channel) card is divided into this group.
(2) subregion 1 is set, port 2 meets CPU1, and port one 6 meets SAS(Serial Attached SCSI, serial SCSI) card is divided into this group.
(3) subregion 2 is set, port 6 connects Peer-Port 6, forms dual control.
, because port 0 meets CPU, be wherein up NTB(non-transparent bridge, non-transparent bridge) port, be configured to UP+NTB+DMA(Direct Memory Access, direct memory access) port, port 6 connects Peer-Port, is configured to NTB port.
Step S2, sets up the dual control memory-mapped of described two controllers; Preferably, opposite end memory-mapped is arrived to local terminal, for reading and writing end memory.After step S1 has realized and has been two controllers configuration dual control interconnect port, also need opposite end memory-mapped to originally bringing in, thereby can have access to end memory, to this, to end memory, can read and write.Each NTB port of IDT comprises 6 Bar(Base Address Register, base address register), Bar0-Bar5, wherein combination of two, can form the Bar use of 64, for example, that CPU is used is the CPU of Cavium MPSI64 position, needs the use that combined.
Then carry out TLP(Transaction Layer Protocol, transaction layer protocol, also claim processing layer agreement) bag conversion, it is divided into two kinds of modes, and a kind of is first level address conversion (Direct Address Translation), and another is look-up table address translation (Lookup Table Address), if continuous address, select first level address conversion, if discontinuous address generally selects the mode of look-up table to change.
First level address conversion as shown in Figure 2, each NTB port has base address (bar2:0x18) and conversion two kinds of base address, base address (bar2:0x498), if use first level address conversion, by transaction address to be converted, by base address and offset address mode, the base address being converted to base address register, in conjunction with the address being converted into after conversion.
First level address conversion can map directly to the memory address of opposite end on the PCIE address of local terminal, like this, accesses the memory address that local PCIE address is access opposite end, but first level address conversion must be continuous address, arranges as follows.
Local Port0 port is set and enables, 64 bit address, transmit TLP and wrap on subregion 2.The PCIE base address that port0 is set is 1:0000:0000, transmits TLP and wraps base address 2:0000:0000 above, TLP is sent to the Port6. of opposite end from Port0
Local Port6 port is set and enables, 64 bit address, transmit TLP and wrap on subregion 0.The PCIE base address that port6 is set is 2:0000:0000, transmits TLP and wraps base address 0:0000:0000 above, is about to TLP bag and is sent on internal memory by Port6 port.
Look-up table address translation as shown in Figure 3, can by discontinuous or not the mode of the address space on an IDT by table be mapped on local address, thereby realize the continuous or discontinuous address space on the one or more IDT chips of access.
Look-up table address translation can be divided into two kinds, 12 entries and 24 entries, converted magnitude arranges in identical situation, and the discontinuous space that 24 entries can have access to is more, certainly the single content size of access is less, and this example selects the LUT table inquiry of 12 entries enough.
Two ports are all arranged to the LUT table inquiry of 12 entries.[11:12] 0-is direct, 1-12 entry, and 2-24 entry, 3-retains.
Port0 port is set and enables, 64 bit address, transmit TLP and wrap on subregion 2, big or small 4GB, and base address is 1:0000:0000; Port6 port is set and enables, 64 bit address, transmit TLP and wrap on subregion 0, big or small 4GB, and base address is 2:0000:0000.
After completing the arranging of TLP conversion, now just address is mapped to originally and is brought in from opposite end, in sending and receiving bag, go back and do not know and will where be sent to from which port, which subregion partition to be sent to which subregion from, so also need to carry out the NT mapping mapping of ID conversion, the memory headroom that just can complete final completely visible opposite end, the address translation of TLP as shown in Figure 4.
Then set up NT mapping mapping table, TLP is surrounded by the TLP and two kinds of the TLP completing of request, no matter be any TLP, all needs to carry out ID conversion, and NT mapping mapping table designs for ID conversion.Each IDT chip comprises an overall NT mapping mapping table, this mapping table comprises 64 entries, each subregion partition comprises 8 entries, the object subregion partition that each entry has comprised the conversions of ID conversion needs, the bus Bus number of object equipment, the dev device number of object equipment, the func function number of object equipment, thus recognize destination interface.
NT mapping mapping table as shown in Figure 5, in order to carry out ID conversion, need to be filled NTmapping mapping table.Local uplink port port0 is at subregion 0, and the entry of NT mapping mapping table is 0-8, in need to filling, has cpu port, IDT P2P bridge port, and DMA port, NTB port, fills 4 entries above.
Because being to map the port6 address of the PCIE address of local port0 and opposite terminal controller, need cpu port, IDT P2P bridge port, DMA port, these four the ID mappings of NTB port are filled on the Port6 port of opposite end.Opposite end Port6 port is subregion 2, and the entry of corresponding NTmapping mapping table is 16-23, so fill respectively 16,17,18,19 ports, shines upon with 0,1,2,3 entries of local terminal.
So far, the internal memory of opposite terminal controller has just been mapped on the PCIE address of local terminal, and the PCIE address of local terminal port 0 is read and write, can realize the read-write to opposite end memory headroom, certainly, in order to improve read or write speed, use the mode of DMA to read and write data.
Step S3, adopts dual control linking mechanism to set up dual control link.Preferably, step S3 comprises the following steps: judge respectively whether two controllers start, and are all to carry out wait flow process, these two controllers are unified in to linking point, as two side controllers of described disk array double-control system, then carry out link flow process; Only arbitrary startup is as single-control startup state processing.
Wherein, described wait flow process, comprises when a side controller arrives linking point, and judgement other end controller does not also arrive linking point, waits for that other end controller arrives linking point pre-seting in the time period; Wherein, described, pre-set in the time period, other end controller does not arrive linking point, as timeout treatment; For example, a side controller arrives after linking point, if other end controller does not also arrive linking point, starts timing, after 10 seconds, 30 seconds, 1 minute or 10 minutes, if other end controller does not arrive linking point, as timeout treatment.
Wherein, described single-control startup state processing, under described single-control startup state, before the controller having started arrives linking point, judging whether opposite terminal controller starts, is to enter described wait flow process, again obtains universal input output hardware information, form described disk array double-control system, then enter dual control flow process; Otherwise as single control system, use, and after inserting opposite terminal controller, carry out link flow process.
Preferably, described link flow process comprises the following steps:
S31, the first controller 0 sends the first output message register that the non-transparent bridge port of second controller 1 is prepared in its link, sends the 3rd output register that link detection time is counted to the non-transparent bridge port of second controller 1;
S32, the first input message registers and the 3rd input register of second controller 1 root port, by message mechanism, reads the link preparation of the first controller 0 and the reception information of link detection number of times;
S33, when described reception information is correct, second controller 1 sends the first output message register that the non-transparent bridge port of the first controller 0 is prepared in its link, described link detection number of times is subtracted to the 3rd output register of the non-transparent bridge port that sends to the first controller 0 after 1;
S34, the first input message registers and the 3rd input register of the first controller 0 root port, by message mechanism, reads the link preparation of second controller 1 and the reception information of link detection number of times; Judging whether described link detection number of times is 0, otherwise return to execution step S31, is to perform step S35; S35, the first controller 0 sends its link and confirms second controller 1, second controller 1 receives after the link confirmation of the first controller 0, the link that sends second controller 1 confirms the first controller 0, the first controller 0 finishes described link flow process after receiving that the link of second controller 1 is confirmed.
For example, after all inserting, two controllers start, by hardware GPIO(General Purpose I/O, and universal input output hardware information), can know that two controllers insert, therefore two side controllers all need to be unified in to linking point.When one end arrives linking point, one end, also when not arriving linking point, now needs to wait for that other one end arrives linking point in addition, when two side controllers all arrive linking point, can carry out next step linked operation; If opposite terminal controller is extracted in waiting process, be judged as single-control startup; If another one controller does not arrive start-up point in waiting process always, when arriving official hour, can carry out timeout treatment.
And for example, when only having a controller to start, by hardware GPIO, can know and only have a controller starting, directly judge into single-control startup.If inserted opposite terminal controller before not reaching linking point, thereby can again obtain GPIO information, become dual control, enter dual control flow process.If also do not insert opposite terminal controller after arrival linking point, can not enter wait_for_interlink flow process, wait for flow process, but after inserting opposite terminal controller, can link flow process.
Preferably, in described implementation method, in step S1, define respectively each two states of the first controller 0 and second controller 1, comprise that state is prepared and confirmation is prepared, add up to totally four kinds of states.In the module of the dual control linking mechanism of designing and Implementing, the present embodiment has defined respectively each two states of controller 0 and controller 1, comprise that state is prepared and confirmation is prepared, totally four kinds of states, in exploitation implementation procedure, consider the problem that may occur in dual control link process, as having a controller by plug in link process or having a controller situation such as extremely need to change, in these cases, each related embodiment of the present invention can be made corresponding processing, finally can link normally.In step S3, dual control link is prepared as shown in Figure 6, first controller 0 transmit control device 0 link is prepared (CTRL_0_LINK_READY) to the message outmsg0 register of the NTB port of controller 1, sends the message outmsg2 register that link detection time is counted to the NTB port of controller 1.By message mechanism, on controller 1, the message inmsg0 of Root port and inmsg2 can read the information that (CTRL_0_LINK_READY) and link detection number of times are prepared in controller link.Now, if reception information is correct, controller 1 equally transmit control device 1 link is prepared (CTRL_1_LINK_READY) and the inferior NTB port of counting to controller 0 of link detection, wherein, for link detection number of times, because dual control is used same detection number of times, now after testing once, need to successively decrease 1; Controller 0 receives after data successes, can again send, and controller 1 receives detection number of times to be successively decreased after data 1 issues again controller 0.Controller 0 and controller 1 be transmission data repeatedly like this each other, finally complete link detection process.
After completing the link detection number of times of regulation, now both sides' controller has linked stable, controller 0 transmit control device 0 link confirms that (CTRL_0_READY_ACKl is to controller 1, controller 1 receives the link of controller 0 and confirms that after (CTRL_0_READY_ACK), transmit control device 1 link confirms that (CTRL_1_READY_ACK) is to controller 0, both sides have just completed link process after receiving that controller is confirmed.Dual control linking status is defined as follows shown in table 1.
state | state name | | |
controller | |||
0 link is | cTRL_0_LINK_READY | controller | 0 link is prepared, and message register can normally be read and write. |
|
cTRL_0_READY_ACK | confirm that |
|
| cTRL_1_LINK_READY | controller | 1 link is prepared, and message register can normally be read and write. |
|
cTRL_1_READY_ACK | confirm that |
The definition of table 1 dual control linking status
And, for the setting of dual control Message register, continue to illustrate as follows.The NTB of IDT chip (None-Transparent Bridge, non-transparent bridge) internal communication of port is divided into two kinds of modes, a kind of is that message register exchanges, and another is doorbell register, by these two kinds of modes, can complete NTB port across the information interchange of chip.
Message register mode, supports 4 inbound messages (INMSG[3:0]) register and four outbound messages (OUTMSG[3:0]) register for each NTB port, each message register transfer be the data of 32bit.
Each IDT chip has 32 message control registers of the overall situation, each subregion is to there being 4 subregion control register devices, always have 8 subregions, for example, naming rule is SWPxMSGCTLy, corresponding switch partition x message contrl y, for example, when needs are used partition0 and partition2, subregion 0 and subregion 2 to be shone upon, mapping is as shown in Figure 7.
Like this, NTB0 and NTB1 are mapped, subregion partition0 under NTB0 port wherein, subregion partition2 under NTB1 port, by SWP0CTL0, SWP0CTL1, SWP0CTL2, tetra-registers of SWP0CTL3, respectively the inmsg of the outmsg of NTB0 port and NTB1 port is associated, like this, when the outmsg of NTB0 port has message, by message mechanism, can automatically pass in inmsg.In like manner, the outmsg of NTB1 port also shines upon with the inmsg of NTB0, so just can realize two inside story communications between port.
For dual control Doorbell register, illustrate as follows.It is Doorbell register that the internal communication of the NTB port of IDT chip also has a kind of mode, by Doorbell register, the embodiment of the present invention has realized the mechanism of transmitting function message between dual control, the introduction of this mechanism in the back chapters and sections is introduced, the principle of article Doorbell register herein, Doorbell register, for transmitting the data of 32bit, is generally used for host-host protocol and uses.
Each NTB port of IDT chip has an OUTDBELLSET(outbound doorbell set) register, this register can arrange, and the agreement that oneself is defined is set in this register.Also have in addition an INDBELLSTS(inbound doorbell status) and INDBELLMASK, value and the unwanted value of shielding that remote port writes obtained respectively.By these three registers, can realize the transmission of information between port and port, for example, distal end I DT chip is passed to port3outbound doorbell by doorbell message, then by doorbell mechanism, transmit the message in port5inbound doorbell status, thereby get doorbell message.Wherein, partition3 under port3, so data and global outbound doorbell mask3 will be carried out and computing when the OUTDBELLSET of port3 has data, and then carry out and computing with global inbound Doorbell mask, finally mask other data, only useful data are transferred on port5.
And for example, application IDT or other chips, as shown in Figure 8, far-end chip is passed to port6outbound doorbell by doorbell message, then by doorbell mechanism, transmit the message in port0inbound doorbell status, thereby get doorbell message.
As shown in Figure 8, partition2 under port6, so data and global outbound doorbell mask2 will be carried out and computing when the OUTDBELLSET of port6 has data, and then carry out and computing with global inbound Doorbell mask, finally mask other data, only useful data are transferred on port0.Like this, by interconnect port Port6, forward the port Port0 that meets CPU to and receive after data, through CPU, process.
Step S4, adopts dual control message mechanism to realize the message transmission of described two controllers.Preferably, described message mechanism comprises some message, and each message comprises respectively function and confirms two states, for realizing the reading and writing data of described disk array double-control system.In the module of the dual control linking mechanism of designing and Implementing, each related embodiment of the present invention has defined a set of complete agreement, each all comprises function and confirms two states, as obtaining buffer(buffer zone) address and confirmation get buffer address, send CDB(Command Descriptor Blocks, command description block) and confirmation get CDB, send data and confirmation and get data etc.; And designed accordingly a set of dual control message flow, thereby guaranteed that dual control can complete the function such as read and write data normally.
Preferably, in described implementation method, described message mechanism comprises and sends that command description block interrupts, command description block is confirmed to interrupt, sends that data are interrupted, data validation interrupts, sends that exchange is interrupted, exchange is confirmed to interrupt, sends that small-sized exchange is interrupted, small-sized exchange is confirmed to interrupt, send Wpush(write cache push, writes buffer memory and pushes) interrupt, Wpush confirms to interrupt, obtain buffer pointer interrupts, obtains mbuffer(message buffer) pointer interrupts, buffer zone is confirmed to interrupt.Wpush writes buffer memory and pushes, and for representing that local terminal pins this region cache, opposite end now cannot be used.Message buffer is message buffer, for depositing the buffer zone of message data.In the Doorbell of dual control message is transmitted, the present invention has defined a set of message mechanism, by this message interrupt mechanism, Peer-Port can be received scsi command CDB that local terminal sends and data, exchange, buffer etc., thereby realizes the message transmission of two controllers.Buffer is for depositing the buffer zone of order and control.Definition and the meaning of message mechanism are as shown in table 2 below.
Table 2 dual control message mechanism
Be combined application with above-mentioned arbitrary embodiment, preferred, after step S3, before step S4, also carry out following steps S40: two local terminals obtain respectively buffer zone address and the mbuffer address of opposite end, for making opposite end obtain local buffer content.Preferably, step S40 comprises the following steps: S401, and arbitrary local terminal is prepared local buffer zone address, sends and requires buffer zone interrupting information to opposite end, and require buffer zone address and the mbuffer address of opposite end; S402, the described buffer zone interrupting information that requires is processed in opposite end, then sends and confirms that buffer zone interrupting information is to described local terminal; S403, described local terminal is received described confirmation buffer zone interrupting information, confirms that local terminal and opposite end all complete the processing to buffer zone.
For example, as shown in Figure 9, after dual control has linked, first can obtain buffer address and the mbuffer address of opposite end, buffer address packet is containing cdb buffer address, iov data address, task task address, wpush address, exchange address and small exchange address, these addresses need to be sent to opposite end and go, thereby opposite end can get local buffer.For example, mBuffer variable, be used for receiving the data that higher level is transmitted, every reception once just increases mBuffer, be that what in last mBuffer, put is all data of receiving, after the data of mBuffer being taken out of by self defined interface when application program, during from new Run, need to be by the original data dump of mBuffer, size is reduced to 0.
First, after two ends have linked, after being ready to local buffer address, can send and interrupt GET_BUFPTRS_INT to opposite end, opposite end receives after GET_BUFPTRS_INT interrupts and processes GET_BUFPTRS_INT interruption, the interruption of having no progeny in the GET_BUFPTRS_INT that finishes dealing with and can send to opposite end GET_BUF_ACK_INT to confirm, when opposite end, receive in this and have no progeny, represent that local terminal and opposite end have all completed the processing of buffer, if not receiving the GET_BUF_ACK_INT of opposite end confirms to interrupt, can wait for, until two ends all complete the processing of buffer.
For example, further embodiment of this invention is as follows: a kind of implementation method of disk array double-control system, and it comprises the following steps:
S1, is two controllers configuration dual control interconnect port, and, define respectively each two states of the first controller 0 and second controller 1, comprise that state prepares and confirm to prepare; To each controller, some subregions and port thereof are set respectively, each subregion connects respectively CPU and data channel by its different port.
S2, sets up the dual control memory-mapped of described two controllers, by opposite end memory-mapped to local terminal, for read-write to end memory; Each controller, can directly read, thereby accelerate treatment effeciency in local terminal mapping to end memory, realizes dual control function.
S3, judges respectively whether two controllers start, and are all to carry out wait flow process, and these two controllers are unified in to linking point, as two side controllers of described disk array double-control system, then carries out link flow process; Only arbitrary startup is as single-control startup state processing.
S401, arbitrary local terminal is prepared local buffer zone address, sends and requires buffer zone interrupting information to opposite end, and require buffer zone address and the mbuffer address of opposite end;
S402, the described buffer zone interrupting information that requires is processed in opposite end, then sends and confirms that buffer zone interrupting information is to described local terminal;
S403, described local terminal is received described confirmation buffer zone interrupting information, confirms that local terminal and opposite end all complete the processing to buffer zone.
S4, adopts dual control message mechanism to realize the message transmission of described two controllers; Wherein, described message mechanism comprises some message, and each message comprises respectively function and confirms two states, for realizing the reading and writing data of described disk array double-control system.
Further embodiment of this invention is as follows: a kind of disk array double-control system, and it has adopted implementation method described in above-mentioned arbitrary embodiment, and this double-control system comprises the first controller 0 and second controller 1; Two described controllers arrange respectively dual control interconnect port; Between two described controllers, dual control memory-mapped is set; Between two described controllers, adopt dual control linking mechanism to set up dual control link; Between two described controllers, adopt dual control message mechanism to realize message transmission.
For example, as shown in Figure 1, disk array double-control system comprises controller 1 and power supply 1 thereof, fan 1, controller 2 and power supply 2 thereof, fan 2, and system intermediate plate, and several hard disks, comprise hard disk 1, hard disk 2, hard disk 3 ... hard disk n; Controller 1 and power supply 1 thereof, fan 1 are connected to system intermediate plate; Controller 2 and power supply 2 thereof, fan 2 are connected to system intermediate plate; Each hard disk is connected respectively to system intermediate plate.Controller 1, controller 2 are controlled each hard disk by system intermediate plate, and are set up memory-mapped, dual control link and adopted dual control message mechanism to realize message transmission by system intermediate plate.
Like this, by by two independently array control unit tissue become disk array double-control system, by this double-control system, can realize dual control redundancy fault-tolerant, I/O flow load balance etc., thereby data are stored to Risk Reduction to bottom line, ensured high reliability, high security and the high availability of system.
Further, embodiments of the invention also comprise, each technical characterictic of the various embodiments described above, the system of the disk array dual control being mutually combined to form.In order to promote the availability that storage system is higher, the various embodiments described above of the present invention provide dual control Redundancy Fault-tolerant Technology, when having the controller of business to go wrong, automatically business are switched on normal controller; Preferably, also notify upper strata to process.
And, the disk array double-control system of various embodiments of the present invention is from two FC ports-Extendings to four FC port, one times of I/O performance boost, by I/O flow load balance, load (task) on a controller is carried out to balance, shared on a plurality of two controllers and carry out, thus dirigibility and the availability of increase handling capacity, Strengthens network data-handling capacity, raising network.
It should be noted that, above-mentioned each technical characterictic continues combination mutually, forms the various embodiment that do not enumerate in the above, is all considered as the scope that instructions of the present invention is recorded; And, for those of ordinary skills, can be improved according to the above description or convert, and all these improvement and conversion all should belong to the protection domain of claims of the present invention.
Claims (10)
1. an implementation method for disk array double-control system, is characterized in that, comprises the following steps:
S1 is two controller configuration dual control interconnect port;
S2, sets up the dual control memory-mapped of described two controllers;
S3, adopts dual control linking mechanism to set up dual control link;
S4, adopts dual control message mechanism to realize the message transmission of described two controllers.
2. implementation method according to claim 1, is characterized in that, in step S1, defines respectively each two states of the first controller 0 and second controller 1, comprises that state prepares and confirm to prepare, and adds up to totally four kinds of states.
3. implementation method according to claim 1, is characterized in that, in step S1, some subregions and port thereof are set, each subregion connects respectively CPU and data channel by its different port.
4. implementation method according to claim 1, is characterized in that, in step S2, by opposite end memory-mapped to local terminal, for read-write to end memory.
5. implementation method according to claim 1, is characterized in that, in step S4, described message mechanism comprises some message, and each message comprises respectively function and confirms two states, for realizing the reading and writing data of described disk array double-control system.
6. implementation method according to claim 5, it is characterized in that, described message mechanism comprises and sends that command description block interrupts, command description block is confirmed to interrupt, sends that data are interrupted, data validation interrupts, sends that exchange is interrupted, exchange is confirmed to interrupt, sends that small-sized exchange is interrupted, small-sized exchange is confirmed to interrupt, sends and write buffer memory and push interruptions, write buffer memory and push and confirm to interrupt, obtain buffer pointer and interrupt, obtain that message buffer pointer interrupts, interruption is confirmed in buffer zone.
7. implementation method according to claim 6, is characterized in that, after step S3, before step S4, also carries out following steps S40: two local terminals obtain respectively buffer zone address and the message format regional address of opposite end, for making opposite end obtain local buffer content.
8. implementation method according to claim 7, is characterized in that, step S40 comprises the following steps:
S401, arbitrary local terminal is prepared local buffer zone address, sends and requires buffer zone interrupting information to opposite end, and require buffer zone address and the message format regional address of opposite end;
S402, the described buffer zone interrupting information that requires is processed in opposite end, then sends and confirms that buffer zone interrupting information is to described local terminal;
S403, described local terminal is received described confirmation buffer zone interrupting information, confirms that local terminal and opposite end all complete the processing to buffer zone.
9. according to the arbitrary described implementation method of claim 1 to 8, it is characterized in that, step S3 comprises the following steps: judge respectively whether two controllers start, all to carry out wait flow process, these two controllers are unified in to linking point, as two side controllers of described disk array double-control system, then carry out link flow process; Only arbitrary startup is as single-control startup state processing;
Described wait flow process, comprises when a side controller arrives linking point, and judgement other end controller does not also arrive linking point, waits for that other end controller arrives linking point pre-seting in the time period; Wherein, described, pre-set in the time period, other end controller does not arrive linking point, as timeout treatment;
Described single-control startup state processing, under described single-control startup state, before the controller having started arrives linking point, judge whether opposite terminal controller starts, to enter described wait flow process, again obtain universal input output hardware information, form described disk array double-control system, then enter dual control flow process; Otherwise as single control system, use, and after inserting opposite terminal controller, carry out link flow process;
Described link flow process comprises the following steps:
S31, the first controller 0 sends the first output message register that the non-transparent bridge port of second controller 1 is prepared in its link, sends the 3rd output register that link detection time is counted to the non-transparent bridge port of second controller 1;
S32, the first input message registers and the 3rd input register of second controller 1 root port, by message mechanism, reads the link preparation of the first controller 0 and the reception information of link detection number of times;
S33, when described reception information is correct, second controller 1 sends the first output message register that the non-transparent bridge port of the first controller 0 is prepared in its link, described link detection number of times is subtracted to the 3rd output register of the non-transparent bridge port that sends to the first controller 0 after 1;
S34, the first input message registers and the 3rd input register of the first controller 0 root port, by message mechanism, reads the link preparation of second controller 1 and the reception information of link detection number of times; Judging whether described link detection number of times is 0, otherwise return to execution step S31, is to perform step S35;
S35, the first controller 0 sends its link and confirms second controller 1, second controller 1 receives after the link confirmation of the first controller 0, the link that sends second controller 1 confirms the first controller 0, the first controller 0 finishes described link flow process after receiving that the link of second controller 1 is confirmed.
10. a disk array double-control system, is characterized in that, comprises the first controller 0 and second controller 1;
Two described controllers arrange respectively dual control interconnect port;
Between two described controllers, dual control memory-mapped is set;
Between two described controllers, adopt dual control linking mechanism to set up dual control link;
Between two described controllers, adopt dual control message mechanism to realize message transmission.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310733225.3A CN103645864B (en) | 2013-12-26 | 2013-12-26 | A kind of magnetic disc array dual-control system and its implementation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310733225.3A CN103645864B (en) | 2013-12-26 | 2013-12-26 | A kind of magnetic disc array dual-control system and its implementation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103645864A true CN103645864A (en) | 2014-03-19 |
CN103645864B CN103645864B (en) | 2016-08-24 |
Family
ID=50251091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310733225.3A Active CN103645864B (en) | 2013-12-26 | 2013-12-26 | A kind of magnetic disc array dual-control system and its implementation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103645864B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105183677A (en) * | 2015-08-31 | 2015-12-23 | 北京神州云科数据技术有限公司 | Asynchronous non-transparent bridge based data transmission method and system |
CN105760319A (en) * | 2014-12-15 | 2016-07-13 | 中兴通讯股份有限公司 | Data successfully sent confirmation method and device |
CN106293985A (en) * | 2016-08-12 | 2017-01-04 | 浪潮(北京)电子信息产业有限公司 | The method of the abort message of communication between multi-controller based on NTB |
CN107547329A (en) * | 2017-09-07 | 2018-01-05 | 郑州云海信息技术有限公司 | A kind of dual control data transmission method and system based on NTB |
CN117596212A (en) * | 2024-01-18 | 2024-02-23 | 苏州元脑智能科技有限公司 | Service processing method, device, equipment and medium |
CN117596212B (en) * | 2024-01-18 | 2024-04-09 | 苏州元脑智能科技有限公司 | Service processing method, device, equipment and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1350295A (en) * | 2000-10-23 | 2002-05-22 | 国际商业机器公司 | Method and apparatus for executing up date of disk array controller based on driver |
US20060020720A1 (en) * | 2004-07-23 | 2006-01-26 | Lsi Logic Corporation | Multi-controller IO shipping |
US7171525B1 (en) * | 2002-07-31 | 2007-01-30 | Silicon Image, Inc. | Method and system for arbitrating priority bids sent over serial links to a multi-port storage device |
CN101576805A (en) * | 2009-06-12 | 2009-11-11 | 浪潮电子信息产业股份有限公司 | Parallel processing-based multi-host interface redundancy SAN controller |
CN101639811A (en) * | 2009-08-21 | 2010-02-03 | 成都市华为赛门铁克科技有限公司 | Data writing method, controller and multi-controller system |
CN102629225A (en) * | 2011-12-31 | 2012-08-08 | 成都市华为赛门铁克科技有限公司 | Dual-controller disk array, storage system and data storage path switching method |
-
2013
- 2013-12-26 CN CN201310733225.3A patent/CN103645864B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1350295A (en) * | 2000-10-23 | 2002-05-22 | 国际商业机器公司 | Method and apparatus for executing up date of disk array controller based on driver |
US7171525B1 (en) * | 2002-07-31 | 2007-01-30 | Silicon Image, Inc. | Method and system for arbitrating priority bids sent over serial links to a multi-port storage device |
US20060020720A1 (en) * | 2004-07-23 | 2006-01-26 | Lsi Logic Corporation | Multi-controller IO shipping |
CN101576805A (en) * | 2009-06-12 | 2009-11-11 | 浪潮电子信息产业股份有限公司 | Parallel processing-based multi-host interface redundancy SAN controller |
CN101639811A (en) * | 2009-08-21 | 2010-02-03 | 成都市华为赛门铁克科技有限公司 | Data writing method, controller and multi-controller system |
CN102629225A (en) * | 2011-12-31 | 2012-08-08 | 成都市华为赛门铁克科技有限公司 | Dual-controller disk array, storage system and data storage path switching method |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105760319A (en) * | 2014-12-15 | 2016-07-13 | 中兴通讯股份有限公司 | Data successfully sent confirmation method and device |
CN105183677A (en) * | 2015-08-31 | 2015-12-23 | 北京神州云科数据技术有限公司 | Asynchronous non-transparent bridge based data transmission method and system |
CN106293985A (en) * | 2016-08-12 | 2017-01-04 | 浪潮(北京)电子信息产业有限公司 | The method of the abort message of communication between multi-controller based on NTB |
CN106293985B (en) * | 2016-08-12 | 2019-05-10 | 浪潮(北京)电子信息产业有限公司 | The method of the abort message communicated between multi-controller based on NTB |
CN107547329A (en) * | 2017-09-07 | 2018-01-05 | 郑州云海信息技术有限公司 | A kind of dual control data transmission method and system based on NTB |
CN117596212A (en) * | 2024-01-18 | 2024-02-23 | 苏州元脑智能科技有限公司 | Service processing method, device, equipment and medium |
CN117596212B (en) * | 2024-01-18 | 2024-04-09 | 苏州元脑智能科技有限公司 | Service processing method, device, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN103645864B (en) | 2016-08-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220100696A1 (en) | Techniques for command validation for access to a storage device by a remote client | |
CN100568211C (en) | Realize method and the device of a plurality of I2C of visit with programming device from device | |
US7647416B2 (en) | Full hardware based TCP/IP traffic offload engine(TOE) device and the method thereof | |
CN110941576B (en) | System, method and device for memory controller with multi-mode PCIE function | |
US20140040527A1 (en) | Optimized multi-root input output virtualization aware switch | |
CN101499046A (en) | SPI equipment communication circuit | |
JP2003076654A (en) | Data transfer system between memories of dsps | |
CN102811112B (en) | Fibre channel input/output data routing system and method | |
KR101003102B1 (en) | Memory assignmen method for multi-processing unit, and memory controller using the same | |
CN101599004B (en) | SATA controller based on FPGA | |
CN104798010B (en) | At least part of serial storage protocol compliant frame conversion | |
CN103645864A (en) | Magnetic disc array dual-control system and realization method thereof | |
CN103634225A (en) | Service bandwidth expansion method in cloud computing network virtualization | |
CN102388357A (en) | Method and system for accessing memory device | |
CN102292714B (en) | The communications protocol that memory source between device assembly is shared | |
EP1625506A1 (en) | Usb host controller with memory for transfer descriptors | |
CN100401279C (en) | Configurable multi-port multi-protocol network interface to support packet processing | |
US8090893B2 (en) | Input output control apparatus with a plurality of ports and single protocol processing circuit | |
CN111783165B (en) | Safe and trusted system chip architecture based on hardware isolation calling mode | |
CN110659143B (en) | Communication method and device between containers and electronic equipment | |
CN104598404A (en) | Computing equipment extending method and device as well as extensible computing system | |
CN102043741B (en) | Circuit and method for pipe arbitration | |
CN101299205A (en) | Priority queuing arbitration system bus control method based on voting | |
CN103678244A (en) | Intelligent device without application processor | |
KR102326892B1 (en) | Adaptive transaction handling method and device for same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20190117 Address after: 518000 Fuyuan Industrial Building, 138 Freedom Road, Baoan 47 District, Baoan District, Shenzhen City, Guangdong Province, 721-2, 7th floor Patentee after: Shenzhen Feiteng Xin'an Technology Co., Ltd. Address before: 518000 South Pass No. 2 Zhiheng Strategic Emerging Industrial Park, Nanshan District, Shenzhen City, Guangdong Province, 30 buildings and 5 floors Patentee before: Shenzhen Data Fault Tolerance System Co., Ltd. |