CN102446123B

CN102446123B - Method and device for processing SCSI sensing data

Info

Publication number: CN102446123B
Application number: CN2010105052288A
Authority: CN
Inventors: 徐磊; 郑劭馨; 张日新; 汪文敏; 金堂
Original assignee: Hangzhou H3C Technologies Co Ltd
Current assignee: New H3C Technologies Co Ltd
Priority date: 2010-10-09
Filing date: 2010-10-09
Publication date: 2013-11-27
Anticipated expiration: 2030-10-09
Also published as: CN102446123A

Abstract

The invention provides a method and a device for processing SCSI (small computer system interface) sensing data, and is applicable to a storage system consisting of an SCSI drive unit and a hard disk device. The method comprises the following steps: the SCSI drive unit confirms that the hard disk device is in an abnormal state after receiving a command response including sensing data sent from the hard disk device, and carries out the following abnormity recovery process: cutting off the power supply to the hard disc device through sending a power-off instruction, and starting up a timer; and a power-on instruction is sent to recover the power supply to the hard disk after the timer returns the zero. The invention can effectively improve the fault-tolerance of the system.

Description

A kind of method and apparatus that the SCSI sense data is processed

Technical field

The present invention relates to computer communication technology, particularly a kind of method and apparatus that small computer system interface (SCSI) sense data is processed.

Background technology

Small computer system interface (SCSI, Small Computer System Interface) high speed and the high efficiency storage bus protocol as a kind of To enterprises level, applied, detailed error condition indicating mechanism is provided, in scsi bus protocol, defined a kind of error message code returned with command response, be used to indicate reason or the residing abnormality of hard disk of order failure, this error message code is the SCSI sense data, and it has defined near 200 species specificity mistake or abnormal state informations.For example, when hard disc apparatus receives while such as management, controlling the I/O order of class or access classes, if this I/O order is by the hard disc apparatus normal process, hard disc apparatus can return to the command response that comprises successful information, if this hard disc apparatus is made mistakes and made the order can't normal process, can return to the command response that comprises the SCSI sense data.

Usually, the processing of sense data completed in the SCSI driver.As shown in Figure 1, the SCSI driver is a kind of framework of layering, and being divided into is three layers: higher level, middle layer and lower level, wherein the processing of sense data realized in higher level and middle layer.Existing processing to sense data mainly comprises following several:

The first is processed: after the SCSI driver receives the SCSI sense data from hard disc apparatus, directly notify upper level applications I/O order there is no successful execution.

The second is processed: after the SCSI driver receives the SCSI sense data from hard disc apparatus, again to this hard disc apparatus, send the I/O order immediately.

Yet, in above-mentioned existing processing mode, first kind of way does not adopt any fault tolerant mechanism, and the second way adopts the mode that resends the I/O order to seem simple and effective simply, if but hard disc apparatus is persistence or additionally intervention extremely, so constantly resend the I/O order and can not solve the abnormal of hard disc apparatus, also easily cause the I/O command timeout and blocked state is hung up or sunk into to whole storage system, when serious, even cause the storage system collapse.

Summary of the invention

In view of this, the invention provides a kind of method and apparatus that the SCSI sense data is processed, so that effectively improve the fault-tolerance of storage system.

A kind of method that the SCSI sense data is processed, be applied to comprise the storage system of SCSI driver element and hard disc apparatus; It is characterized in that, after described SCSI driver element is received the command response that comprises sense data of described hard disc apparatus transmission, determine that described hard disc apparatus occurs abnormal, carry out following abnormal restoring and process:

A, by sending lower electricity order, cut off the power supply of described hard disc apparatus, start simultaneously timer;

B, described timer then after, by the transmission power supply that order recovers described hard disc apparatus that powers on.

A kind of device that the SCSI sense data is processed, be applied to comprise the storage system of SCSI driver element and hard disc apparatus; It is characterized in that, this device comprises: abnormal determining unit, abnormal restoring unit and timer;

Described abnormal determining unit, after the command response that comprises sense data of receiving described hard disc apparatus transmission at described SCSI driver element, determine that described hard disc apparatus occurs abnormal, to described abnormal restoring unit, sends the abnormal restoring notice;

Described abnormal restoring unit, after receiving described abnormal restoring notice, carry out following abnormal restoring and process: by sending lower electricity order, cut off the power supply of described hard disc apparatus, start simultaneously described timer; Described timer then after, by the transmission power supply that order recovers described hard disc apparatus that powers on.

As can be seen from the above technical solutions; in the present invention; the SCSI driver element is after receiving the command response that comprises sense data that hard disc apparatus sends; by the mode that time delay after electricity under hard disc apparatus is powered on; hard disc apparatus is carried out to abnormal restoring; make hard disc apparatus after abnormal restoring, to process the I/O order in time, avoid because of the I/O command timeout, whole storage system being hung up or being sunk into blocked state, thereby effectively improve the fault-tolerance of storage system.

The accompanying drawing explanation

Fig. 1 is SCSI driver layer architecture schematic diagram;

The method flow diagram that Fig. 2 provides for the embodiment of the present invention one;

The method flow diagram that Fig. 3 provides for the embodiment of the present invention two;

The method flow diagram that Fig. 4 provides for the embodiment of the present invention three;

The method flow diagram that Fig. 5 provides for the embodiment of the present invention four;

The method flow diagram that Fig. 6 provides for the embodiment of the present invention five;

Fig. 7 is apparatus structure schematic diagram provided by the invention.

Embodiment

In order to make the purpose, technical solutions and advantages of the present invention clearer, describe the present invention below in conjunction with the drawings and specific embodiments.

Method provided by the invention mainly comprises: after receiving the command response that comprises sense data of hard disc apparatus transmission, the power supply of cutting off this hard disc apparatus by sending lower electricity order starts timer simultaneously, this timer then after, by the transmission power supply that order recovers this hard disc apparatus that powers on.

That is to say, after the sense data of receiving from hard disc apparatus, determine that this hard disc apparatus occurs abnormal, by triggering electricity under this hard disc apparatus, and the mode powered on again after time delay a period of time, make it return to normal condition.Below by specific embodiment, said method is described in detail.

Embodiment mono-, as shown in Figure 2, the method can specifically comprise the following steps:

Step 201:SCSI driver is received the command response that comprises sense data that hard disc apparatus sends.

Step 202: judge that this hard disc apparatus is whether in the topology of storage system, if so, execution step 203; Otherwise, finish the fault-tolerant processing flow process to this sense data.

In storage system, can generate a logical view for the topology of storage system, after hard disc apparatus is successfully added in storage system, will be present in this logical view, therefore, whether whether this step be present in by inquiring about this hard disc apparatus in the logical view of storage system and just can know in the topology of storage system.

Step 203: judge whether this hard disc apparatus is carrying out the abnormal restoring processing, if so, finish the fault-tolerant processing flow process to this sense data; Otherwise, execution step 204.

In fact the related abnormal restoring of this step is processed is exactly the process of power-on delay after lower electricity.

Step 204: in wrong node corresponding to hard disc apparatus, the anomalous counts value of this hard disc apparatus is added to 1.

For extremely managing each hard disc apparatus, while usually finding first hard disc apparatus abnormal, for example receive first that hard disc apparatus sends comprise the command response of sense data the time, can create wrong node for this hard disc apparatus, in wrong node, can comprise the anomalous counts value (being 0 while initially setting up), anomalous event information of hard disc apparatus etc.

Step 205: whether the anomalous counts value that judges this hard disc apparatus surpasses default anomalous counts threshold value, if not, and execution step 206; If so, execution step 209.

In some cases, after hard disc apparatus being carried out to the processing of power-on delay after principal, hard disc apparatus still can't return to normal condition, and the SCSI driver can be carried out same operation after receiving the sense data of this hard disc apparatus next time.But the abnormal of hard disc apparatus still can't return to normal condition after the processing of power-on delay after the lower electricity through repeatedly sometimes, so just can thoroughly abandon the recovery of this hard disc apparatus, to avoid long time delay.

Step 206: the independent hard disk redundant array (RAID) to upper strata sends Recovery processing tolerance event, to avoid RAID, hard disc apparatus is removed.

Sometimes; the RAID on upper strata is in the time can't receiving orders response for a long time; can be by the hard disc apparatus of response from array, removing; this situation occurs in the process hard disc apparatus being carried out to abnormal restoring; can send Recovery processing tolerance event to RAID, RAID can be by this hard disc apparatus from not removing array after receiving this event.

Step 207: send the lower electricity order for above-mentioned hard disc apparatus to the hard disk control module, start simultaneously timer.

The hard disk control module is responsible for the control and management to all hard disc apparatus in hard disk chassis, after the lower electricity order of receiving for above-mentioned hard disc apparatus, can cut off the power supply to this hard disc apparatus, after in step 208, receiving the order that powers on for above-mentioned hard disc apparatus, can recover the power supply to this hard disc apparatus.

Step 208: after timer expiry, send the order that powers on for above-mentioned hard disc apparatus to the hard disk control module, finish the fault-tolerant processing flow process to current sense data.

Step 209: send the Recovery processing turkey to RAID, after waiting for the setting duration, delete wrong node corresponding to hard disc apparatus.

When the anomalous counts value of hard disc apparatus surpasses default anomalous counts threshold value, can start a timer, after this timer expiry, delete wrong node corresponding to hard disc apparatus.Wherein, by the mode that starts timer, wait for and set after duration that to delete the wrong node that hard disc apparatus is corresponding be in order to RAID, to provide a period of time to remove this hard disc apparatus again.In case hard disc apparatus is removed, the abnormal information of this hard disc apparatus becomes not to be needed to preserve, and therefore deletes the wrong node that this hard disc apparatus is corresponding.

After RAID receives this Recovery processing turkey, can be by this hard disc apparatus from memory device, removing.

So far flow process shown in embodiment mono-finishes.

In this embodiment mono-, can be for the reset mode that makes the rear power-on delay of electricity under hard disc apparatus shown in equal execution graph 2 flow processs of any sense data, more preferably, also can be for showing that the abnormal sense data that can't recover occurs hard disc apparatus, hard disc apparatus is carried out to permanent lower electric processing mode, for example, show hard disc apparatus generation hardware fault etc. when sense data.That is to say, before the step 202 of Fig. 2, can also further comprise a decision operation: judge whether sense data shows that hard disc apparatus occurs to recover abnormal, if not, continue execution step 202; If so, according to the flow performing shown in following embodiment bis-.

Embodiment bis-, as shown in Figure 3, receive at the SCSI driver sense data that hard disc apparatus sends, and this sense data is carried out following steps while showing abnormal that this hard disc apparatus occurs to recover:

Step 301: judge that this hard disc apparatus is whether in the topology of storage system, if so, execution step 302; Otherwise, finish the fault-tolerant processing flow process to this sense data.

Step 302: judge whether this hard disc apparatus is carrying out forever descending electric treatment, if so, finish the fault-tolerant processing flow process to this sense data; Otherwise, execution step 303.

Step 303: send the lower electricity order for above-mentioned hard disc apparatus to the hard disk control module, cut off the power supply to this hard disc apparatus.

So far flow process shown in embodiment bis-finishes.

Can under any scene, all according to the flow process shown in embodiment mono-and embodiment bis-, carry out fault-tolerant processing to sense data, but more preferably, can specifically distinguish the scene that this sense data occurs, for different scenes, carry out the fault-tolerant processing mode adapted with this scene.

Can expand command response, make this command response except comprising sense data, also comprise the scene information that sense data occurs.Wherein, two fundamental quantities that can be using I/O command type and the residing state of hard disc apparatus as scene information, and by two extended fields in command response, carry respectively.If have m I/O command type and n the residing state of hard disc apparatus, co-exist in m * n class scene.In this embodiment, hypothesis I/O command type is divided into management control class order and the order of I/O access classes, wherein the order of I/O access classes specifically comprises again read command and write order, the residing state of hard disc apparatus is divided into equipment and adds state and static access state, can be combined into six class scenes as shown in table 1.

Table 1

The SCSI driver is when receiving the command response that comprises sense data, and the scene information at first comprised in the resolve command response, determine the fault-tolerant processing mode that this scene information is corresponding, carries out according to the fault-tolerant processing mode of determining.

Below by several scenes in several embodiment his-and-hers watches 1, be described.

After embodiment tri-, SCSI driver are received the command response that comprises sense data, the scene information parsed is: the I/O order controls the class order for management and the residing state of hard disc apparatus is the static access state, i.e. scene first in table 1, as shown in Figure 4, carry out following steps:

Step 401: judge whether this management control class order affects hard disc apparatus and normally accessed, and if so, can carry out according to the fault-tolerant processing mode in embodiment mono-, even hard disc apparatus first descends the reset mode of the rear power-on delay of electricity; If not, execution step 402.

If the administration order that the class order is the information such as the sequence number that obtains hard disc apparatus, sign is controlled in management, can not affect hard disc apparatus normally to be accessed; If it is to open or close the control command such as buffer memory that the class order is controlled in management, can affects hard disc apparatus and normally be accessed.

Step 402: whether judgement reaches default number of retries threshold value for the number of retries of this hard disc apparatus, if so, and execution step 403; Otherwise, execution step 404.

Step 403: to the user, point out the error message of this hard disc apparatus, the suggestion user changes hard disc apparatus, will, for the number of retries zero clearing of this hard disc apparatus, finish the fault-tolerant processing flow process to current sense data simultaneously.

Step 404: again send management to hard disc apparatus and control the class order.

Shown in embodiment tri-, flow process so far finishes.

After embodiment tetra-, SCSI driver are received the command response that comprises sense data, the scene information that the SCSI driver parses is: the I/O order is that read command and the residing state of hard disc apparatus are the static access state, i.e. scene second in table 1, as shown in Figure 5, carry out following steps:

The RAID on step 501:SCSI driver notice upper strata writes repair process to hard disc apparatus.

If the I/O order may be because physical equipment is normal for the read command hard disc apparatus returns to sense data, the logical device abnormal, thisly extremely often can realize by writing repair process, namely again to abnormal position data writing.

Step 502: whether reparation is write in judgement successful, if so, and execution step 503; Otherwise, execution step 504.

Step 503: again send read command to hard disc apparatus, finish the fault-tolerant processing flow process to current sense data.

Step 504: carry out according to the fault-tolerant processing mode in embodiment mono-, even hard disc apparatus first descends the reset mode of the rear power-on delay of electricity.

After embodiment five, SCSI driver are received the command response that comprises sense data, the scene information parsed is: the I/O order controls the class order for management and the residing state of hard disc apparatus is that equipment adds state, i.e. scene fourth in table 1, as shown in Figure 6, carry out following steps:

Step 601:SCSI driver suspends the fault-tolerant processing to this sense data.

In view of hard disc apparatus is in the particular stage that equipment adds, can suspend the fault-tolerant processing of a period of time to sense data, the market of time-out guarantees that the hard disc apparatus finishing equipment adds, and brings other impact thereby avoid normally adding hard disk to storage system.

Step 602: after the hard disc apparatus finishing equipment adds, recover the fault-tolerant processing to sense data, namely can carry out according to the fault-tolerant processing mode in embodiment mono-, even the first reset mode of power-on delay after lower electricity of hard disc apparatus.

So far flow process shown in embodiment five finishes.

After if the SCSI driver is received the command response that comprises sense data, the scene information parsed is: the I/O order is that write order and the residing state of hard disc apparatus are the static access state, perhaps, the I/O order is that read command and the residing state of hard disc apparatus are that equipment adds state, perhaps, the I/O order is that write order and the residing state of hard disc apparatus are that equipment adds state, namely the scene in table 1 third, scene penta or scene are own, all can carry out according to the fault-tolerant processing mode in embodiment mono-, even hard disc apparatus first descends the reset mode of the rear power-on delay of electricity.Because for scene penta or scene for oneself, since received read command and write order, devices illustrated is added and drawn to an end, therefore can directly adopt and make the first reset mode of power-on delay after lower electricity of hard disc apparatus.

Be more than the detailed description that method provided by the present invention is carried out, below device provided by the present invention be described in detail.Fig. 7 is the device that the SCSI sense data is processed provided by the invention, and this application of installation, in the storage system that comprises SCSI driver element and hard disc apparatus, can be arranged in the SCSI driver element.As shown in Figure 7, this device can comprise: abnormal determining unit 700, abnormal restoring unit 710 and timer 720.

Abnormal determining unit 700, after the command response that comprises sense data of receiving the hard disc apparatus transmission at the SCSI driver element, determine that hard disc apparatus occurs abnormal, and 710 send the abnormal restoring notice to the abnormal restoring unit.

Abnormal restoring unit 710, after receiving the abnormal restoring notice, carry out following abnormal restoring and process: by sending lower electricity order, cut off the power supply of hard disc apparatus, start simultaneously timer 720; Timer 720 then after, by the transmission power supply that order recovers hard disc apparatus that powers on.

Wherein, above-mentioned abnormal restoring unit 710 specifically comprises: anomalous counts subelement 711, the first judgment sub-unit 712, reporting events subelement 713 and abnormal restoring subelement 714.

Anomalous counts subelement 711, after receiving the abnormal restoring notice, in wrong node corresponding to this hard disc apparatus, add 1 by the anomalous counts value of hard disc apparatus, sends the first judgement notice to the first judgment sub-unit 712; When judgment result is that of the first judgment sub-unit 712 is, after waiting for the setting duration, delete the wrong node that this hard disc apparatus is corresponding.

The first judgment sub-unit 712, after receiving the first judgement notice, judge whether the anomalous counts value of hard disc apparatus surpasses default anomalous counts threshold value.

Reporting events subelement 713, when the determination result is NO in the first judgment sub-unit 712, send Recovery processing tolerance event to avoid RAID that hard disc apparatus is removed to the RAID of storage system; When judgment result is that of the first judgment sub-unit 712 is, send the Recovery processing turkey to RAID.

Abnormal restoring subelement 714, when the determination result is NO in the first judgment sub-unit 712, the execute exception Recovery processing.

Further, abnormal restoring unit 710 can also comprise: the second judgment sub-unit 715, the abnormal restoring notice of sending for received abnormal determining unit 700 before anomalous counts subelement 711, judge whether hard disc apparatus is carrying out the abnormal restoring processing, if so, abandon the abnormal restoring notice; Otherwise the abnormal restoring notice is sent to anomalous counts subelement 711.

More preferably, this abnormal restoring unit 710 can also comprise: the 3rd judgment sub-unit 716, the abnormal restoring notice of sending for received abnormal determining unit 700 before the second judgment sub-unit 715, judge that hard disc apparatus is whether in the topology of storage system, if so, the abnormal restoring notice is sent to the second judgment sub-unit 715; Otherwise, abandon the abnormal restoring notice.

On above architecture basics, this device can also comprise: forever descend electric unit 730, specifically comprise: the 4th judgment sub-unit 731 and permanent lower electronic unit 732.

The 4th judgment sub-unit 731, the abnormal restoring notice sent for received abnormal determining unit 700 before abnormal restoring unit 710, judge whether sense data shows that hard disc apparatus occurs to recover abnormal, if not, the abnormal restoring notice is sent to abnormal restoring unit 710; If so, to permanent lower electronic unit 732, send permanent lower electricity notice.

Permanent lower electronic unit 732, after receiving permanent lower electricity notice, carry out following permanent lower electric treatment: cut off the power supply to hard disc apparatus by sending lower electricity order.

Further, this forever descends electric unit 730 to comprise: the 5th judgment sub-unit 733 and the 6th judgment sub-unit 734.

The 5th judgment sub-unit 733, the abnormal restoring notice sent for received abnormal determining unit 700 before the 4th judgment sub-unit 731, judge that hard disc apparatus whether in the topology of storage system, if so, sends to the 6th judgment sub-unit 734 by the abnormal restoring notice; Otherwise, abandon the abnormal restoring notice.

The 6th judgment sub-unit 734, after receiving the abnormal restoring notice, judge whether hard disc apparatus is carrying out forever descending electric treatment, if so, abandons the abnormal restoring notice; Otherwise, the abnormal restoring notice is sent to the 4th judgment sub-unit 731.

When the concrete scene to the sense data place was distinguished, abnormal determining unit 700 can specifically comprise: extremely definite subelement 701, scene resolve subelement 702 and mode is determined subelement 703.

Abnormal definite subelement 701, after the command response that comprises sense data of receiving the hard disc apparatus transmission at the SCSI driver element, determine that hard disc apparatus occurs abnormal.

Scene is resolved subelement 702, after at abnormal definite subelement 701, determining that hard disc apparatus occurs extremely, and the scene information comprised in the resolve command response.

Mode is determined subelement 703, for the corresponding relation according to predefined scene information and fault-tolerant processing mode, determine fault-tolerant processing mode corresponding to scene information that scene parsing subelement 702 parses, if being abnormal restoring, the fault-tolerant processing mode of determining processes, the 710 transmission abnormal restoring notices to the abnormal restoring unit; Wherein, scene information comprises type and the residing state of hard disc apparatus of I/O order corresponding to command response.

If in this device, there is permanent lower electric unit 730, mode is determined subelement 703 is processed the abnormal restoring notice by permanent lower electric unit 730 after and is sent to abnormal restoring unit 710 (shown in Fig. 7); If in this device, there is not permanent lower electric unit 730, mode determines that subelement 703 directly sends to abnormal restoring unit 710 by the abnormal restoring notice.

For different scenes, can possess one of following structure or combination in any (being the combination of all structures shown in Fig. 7):

Structure one, this device also comprise: retry processing unit 740.

Mode is determined subelement 703, and the scene information parsed at scene parsing subelement 702 is: when the type of I/O order is the static access state for managing control class order and the residing state of hard disc apparatus, send retries processing notice to retry processing unit 740.

Retry processing unit 740, while be used to receiving retry, processing notice, whether the order of judgement management control class affects hard disc apparatus is normally accessed, and if so, 710 send the abnormal restoring notice to the abnormal restoring unit; If not, further whether judgement reaches default number of retries threshold value for the number of retries of hard disc apparatus, if so, to the user, points out the error message of hard disc apparatus, to, for the number of retries zero clearing of hard disc apparatus, control the class order otherwise again to hard disc apparatus, send management.

Structure two, this device also comprise: write repair process unit 750.

Mode is determined subelement 703, in scene, resolves the scene information that subelement 702 parses to be: the type of I/O order is read command and the residing state of hard disc apparatus while being the static access state, sends to write to repair and notifies to writing repair process unit 750.

Write repair process unit 750, for writing while repairing notice receiving, the RAID in the notice storage system writes repair process to hard disc apparatus, and whether success of repair process is write in judgement, if so, again to hard disc apparatus, sends read command; Otherwise 710 send the abnormal restoring notice to the abnormal restoring unit.

Structure three, this device can also comprise: suspend processing unit 760.

Mode is determined subelement 703, and the scene information parsed at scene parsing subelement 702 is: when the type of I/O order is equipment interpolation state for managing control class order and the residing state of hard disc apparatus, to time-out processing unit 760, sends the time-out processing and notify.

Suspend processing unit 760, while be used to receiving, suspending the processing notice, suspend the fault-tolerant processing of this device to sense data, after the hard disc apparatus finishing equipment adds, recover the fault-tolerant processing of this device to sense data.

Structure four, mode are determined subelement 703; the scene information parsed at scene parsing subelement 702 is: the type of I/O order is that write order and the residing state of hard disc apparatus are the static access state; perhaps; the type of I/O order is that read command and the residing state of hard disc apparatus are that equipment adds state; perhaps; the type of I/O order is that write order and the residing state of hard disc apparatus are equipment while adding state, and 710 send the abnormal restorings notice to the abnormal restoring unit.

By above description, can be found out, method and apparatus provided by the invention possesses following advantage:

1) in the present invention; the SCSI driver element is after receiving the command response that comprises sense data that hard disc apparatus sends; by the mode that time delay after electricity under hard disc apparatus is powered on; hard disc apparatus is carried out to abnormal restoring; make hard disc apparatus can after abnormal restoring, process the I/O order in time; avoid because of the I/O command timeout, whole storage system being hung up or being sunk into blocked state, thereby effectively improve the fault-tolerance of storage system.

2) the present invention can send Recovery processing tolerance event to RAID when carrying out the abnormal restoring processing, avoids RAID in the process of hard disc apparatus being carried out to the abnormal restoring processing that hard disc apparatus is deleted.Normal condition RAID with redundancy properties can not demote, hard disc apparatus extremely invisible to the user, thus improve the robustness of storage system.

3) for hard disc apparatus occur expendable abnormal, the present invention is by it is carried out forever descend the mode of electric treatment, the once and for all elimination source of trouble, avoid the source of trouble for a long time the performance of storage system to be caused to adverse effect.

4) the present invention specifically distinguishes the scene that sense data occurs, and identifies in command response, for different scenes, carries out the fault-tolerant processing mode adapted with this scene, thereby improves the specific aim of fault-tolerant processing.

The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of making, be equal to replacement, improvement etc., within all should being included in the scope of protection of the invention.

Claims

1. method that small computer system interface SCSI sense data is processed, be applied to comprise the storage system of SCSI driver element and hard disc apparatus; It is characterized in that, after described SCSI driver element is received the command response that comprises sense data of described hard disc apparatus transmission, determine that described hard disc apparatus occurs abnormal, carry out following abnormal restoring and process:

B, described timer then after, by the transmission power supply that order recovers described hard disc apparatus that powers on;

Wherein, before the power supply of cutting off described hard disc apparatus by electricity order under sending, also comprise in steps A: A1, in wrong node corresponding to described hard disc apparatus adds 1 by the anomalous counts value of described hard disc apparatus; A2, judge that whether the anomalous counts value of described hard disc apparatus surpasses default anomalous counts threshold value, if not, independent hard disk redundant array RAID to described storage system sends Recovery processing tolerance event to avoid described RAID that described hard disc apparatus is removed, and the continuation execution is described cuts off the power supply of described hard disc apparatus by sending lower electricity order; Otherwise, execution step A3; A3, send the Recovery processing turkey to described RAID, after waiting for and setting duration, delete the wrong node that described hard disc apparatus is corresponding.

2. method according to claim 1, is characterized in that, before described steps A 1, also comprises in steps A:

A0, judge whether described hard disc apparatus is carrying out described abnormal restoring processing, if so, finish the fault-tolerant processing flow process to described sense data; Otherwise, continue to carry out described steps A 1.

3. method according to claim 2, is characterized in that, before described steps A 0, also comprising in steps A: whether described hard disc apparatus in the topology in described storage system, if so, continues to carry out described steps A 0; Otherwise, finish the fault-tolerant processing to described sense data.

4. method according to claim 1, it is characterized in that, after determining that described hard disc apparatus occurs extremely, before carrying out described abnormal restoring processing, also comprise: judge whether described sense data shows that described hard disc apparatus occurs to recover abnormal, if not, continuing to carry out described abnormal restoring processes; If so, carry out following permanent lower electric treatment:

C, descend electricity order to cut off the power supply to described hard disc apparatus by transmission, finish the fault-tolerant processing to described sense data.

5. method according to claim 4, is characterized in that, also comprised before step C:

C1, judge that described hard disc apparatus is whether in the topology of described storage system, if so, execution step C2; Otherwise, finish the fault-tolerant processing to described sense data;

C2, judge whether described hard disc apparatus is carrying out described permanent lower electric treatment, if so, finish the fault-tolerant processing to described sense data; Otherwise, continue to carry out described step C.

6. method according to claim 1, it is characterized in that, after determining that described hard disc apparatus occurs extremely, before carrying out described abnormal restoring processing, also comprise: the scene information comprised in resolving described command response, corresponding relation according to predefined scene information and fault-tolerant processing mode, determine the fault-tolerant processing mode that scene information is corresponding, carry out fault-tolerant processing according to the fault-tolerant processing mode of determining; In described corresponding relation, having at least a kind of fault-tolerant processing mode is that described abnormal restoring is processed, and described scene information comprises type and the residing state of hard disc apparatus of I/O order corresponding to described command response.

7. method according to claim 6, is characterized in that, according to the corresponding relation of predefined scene information and fault-tolerant processing mode, determines the fault-tolerant processing mode that scene information is corresponding, according to the fault-tolerant processing mode of determining, carries out fault-tolerant processing and specifically comprise:

If described scene information is: the type of described I/O order controls the class order for management and the residing state of described hard disc apparatus is the static access state, carries out following steps;

D1, judge that described management controls the class order and whether affect hard disc apparatus and normally accessed, if so, carries out described abnormal restoring processing; Otherwise execution step D2;

Whether D2, judgement reach default number of retries threshold value for the number of retries of described hard disc apparatus, if so, and execution step D3; Otherwise, execution step D4;

D3, to the user, point out the error message of described hard disc apparatus, will, for the number of retries zero clearing of described hard disc apparatus, finish the fault-tolerant processing to described sense data;

D4, again to described hard disc apparatus, send described management and control the class order.

8. method according to claim 6, is characterized in that, according to the corresponding relation of predefined scene information and fault-tolerant processing mode, determines the fault-tolerant processing mode that scene information is corresponding, according to the fault-tolerant processing mode of determining, carries out fault-tolerant processing and specifically comprise:

If described scene information is: the type of described I/O order is that read command and the residing state of described hard disc apparatus are the static access state, carries out following steps;

E1, notify the RAID in described storage system to write repair process to described hard disc apparatus;

E2, judge the described whether success of repair process of writing, if so, execution step E3; Otherwise, carry out described abnormal restoring and process;

E3, again to described hard disc apparatus, send described read command.

9. method according to claim 6, is characterized in that, according to the corresponding relation of predefined scene information and fault-tolerant processing mode, determines the fault-tolerant processing mode that scene information is corresponding, according to the fault-tolerant processing mode of determining, carries out fault-tolerant processing and specifically comprise:

If described scene information is: the type of described I/O order controls the class order for management and the residing state of described hard disc apparatus is that equipment adds state, carries out following steps;

F1, suspend the fault-tolerant processing to described sense data;

F2, after described hard disc apparatus finishing equipment adds, recover the fault-tolerant processing to sense data.

10. method according to claim 6, it is characterized in that, according to the corresponding relation of predefined scene information and fault-tolerant processing mode, determine the fault-tolerant processing mode that scene information is corresponding, according to the fault-tolerant processing mode of determining, carry out fault-tolerant processing and specifically comprise:

If described scene information is: the type of described I/O order is that write order and the residing state of described hard disc apparatus are the static access state; perhaps; the type of described I/O order is that read command and the residing state of described hard disc apparatus are that equipment adds state; perhaps; the type of described I/O order is that write order and the residing state of described hard disc apparatus are that equipment adds state, carries out described abnormal restoring and processes.

11. the device that small computer system interface SCSI sense data is processed, be applied to comprise the storage system of SCSI driver element and hard disc apparatus; It is characterized in that, this device comprises: abnormal determining unit, abnormal restoring unit and timer;

Described abnormal restoring unit, after receiving described abnormal restoring notice, carry out following abnormal restoring and process: by sending lower electricity order, cut off the power supply of described hard disc apparatus, start simultaneously described timer; Described timer then after, by the transmission power supply that order recovers described hard disc apparatus that powers on;

Wherein, described abnormal restoring unit specifically comprises: anomalous counts subelement, the first judgment sub-unit, reporting events subelement and abnormal restoring subelement;

Described anomalous counts subelement, after receiving described abnormal restoring notice, in wrong node corresponding to described hard disc apparatus, add 1 by the anomalous counts value of described hard disc apparatus, sends the first judgement notice to described the first judgment sub-unit; When judgment result is that of described the first judgment sub-unit is, after waiting for the setting duration, delete the wrong node that described hard disc apparatus is corresponding;

Described the first judgment sub-unit, after receiving described the first judgement notice, judge whether the anomalous counts value of described hard disc apparatus surpasses default anomalous counts threshold value;

Described reporting events subelement, when the determination result is NO in described the first judgment sub-unit, send Recovery processing tolerance event to avoid described RAID that described hard disc apparatus is removed to the independent hard disk redundant array RAID of described storage system; When judgment result is that of described the first judgment sub-unit is, send the Recovery processing turkey to described RAID;

Described abnormal restoring subelement, when the determination result is NO in described the first judgment sub-unit, carry out described abnormal restoring and process.

12. device according to claim 11, it is characterized in that, described abnormal restoring unit also comprises: the second judgment sub-unit, the abnormal restoring notice of sending for received described abnormal determining unit before described anomalous counts subelement, judge whether described hard disc apparatus is carrying out described abnormal restoring processing, if so, abandon described abnormal restoring notice; Otherwise described abnormal restoring notice is sent to described anomalous counts subelement.

13. device according to claim 12, it is characterized in that, described abnormal restoring unit also comprises: the 3rd judgment sub-unit, the abnormal restoring notice of sending for received described abnormal determining unit before described the second judgment sub-unit, judge that described hard disc apparatus is whether in the topology of described storage system, if so, described abnormal restoring notice is sent to described the second judgment sub-unit; Otherwise, abandon described abnormal restoring notice.

14. device according to claim 11, is characterized in that, this device also comprises: forever descend electric unit, specifically comprise: the 4th judgment sub-unit and permanent lower electronic unit;

Described the 4th judgment sub-unit, the abnormal restoring notice sent for received described abnormal determining unit before described abnormal restoring unit, judge whether described sense data shows that described hard disc apparatus occurs to recover abnormal, if not, described abnormal restoring notice is sent to described abnormal restoring unit; If so, to described permanent lower electronic unit, send permanent lower electricity notice;

Described permanent lower electronic unit, after receiving described permanent lower electricity notice, carry out following permanent lower electric treatment: cut off the power supply to described hard disc apparatus by sending lower electricity order.

15. device according to claim 14, is characterized in that, described permanent lower electric unit also comprises: the 5th judgment sub-unit and the 6th judgment sub-unit;

Described the 5th judgment sub-unit, the abnormal restoring notice sent for received described abnormal determining unit before described the 4th judgment sub-unit, judge that described hard disc apparatus whether in the topology of described storage system, if so, sends to the 6th judgment sub-unit by described abnormal restoring notice; Otherwise, abandon described abnormal restoring notice;

Described the 6th judgment sub-unit, after receiving the abnormal restoring notice, judge whether described hard disc apparatus is carrying out described permanent lower electric treatment, if so, abandons described abnormal restoring notice; Otherwise, described abnormal restoring notice is sent to described the 4th judgment sub-unit.

16. device according to claim 11, is characterized in that, described abnormal determining unit specifically comprises: extremely definite subelement, scene resolve subelement and mode is determined subelement;

Described abnormal definite subelement, after the command response that comprises sense data of receiving described hard disc apparatus transmission at described SCSI driver element, determine that described hard disc apparatus occurs abnormal;

Described scene is resolved subelement, after determining that at described abnormal definite subelement described hard disc apparatus occurs extremely, resolves the scene information comprised in described command response;

Described mode is determined subelement, for the corresponding relation according to predefined scene information and fault-tolerant processing mode, determine fault-tolerant processing mode corresponding to scene information that described scene parsing subelement parses, if the fault-tolerant processing mode of determining is abnormal restoring, process, to described abnormal restoring unit, send the abnormal restoring notice; Wherein, described scene information comprises type and the residing state of hard disc apparatus of I/O order corresponding to described command response.

17. device according to claim 16, is characterized in that, this device also comprises: the retry processing unit;

Described mode is determined subelement, the scene information parsed at described scene parsing subelement is: when the type of described I/O order is the static access state for managing control class order and the residing state of described hard disc apparatus, send retry processing notice to described retry processing unit;

Described retry processing unit, while be used to receiving described retry, processing notice, judge whether the order of described management control class affects hard disc apparatus and normally accessed, and if so, to described abnormal restoring unit, sends the abnormal restoring notice; If not, further whether judgement reaches default number of retries threshold value for the number of retries of described hard disc apparatus, if, to the user, point out the error message of described hard disc apparatus, to, for the number of retries zero clearing of described hard disc apparatus, control the class order otherwise again to described hard disc apparatus, send described management.

18. device according to claim 16, is characterized in that, this device also comprises: write the repair process unit;

Described mode is determined subelement, in described scene, resolving the scene information that subelement parses is: the type of described I/O order is read command and the residing state of described hard disc apparatus while being the static access state, sends to write to repair and notifies to the described repair process unit of writing;

The described repair process unit of writing, for receiving described writing while repairing notice, notify the RAID in described storage system to write repair process to described hard disc apparatus, judge describedly whether successfully to write repair process, if so, again to described hard disc apparatus, send described read command; Otherwise, to described abnormal restoring unit, send the abnormal restoring notice.

19. device according to claim 16, is characterized in that, this device also comprises: suspend processing unit;

Described mode is determined subelement, the scene information parsed at described scene parsing subelement is: when the type of described I/O order is equipment interpolation state for managing control class order and the residing state of described hard disc apparatus, to described time-out processing unit transmission, suspends processing and notify;

Described time-out processing unit, while be used to receiving described time-out, processing notice, suspend the fault-tolerant processing of this device to described sense data, after described hard disc apparatus finishing equipment adds, recovers the fault-tolerant processing of this device to sense data.

20. device according to claim 16; it is characterized in that; described mode is determined subelement; the scene information parsed at described scene parsing subelement is: the type of described I/O order is that write order and the residing state of described hard disc apparatus are the static access state; perhaps; the type of described I/O order is that read command and the residing state of described hard disc apparatus are that equipment adds state; perhaps; the type of described I/O order is that write order and the residing state of described hard disc apparatus are equipment while adding state, to described abnormal restoring unit, sends the abnormal restoring notice.