CN102446123A - Method and device for processing SCSI sensing data - Google Patents

Method and device for processing SCSI sensing data Download PDF

Info

Publication number
CN102446123A
CN102446123A CN2010105052288A CN201010505228A CN102446123A CN 102446123 A CN102446123 A CN 102446123A CN 2010105052288 A CN2010105052288 A CN 2010105052288A CN 201010505228 A CN201010505228 A CN 201010505228A CN 102446123 A CN102446123 A CN 102446123A
Authority
CN
China
Prior art keywords
disc apparatus
hard disc
unit
abnormal restoring
fault
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2010105052288A
Other languages
Chinese (zh)
Other versions
CN102446123B (en
Inventor
徐磊
郑劭馨
张日新
汪文敏
金堂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Technologies Co Ltd
Original Assignee
Hangzhou H3C Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou H3C Technologies Co Ltd filed Critical Hangzhou H3C Technologies Co Ltd
Priority to CN2010105052288A priority Critical patent/CN102446123B/en
Publication of CN102446123A publication Critical patent/CN102446123A/en
Application granted granted Critical
Publication of CN102446123B publication Critical patent/CN102446123B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method and a device for processing SCSI (small computer system interface) sensing data, and is applicable to a storage system consisting of an SCSI drive unit and a hard disk device. The method comprises the following steps: the SCSI drive unit confirms that the hard disk device is in an abnormal state after receiving a command response including sensing data sent from the hard disk device, and carries out the following abnormity recovery process: cutting off the power supply to the hard disc device through sending a power-off instruction, and starting up a timer; and a power-on instruction is sent to recover the power supply to the hard disk after the timer returns the zero. The invention can effectively improve the fault-tolerance of the system.

Description

A kind of method and apparatus that the SCSI sense data is handled
Technical field
The present invention relates to computer communication technology, particularly a kind of method and apparatus that small computer system interface (SCSI) sense data is handled.
Background technology
Small computer system interface (SCSI; Small Computer System Interface) high speed and the high efficiency storage bus protocol used as a kind of To enterprises level; Detailed error condition indicating mechanism is provided; Defined a kind of error message sign indicating number that returns with command response in the scsi bus protocol; Be used for the reason or the residing ERST of hard disk of directive command failure, this error message sign indicating number is the SCSI sense data, and it has defined near 200 species specificity mistakes or ERST information.For example; When hard disc apparatus receives the I/O order such as management control type or access classes; If this I/O order is by the hard disc apparatus normal process; Then hard disc apparatus can return the command response that comprises successful information, makes that order can't normal process if this hard disc apparatus is made mistakes, and then can return the command response that comprises the SCSI sense data.
Usually, the processing of sense data is accomplished in the SCSI driver.As shown in Figure 1, the SCSI driver is a kind of framework of layering, and being divided into is three layers: higher level, middle layer and lower level, wherein the processing to sense data realizes in higher level and middle layer.Existing processing to sense data mainly comprises following several kinds:
First kind of processing: after the SCSI driver receives the SCSI sense data from hard disc apparatus, directly notify upper level applications I/O order not have successful execution.
Second kind of processing: after the SCSI driver receives the SCSI sense data from hard disc apparatus, send the I/O order to this hard disc apparatus again immediately.
Yet; In the above-mentioned existing processing mode, first kind of mode do not adopt any fault tolerant mechanism, and the second way adopts the mode that resends the I/O order to seem simple and effective simply; If but hard disc apparatus is that persistence perhaps needs extra intervention unusually; So constantly resend I/O order and can not solve the unusual of hard disc apparatus, also cause the I/O command timeout easily and blocked state is hung up or sunk into to whole storage system, when serious even cause the storage system collapse.
Summary of the invention
In view of this, the invention provides a kind of method and apparatus that the SCSI sense data is handled, so that improve the fault-tolerance of storage system effectively.
A kind of method that the SCSI sense data is handled is applied to comprise the storage system of SCSI driver element and hard disc apparatus; It is characterized in that after said SCSI driver element was received the command response that comprises sense data of said hard disc apparatus transmission, it was unusual to confirm that said hard disc apparatus occurs, and carries out following abnormal restoring and handles:
A, cut off the power supply of said hard disc apparatus, start timer simultaneously through sending down electricity order;
B, said timer then after, through the transmission power supply that order recovers said hard disc apparatus that powers on.
A kind of device that the SCSI sense data is handled is applied to comprise the storage system of SCSI driver element and hard disc apparatus; It is characterized in that this device comprises: definite unusually unit, abnormal restoring unit and timer;
Said unusual definite unit is used for after said SCSI driver element is received the command response that comprises sense data of said hard disc apparatus transmission, and it is unusual to confirm that said hard disc apparatus occurs, and sends the abnormal restoring notice to said abnormal restoring unit;
Said abnormal restoring unit, be used to receive said abnormal restoring notice after, carries out following abnormal restoring and handles: cut off the power supply of said hard disc apparatus through sending down electric the order, start said timer simultaneously; Said timer then after, through the transmission power supply that order recovers said hard disc apparatus that powers on.
Can find out by above technical scheme; Among the present invention, the SCSI driver element is after receiving the command response that comprises sense data that hard disc apparatus sends, through the mode that hard disc apparatus electric down back time-delay is powered on; Hard disc apparatus is carried out abnormal restoring; Make hard disc apparatus can behind abnormal restoring, in time handle I/O order, avoid whole storage system is hung up or being sunk into blocked state, thereby improve the fault-tolerance of storage system effectively because of the I/O command timeout.
Description of drawings
Fig. 1 is a SCSI driver layer architecture synoptic diagram;
The method flow diagram that Fig. 2 provides for the embodiment of the invention one;
The method flow diagram that Fig. 3 provides for the embodiment of the invention two;
The method flow diagram that Fig. 4 provides for the embodiment of the invention three;
The method flow diagram that Fig. 5 provides for the embodiment of the invention four;
The method flow diagram that Fig. 6 provides for the embodiment of the invention five;
Fig. 7 is an apparatus structure synoptic diagram provided by the invention.
Embodiment
In order to make the object of the invention, technical scheme and advantage clearer, describe the present invention below in conjunction with accompanying drawing and specific embodiment.
Method provided by the invention mainly comprises: after receiving the command response that comprises sense data of hard disc apparatus transmission; Cut off the power supply of this hard disc apparatus and start timer simultaneously through sending down electricity order; This timer then after, through the transmission power supply that order recovers this hard disc apparatus that powers on.
That is to say that behind the sense data of receiving from hard disc apparatus, it is unusual to confirm that this hard disc apparatus occurs, electric down through triggering this hard disc apparatus, and the mode that powers on again after time-delay a period of time, makes it return to normal condition.Through specific embodiment said method is described in detail below.
Embodiment one, as shown in Figure 2, this method can specifically may further comprise the steps:
Step 201:SCSI driver is received the command response that comprises sense data that hard disc apparatus sends.
Step 202: judge this hard disc apparatus whether in the topology of storage system, if, execution in step 203; Otherwise, finish fault-tolerant processing flow process to this sense data.
In storage system, can generate a logical view to the topology of storage system; After hard disc apparatus is successfully added in the storage system; Will be present in this logical view; Therefore, whether this step is through just knowing in the logical view of inquiring about this hard disc apparatus and whether being present in storage system in the topology of storage system.
Step 203: judge whether this hard disc apparatus is carrying out abnormal restoring and handling, if finish fault-tolerant processing flow process to this sense data; Otherwise, execution in step 204.
In fact the related abnormal restoring processing of this step is exactly the process that electricity back delay down powers on.
Step 204: in the corresponding wrong node of hard disc apparatus, the anomalous counts value of this hard disc apparatus is added 1.
For managing unusually to each hard disc apparatus; When usually finding hard disc apparatus unusual first; For example receive first that hard disc apparatus sends comprise the command response of sense data the time; Can create wrong node to this hard disc apparatus, in wrong node, can comprise the anomalous counts value (being 0 when initially setting up), anomalous event information of hard disc apparatus etc.
Step 205: whether the anomalous counts value of judging this hard disc apparatus surpasses preset anomalous counts threshold value, if not, and execution in step 206; If, execution in step 209.
In some cases, after the processing that hard disc apparatus is carried out postponing to power on behind the principal, hard disc apparatus still can't return to normal condition, and the SCSI driver can be carried out same operation after receiving the sense data of this hard disc apparatus next time.But the unusual of hard disc apparatus still can't return to normal condition after through the back processing that postpones to power on of following electricity repeatedly sometimes, so just can thoroughly abandon the recovery of this hard disc apparatus, to avoid long time-delay.
Step 206: the independent hard disk redundant array (RAID) to the upper strata sends recovery processing tolerance incident, to avoid RAID hard disc apparatus is removed.
Sometimes; The RAID on upper strata is can't receive orders response the time for a long time; Can the hard disc apparatus of response be removed from array; For fear of this situation takes place in the process of hard disc apparatus being carried out abnormal restoring, can send to RAID and recover to handle the tolerance incident, RAID can not remove this hard disc apparatus after receiving this incident from array.
Step 207: send following electricity order to the hard disk control module, start timer simultaneously to above-mentioned hard disc apparatus.
The hard disk control module is responsible for control and the management to all hard disc apparatus in the hard disk chassis; After the following electricity order of receiving to above-mentioned hard disc apparatus; Can cut off power supply to this hard disc apparatus; In step 208, receive to after the order that powers on of above-mentioned hard disc apparatus, can recover power supply this hard disc apparatus.
Step 208: behind the timer expiry, send the order that powers on, finish fault-tolerant processing flow process to current sense data to above-mentioned hard disc apparatus to the hard disk control module.
Step 209: send recovery to RAID and handle turkey, after duration is set in wait, the wrong node of deletion hard disc apparatus correspondence.
When the anomalous counts value of hard disc apparatus surpasses preset anomalous counts threshold value, can start a timer, treat this timer expiry after, the corresponding wrong node of deletion hard disc apparatus.Wherein, wait for through the mode that starts timer and set that to delete the corresponding wrong node of hard disc apparatus again behind the duration be in order to provide a period of time to remove this hard disc apparatus to RAID.In case hard disc apparatus is removed, the abnormal information of this hard disc apparatus becomes not to be needed to preserve, and therefore deletes the corresponding wrong node of this hard disc apparatus.
RAID can remove this hard disc apparatus after receiving that turkey is handled in this recovery from memory device.
So far flow process shown in the embodiment one finishes.
In this embodiment one; Can be to the reset mode that the delay of electricity back powers under the hard disc apparatus that makes shown in equal execution graph 2 flow processs of any sense data; The permanent processing mode of electricity down also can be carried out to hard disc apparatus to showing that the unusual sense data that can't recover takes place hard disc apparatus in more excellent ground; For example, show hard disc apparatus generation hardware fault etc. when sense data.That is to say, before the step 202 of Fig. 2, can also further comprise a decision operation: it is unusual to judge whether sense data shows that hard disc apparatus takes place to recover, if not, continues execution in step 202; If, then according to the flow performing shown in the following embodiment two.
Embodiment two, as shown in Figure 3 receives the sense data that hard disc apparatus sends at the SCSI driver, and this sense data is carried out following steps when showing unusual that this hard disc apparatus takes place to recover:
Step 301: judge this hard disc apparatus whether in the topology of storage system, if, execution in step 302; Otherwise, finish fault-tolerant processing flow process to this sense data.
Step 302: judge whether this hard disc apparatus is carrying out forever descending electric treatment, if finish fault-tolerant processing flow process to this sense data; Otherwise, execution in step 303.
Step 303: send following electricity order to the hard disk control module, cut off power supply to this hard disc apparatus to above-mentioned hard disc apparatus.
So far flow process shown in the embodiment two finishes.
Can under any scene, all carry out fault-tolerant processing with the flow process shown in the embodiment two to sense data according to embodiment one; But more preferably; Can specifically distinguish this sense data institute scene of generating, carry out the fault-tolerant processing mode that adapts with this scene to different scene.
Can expand command response, make this command response except comprising sense data, also comprise sense data institute scene of generating information.Wherein, can be with I/O command type and the residing state of hard disc apparatus two fundamental quantities as scene information, and carry through two extended fields in the command response respectively.If have m I/O command type and n the residing state of hard disc apparatus, then co-exist in m * n class scene.Hypothesis I/O command type is divided into management control type order and the order of I/O access classes among this embodiment; Wherein the order of I/O access classes specifically comprises read command and write order again; The residing state of hard disc apparatus is divided into equipment and adds state and static access state, then can be combined into six types of scenes as shown in table 1.
Table 1
Figure BSA00000301144000061
The SCSI driver is when receiving the command response that comprises sense data, and the scene information that at first comprises in the resolve command response is confirmed the fault-tolerant processing mode that this scene information is corresponding, carries out according to the fault-tolerant processing mode of confirming.
Describe through several kinds of scenes in several embodiment his-and-hers watches 1 below.
After embodiment three, SCSI driver are received the command response that comprises sense data; The scene information that parses is: the I/O order controls type order for management and the residing state of hard disc apparatus is the static access state; Promptly the scene first in the table 1 is then as shown in Figure 4, carries out following steps:
Step 401: judge this management control class orders whether influence hard disc apparatus by normal access, if then can carry out, even the reset mode that delay powers on behind the electricity under the hard disc apparatus elder generation according to the fault-tolerant processing mode among the embodiment one; If not, execution in step 402.
If management control type order is management of information orders such as the sequence number that obtains hard disc apparatus, sign, then can not influence hard disc apparatus by normal access; If a management control type order is to open or close control command such as buffer memory, then can influence hard disc apparatus by normal access.
Step 402: judge whether the number of retries to this hard disc apparatus reaches preset number of retries threshold value, if, execution in step 403; Otherwise, execution in step 404.
Step 403: to the error message of this hard disc apparatus of user prompt, the suggestion user changes hard disc apparatus, will finish the fault-tolerant processing flow process to current sense data to the number of retries zero clearing of this hard disc apparatus simultaneously.
Step 404: send management control type order to hard disc apparatus again.
Flow process so far finishes shown in the embodiment three.
After embodiment four, SCSI driver are received the command response that comprises sense data; The scene information that the SCSI driver parses is: the I/O order is that read command and the residing state of hard disc apparatus are the static access state; Promptly the scene second in the table 1 is then as shown in Figure 5, carries out following steps:
The RAID on step 501:SCSI driver notice upper strata writes repair process to hard disc apparatus.
If I/O order possibly be because physical equipment is normal for the read command hard disc apparatus returns sense data, logical device takes place unusually, thisly often can realize through writing repair process unusually, promptly again unusual position is write data.
Step 502: whether reparation is write in judgement successful, if, execution in step 503; Otherwise, execution in step 504.
Step 503: send read command to hard disc apparatus again, finish fault-tolerant processing flow process to current sense data.
Step 504: the fault-tolerant processing mode according among the embodiment one is carried out, even the reset mode that the electric down earlier back delay of hard disc apparatus powers on.
After embodiment five, SCSI driver are received the command response that comprises sense data; The scene information that parses is: the I/O order controls type order for management and the residing state of hard disc apparatus is that equipment adds state; Promptly the scene fourth in the table 1 is then as shown in Figure 6, carries out following steps:
Step 601:SCSI driver suspends the fault-tolerant processing to this sense data.
In view of hard disc apparatus is in the particular stage that equipment adds, can suspend the fault-tolerant processing of a period of time to sense data, the market of time-out guarantees that hard disc apparatus completion equipment adds, and brings other influence thereby avoid normally adding hard disk to storage system.
Step 602: after treating that hard disc apparatus completion equipment adds, recovery promptly can be carried out according to the fault-tolerant processing mode among the embodiment one fault-tolerant processing of sense data, even the reset mode that the electric down earlier back delay of hard disc apparatus powers on.
So far flow process shown in the embodiment five finishes.
After if the SCSI driver is received the command response that comprises sense data; The scene information that parses is: the I/O order is that write order and the residing state of hard disc apparatus are the static access state; Perhaps, the I/O order is that read command and the residing state of hard disc apparatus are that equipment adds state, perhaps; The I/O order is that write order and the residing state of hard disc apparatus are that equipment adds state; Promptly the scene in the table 1 third, scene penta or scene are own, then all can carry out according to the fault-tolerant processing mode among the embodiment one, even the reset mode that the electric down earlier back delay of hard disc apparatus powers on.Because for scene penta or scene for oneself, since received read command and write order, then devices illustrated is added and is drawn to an end, therefore can directly adopt make hard disc apparatus earlier down the electricity back postpone the reset mode that powers on.
More than be the detailed description that method provided by the present invention is carried out, below device provided by the present invention be described in detail.Fig. 7 is the device that the SCSI sense data is handled provided by the invention, and this device is applied to comprise the storage system of SCSI driver element and hard disc apparatus, can be arranged in the SCSI driver element.As shown in Figure 7, this device can comprise: definite unusually unit 700, abnormal restoring unit 710 and timer 720.
Unusual definite unit 700 is used for after the SCSI driver element is received the command response that comprises sense data of hard disc apparatus transmission, and it is unusual to confirm that hard disc apparatus occurs, and sends the abnormal restorings notice to abnormal restoring unit 710.
Abnormal restoring unit 710, be used to receive abnormal restoring notice after, carries out following abnormal restoring and handles: cut off the power supply of hard disc apparatus through sending down electric the order, start timer 720 simultaneously; Timer 720 then after, through the transmission power supply that order recovers hard disc apparatus that powers on.
Wherein, above-mentioned abnormal restoring unit 710 specifically comprises: anomalous counts subelement 711, first judgment sub-unit 712, reporting events subelement 713 and abnormal restoring subelement 714.
Anomalous counts subelement 711, be used to receive abnormal restoring notice after, in the corresponding wrong node of this hard disc apparatus, the anomalous counts value of hard disc apparatus is added 1, send first to first judgment sub-unit 712 and judge notice; In the judged result of first judgment sub-unit 712 when being, wait for set duration after, delete the corresponding wrong node of this hard disc apparatus.
First judgment sub-unit 712, be used to receive first judge notice after, judge whether the anomalous counts value of hard disc apparatus surpasses the anomalous counts threshold value of presetting.
Reporting events subelement 713 is used in the judged result of first judgment sub-unit 712 for not the time, sends to the RAID of storage system and recovers to handle the tolerance incident and to avoid RAID hard disc apparatus is removed; , send to RAID and to recover to handle turkey when being in the judged result of first judgment sub-unit 712.
Abnormal restoring subelement 714 is used in the judged result of first judgment sub-unit 712 for not the time, and execute exception is recovered to handle.
Further; Abnormal restoring unit 710 can also comprise: second judgment sub-unit 715; Be used for before anomalous counts subelement 711, receiving the abnormal restoring notice of sending unusual definite unit 700; Judge whether hard disc apparatus is carrying out abnormal restoring and handling, if abandon the abnormal restoring notice; Otherwise the abnormal restoring notice is sent to anomalous counts subelement 711.
More excellent ground; This abnormal restoring unit 710 can also comprise: the 3rd judgment sub-unit 716; Be used for before second judgment sub-unit 715, receiving the abnormal restoring notice of sending unusual definite unit 700; Judge that hard disc apparatus is whether in the topology of storage system, if the abnormal restoring notice is sent to second judgment sub-unit 715; Otherwise, abandon the abnormal restoring notice.
On above architecture basics, this device can also comprise: forever descend electric unit 730, specifically comprise: the 4th judgment sub-unit 731 and permanent electronic unit 732 down.
The 4th judgment sub-unit 731; Be used in the abnormal restoring unit receiving before 710 the abnormal restoring notice that unusual determining unit 700 is sent; It is unusual to judge whether sense data shows that hard disc apparatus takes place to recover, and if not, the abnormal restoring notice is sent to abnormal restoring unit 710; If send permanent electricity notice down to permanent electronic unit 732 down.
Permanent electronic unit 732 down, be used to receive the permanent notice of electricity down after, permanent electric treatment down below carrying out: through sending down the power supply of electricity order cut-out to hard disc apparatus.
Further, this forever descends electric unit 730 to comprise: the 5th judgment sub-unit 733 and the 6th judgment sub-unit 734.
The 5th judgment sub-unit 733; Be used for before the 4th judgment sub-unit 731, receiving the abnormal restoring notice that unusual definite unit 700 sends; Judge that hard disc apparatus is whether in the topology of storage system, if the abnormal restoring notice is sent to the 6th judgment sub-unit 734; Otherwise, abandon the abnormal restoring notice.
The 6th judgment sub-unit 734, be used to receive abnormal restoring notice after, judge whether hard disc apparatus is carrying out forever descending electric treatment, if abandon abnormal restoring and notify; Otherwise, the abnormal restoring notice is sent to the 4th judgment sub-unit 731.
When the concrete scene at sense data place was distinguished, unusual determining unit 700 can specifically comprise: definite unusually subelement 701, scene resolves subelement 702 and mode is confirmed subelement 703.
Unusual definite subelement 701 is used for after the SCSI driver element is received the command response that comprises sense data of hard disc apparatus transmission, and it is unusual to confirm that hard disc apparatus occurs.
Scene is resolved subelement 702, is used for after unusual definite subelement 701 confirms that hard disc apparatus occurs unusually the scene information that comprises in the resolve command response.
Mode is confirmed subelement 703; Be used for corresponding relation according to predefined scene information and fault-tolerant processing mode; Confirm the corresponding fault-tolerant processing mode of scene information that scene parsing subelement 702 parses; Handle if the fault-tolerant processing mode of confirming is an abnormal restoring, then send the abnormal restoring notice to abnormal restoring unit 710; Wherein, scene information comprises the type and the residing state of hard disc apparatus of the corresponding I/O order of command response.
If there is permanent electric unit 730 down in this device, then mode confirms that subelement 703 sends to abnormal restoring unit 710 (shown in Fig. 7) with the abnormal restoring notice after forever electric unit 730 is handled down; If there is not permanent electric unit 730 down in this device, then mode confirms that subelement 703 directly sends to abnormal restoring unit 710 with the abnormal restoring notice.
Can possess one of following structure or combination in any (being the combination of all structures shown in Fig. 7) for different scene:
Structure one, this device also comprise: retry processing unit 740.
Mode is confirmed subelement 703, and the scene information that parses at scene parsing subelement 702 is: when the type of I/O order is the static access state for managing order of control class and the residing state of hard disc apparatus, send retries processing notice to retry processing unit 740.
Retry processing unit 740 when being used to receive retry processing notice, judges whether the order of management control class influences hard disc apparatus by normal access, if send the abnormal restorings notice to abnormal restoring unit 710; If not; Judge further whether the number of retries to hard disc apparatus reaches preset number of retries threshold value, if, to the error message of user prompt hard disc apparatus; To order otherwise send management control class to hard disc apparatus again to the number of retries zero clearing of hard disc apparatus.
Structure two, this device also comprise: write repair process unit 750.
Mode is confirmed subelement 703, resolves the scene information that subelement 702 parses in scene to be: the type of I/O order is read command and the residing state of hard disc apparatus when being the static access state, sends to write and repairs notice to writing repair process unit 750.
Write repair process unit 750, be used for writing when repairing notice receiving, the RAID in the notice storage system writes repair process to hard disc apparatus, judges and writes whether success of repair process, if send read command to hard disc apparatus again; Otherwise, send the abnormal restoring notice to abnormal restoring unit 710.
Structure three, this device can also comprise: suspend processing unit 760.
Mode is confirmed subelement 703, resolves the scene information that subelement 702 parses in scene to be: the type of I/O order is an equipment when adding state for management control type order and the residing state of hard disc apparatus, sends to suspend and handles notice to suspending processing unit 760.
Suspend processing unit 760, be used to receive and suspend when handling notice, suspend the fault-tolerant processing of this device sense data, treat the interpolation of hard disc apparatus completion equipment after, recover the fault-tolerant processing of this device to sense data.
Structure four, mode are confirmed subelement 703; The scene information that parses at scene parsing subelement 702 is: the type of I/O order is that write order and the residing state of hard disc apparatus are the static access state; Perhaps, the type of I/O order is that read command and the residing state of hard disc apparatus are that equipment adds state, perhaps; The type of I/O order is that write order and the residing state of hard disc apparatus are equipment when adding state, sends the abnormal restorings notice to abnormal restoring unit 710.
Can find out that by above description method and apparatus provided by the invention possesses following advantage:
1) among the present invention; The SCSI driver element is after receiving the command response that comprises sense data that hard disc apparatus sends; Through the mode that hard disc apparatus electric down back time-delay is powered on, hard disc apparatus is carried out abnormal restoring, make hard disc apparatus can behind abnormal restoring, in time handle the I/O order; Avoid whole storage system is hung up or being sunk into blocked state, thereby improve the fault-tolerance of storage system effectively because of the I/O command timeout.
2) the present invention can recover to handle the tolerance incident carrying out sending to RAID when abnormal restoring is handled, and avoids that RAID deletes hard disc apparatus in hard disc apparatus being carried out the process that abnormal restoring handles.Normal condition RAID with redundancy properties can not demote, hard disc apparatus invisible to the user unusually, thus improve the robustness of storage system.
3) for hard disc apparatus occur expendable unusual, the present invention is through carrying out forever descend the mode of electric treatment with it, the once and for all elimination source of trouble avoids the source of trouble for a long time the performance of storage system to be caused adverse effect.
4) the present invention specifically distinguishes sense data institute scene of generating, in command response, identifies, and carries out the fault-tolerant processing mode that adapts with this scene to different scene, thereby improves the specific aim of fault-tolerant processing.
The above is merely preferred embodiment of the present invention, and is in order to restriction the present invention, not all within spirit of the present invention and principle, any modification of being made, is equal to replacement, improvement etc., all should be included within the scope that the present invention protects.

Claims (22)

1. method that small computer system interface SCSI sense data is handled is applied to comprise the storage system of SCSI driver element and hard disc apparatus; It is characterized in that after said SCSI driver element was received the command response that comprises sense data of said hard disc apparatus transmission, it was unusual to confirm that said hard disc apparatus occurs, and carries out following abnormal restoring and handles:
A, cut off the power supply of said hard disc apparatus, start timer simultaneously through sending down electricity order;
B, said timer then after, through the transmission power supply that order recovers said hard disc apparatus that powers on.
2. based on the described method of claim 1, it is characterized in that, before the power supply of cutting off said hard disc apparatus through electricity order under sending, also comprise in the steps A:
A1, in the corresponding wrong node of said hard disc apparatus, the anomalous counts value of said hard disc apparatus is added 1;
A2, judge that whether the anomalous counts value of said hard disc apparatus surpasses preset anomalous counts threshold value; If not; Send recovery processing tolerance incident to the independent hard disk redundant array RAID of said storage system and to avoid said RAID said hard disc apparatus is removed, the continuation execution is said cuts off the power supply of said hard disc apparatus through sending down electricity order; Otherwise, execution in step A3;
A3, send to said RAID and to recover to handle turkey, wait for set duration after, delete the corresponding wrong node of said hard disc apparatus.
3. method according to claim 2 is characterized in that, before said steps A 1, also comprises in the steps A:
A0, judge whether said hard disc apparatus is carrying out said abnormal restoring and handling, if finish fault-tolerant processing flow process to said sense data; Otherwise, continue to carry out said steps A 1.
4. method according to claim 3 is characterized in that, is also comprising before the said steps A 0 in the steps A: whether said hard disc apparatus is in the topology in said storage system, if continue to carry out said steps A 0; Otherwise, finish fault-tolerant processing to said sense data.
5. method according to claim 1; It is characterized in that; After confirming that said hard disc apparatus occurs unusually; Before carrying out said abnormal restoring processing, also comprise: it is unusual to judge whether said sense data shows that said hard disc apparatus takes place to recover, and if not, continues the said abnormal restoring of execution and handles; If carry out following permanent electric treatment down:
C, cut off power supply, finish fault-tolerant processing said sense data to said hard disc apparatus through sending electricity order down.
6. method according to claim 5 is characterized in that, before step C, also comprises:
C1, judge said hard disc apparatus whether in the topology of said storage system, if, execution in step C2; Otherwise, finish fault-tolerant processing to said sense data;
C2, judge whether said hard disc apparatus is carrying out said permanent down electric treatment, if finish fault-tolerant processing to said sense data; Otherwise, continue to carry out said step C.
7. method according to claim 1; It is characterized in that; After confirming that said hard disc apparatus occurs unusually, before carrying out said abnormal restoring processing, also comprise: resolve the scene information that comprises in the said command response, according to the corresponding relation of predefined scene information and fault-tolerant processing mode; Confirm the fault-tolerant processing mode that scene information is corresponding, carry out fault-tolerant processing according to the fault-tolerant processing mode of confirming; At least having a kind of fault-tolerant processing mode in the said corresponding relation is that said abnormal restoring is handled, and said scene information comprises the type and the residing state of hard disc apparatus of the corresponding I/O order of said command response.
8. method according to claim 7 is characterized in that, according to the corresponding relation of predefined scene information and fault-tolerant processing mode, confirms the fault-tolerant processing mode that scene information is corresponding, carries out fault-tolerant processing according to the fault-tolerant processing mode of confirming and specifically comprises:
If said scene information is: the type of said I/O order controls type order for management and the residing state of said hard disc apparatus is the static access state, then carries out following steps;
D1, judge whether a said management control type order influences hard disc apparatus by normal access, if carry out said abnormal restoring and handle; Otherwise execution in step D2;
Whether D2, judgement reach preset number of retries threshold value to the number of retries of said hard disc apparatus, if, execution in step D3; Otherwise, execution in step D4;
D3, to the error message of the said hard disc apparatus of user prompt, will finish fault-tolerant processing to the number of retries zero clearing of said hard disc apparatus to said sense data;
D4, send said management control type order to said hard disc apparatus again.
9. method according to claim 7 is characterized in that, according to the corresponding relation of predefined scene information and fault-tolerant processing mode, confirms the fault-tolerant processing mode that scene information is corresponding, carries out fault-tolerant processing according to the fault-tolerant processing mode of confirming and specifically comprises:
If said scene information is: the type of said I/O order is that read command and the residing state of said hard disc apparatus are the static access state, then carries out following steps;
E1, notify the RAID in the said storage system that said hard disc apparatus is write repair process;
E2, judge the said whether success of repair process of writing, if, execution in step E3; Otherwise, carry out said abnormal restoring and handle;
E3, send said read command to said hard disc apparatus again.
10. method according to claim 7; It is characterized in that; According to the corresponding relation of predefined scene information and fault-tolerant processing mode, confirm the fault-tolerant processing mode that scene information is corresponding, carry out fault-tolerant processing according to the fault-tolerant processing mode of confirming and specifically comprise:
If said scene information is: the type of said I/O order controls type order for management and the residing state of said hard disc apparatus is that equipment adds state, then carries out following steps;
F1, suspend fault-tolerant processing to said sense data;
F2, treat that said hard disc apparatus completion equipment adds after, recover fault-tolerant processing to sense data.
11. method according to claim 7; It is characterized in that; According to the corresponding relation of predefined scene information and fault-tolerant processing mode, confirm the fault-tolerant processing mode that scene information is corresponding, carry out fault-tolerant processing according to the fault-tolerant processing mode of confirming and specifically comprise:
If said scene information is: the type of said I/O order is that write order and the residing state of said hard disc apparatus are the static access state; Perhaps; The type of said I/O order is that read command and the residing state of said hard disc apparatus are that equipment adds state; Perhaps, the type of said I/O order is that write order and the residing state of said hard disc apparatus are that equipment adds state, then carries out said abnormal restoring and handles.
12. the device that small computer system interface SCSI sense data is handled is applied to comprise the storage system of SCSI driver element and hard disc apparatus; It is characterized in that this device comprises: definite unusually unit, abnormal restoring unit and timer;
Said unusual definite unit is used for after said SCSI driver element is received the command response that comprises sense data of said hard disc apparatus transmission, and it is unusual to confirm that said hard disc apparatus occurs, and sends the abnormal restoring notice to said abnormal restoring unit;
Said abnormal restoring unit, be used to receive said abnormal restoring notice after, carries out following abnormal restoring and handles: cut off the power supply of said hard disc apparatus through sending down electric the order, start said timer simultaneously; Said timer then after, through the transmission power supply that order recovers said hard disc apparatus that powers on.
13. device according to claim 12 is characterized in that, said abnormal restoring unit specifically comprises: anomalous counts subelement, first judgment sub-unit, reporting events subelement and abnormal restoring subelement;
Said anomalous counts subelement, be used to receive said abnormal restoring notice after, in the corresponding wrong node of said hard disc apparatus, the anomalous counts value of said hard disc apparatus is added 1, send first to said first judgment sub-unit and judge notice; In the judged result of said first judgment sub-unit when being, wait for set duration after, delete the corresponding wrong node of said hard disc apparatus;
Said first judgment sub-unit, be used to receive said first judge notice after, judge whether the anomalous counts value of said hard disc apparatus surpasses preset anomalous counts threshold value;
Said reporting events subelement is used in the judged result of said first judgment sub-unit for not the time, sends to the independent hard disk redundant array RAID of said storage system and recovers to handle the tolerance incident and to avoid said RAID said hard disc apparatus is removed; , send to said RAID and to recover to handle turkey when being in the judged result of said first judgment sub-unit;
Said abnormal restoring subelement is used in the judged result of said first judgment sub-unit carrying out said abnormal restoring and handling for not the time.
14. device according to claim 13; It is characterized in that; Said abnormal restoring unit also comprises: second judgment sub-unit, be used for before said anomalous counts subelement, receiving the abnormal restoring notice of sending said unusual definite unit, and judge whether said hard disc apparatus is carrying out said abnormal restoring and handling; If abandon said abnormal restoring notice; Otherwise said abnormal restoring notice is sent to said anomalous counts subelement.
15. device according to claim 14; It is characterized in that; Said abnormal restoring unit also comprises: the 3rd judgment sub-unit, be used for receiving before said second judgment sub-unit abnormal restoring notice of sending said unusual definite unit, and judge that said hard disc apparatus is whether in the topology of said storage system; If said abnormal restoring notice is sent to said second judgment sub-unit; Otherwise, abandon said abnormal restoring notice.
16., it is characterized in that this device also comprises based on the described device of claim 12: forever descend electric unit, specifically comprise: the 4th judgment sub-unit and permanent electronic unit down;
Said the 4th judgment sub-unit; Receive the abnormal restoring notice that said unusual determining unit is sent before being used in said abnormal restoring unit; It is unusual to judge whether said sense data shows that said hard disc apparatus takes place to recover; If not, said abnormal restoring notice is sent to said abnormal restoring unit; If send permanent electricity notice down to said permanent electronic unit down;
Said permanent down electronic unit, be used to receive the said permanent notice of electricity down after, permanent electric treatment down below carrying out: through sending down the power supply of electricity order cut-out to said hard disc apparatus.
17. device according to claim 16 is characterized in that, said permanent electric unit down also comprises: the 5th judgment sub-unit and the 6th judgment sub-unit;
Said the 5th judgment sub-unit; Be used for before said the 4th judgment sub-unit, receiving the abnormal restoring notice that said unusual definite unit sends; Judge that said hard disc apparatus is whether in the topology of said storage system, if said abnormal restoring notice is sent to the 6th judgment sub-unit; Otherwise, abandon said abnormal restoring notice;
Said the 6th judgment sub-unit, be used to receive abnormal restoring notice after, judge whether said hard disc apparatus is carrying out said permanent electric treatment down, if abandon said abnormal restoring notice; Otherwise, said abnormal restoring notice is sent to said the 4th judgment sub-unit.
18., it is characterized in that said unusual definite unit specifically comprises based on the described device of claim 12: definite unusually subelement, scene resolve subelement and mode is confirmed subelement;
Said unusual definite subelement is used for after said SCSI driver element is received the command response that comprises sense data of said hard disc apparatus transmission, and it is unusual to confirm that said hard disc apparatus occurs;
Said scene is resolved subelement, is used for after said unusual definite subelement confirms that said hard disc apparatus occurs unusually, resolving the scene information that comprises in the said command response;
Said mode is confirmed subelement; Be used for corresponding relation according to predefined scene information and fault-tolerant processing mode; Confirm the corresponding fault-tolerant processing mode of scene information that said scene parsing subelement parses; Handle if the fault-tolerant processing mode of confirming is an abnormal restoring, then send the abnormal restoring notice to said abnormal restoring unit; Wherein, said scene information comprises the type and the residing state of hard disc apparatus of the corresponding I/O order of said command response.
19. device according to claim 18 is characterized in that, this device also comprises: the retry processing unit;
Said mode is confirmed subelement; The scene information that parses at said scene parsing subelement is: when the type of said I/O order is the static access state for managing order of control class and the residing state of said hard disc apparatus, send retry processing notice to said retry processing unit;
Said retry processing unit is used to receive said retry and handles when notifying, and judges whether a said management control type order influences hard disc apparatus by normal access, if send abnormal restoring to said abnormal restoring unit and notify; If not; Judge further whether the number of retries to said hard disc apparatus reaches preset number of retries threshold value; If; To the error message of the said hard disc apparatus of user prompt, will order otherwise send said management control class to said hard disc apparatus again to the number of retries zero clearing of said hard disc apparatus.
20. device according to claim 18 is characterized in that, this device also comprises: write the repair process unit;
Said mode is confirmed subelement; Resolving the scene information that subelement parses in said scene is: the type of said I/O order is read command and the residing state of said hard disc apparatus when being the static access state, sends to write and repairs notice to the said repair process unit of writing;
The said repair process unit of writing; Be used for notifying the RAID in the said storage system that said hard disc apparatus is write repair process, judge saidly whether successfully to write repair process receiving said writing when repairing notice; If send said read command to said hard disc apparatus again; Otherwise, send the abnormal restoring notice to said abnormal restoring unit.
21., it is characterized in that this device also comprises based on the described device of claim 18: suspend processing unit;
Said mode is confirmed subelement; Resolving the scene information that subelement parses in said scene is: the type of said I/O order is an equipment when adding state for management control type order and the residing state of said hard disc apparatus, sends to suspend to said time-out processing unit and handles notice;
Said time-out processing unit when being used to receive said time-out and handling notice, suspends the fault-tolerant processing of this device to said sense data, treat said hard disc apparatus completion equipment interpolation after, recover the fault-tolerant processing of this device to sense data.
22. device according to claim 18; It is characterized in that; Said mode is confirmed subelement, and the scene information that parses at said scene parsing subelement is: the type of said I/O order is that write order and the residing state of said hard disc apparatus are the static access state, perhaps; The type of said I/O order is that read command and the residing state of said hard disc apparatus are that equipment adds state; Perhaps, the type of said I/O order is that write order and the residing state of said hard disc apparatus are equipment when adding state, sends the abnormal restoring notice to said abnormal restoring unit.
CN2010105052288A 2010-10-09 2010-10-09 Method and device for processing SCSI sensing data Active CN102446123B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2010105052288A CN102446123B (en) 2010-10-09 2010-10-09 Method and device for processing SCSI sensing data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2010105052288A CN102446123B (en) 2010-10-09 2010-10-09 Method and device for processing SCSI sensing data

Publications (2)

Publication Number Publication Date
CN102446123A true CN102446123A (en) 2012-05-09
CN102446123B CN102446123B (en) 2013-11-27

Family

ID=46008634

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2010105052288A Active CN102446123B (en) 2010-10-09 2010-10-09 Method and device for processing SCSI sensing data

Country Status (1)

Country Link
CN (1) CN102446123B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108984102A (en) * 2017-06-02 2018-12-11 伊姆西Ip控股有限责任公司 Manage the method, system and computer program product of storage system
CN109376029A (en) * 2018-09-27 2019-02-22 郑州云海信息技术有限公司 A kind of processing method and processing system that SCSI hard disk is extremely overtime
CN109918257A (en) * 2017-12-12 2019-06-21 杭州海康威视数字技术股份有限公司 A kind of hard disk abnormality eliminating method and device
CN111986707A (en) * 2020-08-21 2020-11-24 苏州浪潮智能科技有限公司 Disk link error injection method, exception handling test method and device
CN113722139A (en) * 2021-08-27 2021-11-30 东莞盟大集团有限公司 Data request method with high request efficiency and data loss prevention

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5438549A (en) * 1994-02-28 1995-08-01 Intel Corporation Nonvolatile memory with volatile memory buffer and a backup power supply system
US20020133695A1 (en) * 2001-01-25 2002-09-19 Dell Products L.P. Apparatus and method for detecting a change in system hardware configuration to reduce the amount of time to execute a post routine
CN1378120A (en) * 2001-04-04 2002-11-06 英业达股份有限公司 Power off protector for computer data
CN101183801A (en) * 2007-12-07 2008-05-21 杭州华三通信技术有限公司 Power-off protection method, system and device
CN101286086A (en) * 2008-06-10 2008-10-15 杭州华三通信技术有限公司 Hard disk power down protection method, device and hard disk, and hard disk power down protection system
CN101826060A (en) * 2010-05-24 2010-09-08 中兴通讯股份有限公司 Method and device for protecting power failure data of solid state disk

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5438549A (en) * 1994-02-28 1995-08-01 Intel Corporation Nonvolatile memory with volatile memory buffer and a backup power supply system
US20020133695A1 (en) * 2001-01-25 2002-09-19 Dell Products L.P. Apparatus and method for detecting a change in system hardware configuration to reduce the amount of time to execute a post routine
CN1378120A (en) * 2001-04-04 2002-11-06 英业达股份有限公司 Power off protector for computer data
CN101183801A (en) * 2007-12-07 2008-05-21 杭州华三通信技术有限公司 Power-off protection method, system and device
CN101286086A (en) * 2008-06-10 2008-10-15 杭州华三通信技术有限公司 Hard disk power down protection method, device and hard disk, and hard disk power down protection system
CN101826060A (en) * 2010-05-24 2010-09-08 中兴通讯股份有限公司 Method and device for protecting power failure data of solid state disk

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108984102A (en) * 2017-06-02 2018-12-11 伊姆西Ip控股有限责任公司 Manage the method, system and computer program product of storage system
CN108984102B (en) * 2017-06-02 2021-06-22 伊姆西Ip控股有限责任公司 Method, system and computer program product for managing a storage system
CN109918257A (en) * 2017-12-12 2019-06-21 杭州海康威视数字技术股份有限公司 A kind of hard disk abnormality eliminating method and device
CN109918257B (en) * 2017-12-12 2022-11-04 杭州海康威视数字技术股份有限公司 Hard disk exception handling method and device
CN109376029A (en) * 2018-09-27 2019-02-22 郑州云海信息技术有限公司 A kind of processing method and processing system that SCSI hard disk is extremely overtime
CN109376029B (en) * 2018-09-27 2021-11-19 郑州云海信息技术有限公司 Processing method and processing system for SCSI hard disk abnormal overtime
CN111986707A (en) * 2020-08-21 2020-11-24 苏州浪潮智能科技有限公司 Disk link error injection method, exception handling test method and device
CN113722139A (en) * 2021-08-27 2021-11-30 东莞盟大集团有限公司 Data request method with high request efficiency and data loss prevention

Also Published As

Publication number Publication date
CN102446123B (en) 2013-11-27

Similar Documents

Publication Publication Date Title
EP3179359B1 (en) Data sending method, data receiving method, and storage device
US8250202B2 (en) Distributed notification and action mechanism for mirroring-related events
CN100462927C (en) Method and system for recovery of formatting in repair of bad sectors in flash memory
CN100595839C (en) Hard disc error detection and fault-tolerant method in stream media uses
CN102446123B (en) Method and device for processing SCSI sensing data
US7975171B2 (en) Automated file recovery based on subsystem error detection results
EP2425344B1 (en) Method and system for system recovery using change tracking
JP2008140300A (en) Storage system, virus infection diffusion preventing method, and virus removal supporting method
CN110377456A (en) A kind of management method and device of virtual platform disaster tolerance
JP2006031630A (en) Storage device and method for controlling power consumption of storage device
WO2007063103A1 (en) Backup and restore of file system objects of unknown type
CN111708488A (en) Distributed memory disk-based Ceph performance optimization method and device
CN104516796A (en) Command set based network element backup and recovery method and device
CN102915260B (en) The method that solid state hard disc is fault-tolerant and solid state hard disc thereof
TWI518680B (en) Method for maintaining file system of computer system
CN110597660A (en) Data backup method, device, equipment and medium for virtual machine
CN111371642B (en) Network card fault detection method, device, equipment and storage medium
JP6124644B2 (en) Information processing apparatus and information processing system
US8782006B1 (en) Method and apparatus for file sharing between continuous and scheduled backups
CN104020963A (en) Method and device for preventing misjudgment of hard disk read-write errors
WO2014094259A1 (en) Method and device for processing storage space object
CN111090491B (en) Virtual machine task state recovery method and device and electronic equipment
CN113946471A (en) Distributed file-level backup method and system based on object storage
CN113835971A (en) Monitoring method for abnormal lighting of server backboard and related components
CN107729170B (en) Method and device for generating dump file by HBA card

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 310052 Binjiang District Changhe Road, Zhejiang, China, No. 466, No.

Patentee after: Xinhua three Technology Co., Ltd.

Address before: 310053 Hangzhou hi tech Industrial Development Zone, Zhejiang province science and Technology Industrial Park, No. 310 and No. six road, HUAWEI, Hangzhou production base

Patentee before: Huasan Communication Technology Co., Ltd.