US20040220794A1 - Methods and apparatus for generating effective test code for out of order superscalar microprocessors - Google Patents

Methods and apparatus for generating effective test code for out of order superscalar microprocessors Download PDF

Info

Publication number
US20040220794A1
US20040220794A1 US10/855,600 US85560004A US2004220794A1 US 20040220794 A1 US20040220794 A1 US 20040220794A1 US 85560004 A US85560004 A US 85560004A US 2004220794 A1 US2004220794 A1 US 2004220794A1
Authority
US
United States
Prior art keywords
processor
instruction
instructions
stream
test executable
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/855,600
Inventor
Carl Ramey
Daniel Leibholz
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US10/855,600 priority Critical patent/US20040220794A1/en
Publication of US20040220794A1 publication Critical patent/US20040220794A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • G06F9/3851Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution from multiple instruction streams, e.g. multistreaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/22Detection or location of defective computer hardware by testing during standby operation or during idle time, e.g. start-up testing
    • G06F11/26Functional testing
    • G06F11/261Functional testing by simulating additional hardware, e.g. fault simulation

Definitions

  • the process of designing a data processor typically includes testing for design flaws at various stages of development. Such testing often involves running one or more test executables through a processor simulation system during a processor simulation stage of development, or through an actual processor in semiconductor form after a fabrication stage. In general, these test executables attempt to stress particular circuits and features of the process.
  • a superscalar processor is a processor that is capable of executing multiple instructions simultaneously. Such processors typically include an execution stage having multiple execution units (execution circuits), each of which can execute an instruction independently of other execution units. Designers typically test superscalar processors using test executables created from source code having few or no instruction dependencies, or source code having weak instruction dependencies.
  • An instruction dependency (also referred to as a data hazard) exists when two instructions attempt to access the same register.
  • the strongest type of instruction dependency is a read-after-write (RAW) dependency in which an initial instruction writes a result to a register and a subsequent instruction reads the result from that register. The subsequent instruction must wait until the initial instruction completes writing the result before it can read the result.
  • RAW read-after-write
  • Other types of instruction dependencies include write after-read (WAR) and write-after-write (WAW) dependencies.
  • Instruction streams with weak instruction dependencies or no instruction dependencies stress the multiple execution capabilities of superscalar processors since there is little or no need to delay the instructions of such streams. Accordingly, instructions generally can execute as soon as an execution unit becomes available.
  • Stream # 1 includes no instruction dependencies, and stresses the multiple issue feature of superscalar processors.
  • STREAM #1 Inst. #1 OP SRC 0 SRC 1 DEST. 1 addq R01, R02, R03 2 subl R04, R05, R06 3 addq R07, R08, R09 4 subl R10, R11, R12
  • Instruction 1 adds the contents of source register R 01 to source register R 02 , and stores the result in destination 35 register R 03 .
  • Instruction 2 subtracts the contents of R 04 from R 05 , and stores the result in R 06 .
  • Instruction 3 adds the contents of R 07 to register R 08 , and stores the result in R 09 .
  • Instruction 4 subtracts the contents of R 10 from R 11 , and stores the result in R 12 . Since none of the instructions access the same registers, there are no instruction dependencies. Accordingly, subsequent instructions do not need to be delayed while earlier instructions complete, and instructions may issue as as execution units become available to execute them. As a result, the execution units of the superscalar processor are consistently kept busy. For these reasons, designers of superscalar processors often create large executables similar to Stream # 1 , and use such executables to test the superscalar capabilities of their processor designs.
  • An out-of-order processor is a processor that obtains instructions in a program order, and that is capable of executing instructions in an order that is different than the program order (i.e., capable of executing instructions out-of-order).
  • Such processors typically include an issue queue that queues the instructions obtained in program order, and that is capable of issuing instructions out-of-order when instruction dependencies require that the processor delay issuance of instructions next in line.
  • Designers typically test outof-order processors using a test executable created from source code having a large number of instructions with strong dependencies.
  • Stream # 2 includes instructions with strong dependencies, and stresses the out-of-order issue feature of out-of-order processors.
  • STREAM #2 Inst. # OP SRC 0 SRC 1 DEST. 1 addq R01, R02, R03 2 subl R03, R04, R05 3 addq R03, R06, R07 4 subl R08, R09, R10
  • Instruction 1 adds the contents of source register R 01 to source register R 02 , and stores the result in destination register R 03 .
  • Instruction 2 subtracts the contents of R 03 from R 04 , and stores the result in R 05 .
  • Instruction 3 adds the contents of R 03 to R 06 , and stores the result in R 07 .
  • Instruction 4 subtracts the contents of R 08 from R 09 , and stores the result in R 10 . Since Instruction 1 stores its result in R 03 and each of the Instructions 2 and 3 reads from R 03 , Instructions 2 and 3 having instruction dependencies with Instruction 1 . Accordingly, Instructions 2 and 3 cannot issue until Instruction 1 stores its result.
  • Instruction 4 can issue at any time relative to Instructions 1 , 2 or 3 since Instruction 4 does not access any registers that are accessed by the other instructions. Accordingly, an out-of-order processor may issue Instruction 1 , and subsequently issue Instruction 4 prior to issuing Instructions 2 and 3 . For these reasons, designers of out-of-order processors often create large executables from instruction streams similar to Stream # 2 to cause instructions to issue out-of-order, and then use such executables to stress the out-of-order capabilities of their processor designs.
  • Some processors include both superscalar and out-of-order features.
  • the superscalar feature of such a processor can be tested by running a test executable having instructions without dependencies similar to that of Stream# 1 (shown above). Additionally, the out-of-order feature can be tested by running another test executable having instructions with dependencies similar to that of Stream # 2 (shown above).
  • Stream # 1 may stress a processor's superscalar capabilities, but does not stress the processor's out-of-order capabilities simultaneously.
  • Stream # 2 may stress a processor's out-of-order capabilities, but does not stress the processor's superscalar capabilities simultaneously.
  • Unfortunately many design problems in complex processors will only be discovered when multiple processor features are stressed simultaneously.
  • a stream suitable for testing a processor's superscalar capabilities with few or no dependencies can be modified by introducing strong instruction dependencies, e.g., read-after-write (RAW) dependencies.
  • RAW read-after-write
  • increasing the number of RAW instruction dependencies reduces the number of independent instructions (instructions without dependencies) within the stream. That is, the resulting stream may improve the stream's opportunity to cause an out-of-order execution, but such a stream may no longer be able to consistently stress the superscalar structures of the processor. Accordingly, some execution units may become idle and the throughput of the processor will decrease.
  • An embodiment of the invention is directed to a technique that can produce, in a computer, a test executable that can simultaneously test the superscalar and out-of-order capabilities of a processor.
  • the technique involves forming multiple instruction streams, dividing the multiple instruction streams into portions, and generating a combined instruction stream having the portions interleaved.
  • the technique further involves creating a test executable from the combined instruction stream.
  • Formation of multiple instruction streams preferably involves constructing the multiple instruction streams such that the multiple instruction streams access different groups of registers.
  • Each instruction stream can provide instructions with strong dependencies for testing the out-of-order capabilities of the processor. Additionally, the instructions within any particular stream are independent of the instructions of the other streams such that multiple execution units of the processor can be consistently kept busy.
  • Construction of the multiple instruction streams may involve operating a code generator such that the code generator provides each of the multiple instruction streams.
  • such construction may involve operating a code generator such that the code generator provides a particular instruction stream, and forming other instruction streams according to the particular instruction stream.
  • the technique may involve interleaving the portions within the combined instruction stream such that the portions alternate in a round-robin manner.
  • the technique may involve interleaving the portions within the combined instruction stream such that the portions alternate in a pseudo random manner. Interleaving in a pseudo random manner may introduce nuances within the instruction stream that uncover design flaws that would otherwise be undetected.
  • the technique may further involve, prior to creating the test executable, including conflict instructions (e.g., instructions that cause conflicts) within the combined instruction stream.
  • conflict instructions e.g., instructions that cause conflicts
  • LOAD instructions that cause cache misses may be included within the instruction stream to purposefully stall instructions with dependencies within the instruction stream.
  • the LOAD instructions would more fully stress the processor's out-of-order capabilities by adding delays to particular instructions depending on the LOAD instructions.
  • the formation of the multiple instruction streams may involve constructing the multiple instruction streams such that the multiple instruction streams communicate with each other.
  • the multiple instruction streams can be formed such that they access common registers.
  • the multiple instruction streams can be formed such that they share common memory spaces. The sharing of common registers or memory spaces enhances the breadth of the processor test by also testing interstream communication aspects of the processor.
  • Another embodiment of the invention is directed to a simulation system for testing a simulated processor.
  • the system includes an input that receives a test executable created from a combined instruction stream having interleaved portions of multiple instruction streams.
  • the system further includes a processor simulator, coupled to executable to generate reference results.
  • the system includes a compare module, coupled to the processor simulator and the reference model, that compares the processor results and the reference results to determine whether the simulated processor operates correctly.
  • the system simultaneously stresses the superscalar and out-of-the input processor reference that runs the test executable to generate results.
  • the system includes a model, coupled to the input, that runs the test order capabilities of the processor simulator such that design flaws can be detected and corrected prior to fabrication of the actual processor.
  • FIG. 1 is a block diagram of an apparatus for producing a test executable.
  • FIG. 2 is a flow diagram of a method for producing a test executable.
  • FIG. 3 shows multiple code streams that are formed by a multiple instruction code stream generator circuit of FIG. 1.
  • FIG. 4 is a combined code stream that is generated by an interleaves circuit of FIG. 1.
  • FIG. 5 is a block diagram of a simulation system for testing a simulated processor using a test executable that is created by the apparatus of FIG. 1.
  • FIG. 6A is a chart showing contents of an issue queue of FIG. 5 after a first fetch of instructions.
  • FIG. 6B is a chart showing contents of the issue queue of FIG. 5 after a second fetch of instructions.
  • FIG. 6C is a chart showing contents of the issue queue of FIG. 5 after a third fetch of instructions.
  • FIG. 6D is a chart showing contents of the issue queue of FIG. 5 after a fourth fetch of instructions.
  • FIG. 7 is a chart showing issue times for each of the instructions of the combined code stream of FIG. 4 when executed by the simulation system of FIG. 5.
  • An embodiment of the invention is directed to a technique for producing a test executable that can stress both the superscalar and out-of-order capabilities of a processor design.
  • the test executable is created from a combined instruction stream having interleaved portions of multiple instruction streams.
  • FIG. 1 shows an apparatus for producing the test executable.
  • the apparatus 20 includes a multiple instruction stream generator circuit 22 , an interleaver circuit 24 and a compiler circuit 26 . As will now be explained, the circuits of the apparatus 20 perform a method 50 as shown in FIG. 2.
  • the multiple instruction stream generator circuit 22 forms multiple instruction streams, and stores the instruction streams in respective files 30 (e.g. lis files).
  • the instruction streams access different groups of registers as indicated by configuration information 28 that is received by the multiple instruction stream generator circuit 22 .
  • step 54 the interleaver circuit 24 divides the multiple instruction streams into portions, and generates a combined instruction stream having the portions interleaved.
  • the combined instruction stream is stored within a file 32 (e.g., a mar file).
  • the number of instructions in each portion is controlled by the configuration information 28 .
  • step 56 the compiler circuit 26 creates a test executable from the combined instruction stream.
  • the compiler circuit compiles the combined instruction stream, and stores the test executable as a file 34 (e.g., a dxe file).
  • a file 34 e.g., a dxe file.
  • the multiple instruction stream generator circuit includes a constructor circuit 36 and a storage device 38 (e.g., disk memory), as shown in FIG. 1.
  • the constructor circuit 36 includes a code generator 36 and a control circuit 42 .
  • the control circuit 42 operates the code generator to form the multiple instruction streams.
  • the control circuit 42 can run the code generator 36 to produce a single instruction stream, makes multiple copies of the single instruction stream, and modifies the copies such that they access different groups of registers based on the configuration information 28 .
  • the control circuit 42 runs the code generator 40 multiple times to form the multiple instruction streams that access the different groups of registers.
  • the constructor circuit 36 stores the multiple instruction streams within the files 30 in the storage device 38 .
  • the apparatus 20 is preferably a general purpose computer having code for producing the test executables.
  • the code controls the general purpose computer such that it functions at various times as the multiple instruction stream generator 22 , the interleaver circuit 24 and the compiler circuit 26 .
  • the apparatus 20 may be a specialized apparatus designed specifically to perform the method 50 of FIG. 2.
  • FIG. 3 shows four instruction streams (STREAM A, STREAM B, STREAM C and STREAM D) that can be formed by the multiple instruction stream generator circuit 22 .
  • the configuration information 28 that is used by the multiple instruction stream generator circuit 22 controls particular aspects of the multiple instruction streams such as the number of streams that are formed, their length (the number of instructions within each stream), which registers are accessed by each instruction stream, and the type of instructions within each instruction stream (e.g., load, add, shift, etc).
  • the instructions within STREAM A access a first group of registers, namely R 01 through R 08 .
  • the instructions following Instruction 1 have strong dependencies on preceding instructions. For example, Instruction 1 writes to R 01 , and Instruction 2 reads from R 01 . Accordingly, Instruction 1 must complete writing to R 01 before Instruction 2 can read from R 01 . In a similar manner, Instruction 3 depends from Instructions 1 and 2 , and so on.
  • STREAM B, STREAM C and STREAM D include instructions that are arranged in a manner similar to that of STREAM A, except that these instruction streams access different groups of registers.
  • STREAM B accesses registers R 09 through R 16
  • STREAM C accesses registers R 16 through R 24
  • STREAM D accesses registers R 25 through R 32 .
  • Each instruction stream formed by the multiple instruction stream generator circuit 22 is stored, at least temporarily, in the storage device 38 for use by the interleaver circuit 24 .
  • FIG. 4 shows a combined instruction stream that is generated by the interleaver circuit 24 from the instruction streams shown in FIG. 3.
  • the interleaver circuit 24 divides the instruction streams into portions, and then interleaves the portions to generate the combined instruction stream.
  • the configuration information 28 controls the size and ordering of the portions within the combined instruction stream. As shown in FIG. 4, the first five instructions of the combined instruction stream are from a portion of STREAM A (see FIG. 3). Similarly, the next five instructions of the combined instruction stream are from a portion of STREAM B, and so on.
  • the manner of interleaving is based on the configuration information 28 .
  • the interleaver circuit 24 can generate the combined instruction stream such that it cycles through portions of STREAM A, STREAM B, STREAM C and STREAM D. Such an arrangement of portions is considered to be a round-robin ordering of the portions.
  • the interleaver circuit 24 can generate the combined instruction stream such that it includes portions of the streams in a pseudo random order.
  • the compiler circuit 26 compiles the combined instruction stream to create a test executable that is suitable for execution on either a simulated processor or an actual processor.
  • the simulation system 60 includes a simulation device 62 that receives a test executable 64 (e.g., executable code such as the test executable 34 of FIG. 1) and environment information 66 (e.g., a dxe file), simulates execution of the test executable, and provides results 68 of the execution (e.g., a log file).
  • a test executable 64 e.g., executable code such as the test executable 34 of FIG. 1
  • environment information 66 e.g., a dxe file
  • the simulation device 62 includes a processor simulator module 70 , a reference model module 72 , a system or motherboard simulator module 76 , and a compare module 74 .
  • the processor simulator module 70 operates according to processor design information and is connected with the system or motherboard simulator module 76 which simulates environmental conditions (e.g., provides external clock rates).
  • the test executable 64 is executed by both the processor simulator module 70 and the reference model module 72 .
  • the processor simulator module 70 includes a simulated issue queue 78 , and a simulated execution stage 80 having multiple simulated execution units and processor registers.
  • results of the execution are passed to the compare module 74 .
  • the reference model module 72 determines what the correct results of execution should be, and passes the correct results to the compare module 74 .
  • the compare module 74 matches the results from both the processor simulator module 70 and the reference model 72 , and points out discrepancies in the results as an error output 68 (e.g., the log file).
  • FIGS. 6A through 6D show the contents of the simulated issue queue 78 of the processor simulator module 70 , after the occurrence of various multi-instruction fetches of the test executable 34 (i.e., the test executable created by compiling the combined code stream of FIG. 4).
  • the simulated issue queue 78 loads the first four instructions of the test executable 34 during an initial processor cycle (time 0 ). Since Instruction 1 is the first instruction and does not depend on any other instruction, Instruction 1 is free to issue. However, Instructions 2 , 3 and 4 cannot issue due to their RAW dependencies with Instruction 1 . Accordingly, during the next processor cycle (time 1 ), only Instruction 1 will issue (indicated by the rectangle around Instruction 1 ).
  • Instruction 1 is removed from the simulated issue queue 78 after issuing in time 1 . Additionally, the remaining three instructions are advanced in their queue positions, and the next tour instructions of the test executable 34 are fetched and loaded into the simulated issue queue 78 .
  • Instruction 2 can issue since the simulated processor is capable of issuing instructions speculatively. However, Instructions 3 , 4 and 5 cannot issue since they have RAW dependencies with Instruction 2 . Instructions 6 , 7 and 8 do not depend on these previously fetched instructions. Rather, Instruction 6 has no dependencies, and Instructions 7 and 8 have RAW dependencies with Instruction 6 . Although Instruction 6 can issue, Instructions 7 and 8 cannot issue because of their instruction dependencies. Accordingly, in the next processor cycle (time 2 ), Instructions 2 and 6 issue while the other queued instructions must wait.
  • test executable 34 has begun to stress both the superscalar and out-of-order capabilities of the processor simultaneously.
  • two instructions Instructions 2 and 6
  • Instruction 6 stored in issue queue position 5
  • Instruction 6 is issued out-of-order to test the simulated processor's out-of-order feature.
  • Instructions 2 and 6 are removed from the simulated issue queue 78 after issuing in time 2 . Additionally, the remaining instructions are advanced in their queue positions, and the next four instructions of the test executable are fetched and loaded into the simulated issue queue 78 . Instructions 3 and 7 can issue since the processor supports speculative execution. Additionally, Instruction 11 has no dependencies and can issue. The rest of the instructions have RAW dependencies with other instructions in the issue queue and must wait. Accordingly, Instructions 3 , 7 and 11 issue simultaneously in the next processor cycle (time 3 ). As shown in FIG. 6C, three instructions are issued from various positions within the issue queue 78 such that both the superscalar and out-of-order features of the simulated processor are stressed.
  • Instructions 3 , 7 and 11 are removed from the simulated issue queue 78 after issuing in time 3 . Furthermore, the remaining instructions are advanced in their queue positions, and the next four instructions of the test executable are fetched and loaded into the simulated issue queue 78 . Instructions 4 , 8 and 12 can issue if the processor supports speculative execution. Additionally, Instruction 16 has no dependencies and can issue. The rest of the instructions have RAW dependencies with other instructions in the issue queue and must wait. Accordingly, Instructions 4 , 8 , 12 and 16 issue simultaneously in the next processor cycle (time 4 ). It should be understood that four instructions are issued from various positions within the issue queue 78 such that both the superscalar and out-of-order features of the simulated processor are further stressed.
  • the issue queue receives instructions at a first end, and scans for instructions to issue beginning at the opposite end.
  • the instructions migrate from the first end of the issue queue to the opposite end.
  • the test executable 34 is well suited for testing such a processor.
  • queued instructions issue from positions throughout the issue queue as they migrate from the first end of the issue queue to the opposite end.
  • FIG. 7 shows the instructions within test executable created from the combined instruction stream of FIG. 4, with their respective fetch (F), issue (I), execute (E) and retire (R) times. As illustrated, multiple instructions issue and execute simultaneously and out-of-order thereby stressing the superscalar and out-of-order capabilities of the simulated processor. Similar results occur when running the test executable 34 on an actual processor.
  • test executable can run on a processor without speculative execution capabilities. In this situation, more fetches must occur to further fill the issue queue with instructions without dependencies before the processor's superscalar capabilities are stressed. Otherwise, the processor behaves in a manner similar to that above for a processor capable of issuing and executing speculatively.
  • special instructions e.g., instructions that cause conflicts
  • LOAD instructions may be inserted within the combined instruction stream such that cache misses occur during execution. This would more fully stress the processor's out-of-order capabilities.
  • the conflict instructions can replace instructions within the combined instruction stream. Such insertions or replacements can be controlled by setting parameters within the configuration information 28 .
  • the groups of registers can be modified such that the different groups overlap.
  • STREAM A and STREAM B can be formed such that both instruction streams access register R 08 .
  • Such a modification provides an opportunity for inter-stream communication.
  • Another way of adding inter-stream communication is to make multiple streams access overlapping memory spaces.
  • Such features can be controlled by setting parameters within the configuration information 28 .
  • Some processors treat registers identified within instructions as logical registers, and internally map the logical registers of instructions to physical registers. This operation is called register renaming.
  • the test executable produced by the above described technique is suitable for testing such processors. In particular, running the test executable on such a processor would stress that processor's renaming features simultaneously with its superscalar and out-of-order features. To enhance testing of the register renaming capabilities of the processor, more instruction streams should be added or the different register groups should be widened such that each logical register is accessed by at least one instruction stream.
  • the number of instruction streams formed by the multiple instruction stream generator circuit 22 can be more or less than four (as shown in the example of FIG. 3).
  • the instruction types and the lengths of the instruction stream portions can be changed as well. Accordingly, processor designers can produce multiple test executables that stress various combinations of particular processor features, at different times.

Abstract

A technique for producing a test executable in a computer. The technique involves forming multiple instruction streams. The technique further involves dividing the multiple instruction streams into portions, and generating a combined instruction stream having the portions interleaved. Additionally, the technique involves creating a test executable from the combined instruction stream. The test executable can be used for testing a simulated processor in a computer. In particular, the test executable is loaded. Then, the test executable is run through the simulated processor to generate processor results and through a reference model to generate reference results. The processor results and the reference results are compared to determine whether the simulated processor operates correctly.

Description

    RELATED APPLICATIONS
  • This application is a continuation of U.S. application Ser. No. 09/106,691, filed Jun. 29, 1998. The entire teachings of the above application is incorporated herein by reference.[0001]
  • BACKGROUND OF THE INVENTION
  • The process of designing a data processor typically includes testing for design flaws at various stages of development. Such testing often involves running one or more test executables through a processor simulation system during a processor simulation stage of development, or through an actual processor in semiconductor form after a fabrication stage. In general, these test executables attempt to stress particular circuits and features of the process. [0002]
  • A superscalar processor is a processor that is capable of executing multiple instructions simultaneously. Such processors typically include an execution stage having multiple execution units (execution circuits), each of which can execute an instruction independently of other execution units. Designers typically test superscalar processors using test executables created from source code having few or no instruction dependencies, or source code having weak instruction dependencies. [0003]
  • An instruction dependency (also referred to as a data hazard) exists when two instructions attempt to access the same register. The strongest type of instruction dependency is a read-after-write (RAW) dependency in which an initial instruction writes a result to a register and a subsequent instruction reads the result from that register. The subsequent instruction must wait until the initial instruction completes writing the result before it can read the result. The weakest type of instruction attempting to read from the same register. Other types of instruction dependencies include write after-read (WAR) and write-after-write (WAW) dependencies. [0004]
  • Instruction streams with weak instruction dependencies or no instruction dependencies stress the multiple execution capabilities of superscalar processors since there is little or no need to delay the instructions of such streams. Accordingly, instructions generally can execute as soon as an execution unit becomes available. [0005]
  • [0006] Stream # 1, as shown below, includes no instruction dependencies, and stresses the multiple issue feature of superscalar processors.
    STREAM #1
    Inst. #1 OP SRC 0 SRC 1 DEST.
    1 addq R01, R02, R03
    2 subl R04, R05, R06
    3 addq R07, R08, R09
    4 subl R10, R11, R12
  • [0007] Instruction 1 adds the contents of source register R01 to source register R02, and stores the result in destination 35 register R03. Instruction 2 subtracts the contents of R04 from R05, and stores the result in R06. Instruction 3 adds the contents of R07 to register R08, and stores the result in R09. Instruction 4 subtracts the contents of R10 from R11, and stores the result in R12. Since none of the instructions access the same registers, there are no instruction dependencies. Accordingly, subsequent instructions do not need to be delayed while earlier instructions complete, and instructions may issue as as execution units become available to execute them. As a result, the execution units of the superscalar processor are consistently kept busy. For these reasons, designers of superscalar processors often create large executables similar to Stream # 1, and use such executables to test the superscalar capabilities of their processor designs.
  • Another type of processor is called an out-of-order processor. An out-of-order processor is a processor that obtains instructions in a program order, and that is capable of executing instructions in an order that is different than the program order (i.e., capable of executing instructions out-of-order). Such processors typically include an issue queue that queues the instructions obtained in program order, and that is capable of issuing instructions out-of-order when instruction dependencies require that the processor delay issuance of instructions next in line. Designers typically test outof-order processors using a test executable created from source code having a large number of instructions with strong dependencies. [0008]
  • [0009] Stream # 2 includes instructions with strong dependencies, and stresses the out-of-order issue feature of out-of-order processors.
    STREAM #2
    Inst. # OP SRC 0 SRC 1 DEST.
    1 addq R01, R02, R03
    2 subl R03, R04, R05
    3 addq R03, R06, R07
    4 subl R08, R09, R10
  • [0010] Instruction 1 adds the contents of source register R01 to source register R02, and stores the result in destination register R03. Instruction 2 subtracts the contents of R03 from R04, and stores the result in R05. Instruction 3 adds the contents of R03 to R06, and stores the result in R07. Instruction 4 subtracts the contents of R08 from R09, and stores the result in R10. Since Instruction 1 stores its result in R03 and each of the Instructions 2 and 3 reads from R03, Instructions 2 and 3 having instruction dependencies with Instruction 1. Accordingly, Instructions 2 and 3 cannot issue until Instruction 1 stores its result.
  • In contrast, [0011] Instruction 4 can issue at any time relative to Instructions 1, 2 or 3 since Instruction 4 does not access any registers that are accessed by the other instructions. Accordingly, an out-of-order processor may issue Instruction 1, and subsequently issue Instruction 4 prior to issuing Instructions 2 and 3. For these reasons, designers of out-of-order processors often create large executables from instruction streams similar to Stream # 2 to cause instructions to issue out-of-order, and then use such executables to stress the out-of-order capabilities of their processor designs.
  • Some processors include both superscalar and out-of-order features. The superscalar feature of such a processor can be tested by running a test executable having instructions without dependencies similar to that of Stream#[0012] 1 (shown above). Additionally, the out-of-order feature can be tested by running another test executable having instructions with dependencies similar to that of Stream #2 (shown above).
  • SUMMARY OF THE INVENTION
  • [0013] Stream # 1, shown above, may stress a processor's superscalar capabilities, but does not stress the processor's out-of-order capabilities simultaneously. Similarly, Stream # 2, shown above, may stress a processor's out-of-order capabilities, but does not stress the processor's superscalar capabilities simultaneously. Unfortunately, many design problems in complex processors will only be discovered when multiple processor features are stressed simultaneously.
  • A stream suitable for testing a processor's superscalar capabilities with few or no dependencies (e.g., [0014] Stream # 1 above) can be modified by introducing strong instruction dependencies, e.g., read-after-write (RAW) dependencies. However, increasing the number of RAW instruction dependencies reduces the number of independent instructions (instructions without dependencies) within the stream. That is, the resulting stream may improve the stream's opportunity to cause an out-of-order execution, but such a stream may no longer be able to consistently stress the superscalar structures of the processor. Accordingly, some execution units may become idle and the throughput of the processor will decrease.
  • An embodiment of the invention is directed to a technique that can produce, in a computer, a test executable that can simultaneously test the superscalar and out-of-order capabilities of a processor. The technique involves forming multiple instruction streams, dividing the multiple instruction streams into portions, and generating a combined instruction stream having the portions interleaved. The technique further involves creating a test executable from the combined instruction stream. [0015]
  • Formation of multiple instruction streams preferably involves constructing the multiple instruction streams such that the multiple instruction streams access different groups of registers. Each instruction stream can provide instructions with strong dependencies for testing the out-of-order capabilities of the processor. Additionally, the instructions within any particular stream are independent of the instructions of the other streams such that multiple execution units of the processor can be consistently kept busy. [0016]
  • Construction of the multiple instruction streams may involve operating a code generator such that the code generator provides each of the multiple instruction streams. Alternatively, such construction may involve operating a code generator such that the code generator provides a particular instruction stream, and forming other instruction streams according to the particular instruction stream. [0017]
  • To divide the streams into portions and generate a combined instruction stream having the stream portions, the technique may involve interleaving the portions within the combined instruction stream such that the portions alternate in a round-robin manner. Alternatively, the technique may involve interleaving the portions within the combined instruction stream such that the portions alternate in a pseudo random manner. Interleaving in a pseudo random manner may introduce nuances within the instruction stream that uncover design flaws that would otherwise be undetected. [0018]
  • Additional nuances within the instruction stream can be introduced in other ways, as well. In particular, the technique may further involve, prior to creating the test executable, including conflict instructions (e.g., instructions that cause conflicts) within the combined instruction stream. For example, LOAD instructions that cause cache misses may be included within the instruction stream to purposefully stall instructions with dependencies within the instruction stream. The LOAD instructions would more fully stress the processor's out-of-order capabilities by adding delays to particular instructions depending on the LOAD instructions. [0019]
  • Furthermore, the formation of the multiple instruction streams may involve constructing the multiple instruction streams such that the multiple instruction streams communicate with each other. In particular, the multiple instruction streams can be formed such that they access common registers. Additionally, the multiple instruction streams can be formed such that they share common memory spaces. The sharing of common registers or memory spaces enhances the breadth of the processor test by also testing interstream communication aspects of the processor. [0020]
  • Another embodiment of the invention is directed to a simulation system for testing a simulated processor. The system includes an input that receives a test executable created from a combined instruction stream having interleaved portions of multiple instruction streams. The system further includes a processor simulator, coupled to executable to generate reference results. Furthermore, the system includes a compare module, coupled to the processor simulator and the reference model, that compares the processor results and the reference results to determine whether the simulated processor operates correctly. The system simultaneously stresses the superscalar and out-of-the input processor reference that runs the test executable to generate results. Additionally, the system includes a model, coupled to the input, that runs the test order capabilities of the processor simulator such that design flaws can be detected and corrected prior to fabrication of the actual processor.[0021]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The foregoing and other objects, features and advantages of the invention will be apparent from the following more particular description of preferred embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. [0022]
  • The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention. [0023]
  • FIG. 1 is a block diagram of an apparatus for producing a test executable. [0024]
  • FIG. 2 is a flow diagram of a method for producing a test executable. [0025]
  • FIG. 3 shows multiple code streams that are formed by a multiple instruction code stream generator circuit of FIG. 1. [0026]
  • FIG. 4 is a combined code stream that is generated by an interleaves circuit of FIG. 1. [0027]
  • FIG. 5 is a block diagram of a simulation system for testing a simulated processor using a test executable that is created by the apparatus of FIG. 1. [0028]
  • FIG. 6A is a chart showing contents of an issue queue of FIG. 5 after a first fetch of instructions. [0029]
  • FIG. 6B is a chart showing contents of the issue queue of FIG. 5 after a second fetch of instructions. [0030]
  • FIG. 6C is a chart showing contents of the issue queue of FIG. 5 after a third fetch of instructions. [0031]
  • FIG. 6D is a chart showing contents of the issue queue of FIG. 5 after a fourth fetch of instructions. [0032]
  • FIG. 7 is a chart showing issue times for each of the instructions of the combined code stream of FIG. 4 when executed by the simulation system of FIG. 5.[0033]
  • DETAILED DESCRIPTION OF THE INVENTION
  • An embodiment of the invention is directed to a technique for producing a test executable that can stress both the superscalar and out-of-order capabilities of a processor design. The test executable is created from a combined instruction stream having interleaved portions of multiple instruction streams. [0034]
  • Reference is now made to the drawings wherein the same reference numbers are used throughout multiple figures to designate the same or similar components. FIG. 1 shows an apparatus for producing the test executable. The [0035] apparatus 20 includes a multiple instruction stream generator circuit 22, an interleaver circuit 24 and a compiler circuit 26. As will now be explained, the circuits of the apparatus 20 perform a method 50 as shown in FIG. 2.
  • In [0036] step 52, the multiple instruction stream generator circuit 22 forms multiple instruction streams, and stores the instruction streams in respective files 30 (e.g. lis files). The instruction streams access different groups of registers as indicated by configuration information 28 that is received by the multiple instruction stream generator circuit 22.
  • In step [0037] 54, the interleaver circuit 24 divides the multiple instruction streams into portions, and generates a combined instruction stream having the portions interleaved. The combined instruction stream is stored within a file 32 (e.g., a mar file). The number of instructions in each portion is controlled by the configuration information 28.
  • In [0038] step 56, the compiler circuit 26 creates a test executable from the combined instruction stream. In particular, the compiler circuit compiles the combined instruction stream, and stores the test executable as a file 34 (e.g., a dxe file). Such an executable is suitable for execution by a simulated processor or an actual processor.
  • Further details of the multiple instruction [0039] stream generator circuit 22 will now be provided. The multiple instruction stream generator circuit includes a constructor circuit 36 and a storage device 38 (e.g., disk memory), as shown in FIG. 1. The constructor circuit 36 includes a code generator 36 and a control circuit 42. The control circuit 42 operates the code generator to form the multiple instruction streams. Preferably, the control circuit 42 can run the code generator 36 to produce a single instruction stream, makes multiple copies of the single instruction stream, and modifies the copies such that they access different groups of registers based on the configuration information 28. Alternatively, the control circuit 42 runs the code generator 40 multiple times to form the multiple instruction streams that access the different groups of registers. The constructor circuit 36 stores the multiple instruction streams within the files 30 in the storage device 38.
  • It should be understood that the [0040] apparatus 20 is preferably a general purpose computer having code for producing the test executables. In particular, the code controls the general purpose computer such that it functions at various times as the multiple instruction stream generator 22, the interleaver circuit 24 and the compiler circuit 26. Alternatively, the apparatus 20 may be a specialized apparatus designed specifically to perform the method 50 of FIG. 2.
  • The operation of the [0041] apparatus 20 will be further explained by way of example. FIG. 3 shows four instruction streams (STREAM A, STREAM B, STREAM C and STREAM D) that can be formed by the multiple instruction stream generator circuit 22. The configuration information 28 that is used by the multiple instruction stream generator circuit 22 controls particular aspects of the multiple instruction streams such as the number of streams that are formed, their length (the number of instructions within each stream), which registers are accessed by each instruction stream, and the type of instructions within each instruction stream (e.g., load, add, shift, etc).
  • The instructions within STREAM A access a first group of registers, namely R[0042] 01 through R08. The instructions following Instruction 1 have strong dependencies on preceding instructions. For example, Instruction 1 writes to R01, and Instruction 2 reads from R01. Accordingly, Instruction 1 must complete writing to R01 before Instruction 2 can read from R01. In a similar manner, Instruction 3 depends from Instructions 1 and 2, and so on.
  • STREAM B, STREAM C and STREAM D include instructions that are arranged in a manner similar to that of STREAM A, except that these instruction streams access different groups of registers. In particular, STREAM B accesses registers R[0043] 09 through R16, STREAM C accesses registers R16 through R24, and STREAM D accesses registers R25 through R32. Each instruction stream formed by the multiple instruction stream generator circuit 22 is stored, at least temporarily, in the storage device 38 for use by the interleaver circuit 24.
  • FIG. 4 shows a combined instruction stream that is generated by the [0044] interleaver circuit 24 from the instruction streams shown in FIG. 3. The interleaver circuit 24 divides the instruction streams into portions, and then interleaves the portions to generate the combined instruction stream. The configuration information 28 controls the size and ordering of the portions within the combined instruction stream. As shown in FIG. 4, the first five instructions of the combined instruction stream are from a portion of STREAM A (see FIG. 3). Similarly, the next five instructions of the combined instruction stream are from a portion of STREAM B, and so on.
  • The manner of interleaving is based on the [0045] configuration information 28. In particular, the interleaver circuit 24 can generate the combined instruction stream such that it cycles through portions of STREAM A, STREAM B, STREAM C and STREAM D. Such an arrangement of portions is considered to be a round-robin ordering of the portions. Alternatively, the interleaver circuit 24 can generate the combined instruction stream such that it includes portions of the streams in a pseudo random order.
  • After the [0046] interleaver circuit 24 generates the combined instruction stream, the compiler circuit 26 compiles the combined instruction stream to create a test executable that is suitable for execution on either a simulated processor or an actual processor.
  • Another embodiment of the invention is directed to a simulation system [0047] 60 that is suitable for executing the created test executable 34. As shown in FIG. 5, the simulation system 60 includes a simulation device 62 that receives a test executable 64 (e.g., executable code such as the test executable 34 of FIG. 1) and environment information 66 (e.g., a dxe file), simulates execution of the test executable, and provides results 68 of the execution (e.g., a log file).
  • As shown in FIG. 5, the simulation device [0048] 62 includes a processor simulator module 70, a reference model module 72, a system or motherboard simulator module 76, and a compare module 74. The processor simulator module 70 operates according to processor design information and is connected with the system or motherboard simulator module 76 which simulates environmental conditions (e.g., provides external clock rates).
  • During simulation, the [0049] test executable 64 is executed by both the processor simulator module 70 and the reference model module 72. The processor simulator module 70 includes a simulated issue queue 78, and a simulated execution stage 80 having multiple simulated execution units and processor registers. As the processor simulator module 70 executes the test executable 64, results of the execution are passed to the compare module 74. Similarly, the reference model module 72 determines what the correct results of execution should be, and passes the correct results to the compare module 74. The compare module 74 matches the results from both the processor simulator module 70 and the reference model 72, and points out discrepancies in the results as an error output 68 (e.g., the log file).
  • The operation of the simulation system [0050] 60 will be described further by way of example. This example involves testing a superscalar out-of-order processor that is capable of speculatively issuing and executing instructions. FIGS. 6A through 6D show the contents of the simulated issue queue 78 of the processor simulator module 70, after the occurrence of various multi-instruction fetches of the test executable 34 (i.e., the test executable created by compiling the combined code stream of FIG. 4). In particular, in FIG. 6A, the simulated issue queue 78 loads the first four instructions of the test executable 34 during an initial processor cycle (time 0). Since Instruction 1 is the first instruction and does not depend on any other instruction, Instruction 1 is free to issue. However, Instructions 2, 3 and 4 cannot issue due to their RAW dependencies with Instruction 1. Accordingly, during the next processor cycle (time 1), only Instruction 1 will issue (indicated by the rectangle around Instruction 1).
  • As shown in FIG. 6B, [0051] Instruction 1 is removed from the simulated issue queue 78 after issuing in time 1. Additionally, the remaining three instructions are advanced in their queue positions, and the next tour instructions of the test executable 34 are fetched and loaded into the simulated issue queue 78. At this point, Instruction 2 can issue since the simulated processor is capable of issuing instructions speculatively. However, Instructions 3, 4 and 5 cannot issue since they have RAW dependencies with Instruction 2. Instructions 6, 7 and 8 do not depend on these previously fetched instructions. Rather, Instruction 6 has no dependencies, and Instructions 7 and 8 have RAW dependencies with Instruction 6. Although Instruction 6 can issue, Instructions 7 and 8 cannot issue because of their instruction dependencies. Accordingly, in the next processor cycle (time 2), Instructions 2 and 6 issue while the other queued instructions must wait.
  • At this point, it should be understood that the [0052] test executable 34 has begun to stress both the superscalar and out-of-order capabilities of the processor simultaneously. In particular, two instructions (Instructions 2 and 6) have issued for simultaneous execution to test the simulated processor's superscalar feature. Additionally, Instruction 6 (stored in issue queue position 5) is issued out-of-order to test the simulated processor's out-of-order feature.
  • As shown in FIG. 6C, [0053] Instructions 2 and 6 are removed from the simulated issue queue 78 after issuing in time 2. Additionally, the remaining instructions are advanced in their queue positions, and the next four instructions of the test executable are fetched and loaded into the simulated issue queue 78. Instructions 3 and 7 can issue since the processor supports speculative execution. Additionally, Instruction 11 has no dependencies and can issue. The rest of the instructions have RAW dependencies with other instructions in the issue queue and must wait. Accordingly, Instructions 3, 7 and 11 issue simultaneously in the next processor cycle (time 3). As shown in FIG. 6C, three instructions are issued from various positions within the issue queue 78 such that both the superscalar and out-of-order features of the simulated processor are stressed.
  • As shown in FIG. 6D, [0054] Instructions 3, 7 and 11 are removed from the simulated issue queue 78 after issuing in time 3. Furthermore, the remaining instructions are advanced in their queue positions, and the next four instructions of the test executable are fetched and loaded into the simulated issue queue 78. Instructions 4, 8 and 12 can issue if the processor supports speculative execution. Additionally, Instruction 16 has no dependencies and can issue. The rest of the instructions have RAW dependencies with other instructions in the issue queue and must wait. Accordingly, Instructions 4, 8, 12 and 16 issue simultaneously in the next processor cycle (time 4). It should be understood that four instructions are issued from various positions within the issue queue 78 such that both the superscalar and out-of-order features of the simulated processor are further stressed.
  • It should be clear from a comparison of FIGS. 6A through 6D that execution of the [0055] test executable 34 results in instructions issuing from a variety of different locations within the issue queue 78. Accordingly, the out-of-order capabilities of the processor are well tested.
  • In some processors, the issue queue receives instructions at a first end, and scans for instructions to issue beginning at the opposite end. For such processors, the instructions migrate from the first end of the issue queue to the opposite end. The [0056] test executable 34 is well suited for testing such a processor. In particular, queued instructions issue from positions throughout the issue queue as they migrate from the first end of the issue queue to the opposite end.
  • FIG. 7 shows the instructions within test executable created from the combined instruction stream of FIG. 4, with their respective fetch (F), issue (I), execute (E) and retire (R) times. As illustrated, multiple instructions issue and execute simultaneously and out-of-order thereby stressing the superscalar and out-of-order capabilities of the simulated processor. Similar results occur when running the [0057] test executable 34 on an actual processor.
  • Furthermore, the test executable can run on a processor without speculative execution capabilities. In this situation, more fetches must occur to further fill the issue queue with instructions without dependencies before the processor's superscalar capabilities are stressed. Otherwise, the processor behaves in a manner similar to that above for a processor capable of issuing and executing speculatively. [0058]
  • Equivalents [0059]
  • While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. [0060]
  • For example, special instructions (e.g., instructions that cause conflicts) may be inserted within the combined instruction streams generated by the [0061] interleaver circuit 24 to cause various situations to occur. For example, LOAD instructions may be inserted within the combined instruction stream such that cache misses occur during execution. This would more fully stress the processor's out-of-order capabilities. As an alternative to inserting conflict instructions within the combined instruction stream, the conflict instructions can replace instructions within the combined instruction stream. Such insertions or replacements can be controlled by setting parameters within the configuration information 28.
  • Additionally, the groups of registers can be modified such that the different groups overlap. For example, STREAM A and STREAM B can be formed such that both instruction streams access register R[0062] 08. Such a modification provides an opportunity for inter-stream communication. Another way of adding inter-stream communication is to make multiple streams access overlapping memory spaces. Such features can be controlled by setting parameters within the configuration information 28.
  • Some processors treat registers identified within instructions as logical registers, and internally map the logical registers of instructions to physical registers. This operation is called register renaming. The test executable produced by the above described technique is suitable for testing such processors. In particular, running the test executable on such a processor would stress that processor's renaming features simultaneously with its superscalar and out-of-order features. To enhance testing of the register renaming capabilities of the processor, more instruction streams should be added or the different register groups should be widened such that each logical register is accessed by at least one instruction stream. [0063]
  • Furthermore, it should be understood that particular aspects of the combined instruction stream can be changed. For example, the number of instruction streams formed by the multiple instruction [0064] stream generator circuit 22 can be more or less than four (as shown in the example of FIG. 3). Similarly, the instruction types and the lengths of the instruction stream portions can be changed as well. Accordingly, processor designers can produce multiple test executables that stress various combinations of particular processor features, at different times.

Claims (4)

What is claimed is:
1. A method for testing a simulated processor in a computer, comprising the steps of:
loading a test executable created from a combined instruction stream having interleaved portions of multiple instruction streams;
running the loaded test executable through the simulated processor to generate processor results and through a reference model to generate reference results; and comparing the processor results and the reference results to determine whether the simulated processor operates correctly.
2. The method of claim 1, wherein the multiple instruction streams access different groups of registers, and wherein the step of running includes the step of:
executing the test executable such that each of the different groups of registers is accessed.
3. A simulation system for testing a simulated processor, comprising:
an input that receives a test executable created from a combined instruction stream having interleaved portions of multiple instruction streams;
a processor simulator, coupled to the input, that runs the test executable to generate processor results;
a reference model, coupled to the input, that runs the test executable to generate reference results; and
a compare module, coupled to the processor simulator and the reference model, that compares the processor results and the reference results to determine whether the simulated processor operates correctly.
4. The simulation system of claim 3, wherein the multiple instruction streams access different groups of registers, and wherein the processor simulator includes:
registers that are accessed in the different groups when the processor simulator runs the test executable.
US10/855,600 1998-06-29 2004-05-26 Methods and apparatus for generating effective test code for out of order superscalar microprocessors Abandoned US20040220794A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/855,600 US20040220794A1 (en) 1998-06-29 2004-05-26 Methods and apparatus for generating effective test code for out of order superscalar microprocessors

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/106,691 US6813702B1 (en) 1998-06-29 1998-06-29 Methods and apparatus for generating effective test code for out of order super scalar microprocessors
US10/855,600 US20040220794A1 (en) 1998-06-29 2004-05-26 Methods and apparatus for generating effective test code for out of order superscalar microprocessors

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/106,691 Continuation US6813702B1 (en) 1998-06-29 1998-06-29 Methods and apparatus for generating effective test code for out of order super scalar microprocessors

Publications (1)

Publication Number Publication Date
US20040220794A1 true US20040220794A1 (en) 2004-11-04

Family

ID=33297783

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/106,691 Expired - Fee Related US6813702B1 (en) 1998-06-29 1998-06-29 Methods and apparatus for generating effective test code for out of order super scalar microprocessors
US10/855,600 Abandoned US20040220794A1 (en) 1998-06-29 2004-05-26 Methods and apparatus for generating effective test code for out of order superscalar microprocessors

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/106,691 Expired - Fee Related US6813702B1 (en) 1998-06-29 1998-06-29 Methods and apparatus for generating effective test code for out of order super scalar microprocessors

Country Status (1)

Country Link
US (2) US6813702B1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11163579B2 (en) * 2011-12-14 2021-11-02 International Business Machines Corporation Instruction generation based on selection or non-selection of a special command

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6912640B2 (en) * 2003-03-14 2005-06-28 Sun Microsystems, Inc. Method to partition large code across multiple e-caches
US7401328B2 (en) * 2003-12-18 2008-07-15 Lsi Corporation Software-implemented grouping techniques for use in a superscalar data processing system
US7487396B2 (en) * 2004-10-15 2009-02-03 Broadcom Corporation System and method to locate and correct software errors within a protocol stack for wireless devices
US20110087922A1 (en) * 2009-10-09 2011-04-14 National Tsing Hua University Test method and tool for master-slave systems on multicore processors
US9251022B2 (en) 2013-03-01 2016-02-02 International Business Machines Corporation System level architecture verification for transaction execution in a multi-processing environment
US9218272B2 (en) 2013-03-01 2015-12-22 International Business Machines Corporation System level architecture verification of a transactional execution

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5167023A (en) * 1988-02-01 1992-11-24 International Business Machines Translating a dynamic transfer control instruction address in a simulated CPU processor
US5592674A (en) * 1994-12-20 1997-01-07 International Business Machines Corporation Automatic verification of external interrupts

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5483470A (en) * 1990-03-06 1996-01-09 At&T Corp. Timing verification by successive approximation
US5765011A (en) * 1990-11-13 1998-06-09 International Business Machines Corporation Parallel processing system having a synchronous SIMD processing with processing elements emulating SIMD operation using individual instruction streams
US5615167A (en) 1995-09-08 1997-03-25 Digital Equipment Corporation Method for increasing system bandwidth through an on-chip address lock register
US5864660A (en) * 1996-03-12 1999-01-26 Electronic Data Systems Corporation Testing the integration of a plurality of elements in a computer system using a plurality of tests codes, each corresponding to an alternate product configuration for an associated element

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5167023A (en) * 1988-02-01 1992-11-24 International Business Machines Translating a dynamic transfer control instruction address in a simulated CPU processor
US5592674A (en) * 1994-12-20 1997-01-07 International Business Machines Corporation Automatic verification of external interrupts

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11163579B2 (en) * 2011-12-14 2021-11-02 International Business Machines Corporation Instruction generation based on selection or non-selection of a special command

Also Published As

Publication number Publication date
US6813702B1 (en) 2004-11-02

Similar Documents

Publication Publication Date Title
US6285974B1 (en) Hardware verification tool for multiprocessors
US9639365B2 (en) Indirect function call instructions in a synchronous parallel thread processor
US5421022A (en) Apparatus and method for speculatively executing instructions in a computer system
US5884060A (en) Processor which performs dynamic instruction scheduling at time of execution within a single clock cycle
US5428807A (en) Method and apparatus for propagating exception conditions of a computer system
US8413086B2 (en) Methods and apparatus for adapting pipeline stage latency based on instruction type
US6732297B2 (en) Pipeline testing method, pipeline testing system, pipeline test instruction generation method and storage method
US5420990A (en) Mechanism for enforcing the correct order of instruction execution
US6212626B1 (en) Computer processor having a checker
US5805470A (en) Verification of instruction and data fetch resources in a functional model of a speculative out-of order computer system
EP3574405B1 (en) Error detection using vector processing circuitry
US5592674A (en) Automatic verification of external interrupts
JP3595158B2 (en) Instruction assignment method and instruction assignment device
US8027828B2 (en) Method and apparatus for synchronizing processors in a hardware emulation system
US6813702B1 (en) Methods and apparatus for generating effective test code for out of order super scalar microprocessors
JP3373607B2 (en) Method and apparatus for automatically generating instruction sequence for verifying control mechanism of processor
US6704861B1 (en) Mechanism for executing computer instructions in parallel
US20040006751A1 (en) System verifying apparatus and method
US7111152B1 (en) Computer system that operates in VLIW and superscalar modes and has selectable dependency control
JP3146058B2 (en) Parallel processing type processor system and control method of parallel processing type processor system
US5729729A (en) System for fast trap generation by creation of possible trap masks from early trap indicators and selecting one mask using late trap indicators
JP3743155B2 (en) Pipeline controlled computer
CN117033101A (en) Processor fuzzy test method supporting run-time instruction variation
Huang et al. Hardware/Software Resolutions for Pipeline
Frey Superscalar SMIPS Processor

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION