WO2000008555A1 - Data processing device - Google Patents

Data processing device Download PDF

Info

Publication number
WO2000008555A1
WO2000008555A1 PCT/EP1999/005520 EP9905520W WO0008555A1 WO 2000008555 A1 WO2000008555 A1 WO 2000008555A1 EP 9905520 W EP9905520 W EP 9905520W WO 0008555 A1 WO0008555 A1 WO 0008555A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
stage
register file
result
processing
Prior art date
Application number
PCT/EP1999/005520
Other languages
French (fr)
Inventor
Fransiscus W. Sijstermans
Original Assignee
Koninklijke Philips Electronics N.V.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Koninklijke Philips Electronics N.V. filed Critical Koninklijke Philips Electronics N.V.
Publication of WO2000008555A1 publication Critical patent/WO2000008555A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30025Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30036Instructions to perform operations on packed data, e.g. vector, tile or matrix operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • G06F9/30141Implementation provisions of register files, e.g. ports
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3854Instruction completion, e.g. retiring, committing or graduating
    • G06F9/3858Result writeback, i.e. updating the architectural state or memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3867Concurrent instruction execution, e.g. pipeline, look ahead using instruction pipelines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units

Definitions

  • the invention relates to a data processing device with an instruction execution pipeline.
  • PCT patent application No. WO 98/11483 teaches a data processing device with an instruction pipeline.
  • the pipeline contains a series of processing stages from a front end to a back end, for performing successive operations during the execution of an instruction.
  • the final stage of the back end writes back a processing result to a register file.
  • the pipeline can process several instructions in parallel, because the front end processing stages can start executing an instruction before the back end processing stages produced and written back the result of an earlier instruction.
  • the data processing device is described in Claim 1.
  • the invention provides for the possibility to write back results from different processing stages in the pipeline directly after such a processing stage completes processing of an instruction, that is, without passing the entire pipeline and before the entire pipeline has had the opportunity to process the instruction.
  • a first processing stage might perform an arithmetic operation and a second processing stage might perform a clipping operation on the result of the arithmetic operation.
  • one may include two types of instruction in the instruction set of the data processing device, one type for arithmetic operation with clipping and one type for arithmetic operations without clipping. In case of an operation with clipping the result would be written back from the second processing unit (after completion of the clipping operation) and in case of an operation without clipping the result would be written back from the first processing stage (before completion of the clipping operation).
  • the data processor may even write the result of both the first and the second stage (e.g. with and without clipping) in response to some instructions. This means that the result is written back to the register file directly after the processing stage produces its result, that is, earlier than if the processor has to wait for a time period corresponding to the time needed by the second processing stage.
  • Writing to the register file is normally followed by writing to a register after a predetermined delay, but without deviating from the invention, some types of register file may introduce a variable delay until writing is complete, for example in order to resolve access conflicts.
  • the register file is provided with more than one write port, so that results from different stages of the pipeline can be written back in parallel.
  • different write port of the register file are assigned to different processing stages, so that the pipeline is connected to more write ports than needed for writing the result of individual instructions, in order to be able to write results of different instructions in the pipeline from different processing stages in parallel.
  • Figure 1 shows an architecture of a data processing device
  • Figure 2 shows a functional unit.
  • FIG 1 shows the architecture of a data processor.
  • the processor contains a register file 10, a number of functional units 12a-f and an instruction issue unit 14.
  • the instruction issue unit 14 has instruction issue connections to the functional units 12a-f.
  • the functional units 12a-f are connected to the register file 10 via read and write ports.
  • a first one of the functional units 12a has two read ports and two write ports connected to the register file 10.
  • Figure 2 shows the first one of the functional units 70, with a cascade of a first and second sub-unit 72. 74. An output of the first sub-unit is coupled to an input of the second sub-unit and to a write port of the register file 10.
  • the functional unit 70 contains two control units 76, 78 coupled to a control input the first and second sub-unit 72, 74 respectively.
  • An input of the first control unit 76 is coupled to an output of the instruction issue unit for receiving an opcode.
  • An output of the first control unit 76 is coupled to an input of the second control unit 78.
  • the instruction issue unit 14 fetches successive instructions words from an instruction memory (not shown explicitly). Each instruction word may contain several instructions for different ones of the functional units 12a-f. Normally, each instruction contains fields specifying an opcode, one or more source registers and one result register. When the instruction issue unit 14 has fetched an instruction word from instruction memory, the fields specifying the source registers in a particular instruction are decoded and used to address the register file 10. In response, the register file 10 supplies the content of the source registers to the functional unit 12a-f that will execute the particular instruction.
  • the field specifying the opcode and the content of the source registers is supplied to the functional unit 70.
  • the functional unit 70 operates in successive processing cycles.
  • a control signal for the first sub-unit 72 is generated by the first control unit 76, dependent on the opcode.
  • the first sub-unit 72 generates a result which the first sub-unit may write to the register file via the write port (writing depends on the control signal).
  • the result (and possible additional information) is passed to the second sub-unit 74.
  • a further control signal dependent on the opcode is passed from the first control unit 74 to the second control unit 78.
  • the second sub-unit 74 processes the result generated by the first sub-unit 72 under control of the control signal passed by the second control unit 78.
  • a second result, generated by the second sub-unit 72 may be written to the register file via a write port (writing depends on the control signal from the second control unit 78).
  • the first control unit 76 may already cause the first sub-unit 72 to process a subsequent instruction.
  • processors that have a two or more functional units that can start processing different instructions in parallel, such as VLIW processors.
  • These processors can execute further instructions 13 and 14 that use the results of II and 12 respectively. Due to the invention such a processor can start 13 and 14 in the same cycle, which makes processing faster.
  • the first sub-unit 72 may be for example an ALU and the second sub-unit 74 may be clipping unit or a rounding unit.
  • the instruction may be for example an "ADD" instruction.
  • the first sub-unit 72 adds the source operands and writes the sum to the register file via its write port, i.e. without involvement of the second sub-unit 74; the second sub-unit 74 refrains from writing to its write port if it receives this first type of ADD instruction.
  • the first sub-unit 72 adds the source operands, but it refrains from writing the sum to the register file via its write port; the second sub-unit 74 responds to the second type of ADD instruction e.g. by rounding or clipping the sum, which the second sub-unit 74 receives from the first sub-unit 72. Also in response to the second type of ADD instruction the second sub-unit 74 write the result of its operation on the sum to the write port of the second sub-unit 74.
  • adding and rounding or clipping are used here merely by way of example, many other types of operations, which produce meaningful intermediate results, e.g instead of ADD other arithmetic or logic operations, or vector operations and instead of rounding or clipping further arithmetic or logic operations on the result of the first sub-unit 72.
  • the functional unit may respond to some instructions by writing back from both of the sub-units. This leads to the following pipeline table.
  • each sub-unit 72, 74 itself may contain one or more further subunits, or pipeline stages which process the instruction in successive processing stages.
  • more sub-units for implementing different pipeline stages may be placed in series with the first and second sub-unit 72, 74.
  • more than two of such further sub- units may be connected to their own write ports to the register file 10 for writing a result produced at an intermediate stage in the pipeline.
  • the pipeline table may be
  • forks in the pipeline may be included, where one sub-unit feeds two or more further subunits in parallel, one or more of these sub-units having their own write ports for writing results back to the register file 10.
  • one may include one or more sub-units (not shown) in parallel to the first sub-unit 72, each having its own instruction and operand inputs and its own write port for writing to the register file 10.
  • these one or more sub-units and the first sub-unit 72 may feed a single second sub-unit 74 in parallel via a multiplexer (not shown), the pipeli- ned instructions determining from which of the sub-units the multiplexer passes results to the second sub-unit 74.
  • several instructions may be executed in parallel and a selected one of them may be followed by postprocessing in the second sub-unit 74.
  • a compiler for the processor will have to schedule operations in such a way that results are produced timely, without conflicts about the use of functional units 12a-f or regis- ters.
  • the compiler can treat the functional unit 70 more or less as two or more conceptually different functional units, one for processing instructions without processing by the second sub-unit 74 and one for processing instructions including processing by the second sub-unit 74. These conceptually different functional units have different latencies.
  • the compiler will avoid scheduling instruction simultaneously at the functional unit, but the compiler may schedule the start a further instruction at a time when the second sub-unit 74 is still working on the previous instruction. Owing to the invention the compiler can schedule instructions that use the result of the further instruction earlier, for example as early as an instruction that uses a result of the previous instruction.

Abstract

The data processing device has an instruction execution pipeline containing at least a first and second processing stage directly or indirectly in series. The stages execute a first and second stage of instruction execution, a mutually different first and second number of processing cycles after the instruction enters the pipeline. The first and second stage are both coupled to a register file, for writing a result of processing by the first and/or the second stage to the register file upon completion of the first and second number of processing cycles respectively.

Description

Data processing device.
The invention relates to a data processing device with an instruction execution pipeline.
PCT patent application No. WO 98/11483 teaches a data processing device with an instruction pipeline. The pipeline contains a series of processing stages from a front end to a back end, for performing successive operations during the execution of an instruction. The final stage of the back end writes back a processing result to a register file.
The pipeline can process several instructions in parallel, because the front end processing stages can start executing an instruction before the back end processing stages produced and written back the result of an earlier instruction.
Amongst others, it is an object of the invention to reduce the average time needed for processing instructions using the pipeline.
The data processing device according to the invention is described in Claim 1. The invention provides for the possibility to write back results from different processing stages in the pipeline directly after such a processing stage completes processing of an instruction, that is, without passing the entire pipeline and before the entire pipeline has had the opportunity to process the instruction.
For example, a first processing stage might perform an arithmetic operation and a second processing stage might perform a clipping operation on the result of the arithmetic operation. In this case, one may include two types of instruction in the instruction set of the data processing device, one type for arithmetic operation with clipping and one type for arithmetic operations without clipping. In case of an operation with clipping the result would be written back from the second processing unit (after completion of the clipping operation) and in case of an operation without clipping the result would be written back from the first processing stage (before completion of the clipping operation).
The data processor may even write the result of both the first and the second stage (e.g. with and without clipping) in response to some instructions. This means that the result is written back to the register file directly after the processing stage produces its result, that is, earlier than if the processor has to wait for a time period corresponding to the time needed by the second processing stage. Writing to the register file is normally followed by writing to a register after a predetermined delay, but without deviating from the invention, some types of register file may introduce a variable delay until writing is complete, for example in order to resolve access conflicts.
When instruction execution is pipelined, this may mean that the result of a first instruction is written back to the register file is written back before or at the same time as the result of an second instruction that has entered the pipeline before the first instruction. Preferably, the register file is provided with more than one write port, so that results from different stages of the pipeline can be written back in parallel. Also preferably, different write port of the register file are assigned to different processing stages, so that the pipeline is connected to more write ports than needed for writing the result of individual instructions, in order to be able to write results of different instructions in the pipeline from different processing stages in parallel.
These and other advantageous aspects of the invention will be described using the following figures.
Figure 1 shows an architecture of a data processing device
Figure 2 shows a functional unit.
Figure 1 shows the architecture of a data processor. By way of example a VLIW processor has been shown, although the invention is not limited to VLIW processors. The processor contains a register file 10, a number of functional units 12a-f and an instruction issue unit 14. The instruction issue unit 14 has instruction issue connections to the functional units 12a-f. The functional units 12a-f are connected to the register file 10 via read and write ports. A first one of the functional units 12a has two read ports and two write ports connected to the register file 10. Figure 2 shows the first one of the functional units 70, with a cascade of a first and second sub-unit 72. 74. An output of the first sub-unit is coupled to an input of the second sub-unit and to a write port of the register file 10. An output of the second sub-unit is coupled to another port of the register file 10. The functional unit 70 contains two control units 76, 78 coupled to a control input the first and second sub-unit 72, 74 respectively. An input of the first control unit 76 is coupled to an output of the instruction issue unit for receiving an opcode. An output of the first control unit 76 is coupled to an input of the second control unit 78.
In operation, the instruction issue unit 14 fetches successive instructions words from an instruction memory (not shown explicitly). Each instruction word may contain several instructions for different ones of the functional units 12a-f. Normally, each instruction contains fields specifying an opcode, one or more source registers and one result register. When the instruction issue unit 14 has fetched an instruction word from instruction memory, the fields specifying the source registers in a particular instruction are decoded and used to address the register file 10. In response, the register file 10 supplies the content of the source registers to the functional unit 12a-f that will execute the particular instruction.
In case the particular instruction is intended for the first functional unit 12a, 70, the field specifying the opcode and the content of the source registers is supplied to the functional unit 70. The functional unit 70 operates in successive processing cycles. In a first processing cycle a control signal for the first sub-unit 72 is generated by the first control unit 76, dependent on the opcode. Under control of this control signal the first sub-unit 72 generates a result which the first sub-unit may write to the register file via the write port (writing depends on the control signal). The result (and possible additional information) is passed to the second sub-unit 74. Also a further control signal dependent on the opcode is passed from the first control unit 74 to the second control unit 78. In a second processing cycle, which is later than the first processing cycle, the second sub-unit 74 processes the result generated by the first sub-unit 72 under control of the control signal passed by the second control unit 78. A second result, generated by the second sub-unit 72 may be written to the register file via a write port (writing depends on the control signal from the second control unit 78). In the cycle in which the second sub-unit 74 operates under control of the second control unit 78, the first control unit 76 may already cause the first sub-unit 72 to process a subsequent instruction.
This can be illustrated with a pipeline table:
Cl C2 C3 C4
11 IF SI S2 WB
12 IF SI WB This table shows the execution of two instructions II, 12 in successive clock cycles Cl, C2, C3, C4. These instructions involve a number of processing steps: LF, SI, S2, WB. "IF" refers to an instruction fetch and operand fetch step, "SI", "S2" refer to execution in the first and second sub-unit 72, 74 respectively. " WB" refers to a step of writing a result back to the register file 10. It is seen in the table that the result of the second instruction is written back without an "S2" step. As a consequence the WB step for 12 occurs in the same cycle as for II, even though the two instructions II, 12 started after one another on the same functional unit.
This is especially advantageous for processors that have a two or more functional units that can start processing different instructions in parallel, such as VLIW processors. These processors can execute further instructions 13 and 14 that use the results of II and 12 respectively. Due to the invention such a processor can start 13 and 14 in the same cycle, which makes processing faster.
The first sub-unit 72 may be for example an ALU and the second sub-unit 74 may be clipping unit or a rounding unit. In this case, the instruction may be for example an "ADD" instruction. In response to a first type of ADD instruction the first sub-unit 72 adds the source operands and writes the sum to the register file via its write port, i.e. without involvement of the second sub-unit 74; the second sub-unit 74 refrains from writing to its write port if it receives this first type of ADD instruction.
In response to a second type of ADD instruction the first sub-unit 72 adds the source operands, but it refrains from writing the sum to the register file via its write port; the second sub-unit 74 responds to the second type of ADD instruction e.g. by rounding or clipping the sum, which the second sub-unit 74 receives from the first sub-unit 72. Also in response to the second type of ADD instruction the second sub-unit 74 write the result of its operation on the sum to the write port of the second sub-unit 74. Of course, adding and rounding or clipping are used here merely by way of example, many other types of operations, which produce meaningful intermediate results, e.g instead of ADD other arithmetic or logic operations, or vector operations and instead of rounding or clipping further arithmetic or logic operations on the result of the first sub-unit 72. According to the invention, the functional unit may respond to some instructions by writing back from both of the sub-units. This leads to the following pipeline table.
Cl C2 C3 C4
II IF SI S2 and WB WB Of course, each sub-unit 72, 74 itself may contain one or more further subunits, or pipeline stages which process the instruction in successive processing stages. Alternatively, more sub-units for implementing different pipeline stages may be placed in series with the first and second sub-unit 72, 74. According to the invention, more than two of such further sub- units may be connected to their own write ports to the register file 10 for writing a result produced at an intermediate stage in the pipeline. In this case the pipeline table may be
Cl C2 Ck.. Cm.. Cn
11 IF SL. S2.. S2.. WB
12 IF SL. WB
(Here Ck, Cm, Cn refer to clock cycles later than C3, C4, C5 respectively).
Also, forks in the pipeline may be included, where one sub-unit feeds two or more further subunits in parallel, one or more of these sub-units having their own write ports for writing results back to the register file 10.
Furthermore, one may include one or more sub-units (not shown) in parallel to the first sub-unit 72, each having its own instruction and operand inputs and its own write port for writing to the register file 10. In this case these one or more sub-units and the first sub-unit 72 may feed a single second sub-unit 74 in parallel via a multiplexer (not shown), the pipeli- ned instructions determining from which of the sub-units the multiplexer passes results to the second sub-unit 74. In this case, several instructions may be executed in parallel and a selected one of them may be followed by postprocessing in the second sub-unit 74.
A compiler for the processor will have to schedule operations in such a way that results are produced timely, without conflicts about the use of functional units 12a-f or regis- ters. The compiler can treat the functional unit 70 more or less as two or more conceptually different functional units, one for processing instructions without processing by the second sub-unit 74 and one for processing instructions including processing by the second sub-unit 74. These conceptually different functional units have different latencies. The compiler will avoid scheduling instruction simultaneously at the functional unit, but the compiler may schedule the start a further instruction at a time when the second sub-unit 74 is still working on the previous instruction. Owing to the invention the compiler can schedule instructions that use the result of the further instruction earlier, for example as early as an instruction that uses a result of the previous instruction.

Claims

CLAIMS:
1. A data processing device comprising
- an instruction execution pipeline containing at least a first and second processing stage directly or indirectly in series, for executing a first and second stage of instruction execution, a mutually different first and second number of processing cycles after the instruction enters the pipeline respectively;
- a register file for receiving a result of the instruction from the pipeline in a register addressed by the instruction, characterized in that the first and second stage are both coupled to the register file, for making a result of processing by the first and/or the second stage available for writing to the register file from completion of the first and second number of processing cycles respectively.
2. A data processing device according to Claim 1, the data processing device having an instruction set that contains a first and a second type of instruction, the pipeline being arranged to make only the result of processing by the first stage or only the result of processing by the second stage available to the register file, in response to the first and second type of instruction respectively.
3. A data processing device according to Claim 2, the instruction set comprising a third type of instruction, the pipeline being arranged to make both the result of processing by the first stage and the result of processing by the second stage being available in response to the third type of instruction.
4. A data processing device according to Claim 1, the first and second stage being arranged to process mutually different first and second pipelined instructions in parallel and to make results of processing the first and second pipelined instruction available for writing to the register file both in the same processing cycle.
5. A data processing device according to Claim 1, the register file being a multiport register file, having at least a first and a second write port, for writing results to different registers in parallel, the first and second processing stage being coupled to the first and the second write port respectively.
6. A data processing device according to Claim 1 containing two or more functional units for starting execution of different instructions in parallel to one another, the instruction execution pipeline being comprised in a first one of the functional units, the device being programmed for executing a first instruction followed by a second instruction in the first one of the functional units, and executing a third and fourth instruction, which use a result of the first and second instruction written to the register file from the second and first stage respectively, the third and fourth instruction being started in parallel on different ones of the functional units.
7. A method of compiling a program for a processing device according to Claim 1 , the method comprising scheduling a first instruction followed by second instruction in the instruction pipeline, the first and second instruction specifying writing to the register file after the second stage and after the first stage of the instruction execution pipeline respectively, and scheduling a third instruction, which uses a result of the second instruction, before or at the same time as a result of the first instruction becomes available to the register file.
PCT/EP1999/005520 1998-08-06 1999-07-29 Data processing device WO2000008555A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP98202647.8 1998-08-06
EP98202647 1998-08-06
EP98203425.8 1998-10-09
EP98203425 1998-10-09

Publications (1)

Publication Number Publication Date
WO2000008555A1 true WO2000008555A1 (en) 2000-02-17

Family

ID=26150605

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP1999/005520 WO2000008555A1 (en) 1998-08-06 1999-07-29 Data processing device

Country Status (1)

Country Link
WO (1) WO2000008555A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100418645C (en) * 2003-03-21 2008-09-17 迪纳帕克压紧设备股份公司 Regulator for regulating accentric moment or foller drum eccentric shaft
EP2866138B1 (en) * 2013-10-23 2019-08-07 Teknologian tutkimuskeskus VTT Oy Floating-point supportive pipeline for emulated shared memory architectures
EP2887207B1 (en) * 2013-12-19 2019-10-16 Teknologian tutkimuskeskus VTT Oy Architecture for long latency operations in emulated shared memory architectures
JP2021168189A (en) * 2020-07-15 2021-10-21 ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド Apparatus and method for writing back instruction execution result, and processing apparatus

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4228497A (en) * 1977-11-17 1980-10-14 Burroughs Corporation Template micromemory structure for a pipelined microprogrammable data processing system
EP0653703A1 (en) * 1993-11-17 1995-05-17 Sun Microsystems, Inc. Temporary pipeline register file for a superpipelined superscalar processor

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4228497A (en) * 1977-11-17 1980-10-14 Burroughs Corporation Template micromemory structure for a pipelined microprogrammable data processing system
EP0653703A1 (en) * 1993-11-17 1995-05-17 Sun Microsystems, Inc. Temporary pipeline register file for a superpipelined superscalar processor

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"METHOD TO MAINTAIN PIPELINE THROUGHPUT WHILE PIPELINE DEPTH IS ALLOWED TO VARY", IBM TECHNICAL DISCLOSURE BULLETIN,US,IBM CORP. NEW YORK, vol. 39, no. 5, pages 31-32, XP000584045, ISSN: 0018-8689 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100418645C (en) * 2003-03-21 2008-09-17 迪纳帕克压紧设备股份公司 Regulator for regulating accentric moment or foller drum eccentric shaft
EP2866138B1 (en) * 2013-10-23 2019-08-07 Teknologian tutkimuskeskus VTT Oy Floating-point supportive pipeline for emulated shared memory architectures
EP2887207B1 (en) * 2013-12-19 2019-10-16 Teknologian tutkimuskeskus VTT Oy Architecture for long latency operations in emulated shared memory architectures
JP2021168189A (en) * 2020-07-15 2021-10-21 ベイジン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッド Apparatus and method for writing back instruction execution result, and processing apparatus
EP3940531A1 (en) * 2020-07-15 2022-01-19 Kunlunxin Technology (Beijing) Company Limited Apparatus and method for writing back instruction execution result and processing apparatus
JP7229305B2 (en) 2020-07-15 2023-02-27 ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド Apparatus, method, and processing apparatus for writing back instruction execution results

Similar Documents

Publication Publication Date Title
US20020169942A1 (en) VLIW processor
JP2918631B2 (en) decoder
EP1658559B1 (en) Instruction controlled data processing device and method
US5404552A (en) Pipeline risc processing unit with improved efficiency when handling data dependency
JP3881763B2 (en) Data processing device
EP0118830A2 (en) Pipelined processor
EP1050808A1 (en) Computer instruction scheduling
US8589664B2 (en) Program flow control
US6260189B1 (en) Compiler-controlled dynamic instruction dispatch in pipelined processors
JP2002512399A (en) RISC processor with context switch register set accessible by external coprocessor
US6154828A (en) Method and apparatus for employing a cycle bit parallel executing instructions
JP2003005958A (en) Data processor and method for controlling the same
US6145074A (en) Selecting register or previous instruction result bypass as source operand path based on bypass specifier field in succeeding instruction
JP3578883B2 (en) Data processing device
JP2874351B2 (en) Parallel pipeline instruction processor
WO2000008555A1 (en) Data processing device
US7111152B1 (en) Computer system that operates in VLIW and superscalar modes and has selectable dependency control
US6099585A (en) System and method for streamlined execution of instructions
JP3182591B2 (en) Microprocessor
JP2878792B2 (en) Electronic computer
JPH08272611A (en) Microprocessor
US7302555B2 (en) Zero overhead branching and looping in time stationary processors
US6032249A (en) Method and system for executing a serializing instruction while bypassing a floating point unit pipeline
US9135006B1 (en) Early execution of conditional branch instruction with pc operand at which point target is fetched
JP3743155B2 (en) Pipeline controlled computer

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase