CA1269756A - Extended floating point operations supporting emulation of source instruction execution - Google Patents

Extended floating point operations supporting emulation of source instruction execution

Info

Publication number
CA1269756A
CA1269756A CA000547682A CA547682A CA1269756A CA 1269756 A CA1269756 A CA 1269756A CA 000547682 A CA000547682 A CA 000547682A CA 547682 A CA547682 A CA 547682A CA 1269756 A CA1269756 A CA 1269756A
Authority
CA
Canada
Prior art keywords
instruction
target
floating point
source
operand
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CA000547682A
Other languages
French (fr)
Inventor
John F. Bechdel
James A. Mitchell
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Application granted granted Critical
Publication of CA1269756A publication Critical patent/CA1269756A/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/3017Runtime instruction translation, e.g. macros
    • G06F9/30174Runtime instruction translation, e.g. macros for non-native instruction set, e.g. Javabyte, legacy code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline, look ahead using a slave processor, e.g. coprocessor
    • G06F9/3879Concurrent instruction execution, e.g. pipeline, look ahead using a slave processor, e.g. coprocessor for non-native instruction execution, e.g. executing a command; for Java instruction set

Abstract

EXTENDED FLOATING POINT OPERATIONS SUPPORTING
EMUILATION OF SOURCE INSTRUCTION EXECUTION

ABSTRACT
In a system which emulates execution of source CPU
instructions and includes a translating unit (translator) for converting source instructions to target instructions and a target CPU instruction unit for processing and issuing translated target instructions, provision is made for accelerating instruction, translation, issue, and execution when certain source floating point arithmetic instructions are emulated for execution. When a source floating point arithmetic instruction is emulated, a token is placed in a wait queue in the translator to prevent the translation of any source instructions and issue of any target instructions until condition and interrupt information is available and validated. In addition, emulation of source RX-type floating point instructions is enhanced by provision of registers in the instruction unit which receive X-field denoted operands, and which thereby permit a target CPU
execution unit to perform the emulation by conducting register-to-register operations.

Description

~6975~

EXTENDED FLO~TING POINT OP~RATIONS SUPPORTING
EMUL~TION OF SO~RCE I~STRUCTION EXECUTION

BACKGROUND OF T~E INV~TION
This invention i5 in the field of machine emulation and especially concerns a source which emulates operation of a source CPU by translating source CPU instructions into target CPU instructions for issue and execution by a target CPU. More particularLy, the invention relates to extensions of the emulation ability of such a system through the provision of means for accelerating the translating and issuing functions when source floating point arithmetic instructions are being emulated and which emulate the execution of source floating point RX-type instructions by ; execution of target register-to-register floating point arithmetic instructions.
Emulation is the imitation of the operation of a first ~"source") CPU by a second ("target") CPU. The target CPU
is specially programmed and architected to permit it to execute programs written for the source CPU. A program ' '~:
- .: ,. , ~

written for the source CPU comprises a sequence of source instructions which are provided, one-by-one, to the target CPU. The tarcJet CPU responds to each source instruction by executing one or more target instructions.
In U. S. Patent No. 4,587,612 of Fisk et al., assigned to the present assignee and incorporated herein by reference, an emulation assist processor (EAP~ receives a source instruction stream and maps each source instruction to one or more target instructions, the target instructions being passed to an instruction processing unit (IPU) of the target CPU. In the incorporated patent, the EAP converts multi-field source instructions into multi-field target instructions and streams the target instructions to the IP~
for processing and issuing.
As is known, when the source CPU comprises a machine such as an IBM*370 host computer (described in U. S. Patent No. 3,400,371, assigned to the present assignee, and incorporated herein by reference), the source instruction set includes multi-field floating point arithmetic instructions, primarily of the RX-type. Source instruction programs utilizing the IBM 370 instruction set characteristically are conditioned by the state of a condition code (CC), an indicator which is set according to the outcome of certain instruction among which are floating point arithmetic instructionsO When an IBM 370 floating point instruction produces an abnormal outcome (such as an alL zero result) or attempts an abnormal operation (divide by zero), an interrupt indicator is set which transfers control from the executing program to a supervisory for certain interrupt procedures. Emulation of the branching *Registered Trade Mark s~

and interrupt features of an IsM 370 source program requires that the target program maintain condition code and interrupt indica~ors to effectively map program branches and interrupts.
WhiLe conceding tha~ the branching and interrupt correspondence between source and target programs must be maintained, it is recognized that the speed of emulation can be enhanced by the ability to reliably predict the state of the condition code and interrupt indicators before the completion of executing fLoating point arithmetic operations. However, any such enhancement must account not only for the translation of source to target instructions, but also must take into account the issuance of converted target instructions.
One of the potential operational environments of the EAP of U. S. Patent No. 4~587,612 imposes certain architectural bottlenecks to execution of target floating point instructions. In this regard, the instruction unit and floating point unit of th~ target CPU are interconnected by a 32-bit wide databus. When source RX-type floating point instructions are emulated, an operand denoted in the X
field of the source instruction must be fetched from memory for provision to the FPU. Passage of the responsibiLity for obtaining the X-field operand from memor~ lengthens the instruction execution time. Execution of a target floating point instruction is further Lengthened when the source instruction is an extended RX-type requiring two sequential memory accesses to obtain a 64-bit operand over the 32-bit databus.

`'`'.' - ': '' ;975~;

T~ INVENTION
The invention has application in a system for emulating the execution o~ source CPU instructions, which system includes a target CPU embracing a memory, a number of functional units, and a target instruction unit which processes and issues translated target instructions for execution by target functional units. The emulation system also includes an instruction translating unit ~translator~
for receiving and converting source instructions to target instructions. The invention resides in an improvement to the system and has the form of a mechanism for àccelerating the translation and execution of floating point instructions.
In this regard, the invention includes a token generator in the translator which generates a floating point wait token signal in response to translation of a source floating point instruction. An outcome generator in a fLoating point functional unit of thP target CPU produces completion signals indicative of the outcome of issued target floating point instructions in advance of completion of the execution of the issued instructions. The system includes system means for restraining the translation and issue of instructions in response to the token signal~ Finally, a multi-stage wait queue located in the translator and connected to the system means receives and retains the wait token signal when it is generated and extinguishes the token signal in response to the completion signals indicating the outcome of the target floating point instruction translated from the source floating point instruction.

.

~6975~

The invention stems from a number o~ unanticlpated observations. The inventors have observed that: if instruction translation and target instruction issue are responsive to an accelerated indication of the execution results of target floating point instructions, translation and issuing, and therefore emulation, can be accelerated; if a source RX-type instruction X-field operand can be obtained prior to translated target instruction issue, a source RX
floating point instruction can be emulated by execution of a target RR instruction, thereby obviating a memory access cycle after instruction issue; and if the two halves of the X-field operand of an extended f~oating point source instruction are obtained on successive memory access cycles, emulation of the source instruction can be emulated a set of target floating point instructions which transfer the operand and initiate execution of an RR target instruction sequence which reduces the number of cycles required to emulate the source instruction.
Therefore, the primary objective of the present invention is ~o improve the operation of a system which emulates the performance of a source CPU program that includes floating point arithmetic instructions of the RX-type by accelerating the translation of source to target instructions and the issue of converted ta~get instructions.
A further ob~ective of the present invention is to improve the operation of such an emulation system by reducing the time required for target CPU operations that emulate execution of source RX-type floating point arithmetic instructions.

' ;

,:
',`' ` '' ~':

~697~

These and other objects and further attendant advantages of the subjec~ invention will become more evident when the detailed description is read in connection with the below-described drawings.
BRI~F DESCRIPTION OF T~E DRAWINGS
Figure 1 illustrates the architectural arrangement of a system for emulating a program provided by a source CPU of the I~M 370 type.
Figure 2 illustrates in greater detail functional elements of the emulator-assist processor (EAP~ and instruction processing unit (IPU) which accelerate the translation and target issuing functions in response to accelerated completion indications provided by a floating point unit (FPU) prior to the completion of target floating point arithmetic operations.
Figure 3 illustrates the functional elements included in the IPU and FPU that support the emulation of RX-type floating point souxce instructions by RR-type floating point target instructions.
Figure 4 is a flow diagram illustrating the sequence of operations executed by the arrangement of Figure 3 in emulating RX short source instructions.
Figure 5 is a flow diagram illustrating the sequence of operations executed by the arrangement of Figure 3 in emulating extended RX-type floating point source instructions.
DETAILED DESCRIPTIO~ OF T~E PREFERRED ~MBODIME~T
Figure 1 illustrates the application environment of the invention. This environment includes a reduced instruction set computer (RISC) which forms the target CPU. As ~26~75~i described by Radin in 'IThe 801 Minicomputer," published in the IBM ~ournal of Research and DeveLopment, Volume 27, No.
3, May 1983, at pages 237-246, such a computer includes separate instruction and data streams sourced h~ respective caches~ In this computer, instructions are obtained from an instruction cache and data from a separate data cache, both of which are `located in and managed b~ an instruction processing unit (IPU) 10. The contents of the caches are replenished by IPU access of a memory 12. The IPU 10 processes cached instructi-ons and issues them for execution by one or more processing units ~PU) 14 and a floating point processing unit (FPU) 16~ Since the units 10, 12, 14, and 16 form the target CPU, the instructions issued by the IPU
are termed target instructions and are issued to the processing units on a target instruction bus 19. Operand and result [O/R) data is exchanged between the IPU 10 and the processing units 14 and 16 on the O/R databus 18.
In the postulated environment, the exemplary 801 computer has a 32-bit architecture which is reflected in a corresponding width for the data and instruction buses 18 and 19. Further, in addition to the data cache, the IPU 10 of the 801 includes a bank of general-purpose registers.
When used as an emulator, the target CPU operates in conjunction with an emuLator-assist processor (EAP) 20. In the emulation model a prograrn consisting of instructions from a source CPU are streamed through a portion of the data cache in the IPU 10 on the data path 21, 10, 22 to the EAP
20. In the EAP 20, the source instructions are converted into target instructions and inserted into a target CPU
instruction stream which does not disturb the normal target 375~

CPU instruction execution sequ~nce executed by the IPU 10.
The converted instruction stream is provided to the IPU 10 on datapath 23. The cooperative operations o~ the IPU 10 and EAP 20 in per~ormin~ instruction conversion and generating the target instruction stream are well expLained in the incorporated Fisk Patent and will not be repeated here. However, certain of the structures and functions involved in the instruction conversion must be set out in order ~o adequ~tely explain the invention.
Refer now to Figure -2, where the EAP 20, IPU 10, and FPU 16 are shown. The IPU is shown in ~two functional sections 10a and 10b only to facilitate the following discussion; the IPU 10 in fact comprises a single unit. In Figure 2, the portion 10a of the IPU 10 which receives the source instruction stream on 21 includes a data cache 2~
which terminates the source stream, and an instruction cache 25 which contains microinstructions. The source instruction stream is provided to the EAP 20 from the data cache on signal line 22 and staged, instruction by instruction, through a source instruction register (SIR) 25 which provides a platform against which an instruction mapping circuit 27 can operate. As expLained in the Fisk patent, the instruction mapping circuit 27 generates a next instruction address which is provided to the instruction cache 25 in the IPU 10, and which results in the provision of a microinstruction from the instruction cache, the microinstruction being provided to the EAP on the signal line 30. The microinstruction on the signal line 30 is staged to the EAP in the microinstruction register (MIR) 31.
As is taught in the incorporated Fisk patent, the . ., . :

~6975G

microinstruction held in the register 31 also i.ncludes a next instruction ield which can generate the next instruction address on line 29. Thus, a source instruction in the register 25 can result in the generation of one or a sequence of microinstructions which are fed to the E~P on the signal line 30. Rela~edly, emulation of a source instruction can involve retrieval of a microinstruction sequence from the instruction cache, and execution of a resulting target instruction sequence by the target processor. The microinstruction provided to the register 31 consists of a control section and a skeleton target instruction. The skeleton target instruction is provided on signal line 23a to an instruction merge register 32 in the IPU 10. As taught in the akove-noted Fisk patent, the skeleton instruction consists of an OP code field and register and displacement fields. The control portion of the microinstruction in the register 31 and the operand fields of the source instruction in the register 25 are used by the mapping circuit 27 to fiLl in zeroed operand and/or control fields of the skeleton instruction in an instruction merge register 32 in the IPU 10. The filL information is provided from the mapping circuit on the signal line 23b.
Thus, the signal lines 23a and 23b form the signal line 23 of Figure 1 on which the target instructivn stream is provided to the IPU 10. When an instruction is issued by the IPU 10, it is transferred from the instruction merge register 32 to an instruction register (I REG) 33. When the target instruction resides in the I register 33, *he IPU 10 undertakes a course of instruction issue operations, summarized below.

~ ~ .

~L26975~i The operational context of the invention assumes a source instruction stream including multi-field instructions such as are included in the IBM 370 CPU instruction set. At least one distinctive format in this set is termed the RX
(register/index) format. The RX format has the following form:

oP Rl X? B2 D2 In an RX instruction, the operation is indicated by the OP
code field. A first operand is located in register R1, while the second operand is at a main memory location having
2 B2 + D2. X2 and B2 refer to general registers functioning, for the purpose of the instruction, as index and base registers, while D2 is a displacement.
When the instruction is executed, the result is pLaced in register Rl.
As is known, RX-type instructions include a subset of instructions involving arithmetic operations that are performed on floating point operands. Floating point arithmetic is well-understood. FLoating point data format, terminology, and operations are reviewed at length in the article by Anderson et al. entitled "The IBM System/360 Model 91: Floating-Point Execution Unit," which appeared in the IBM Journal dated January 1967 at pages 34_53.!
.
RX-type floating point arithmetic source instructions are converted into floating point arithmetic target instructions by the EAP 20 for issue by the IPU 10 and execution by the FPU 16.

~6~t75~

An RX-type Eloating point arithmetia instruction can produce a result which alters the condition code (CC) or which raises a program interrupt request (IR). As is known, the state of the CC conditions certain branches in the executing source program. An interrupt request will resuLt in transfer of program control from the executing program.
Therefore, these indicators must be accounted for in the emulation of a source instruction stream. In the EAP 20, the current state of the CC and IR are maintained in a status register 34. The instruction mapping circuit 27 inspects the contents of the status register 34 in order to undertake branching and interrupt activities, when necessary.
The CC and IR are provided by, among other sources, the FPU 16, where floating point instructions which affect the indicators are performed. The CC and IR are provided on the signal Lines 35 and 36, respectively. ALso provided by the FPU 16 is an accelerated validity response (AVR) signal on a signal line 38. The AVR is raised by the FPU 16 at the time when the CC and IR validly reflect the outcome of a floating point arithmetic instruction currently being executed by the FPU 16. The FPU 16 can set the CC and IR in advance of the completion of a floating point arithmetic instruction whose result determines their states. When the CC and I~ have been generated, the FPU 16 raises the AVR, signalling to the EAP 20 that the contents of the register 34 can be inspected. The time of occurrence of the AVR can range from soon after commencement of a floating point arithmetic instruction to the time of completion of an instruction, -~t' j~'' ~6~75~

which is the time at which the CC and IR are conv~ntionally sampled.
In order to take advantage of the accelerated production of the CC and IR by the FPU 16, the invention provides the EAP 20 with a wait token circuit which notifies the EAP 20 and IPU 10 when a floating point arithmetic target instruction has been translated and issued~ restrains further trans~ation and issuance of target instructions until production of the CC and IR in response to execution of the issued floating point arithmetic target instruction, and notifies the EAP 20 and IPU 10 when the CC and IR have been produced, permitting the early resumption of instruction translation and issue.
When a source floating point arithmetic instruction is contained in the register 25, the instruction mapping circuit 27 generates the first address of a microinstruction sequence adapted for translating the source instruction.
All microinstruction sequences generated as the result of a floating point arithmetic instruction terminate with a WAIT
microinstruction. The WAIT microinstruction is the signal to the invention to begin production of a wait token. In this regard, when the source floating point arithmetic instruction is registered at 26, its OP code field is provided to a conventional decoder ~D) 40 which activates a first token precursor signal. When the WAIT
microinstruction at the end of the stimulated sequence is registered at 31, the OP code of the microinstruction is also provided to the decoder 40 which generates a second token precursor signal. A coincidence detector (CD) 42 responds to the two token precursor signals by activating a :

75~;

WAIT TOXEN si~nal on signal line ~3. The WAIT rrOKEN signal is provided to a wait token queue 45, consistin~ of a multi-stage latch sequence including latches 46 and 47.
When the WAIT microlnstruction is in the microinstruction register 31, a skeleton target floating point instruction is entered under the instruction merge register 32 on the signal line 23a, with the operand field data for the instruction being provided by the instruction mapping circuit 27 on signal line 23b. The WAIT TOKEN signal is clocked from the latch 46-to the latch 47 by one phase (CLK
A~ of a multi-phase pipeline clock. When the ~IAIT TOKEN
signal enters the latch 47, the latch activates a HOLD
signal that is provided to the instruction mapping circuit 27, ~he instruction merge and instruction registers 32 and 33, and to the status register 34. The HOLD signal prevents alteration of data in the registers 32, 33, and 34 for so Long as the HOLD signal is active. In addition, the HOLD
signal suspends the operation of the instruction mapping circuit 27.
Concurrent with the transfer of the WAIT TOKEN signa~
into the latch 47 in response to a second multi-phase clock ~CLK B), the target floating point instruction shifts from the instruction merge to the instruction register 33 in the IPU 10. In the instruction register 33, the instruction issues and is provided to the FPU 16. The FPU 16 undertakes operations for executing the issued instruction and also operates to set the CC and IR to the states reflecting the outcome of the executed instruction. When the CC and IR
have been produced by the FPU 16, the AVR is raised, which clears the latch 47, thereby deactivating the ~IOLD signal.

, .

r ~6975~;

When the HOLD signal is deactivated, the current CC and I~ are entered into the status register 34 from khe F'PU 16.
When the entry is made, the instruction mapping circuit 27 inspects the contents oE the status register 34 to undertake any required branching or interrupt operations. Following deactivation of the HOLD signal and the generation of the next skeleton instruction, a merged instruction will lodge in the register 32 and be transferred for issue to the instruction register 33.
It will be evident to those skilled in the art that the WAIT TOKEN circuit 45 responds to acceleration of the AVR
and permits target instruction conversion and issue to take place in advance of the completion of an executing target floating point arithmetic instruction. The skilled artisan will appreciate that this accelerated operation can only enhance the concurrency of operations in the emulator of Figure 1, thereby increasing the overall speed of its operations.
Reference to Figure 3 will engender an understanding of how the invention achieves the second of the stated objectives: that of reducing the time required for execution of target machine operations that emulate execution of source R~-type fLoating point arithmetic instructions. In Figure 3, the IPU 10 includes, in addition to the instruction register 33, a bank of general purpose registers (GPR) 50, a decode/control circuit 52, and a pair of multiplexers (MUX) 54, 56. The general purpose registers are used for instruction processing operations in the IPU
10. The decode/control circuit conventionally receives OP
code information from the OP field of the target instruction ~9~

in the register 33 and information in the D fie~d of the instruction. The D field specifies a control operation to be executed and identifies the controller responsible for the execution. When designated, the decode/control circuit 52 generates controL signaLs for the MUXES 54, 56 and the GPR's 50 to execute the oP code indicated instruction. When an instruction processing operation is required, the D field identifies the decode/controL circuit 52 and enables the circuit to conduct a specified operation. Such an operation includes data fetched from memory, entailing operation of the MUX 54 to conduct the fetched data to a designated GPR.
When data is to be exchanged with the O/R databus 1~, the circuit 52 operates the MUX 56 to conduct the data and also operates a GPR to source or sink the data. The complement of operations performed by the decode/control circui~ 52 also includes a LOAD IMMEDIATE instruction which will cause a value in one of the register fields (RT, RA, or RB) of the target instruction in the register 33 to be entered into a predetermined one of the GPR registers.
Figure 3 also illustrates, in greater detail, the FPU
16. The FPU 16 includes a control unit 60 which can be designated by a D field of an issued target instruction and which can control and synchronize the operation of FPU
resources to conduct the execution of an instruction specified in the OP and D fieLds of the issued target instruction. The FPU has internal resources including a bank of floating point registers (FPR) 62. The FPR's 62 comprise a plurality of general purpose 54-bit registers for temporary storage of the operands and results of floating point operations conducted by the FPU 16. FPU arithmetic :
~:' x~

and logic operations are cond~lcted by execution uni~s ~X) 64 and 65. As is known, such units can include, for exàmple, a multiply/divide execution unit, an add/subtract/shift unit, and a unit for performing radicaL operations. Since the IPU
and its other associated functional units are assembLed in a 32~bit data and instruction architecture, the data interface of the FPU 16 is a 32-bit interface register (IR~ 66, which exchanges data with the 32-bit databus 18.
- Reerring now to Figures 3 and 4 together, the operation of the invention in emulating "short'l RX~type floating point arithmetic source instructions can be understood. The term "short" indicates that the operands of an RX-type instruction are 32 bits in length. Thus, the data interface 21 between the IPU 10 and FPU 16 can transfer an entire operand result in a single cycle. In step S1 of Figure 4, when a short source RX-type instruction enters the EAP 20, the EAP will undertake a microinstruction se~uence which incLudes provision of a LOAD IMMEDIATE instruction to the IPU 10, designating the IPU as the unit of execution.
The instruction will carry in one of its register fièlds (RT, RA, or RB) the memory address corresponding to the sum of the X2, B2, and D2 fields of the RX-type instruction.
The address will be loaded into GPRi. Next, during the microinstruction sequence, a LOAD instruction to the IPU
will result in the operand stored at the address held in GPRi being loaded into GPRj. This is step S2 in Figure 4.
Now, the final instruction of the microinstruction se~uence results in a target floating point arithmetic instruction being loaded into the register 33 for issue. The issued instruction is referred to as an extended-RX (ERX) i975~

short-type tar~et instructioll. Such an instruction results in the IPU contro~ circuit 52 placing the contents o~ GPRj on the databus 21; this is the X-field operand of the source RX instruction. The FPU receives and decodes the issued instruction as a register-to-register (RR) floating point arithmetic instruction in which the first operand is located in the FPR designated by the RA field of the instruction and the second operand is located in the interface register 66, which is designated in the RB field of the issued instruction. The result of the target instruction execution is placed in the FPR denoted in the RT field of the instruction.
The procedure of Eigure 4 is an efficient one for emulating source RX-type floating point arithmetic instructions by execution of target register~to-register floating point instructions. Without the RX enhancement of Figure 4, it would be necessary to insert the added step of transferring the operand held in GPRj to a floating point register and then executing a floating point register-to-register floating point instruction. The extra step would be inserted between steps S2 and S3 of Figure 4.
- Reference now to Figures 3 and 5 will provide an understanding of how the invention operates to emulate a source RX floating point arithmetic instruction designating a long t64-bit) operand. In the microinstruction sequence undertaken by the EAP in response to an extended RX-type instruction, the sequence of Figure 4 is initially followed by providing, in step S10, a LOAD IMMEDIATE target instruction for execution by the IPU, which places the X
field address of the RX instruction into GPRi. It will be :;
EN9-86-011 ~ 17 -' ' 75~i appreciated that the address in GPRi defines a double-worcl operand comprising 64 bits. Thus, following the LOAD
IMM~DIATE instruction, in steps S11 and S12, two LOAD
instructions are sent in sequence, causing the IPU to first ~oad the high order (HO~ four bytes of the X-field operand into GPRj and then the low order ILO) four bytes of the operand into GPRk. In step S13, the penultimate instruction provided to the IPU as a result of the microinstruction sequence undertaken by the EAP consists of an extended RX
(ERX) long-type floating point instruction to be issued to the FPU. Essentially, the instruction is recognized by the FPU as an extended register-to register instruction involving a pair of 64-bit operands, one stored in the FPR
designated by the RA field of the instruction. The FPU
obtains the second operand from the interface register 66.
The second operand is staged to the FPU through the interface register 66 by transferring first the high order four bytes in GPRj concurrently with issue of the instruction in step S13, and then, in step S1~, transferring the low order ~our bytes in response to the finaL target instruction resulting from the microinstruction sequence.
The final instruction is denoted as FINRX and is executed only by the IPU. Instruction execution consists of transfer of the low order four bytes from GPRk to the interface register 66. Thus, the execution sequence of Figure 5 permits efficient emu~ation of an extended source RX
floating point instruction by means of an extended target register-to-register floating point operation. The FINRX
dummy instruction adapts the emulation sequence to the 32-bit architecture of the target machine.

l~i9t75G

obviously, many modifications and variations of this invention are possible in light of these teachings, and lt is therefore understood that the appended claims permit the invention to be practiced other than as specifically described.

.

~;
' : ' ' ~,,.

' ~'.: . : - '

Claims (5)

The embodiments of the invention in which an exclusive property or privilege is claimed are defined as follows:
1. In a system for emulating the execution of source CPU instructions received from a computer external to said system, said system including a memory, a target instruction unit which processes and issues translated target instructions, target functional units for executing target instructions, and an instruction translating unit for receiving and translating said source instructions to translated target instructions, an improvement for accelerating said translating and issuing, said improvement comprising:
signal means in said translator for generating a floating point token signal in response to translation of a source floating point arithmetic instruction;
outcome means in said floating point unit for generating completion signals indicative of the outcome of the target floating point arithmetic instruction issued in response to said translation, said completion signals being generated in advance of completing the execution of said issued target floating point arithmetic instruction;
a multi-stage wait queue including a wait stage in said translator, said multi-stage wait queue connected to said signal and outcome means for receiving and moving said token signal to said wait stage in response to said translation and for removing said token signal from said wait stage in response to said completion signals; and system means connected to said wait stage, said translator, and said target instruction unit for preventing the translation and issue of instructions until said token is removed from said wait stage.
2. The improvement of Claim 1 wherein said source floating point arithmetic instruction is an RX-type instruction which includes an X field for indicating a first operand in a memory location, and which indicates a second operand in a register location, and said target instruction is an RR-type instruction which indicates a first operand in a register location and a second operand in a register location, said target instruction being executable against the operands of said RX instruction and further including:
a register in said target instruction unit for receiving a floating point operand indicated by the X-field of said source instruction;
a data interface means in said floating point unit selectively connectable to said register for buffering said operand to said floating point unit; and multiplexing means in said target instruction unit connected to said register, said memory, and said data interface means for transferring said operand from a location in said memory indicated by said X-field to said register prior to the issue of said target instruction and for connecting said data interface means to said register to transfer said operand concurrently with the issued target RR instruction.
3. The improvement of Claim 1 wherein said source instruction is an extended RX-type instruction which includes an X field for indicating a first operand in a memory location, and which indicates a second operand in a register location, and said target instruction is an RR-type instruction which indicates a first operand in a register location and a second operand in a register location, said target instruction being executable against the operands of said RX instruction and further including:
first and second registers in said target instruction unit for receiving respective first and second halves of an extended floating point operand indicated by the X-field of said source instruction;
a data interface means in said floating point unit selectively connectable to said first and second registers for sequentially buffering said first and second operand halves to said floating point unit; and multiplexing means in said target instruction unit connected to said first and second registers, said memory and said data interface means for transferring the first and second halves of said operand from a location in said memory indicated by said X-field to said first and second registers, respectively, prior to the issue of said target instruction and for connecting said data interface means successively to said first and second registers to transfer said first and second operand halves concurrently with the issue of said target RR instruction.
4. In a method for emulating execution of a source floating point RX-type instruction (source instruction), said method executable by an emulation system including a memory for storing operands, a Fisk-translator for translating said source instruction into an RR-type target floating point instruction (target instruction), a target system instruction unit for issuing said target instruction, and a target system floating point unit for executing said target instruction, an improvement comprising steps of:
providing a source instruction in the form of RX-type floating point instruction from an instruction source, said RX-type instruction including an X field for indicating a memory location containing an operand;
moving the operand indicated by said X-field of said source instruction to a register located in said instruction unit;
assembling and issuing a target instruction to the floating point unit for execution using operands of said source instruction;
concurrently with said issuing of said target instruction, transferring said operand from said instruction unit register to said floating point unit; and executing said target instruction by combining said operand with an operand contained in said floating point unit.
5. In a method for emulating execution of an extended source floating point RX-type instruction (source instruction), said method executable in an emulation system including a memory for storing operands, a Fisk-type translator for translating said source instruction into an RR-type target floating point instruction (target instruction) for issuing said target instruction, and a target system floating point unit for executing said target instruction, an improvement, comprising the steps of:
providing a source instruction in the form of an RX-type floating point instruction from an instruction source, said RX-type instructing including an X field for indicating a memory location containing an operand with a first half and a second half;
moving said first half of said operand indicated by the X field of said RX-type instruction into a first register located in said instruction unit;
moving said second half of said operand into a second register located in said instruction unit;
in response to said source instruction, assembling and issuing a target instruction to the floating point unit for execution using operands of said source instruction;

concurrently with the issue of said target instruction, transferring said first operand half from said first register to said floating point unit;
transferring said second operand half from said second register to said floating point unit; and executing said target instruction by combining said first operand with a second operand in said floating point unit.
CA000547682A 1986-10-06 1987-09-24 Extended floating point operations supporting emulation of source instruction execution Expired - Fee Related CA1269756A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US915,423 1978-06-14
US06/915,423 US4841476A (en) 1986-10-06 1986-10-06 Extended floating point operations supporting emulation of source instruction execution

Publications (1)

Publication Number Publication Date
CA1269756A true CA1269756A (en) 1990-05-29

Family

ID=25435704

Family Applications (1)

Application Number Title Priority Date Filing Date
CA000547682A Expired - Fee Related CA1269756A (en) 1986-10-06 1987-09-24 Extended floating point operations supporting emulation of source instruction execution

Country Status (11)

Country Link
US (1) US4841476A (en)
EP (1) EP0263288B1 (en)
JP (1) JPH0758466B2 (en)
KR (1) KR910000364B1 (en)
AR (1) AR240723A1 (en)
AT (1) ATE103085T1 (en)
BR (1) BR8704431A (en)
CA (1) CA1269756A (en)
DE (1) DE3789345T2 (en)
HK (1) HK79994A (en)
MY (1) MY102468A (en)

Families Citing this family (57)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5093784A (en) * 1987-02-27 1992-03-03 Nec Corporation Data processor with efficient transfer between subroutines and main program
CA1327080C (en) * 1987-05-26 1994-02-15 Yoshiko Yamaguchi Reduced instruction set computer (risc) type microprocessor
JPH0628036B2 (en) * 1988-02-01 1994-04-13 インターナショナル・ビジネス・マシーンズ・コーポレーシヨン Simulation method
US5167023A (en) * 1988-02-01 1992-11-24 International Business Machines Translating a dynamic transfer control instruction address in a simulated CPU processor
US4951195A (en) * 1988-02-01 1990-08-21 International Business Machines Corporation Condition code graph analysis for simulating a CPU processor
CA2002201C (en) * 1988-12-06 1999-04-27 John Charles Goettelmann Translation technique
US5430862A (en) * 1990-06-29 1995-07-04 Bull Hn Information Systems Inc. Emulation of CISC instructions by RISC instructions using two pipelined stages for overlapped CISC decoding and RISC execution
US5625836A (en) * 1990-11-13 1997-04-29 International Business Machines Corporation SIMD/MIMD processing memory element (PME)
US5966528A (en) * 1990-11-13 1999-10-12 International Business Machines Corporation SIMD/MIMD array processor with vector processing
DE69131272T2 (en) * 1990-11-13 1999-12-09 Ibm Parallel associative processor system
US5963745A (en) * 1990-11-13 1999-10-05 International Business Machines Corporation APAP I/O programmable router
US5809292A (en) * 1990-11-13 1998-09-15 International Business Machines Corporation Floating point for simid array machine
US5590345A (en) * 1990-11-13 1996-12-31 International Business Machines Corporation Advanced parallel array processor(APAP)
US5765015A (en) * 1990-11-13 1998-06-09 International Business Machines Corporation Slide network for an array processor
US5815723A (en) * 1990-11-13 1998-09-29 International Business Machines Corporation Picket autonomy on a SIMD machine
US5963746A (en) * 1990-11-13 1999-10-05 International Business Machines Corporation Fully distributed processing memory element
US5765012A (en) * 1990-11-13 1998-06-09 International Business Machines Corporation Controller for a SIMD/MIMD array having an instruction sequencer utilizing a canned routine library
US5794059A (en) * 1990-11-13 1998-08-11 International Business Machines Corporation N-dimensional modified hypercube
US5734921A (en) * 1990-11-13 1998-03-31 International Business Machines Corporation Advanced parallel array processor computer package
US5828894A (en) * 1990-11-13 1998-10-27 International Business Machines Corporation Array processor having grouping of SIMD pickets
US5765011A (en) * 1990-11-13 1998-06-09 International Business Machines Corporation Parallel processing system having a synchronous SIMD processing with processing elements emulating SIMD operation using individual instruction streams
US5617577A (en) * 1990-11-13 1997-04-01 International Business Machines Corporation Advanced parallel array processor I/O connection
US5752067A (en) * 1990-11-13 1998-05-12 International Business Machines Corporation Fully scalable parallel processing system having asynchronous SIMD processing
US5630162A (en) * 1990-11-13 1997-05-13 International Business Machines Corporation Array processor dotted communication network based on H-DOTs
US5588152A (en) * 1990-11-13 1996-12-24 International Business Machines Corporation Advanced parallel processor including advanced support hardware
DE69216020T2 (en) * 1991-03-07 1997-07-10 Digital Equipment Corp IMPROVED TROUBLESHOOTING SYSTEM AND METHOD, PARTICULARLY FOR TROUBLESHOOTING IN A MULTI-ARCHITECTURE ENVIRONMENT
US5652869A (en) * 1991-03-07 1997-07-29 Digital Equipment Corporation System for executing and debugging multiple codes in a multi-architecture environment using jacketing means for jacketing the cross-domain calls
US5594918A (en) * 1991-05-13 1997-01-14 International Business Machines Corporation Parallel computer system providing multi-ported intelligent memory
JPH079632B2 (en) * 1991-06-18 1995-02-01 インターナショナル・ビジネス・マシーンズ・コーポレイション Address translation device and method
FR2678401A1 (en) * 1991-06-28 1992-12-31 Philips Electronique Lab INFORMATION PROCESSING DEVICE MORE PARTICULARLY ADAPTED TO A CHAIN LANGUAGE, OF THE FORTH TYPE IN PARTICULAR.
US5438668A (en) * 1992-03-31 1995-08-01 Seiko Epson Corporation System and method for extraction, alignment and decoding of CISC instructions into a nano-instruction bucket for execution by a RISC computer
JP2642039B2 (en) * 1992-05-22 1997-08-20 インターナショナル・ビジネス・マシーンズ・コーポレイション Array processor
JPH0773046A (en) * 1992-12-07 1995-03-17 Intel Corp Method and equipment for emulation of circuit in computer system
WO1994027214A1 (en) * 1993-05-07 1994-11-24 Apple Computer, Inc. Method for decoding sequences of guest instructions for a host computer
US5392408A (en) * 1993-09-20 1995-02-21 Apple Computer, Inc. Address selective emulation routine pointer address mapping system
US5408622A (en) * 1993-09-23 1995-04-18 Apple Computer, Inc. Apparatus and method for emulation routine control transfer via host jump instruction creation and insertion
US5542059A (en) * 1994-01-11 1996-07-30 Exponential Technology, Inc. Dual instruction set processor having a pipeline with a pipestage functional unit that is relocatable in time and sequence order
US5481684A (en) * 1994-01-11 1996-01-02 Exponential Technology, Inc. Emulating operating system calls in an alternate instruction set using a modified code segment descriptor
US5781750A (en) * 1994-01-11 1998-07-14 Exponential Technology, Inc. Dual-instruction-set architecture CPU with hidden software emulation mode
US5685009A (en) * 1994-07-20 1997-11-04 Exponential Technology, Inc. Shared floating-point registers and register port-pairing in a dual-architecture CPU
US5481693A (en) * 1994-07-20 1996-01-02 Exponential Technology, Inc. Shared register architecture for a dual-instruction-set CPU
JPH08339298A (en) * 1995-02-02 1996-12-24 Ricoh Co Ltd Instruction addition method in microprocessor and microprocessor using the same
US5619665A (en) * 1995-04-13 1997-04-08 Intrnational Business Machines Corporation Method and apparatus for the transparent emulation of an existing instruction-set architecture by an arbitrary underlying instruction-set architecture
US5819063A (en) * 1995-09-11 1998-10-06 International Business Machines Corporation Method and data processing system for emulating a program
US5812823A (en) * 1996-01-02 1998-09-22 International Business Machines Corporation Method and system for performing an emulation context save and restore that is transparent to the operating system
US5758140A (en) * 1996-01-25 1998-05-26 International Business Machines Corporation Method and system for emulating instructions by performing an operation directly using special-purpose register contents
US6711667B1 (en) * 1996-06-28 2004-03-23 Legerity, Inc. Microprocessor configured to translate instructions from one instruction set to another, and to store the translated instructions
JPH113225A (en) * 1997-06-13 1999-01-06 Nec Corp Information processor
US5864690A (en) * 1997-07-30 1999-01-26 Integrated Device Technology, Inc. Apparatus and method for register specific fill-in of register generic micro instructions within an instruction queue
DE69820027T2 (en) * 1997-10-02 2004-07-08 Koninklijke Philips Electronics N.V. DEVICE FOR EXECUTING VIRTUAL MACHINE COMMANDS
EP0997815A3 (en) * 1998-10-29 2004-05-26 Texas Instruments Incorporated Interactive translation system and method
US7149883B1 (en) * 2000-03-30 2006-12-12 Intel Corporation Method and apparatus selectively to advance a write pointer for a queue based on the indicated validity or invalidity of an instruction stored within the queue
US6862565B1 (en) * 2000-04-13 2005-03-01 Hewlett-Packard Development Company, L.P. Method and apparatus for validating cross-architecture ISA emulation
US7243217B1 (en) * 2002-09-24 2007-07-10 Advanced Micro Devices, Inc. Floating point unit with variable speed execution pipeline and method of operation
US7293159B2 (en) * 2004-01-15 2007-11-06 International Business Machines Corporation Coupling GP processor with reserved instruction interface via coprocessor port with operation data flow to application specific ISA processor with translation pre-decoder
GB2447968B (en) * 2007-03-30 2010-07-07 Transitive Ltd Improvements in and relating to floating point operations
JP4849273B2 (en) * 2008-03-26 2012-01-11 株式会社日立プラントテクノロジー H type protector for discharge electrode of dry type electrostatic precipitator

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3881173A (en) * 1973-05-14 1975-04-29 Amdahl Corp Condition code determination and data processing
JPS57150039A (en) * 1981-03-11 1982-09-16 Hitachi Ltd Data processor
WO1984001635A1 (en) * 1982-10-22 1984-04-26 Ibm Accelerated instruction mapping external to source and target instruction streams for near realtime injection into the latter

Also Published As

Publication number Publication date
EP0263288A2 (en) 1988-04-13
MY102468A (en) 1992-06-30
EP0263288A3 (en) 1991-06-12
AR240723A1 (en) 1990-09-28
HK79994A (en) 1994-08-19
BR8704431A (en) 1988-05-24
US4841476A (en) 1989-06-20
KR910000364B1 (en) 1991-01-24
DE3789345T2 (en) 1994-09-29
EP0263288B1 (en) 1994-03-16
JPH0758466B2 (en) 1995-06-21
ATE103085T1 (en) 1994-04-15
KR880005516A (en) 1988-06-29
JPS6398739A (en) 1988-04-30
DE3789345D1 (en) 1994-04-21

Similar Documents

Publication Publication Date Title
CA1269756A (en) Extended floating point operations supporting emulation of source instruction execution
US4589087A (en) Condition register architecture for a primitive instruction set machine
US5341482A (en) Method for synchronization of arithmetic exceptions in central processing units having pipelined execution units simultaneously executing instructions
US4569016A (en) Mechanism for implementing one machine cycle executable mask and rotate instructions in a primitive instruction set computing system
EP0464494B1 (en) A high performance pipelined emulator
EP0399762B1 (en) Multiple instruction issue computer architecture
US5530804A (en) Superscalar processor with plural pipelined execution units each unit selectively having both normal and debug modes
CA1180455A (en) Pipelined microprocessor with double bus architecture
US4947316A (en) Internal bus architecture employing a simplified rapidly executable instruction set
JP2001195250A (en) Instruction translator and instruction memory with translator and data processor using the same
US5887175A (en) Apparatus and method for managing interrupt delay on floating point error
US5761491A (en) Data processing system and method for storing and restoring a stack pointer
US5233698A (en) Method for operating data processors
AU644065B2 (en) Arithmetic unit
US20020138712A1 (en) Data processing device with instruction translator and memory interface device
US5179691A (en) N-byte stack-oriented CPU using a byte-selecting control for enhancing a dual-operation with an M-byte instruction word user program where M<N<2M
US5864701A (en) Apparatus and method for managing interrupt delay associated with mask flag transition
EP0081336A2 (en) Shifting apparatus
EP0374598B1 (en) Control store addressing from multiple sources
CA1304823C (en) Apparatus and method for synchronization of arithmetic exceptions in central processing units having pipelined execution units simultaneously executing instructions
EP0992893B1 (en) Verifying instruction parallelism
EP0015276B1 (en) A digital pipelined computer
WO1979000959A1 (en) A computer system having enhancement circuitry for memory accessing
JP2579817B2 (en) Microprocessor
Sylvain et al. The design and evaluation of the array machine: a high-level language processor

Legal Events

Date Code Title Description
MKLA Lapsed