US20110060892A1 - Speculative forwarding of non-architected data format floating point results - Google Patents
Speculative forwarding of non-architected data format floating point results Download PDFInfo
- Publication number
- US20110060892A1 US20110060892A1 US12/820,662 US82066210A US2011060892A1 US 20110060892 A1 US20110060892 A1 US 20110060892A1 US 82066210 A US82066210 A US 82066210A US 2011060892 A1 US2011060892 A1 US 2011060892A1
- Authority
- US
- United States
- Prior art keywords
- adf
- result
- instruction
- floating
- point unit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/38—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
- G06F7/48—Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
- G06F7/483—Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2207/00—Indexing scheme relating to methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F2207/38—Indexing scheme relating to groups G06F7/38 - G06F7/575
- G06F2207/3804—Details
- G06F2207/3808—Details concerning the type of numbers or the way they are handled
- G06F2207/3812—Devices capable of handling different types of numbers
- G06F2207/3824—Accepting both fixed-point and floating-point numbers
Definitions
- the present invention relates in general to the field of pipelined microprocessor architectures, and particularly to the forwarding of floating-point results from one instruction to another.
- the x86 architecture specifies multiple data formats for floating point operands, namely, single-precision, double-precision, and extended double-precision. This implies that the floating point units have a different multiplier, adder, etc. for each architected data format. This is an inefficient use of space and power. So, to reduce the number of multipliers, adders, etc., the floating point units include a single multiplier, adder, etc. each capable of operating on operands that are in a single non-architected data format.
- the floating point units convert the received source operands from their architected data format to the non-architected data format, perform the operation on the non-architected data format operands to generate a result in the non-architected data format, and then convert the result back to the architected data format.
- the architected data format results are then forwarded to the floating point units as source operands, as illustrated by the conventional floating point units 112 shown in FIG. 4 .
- the present invention provides a microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands.
- the microprocessor includes first and second floating-point units.
- the first floating-point unit is configured to speculatively forward a non-ADF result generated by the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction.
- the second floating-point unit is configured to use the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction.
- the second floating-point unit is further configured to convert the non-ADF result to an ADF result and to determine whether the non-ADF result creates an exception condition when converted to the ADF result.
- the microprocessor is configured to cancel the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
- the present invention provides a method for processing floating-point instructions in a microprocessor having first and second floating-point units, wherein the microprocessor has an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands.
- the method includes speculatively forwarding a non-ADF result generated by the first floating-point unit from the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction.
- the method also includes the second floating-point unit using the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction.
- the method also includes determining whether the non-ADF result creates an exception condition when converted to an ADF result.
- the method also includes canceling the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
- the present invention provides a computer program product encoded in at least one computer readable medium for use with a computing device, the computer program product comprising computer readable program code embodied in said medium for specifying a microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands.
- the computer readable program code includes first program code for specifying a first floating-point unit and second program code for specifying a second floating-point unit.
- the first floating-point unit is configured to speculatively forward a non-ADF result generated by the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction.
- the second floating-point unit is configured to use the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction.
- the second floating-point unit is further configured to convert the non-ADF result to an ADF result and to determine whether the non-ADF result creates an exception condition when converted to the ADF result.
- the microprocessor is configured to cancel the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
- FIG. 1 is a block diagram illustrating a microprocessor that incorporates latency-reducing non-architectural data format result forwarding.
- FIG. 2 is a block diagram illustrating in more detail the floating point units of FIG. 1 .
- FIG. 3 is a flowchart illustrating an example of operation of the microprocessor of FIG. 1 .
- FIG. 4 is a block diagram illustrating related art floating point units that do not forward non-architectural data format results.
- inventions described herein include modified floating point units that forward the non-architected data format (NADF) result without converting to the architected data format (ADF) and are capable of receiving and operating directly on the NADF operands without converting them from the ADF to the NADF. This reduces the latency by removing the conversion time in and out of the floating point units from the critical path.
- NADF non-architected data format
- ADF architected data format
- the NADF includes additional exponent bits beyond the number of exponent bits specified by the largest ADF.
- the largest ADF is the 80-bit double-precision format, which includes a 15-bit exponent field, and the NADF includes a 17-bit exponent field to accommodate overflows and underflows.
- the microprocessor 100 includes a plurality of floating point units (FPU) 112 .
- the floating point units 112 include a first floating point unit 112 A that includes a floating point multiplier 226 (see FIG. 2 ) that generates a first ADF result 162 , and a second floating point unit 112 B that includes a floating point adder 236 (see FIG. 2 ) that generates a second ADF result 164 .
- the floating point units 112 receive ADF source operands 152 from a multiplexer 116 that receives ADF source operands from general purpose registers (GPRs) 118 , from temporary registers of a reorder buffer (ROB) 114 , and the ADF results 162 / 164 from the floating point units 112 themselves. Additionally, the floating point units 112 generate respective exception signals 172 / 174 to the ROB 114 to indicate that an instruction created an exception condition, such as an overflow or underflow, as described in more detail below.
- GPRs general purpose registers
- ROB reorder buffer
- the microprocessor 100 is an x86 (also referred to as IA-32) architecture microprocessor 100 ; however, other microprocessor architectures may be employed.
- a microprocessor is an x86 architecture processor if it can correctly execute a majority of the application programs that are designed to be executed on an x86 microprocessor. An application program is correctly executed if its expected results are obtained.
- the microprocessor 100 executes instructions of the x86 instruction set and includes the x86 user-visible register set.
- Floating point unit 112 A includes a converter 222 , coupled to a mux 224 , coupled to a NADF multiplier 226 , coupled to a second converter 228 .
- Floating point unit 112 B includes a converter 232 , coupled to a mux 234 , coupled to a NADF adder 236 , coupled to a second converter 238 .
- the converter 222 converts the ADF operands 152 into NADF operands 272 that are provided to the mux 224 .
- the mux 224 also receives a NADF result 252 forwarded from the NADF multiplier 226 and a NADF result 254 forwarded from the NADF adder 236 . From its inputs, the mux 224 selects NADF operands 266 for provision to the NADF multiplier 226 , which multiplies the operands 266 to generate the NADF result 252 .
- the converter 228 converts the NADF result 252 to the ADF result 162 of FIG. 1 . Additionally, the converter 228 generates an exception indicator 172 of FIG.
- the NADF may have accommodated the result 252 without creating an underflow or overflow; however, the smaller ADF may not sufficiently accommodate the NADF result 252 such that the conversion from the NADF to the ADF creates an exception condition.
- the converter 232 converts the ADF operands 152 into NADF operands 274 that are provided to the mux 234 .
- the mux 234 also receives the NADF result 252 forwarded from the NADF multiplier 226 and the NADF result 254 forwarded from the NADF adder 236 . From its inputs, the mux 234 selects NADF operands 268 for provision to the NADF adder 236 , which adds the operands 268 to generate the NADF result 254 .
- the converter 238 converts the NADF result 254 to the ADF result 164 of FIG. 1 . Additionally, the converter 238 generates an exception indicator 174 of FIG. 1 if it detects that the ADF result 164 created an exception condition, such as an underflow or overflow.
- the floating point units 112 of FIG. 2 advantageously potentially reduce instruction execution latency by directly forwarding to one another their NADF results 252 / 254 .
- This is in contrast to the conventional floating point units 112 of FIG. 4 , which incur the latency of converting the NADF results to ADF results, forwarding the converted ADF results, and then reconverting to NADF operands.
- Floating point operations may generate exception conditions, such as overflow or underflow.
- a side-effect of the NADF is that some results that would overflow/underflow in the ADF would not do so in the NADF, e.g., because of the larger exponent, as discussed above. Consequently, the forwarding of the NADF results 252 / 254 is speculative because the programmer may not want the instruction that receives the forwarded NADF result 252 / 254 to execute with a value that would cause an exception when converted to ADF.
- the converters 228 / 238 also perform the conversion to ADF, and if the conversion yields an overflow/underflow, then they generate an exception 172 / 174 on the forwarding instruction and the microprocessor 100 kills the instruction that executed using the speculatively forwarded NADF result, as described in more detail with respect to FIG. 3 .
- FIG. 3 a flowchart illustrating an example of operation of the microprocessor 100 of FIG. 1 is shown. Flow begins at block 302 .
- floating point unit 112 A receives an instruction-B for execution.
- the mux 224 detects that one of the source operands is the NADF result 254 of a previous instruction-A that has been forwarded from the NADF adder 236 and accordingly selects the forwarded NADF result 254 .
- the mux 224 may also select as the other operand the forwarded NADF result 252 from the NADF multiplier 226 or the converted NADF operands 272 .
- Flow proceeds to block 304 .
- the NADF multiplier 226 multiplies the NADF operands 266 to generate the NADF result 252 for instruction-B. Flow proceeds concurrently from block 304 to blocks 306 and 326 .
- the forwarding buses forward the NADF result 252 of instruction-B to the NADF adder 236 .
- Flow proceeds to block 308 .
- floating point unit 112 B receives an instruction-C for execution.
- the mux 234 detects that one of the source operands is the NADF result 252 of instruction-B that has been forwarded at block 306 from the NADF multiplier 226 and accordingly selects the forwarded NADF result 252 .
- the mux 234 may also select as the other operand the forwarded NADF result 254 from the NADF adder 236 or the converted NADF operands 274 .
- Flow proceeds to block 312 .
- the NADF adder 236 adds the NADF operands 268 to generate the NADF result 254 for instruction-C.
- Flow ends at block 312 although it is understood that the forwarding of NADF results 252 and/or 254 may advantageously continue for a long sequence of instructions, thereby reducing latency and speeding up the execution of the sequence of instructions relative to the conventional floating point units 112 of FIG. 4 that include the ADF-to-NADF conversion and NADF-to-ADF conversion in the forwarding paths.
- the converter 228 converts the NADF result 252 of instruction-B to ADF result 162 .
- Flow proceeds to decision block 324 .
- the converter 228 determines whether the NADF result 252 of instruction-B creates an exception condition when converting to ADF. If so, flow proceeds to block 326 ; otherwise, flow proceeds to block 328 .
- the converter 228 asserts the exception indicator 172 to the ROB 114 . Consequently, the microprocessor 100 will take an exception, and the ROB 114 will flush instruction-C since instruction-C is newer in program sequence than instruction-B that caused the exception. This is necessary since the NADF result 252 of instruction-B was speculatively forwarded to the NADF adder 236 without knowledge of whether the NADF result 252 was a good operand, i.e., without knowledge of whether the NADF result 252 was a non-underflowed/overflowed value from an ADF perspective. That is, the programmer may not have desired instruction-C to execute with a non-good operand. However, advantageously the NADF results 252 / 254 are speculatively forwarded to potentially reduce the latency of instruction execution and in most cases both the forwarding and the receiving instructions will complete successfully. Flow ends at block 326 .
- floating point unit 112 A provides the ADF result 162 to the ROB 114 for storage in a temporary register therein. Flow proceeds to block 332 .
- the ROB 114 retires the ADF result 162 from the temporary register to the appropriate GPR 118 . Flow ends at block 332 .
- software can enable, for example, the function, fabrication, modeling, simulation, description and/or testing of the apparatus and methods described herein. This can be accomplished through the use of general programming languages (e.g., C, C++), hardware description languages (HDL) including Verilog HDL, VHDL, and so on, or other available programs.
- general programming languages e.g., C, C++
- HDL hardware description languages
- Such software can be disposed in any known computer usable medium such as magnetic tape, semiconductor, magnetic disk, or optical disc (e.g., CD-ROM, DVD-ROM, etc.), a network, wire line, wireless or other communications medium.
- Embodiments of the apparatus and method described herein may be included in a semiconductor intellectual property core, such as a microprocessor core (e.g., embodied in HDL) and transformed to hardware in the production of integrated circuits.
- the apparatus and methods described herein may be embodied as a combination of hardware and software.
- the present invention should not be limited by any of the exemplary embodiments described herein, but should be defined only in accordance with the following claims and their equivalents.
- the present invention may be implemented within a microprocessor device which may be used in a general purpose computer.
- a microprocessor device which may be used in a general purpose computer.
Abstract
A microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands includes first and second floating-point units. The first floating-point unit is configured to speculatively forward a non-ADF result generated by the first floating-point unit to the second floating-point unit. The non-ADF result is associated with a first instruction. The second floating-point unit is configured to use the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction. The second floating-point unit is further configured to convert the non-ADF result to an ADF result and to determine whether the non-ADF result creates an exception condition when converted to the ADF result. The microprocessor is configured to cancel the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
Description
- This application claims priority based on U.S. Provisional Application Ser. No. 61/240,753, filed Sep. 9, 2009, entitled FAST FLOATING POINT RESULT FORWARDING USING NON-ARCHITECTED DATA FORMAT, which is hereby incorporated by reference in its entirety.
- This application is related to U.S. Non-Provisional Application TBD, filed concurrently herewith, entitled FAST FLOATING POINT RESULT FORWARDING USING NON-ARCHITECTED DATA FORMAT, which is incorporated by reference herein in its entirety, and which is subject to an obligation of assignment to common assignee VIA Technologies, Inc.
- The present invention relates in general to the field of pipelined microprocessor architectures, and particularly to the forwarding of floating-point results from one instruction to another.
- The x86 architecture specifies multiple data formats for floating point operands, namely, single-precision, double-precision, and extended double-precision. This implies that the floating point units have a different multiplier, adder, etc. for each architected data format. This is an inefficient use of space and power. So, to reduce the number of multipliers, adders, etc., the floating point units include a single multiplier, adder, etc. each capable of operating on operands that are in a single non-architected data format. The floating point units convert the received source operands from their architected data format to the non-architected data format, perform the operation on the non-architected data format operands to generate a result in the non-architected data format, and then convert the result back to the architected data format. The architected data format results are then forwarded to the floating point units as source operands, as illustrated by the conventional
floating point units 112 shown inFIG. 4 . - In one aspect the present invention provides a microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands. The microprocessor includes first and second floating-point units. The first floating-point unit is configured to speculatively forward a non-ADF result generated by the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction. The second floating-point unit is configured to use the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction. The second floating-point unit is further configured to convert the non-ADF result to an ADF result and to determine whether the non-ADF result creates an exception condition when converted to the ADF result. The microprocessor is configured to cancel the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
- In another aspect, the present invention provides a method for processing floating-point instructions in a microprocessor having first and second floating-point units, wherein the microprocessor has an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands. The method includes speculatively forwarding a non-ADF result generated by the first floating-point unit from the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction. The method also includes the second floating-point unit using the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction. The method also includes determining whether the non-ADF result creates an exception condition when converted to an ADF result. The method also includes canceling the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
- In yet another aspect, the present invention provides a computer program product encoded in at least one computer readable medium for use with a computing device, the computer program product comprising computer readable program code embodied in said medium for specifying a microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands. The computer readable program code includes first program code for specifying a first floating-point unit and second program code for specifying a second floating-point unit. The first floating-point unit is configured to speculatively forward a non-ADF result generated by the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction. The second floating-point unit is configured to use the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction. The second floating-point unit is further configured to convert the non-ADF result to an ADF result and to determine whether the non-ADF result creates an exception condition when converted to the ADF result. The microprocessor is configured to cancel the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
-
FIG. 1 is a block diagram illustrating a microprocessor that incorporates latency-reducing non-architectural data format result forwarding. -
FIG. 2 is a block diagram illustrating in more detail the floating point units ofFIG. 1 . -
FIG. 3 is a flowchart illustrating an example of operation of the microprocessor ofFIG. 1 . -
FIG. 4 is a block diagram illustrating related art floating point units that do not forward non-architectural data format results. - The forwarding of architected data format results described above with respect to
FIG. 4 , or more specifically the data format conversions performed, is time-wasteful in the sense that it adds additional latency in cases where the result-generating and result-consuming instructions are scheduled back-to-back for execution. To reduce latency, embodiments described herein include modified floating point units that forward the non-architected data format (NADF) result without converting to the architected data format (ADF) and are capable of receiving and operating directly on the NADF operands without converting them from the ADF to the NADF. This reduces the latency by removing the conversion time in and out of the floating point units from the critical path. The amount of latency reduced may be particularly significant when there is a sequence of back-to-back result-generating and result-consuming instructions such that the modified floating point units are able to forward the NADF results. In one embodiment, the NADF includes additional exponent bits beyond the number of exponent bits specified by the largest ADF. For example, in one embodiment the largest ADF is the 80-bit double-precision format, which includes a 15-bit exponent field, and the NADF includes a 17-bit exponent field to accommodate overflows and underflows. - Referring now to
FIG. 1 , a block diagram illustrating amicroprocessor 100 that incorporates the latency-reducing NADF result forwarding described above is shown. Themicroprocessor 100 includes a plurality of floating point units (FPU) 112. In one embodiment, thefloating point units 112 include a firstfloating point unit 112A that includes a floating point multiplier 226 (seeFIG. 2 ) that generates afirst ADF result 162, and a secondfloating point unit 112B that includes a floating point adder 236 (seeFIG. 2 ) that generates asecond ADF result 164. Thefloating point units 112 receive ADF source operands 152 from amultiplexer 116 that receives ADF source operands from general purpose registers (GPRs) 118, from temporary registers of a reorder buffer (ROB) 114, and theADF results 162/164 from thefloating point units 112 themselves. Additionally, thefloating point units 112 generaterespective exception signals 172/174 to theROB 114 to indicate that an instruction created an exception condition, such as an overflow or underflow, as described in more detail below. - In one embodiment, the
microprocessor 100 is an x86 (also referred to as IA-32)architecture microprocessor 100; however, other microprocessor architectures may be employed. A microprocessor is an x86 architecture processor if it can correctly execute a majority of the application programs that are designed to be executed on an x86 microprocessor. An application program is correctly executed if its expected results are obtained. In particular, themicroprocessor 100 executes instructions of the x86 instruction set and includes the x86 user-visible register set. - Referring now to
FIG. 2 , a block diagram illustrating in more detail thefloating point units 112 ofFIG. 1 is shown.Floating point unit 112A includes aconverter 222, coupled to amux 224, coupled to aNADF multiplier 226, coupled to asecond converter 228.Floating point unit 112B includes aconverter 232, coupled to amux 234, coupled to aNADF adder 236, coupled to asecond converter 238. - The
converter 222 converts theADF operands 152 intoNADF operands 272 that are provided to themux 224. Themux 224 also receives aNADF result 252 forwarded from theNADF multiplier 226 and aNADF result 254 forwarded from theNADF adder 236. From its inputs, themux 224 selects NADF operands 266 for provision to theNADF multiplier 226, which multiplies theoperands 266 to generate theNADF result 252. Theconverter 228 converts theNADF result 252 to theADF result 162 ofFIG. 1 . Additionally, theconverter 228 generates anexception indicator 172 ofFIG. 1 if it detects that theADF result 162 created an exception condition, such as an underflow or overflow. That is, the NADF may have accommodated theresult 252 without creating an underflow or overflow; however, the smaller ADF may not sufficiently accommodate theNADF result 252 such that the conversion from the NADF to the ADF creates an exception condition. - The
converter 232 converts theADF operands 152 intoNADF operands 274 that are provided to themux 234. Themux 234 also receives theNADF result 252 forwarded from theNADF multiplier 226 and theNADF result 254 forwarded from theNADF adder 236. From its inputs, themux 234 selects NADF operands 268 for provision to theNADF adder 236, which adds theoperands 268 to generate theNADF result 254. Theconverter 238 converts theNADF result 254 to theADF result 164 ofFIG. 1 . Additionally, theconverter 238 generates anexception indicator 174 ofFIG. 1 if it detects that theADF result 164 created an exception condition, such as an underflow or overflow. - As may be observed by comparing
FIGS. 2 and 4 , the floatingpoint units 112 ofFIG. 2 advantageously potentially reduce instruction execution latency by directly forwarding to one another theirNADF results 252/254. This is in contrast to the conventional floatingpoint units 112 ofFIG. 4 , which incur the latency of converting the NADF results to ADF results, forwarding the converted ADF results, and then reconverting to NADF operands. - Floating point operations may generate exception conditions, such as overflow or underflow. A side-effect of the NADF is that some results that would overflow/underflow in the ADF would not do so in the NADF, e.g., because of the larger exponent, as discussed above. Consequently, the forwarding of the NADF results 252/254 is speculative because the programmer may not want the instruction that receives the forwarded
NADF result 252/254 to execute with a value that would cause an exception when converted to ADF. Therefore, in parallel with the speculative forwarding ofNADF results 252/254, theconverters 228/238 also perform the conversion to ADF, and if the conversion yields an overflow/underflow, then they generate anexception 172/174 on the forwarding instruction and themicroprocessor 100 kills the instruction that executed using the speculatively forwarded NADF result, as described in more detail with respect toFIG. 3 . - Referring now to
FIG. 3 , a flowchart illustrating an example of operation of themicroprocessor 100 ofFIG. 1 is shown. Flow begins atblock 302. - At
block 302, floatingpoint unit 112A receives an instruction-B for execution. Themux 224 detects that one of the source operands is theNADF result 254 of a previous instruction-A that has been forwarded from theNADF adder 236 and accordingly selects the forwardedNADF result 254. Themux 224 may also select as the other operand the forwarded NADF result 252 from theNADF multiplier 226 or the convertedNADF operands 272. Flow proceeds to block 304. - At block 304, the
NADF multiplier 226 multiplies theNADF operands 266 to generate theNADF result 252 for instruction-B. Flow proceeds concurrently from block 304 toblocks 306 and 326. - At block 306, the forwarding buses forward the
NADF result 252 of instruction-B to theNADF adder 236. Flow proceeds to block 308. - At
block 308, floatingpoint unit 112B receives an instruction-C for execution. Themux 234 detects that one of the source operands is theNADF result 252 of instruction-B that has been forwarded at block 306 from theNADF multiplier 226 and accordingly selects the forwardedNADF result 252. Themux 234 may also select as the other operand the forwarded NADF result 254 from theNADF adder 236 or the convertedNADF operands 274. Flow proceeds to block 312. - At block 312, the
NADF adder 236 adds theNADF operands 268 to generate theNADF result 254 for instruction-C. Flow ends at block 312, although it is understood that the forwarding ofNADF results 252 and/or 254 may advantageously continue for a long sequence of instructions, thereby reducing latency and speeding up the execution of the sequence of instructions relative to the conventional floatingpoint units 112 ofFIG. 4 that include the ADF-to-NADF conversion and NADF-to-ADF conversion in the forwarding paths. - At
block 322, theconverter 228 converts theNADF result 252 of instruction-B toADF result 162. Flow proceeds todecision block 324. - At
decision block 324, theconverter 228 determines whether theNADF result 252 of instruction-B creates an exception condition when converting to ADF. If so, flow proceeds to block 326; otherwise, flow proceeds to block 328. - At
block 326, theconverter 228 asserts theexception indicator 172 to theROB 114. Consequently, themicroprocessor 100 will take an exception, and theROB 114 will flush instruction-C since instruction-C is newer in program sequence than instruction-B that caused the exception. This is necessary since theNADF result 252 of instruction-B was speculatively forwarded to theNADF adder 236 without knowledge of whether theNADF result 252 was a good operand, i.e., without knowledge of whether theNADF result 252 was a non-underflowed/overflowed value from an ADF perspective. That is, the programmer may not have desired instruction-C to execute with a non-good operand. However, advantageously the NADF results 252/254 are speculatively forwarded to potentially reduce the latency of instruction execution and in most cases both the forwarding and the receiving instructions will complete successfully. Flow ends atblock 326. - At
block 328, floatingpoint unit 112A provides theADF result 162 to theROB 114 for storage in a temporary register therein. Flow proceeds to block 332. - At
block 332, theROB 114 retires the ADF result 162 from the temporary register to theappropriate GPR 118. Flow ends atblock 332. - While various embodiments of the present invention have been described herein, it should be understood that they have been presented by way of example, and not limitation. It will be apparent to persons skilled in the relevant computer arts that various changes in form and detail can be made therein without departing from the scope of the invention. For example, software can enable, for example, the function, fabrication, modeling, simulation, description and/or testing of the apparatus and methods described herein. This can be accomplished through the use of general programming languages (e.g., C, C++), hardware description languages (HDL) including Verilog HDL, VHDL, and so on, or other available programs. Such software can be disposed in any known computer usable medium such as magnetic tape, semiconductor, magnetic disk, or optical disc (e.g., CD-ROM, DVD-ROM, etc.), a network, wire line, wireless or other communications medium. Embodiments of the apparatus and method described herein may be included in a semiconductor intellectual property core, such as a microprocessor core (e.g., embodied in HDL) and transformed to hardware in the production of integrated circuits. Additionally, the apparatus and methods described herein may be embodied as a combination of hardware and software. Thus, the present invention should not be limited by any of the exemplary embodiments described herein, but should be defined only in accordance with the following claims and their equivalents. Specifically, the present invention may be implemented within a microprocessor device which may be used in a general purpose computer. Finally, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the scope of the invention as defined by the appended claims.
Claims (20)
1. A microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands, the microprocessor comprising:
first and second floating-point units;
wherein the first floating-point unit is configured to speculatively forward a non-ADF result generated by the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction;
wherein the second floating-point unit is configured to use the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction;
wherein the second floating-point unit is further configured to convert the non-ADF result to an ADF result and to determine whether the non-ADF result creates an exception condition when converted to the ADF result;
wherein the microprocessor is configured to cancel the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
2. The microprocessor of claim 1 , wherein the exception condition comprises an underflow or overflow.
3. The microprocessor of claim 1 , wherein the ISA is an x86 ISA.
4. The microprocessor of claim 1 , wherein the non-ADF comprises a larger number of bits than the ADF for specifying a floating-point operand exponent.
5. The microprocessor of claim 1 , wherein the first floating-point unit is configured to use a speculatively forwarded second non-ADF result as a source operand to generate the speculatively forwarded non-ADF result associated with the first instruction, wherein the second non-ADF result is associated with a third instruction, wherein the microprocessor is further configured to cancel the first instruction, in response to determining that the second non-ADF result creates an exception condition when converted to an ADF result.
6. The microprocessor of claim 1 , wherein the non-ADF result associated with the first instruction is a product and the non-ADF result associated with the second instruction is a sum.
7. The microprocessor of claim 1 , wherein the non-ADF result associated with the first instruction is a sum and the non-ADF result associated with the second instruction is a product.
8. The microprocessor of claim 1 , wherein the microprocessor is further configured to retire the ADF result to an architected register of the microprocessor, in response to determining that the non-ADF result does not create an exception condition when converted to the ADF result.
9. The microprocessor of claim 1 ,
wherein the second floating-point unit is further configured to speculatively forward a second non-ADF result generated by the second floating-point unit from the second floating-point unit to the second floating-point unit, wherein the second non-ADF result is associated with a third instruction, wherein the second floating-point unit is configured to speculatively forward the second non-ADF result associated with the third instruction concurrently with the first floating-point unit speculatively forwarding the non-ADF result associated with the first instruction to the second floating-point unit;
wherein the second floating-point unit is further configured to use the speculatively forwarded non-ADF result associated with the first instruction and the speculatively forwarded second non-ADF result associated with the third instruction as source operands to generate the result of the second instruction;
wherein the second floating-point unit is further configured to determine whether the second non-ADF result associated with the third instruction creates an exception condition when converted to an ADF result; and
wherein the microprocessor is further configured to cancel the second instruction, in response to determining that either of the non-ADF results associated with the first and third instructions creates an exception condition when converted to an ADF result.
10. A method for processing floating-point instructions in a microprocessor having first and second floating-point units, wherein the microprocessor has an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands, the method comprising:
speculatively forwarding a non-ADF result generated by the first floating-point unit from the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction;
using, by the second floating-point unit, the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction;
determining whether the non-ADF result creates an exception condition when converted to an ADF result; and
canceling the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
11. The method of claim 10 , wherein the exception condition comprises an underflow or overflow.
12. The method of claim 10 , wherein the ISA is an x86 ISA.
13. The method of claim 10 , wherein the non-ADF comprises a larger number of bits than the ADF for specifying a floating-point operand exponent.
14. The method of claim 10 , further comprising:
using, by the first floating-point unit, a speculatively forwarded second non-ADF result as a source operand to generate the speculatively forwarded non-ADF result associated with the first instruction, wherein the second non-ADF result is associated with a third instruction; and
canceling the first instruction, in response to determining that the second non-ADF result creates an exception condition when converted to an ADF result.
15. The method of claim 10 , wherein the non-ADF result associated with the first instruction is a product and the non-ADF result associated with the second instruction is a sum.
16. The method of claim 10 , wherein the non-ADF result associated with the first instruction is a sum and the non-ADF result associated with the second instruction is a product.
17. The method of claim 10 , further comprising:
retiring the ADF result to an architected register of the microprocessor, in response to determining that the non-ADF result does not create an exception condition when converted to the ADF result.
18. The method of claim 10 , further comprising:
speculatively forwarding a second non-ADF result generated by the second floating-point unit from the second floating-point unit to the second floating-point unit, wherein the second non-ADF result is associated with a third instruction, wherein said speculatively forwarding the second non-ADF result associated with the third instruction is performed concurrently with said speculatively forwarding the non-ADF result associated with the first instruction;
using, by the second floating-point unit, the speculatively forwarded non-ADF result associated with the first instruction and the speculatively forwarded second non-ADF result associated with the third instruction as source operands to generate the result of the second instruction;
determining whether the second non-ADF result associated with the third instruction creates an exception condition when converted to an ADF result; and
canceling the second instruction, in response to determining that either of the non-ADF results associated with the first and third instructions creates an exception condition when converted to an ADF result.
19. A computer program product encoded in at least one computer readable medium for use with a computing device, the computer program product comprising:
computer readable program code embodied in said medium, for specifying a microprocessor having an instruction set architecture (ISA) that specifies at least one architected data format (ADF) for floating-point operands, the computer readable program code comprising:
first program code for specifying a first floating-point unit; and
second program code for specifying a second floating-point unit;
wherein the first floating-point unit is configured to speculatively forward a non-ADF result generated by the first floating-point unit to the second floating-point unit, wherein the non-ADF result is associated with a first instruction;
wherein the second floating-point unit is configured to use the speculatively forwarded non-ADF result associated with the first instruction as a source operand to generate a result of a second instruction;
wherein the second floating-point unit is further configured to convert the non-ADF result to an ADF result and to determine whether the non-ADF result creates an exception condition when converted to the ADF result;
wherein the microprocessor is configured to cancel the second instruction, in response to determining that the non-ADF result creates an exception condition when converted to the ADF result.
20. The computer program product of claim 19 , wherein the at least one computer readable medium is selected from the set of a disk, tape, or other magnetic, optical, or electronic storage medium and a network, wire line, wireless or other communications medium.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/820,662 US20110060892A1 (en) | 2009-09-09 | 2010-06-22 | Speculative forwarding of non-architected data format floating point results |
TW099128795A TWI450191B (en) | 2009-09-09 | 2010-08-27 | Microprocessor and methods for processing floating-point instructions |
CN201010270067.9A CN101916182B (en) | 2009-09-09 | 2010-08-30 | Transmission of fast floating point result using non-architected data format |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US24075309P | 2009-09-09 | 2009-09-09 | |
US12/820,662 US20110060892A1 (en) | 2009-09-09 | 2010-06-22 | Speculative forwarding of non-architected data format floating point results |
Publications (1)
Publication Number | Publication Date |
---|---|
US20110060892A1 true US20110060892A1 (en) | 2011-03-10 |
Family
ID=43648501
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/820,662 Abandoned US20110060892A1 (en) | 2009-09-09 | 2010-06-22 | Speculative forwarding of non-architected data format floating point results |
US12/820,578 Active 2031-09-20 US8375078B2 (en) | 2009-09-09 | 2010-06-22 | Fast floating point result forwarding using non-architected data format |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/820,578 Active 2031-09-20 US8375078B2 (en) | 2009-09-09 | 2010-06-22 | Fast floating point result forwarding using non-architected data format |
Country Status (2)
Country | Link |
---|---|
US (2) | US20110060892A1 (en) |
TW (1) | TWI450191B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110060785A1 (en) * | 2009-09-09 | 2011-03-10 | Via Technologies, Inc. | Fast floating point result forwarding using non-architected data format |
US20150309799A1 (en) * | 2014-04-25 | 2015-10-29 | Broadcom Corporation | Stunt box |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107168682B (en) | 2011-12-23 | 2021-01-26 | 英特尔公司 | Instruction for determining whether a value is within a range |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5619664A (en) * | 1994-01-04 | 1997-04-08 | Intel Corporation | Processor with architecture for improved pipelining of arithmetic instructions by forwarding redundant intermediate data forms |
US5687106A (en) * | 1995-03-31 | 1997-11-11 | International Business Machines Corporation | Implementation of binary floating point using hexadecimal floating point unit |
US5878266A (en) * | 1995-09-26 | 1999-03-02 | Advanced Micro Devices, Inc. | Reservation station for a floating point processing unit |
US5996065A (en) * | 1997-03-31 | 1999-11-30 | Intel Corporation | Apparatus for bypassing intermediate results from a pipelined floating point unit to multiple successive instructions |
US20020095451A1 (en) * | 2001-01-18 | 2002-07-18 | International Business Machines Corporation | Floating point unit for multiple data architectures |
US20060179100A1 (en) * | 2005-02-09 | 2006-08-10 | International Business Machines Corporation | System and method for performing floating point store folding |
US20070226288A1 (en) * | 2006-03-23 | 2007-09-27 | Fujitsu Limited | Processing method and computer system for summation of floating point data |
US20110060785A1 (en) * | 2009-09-09 | 2011-03-10 | Via Technologies, Inc. | Fast floating point result forwarding using non-architected data format |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5191335A (en) * | 1990-11-13 | 1993-03-02 | International Business Machines Corporation | Method and apparatus for floating-point data conversion with anomaly handling facility |
US7529912B2 (en) * | 2002-02-12 | 2009-05-05 | Via Technologies, Inc. | Apparatus and method for instruction-level specification of floating point format |
US8595279B2 (en) * | 2006-02-27 | 2013-11-26 | Qualcomm Incorporated | Floating-point processor with reduced power requirements for selectable subprecision |
GB2447968B (en) * | 2007-03-30 | 2010-07-07 | Transitive Ltd | Improvements in and relating to floating point operations |
-
2010
- 2010-06-22 US US12/820,662 patent/US20110060892A1/en not_active Abandoned
- 2010-06-22 US US12/820,578 patent/US8375078B2/en active Active
- 2010-08-27 TW TW099128795A patent/TWI450191B/en active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5619664A (en) * | 1994-01-04 | 1997-04-08 | Intel Corporation | Processor with architecture for improved pipelining of arithmetic instructions by forwarding redundant intermediate data forms |
US5687106A (en) * | 1995-03-31 | 1997-11-11 | International Business Machines Corporation | Implementation of binary floating point using hexadecimal floating point unit |
US5878266A (en) * | 1995-09-26 | 1999-03-02 | Advanced Micro Devices, Inc. | Reservation station for a floating point processing unit |
US5996065A (en) * | 1997-03-31 | 1999-11-30 | Intel Corporation | Apparatus for bypassing intermediate results from a pipelined floating point unit to multiple successive instructions |
US20020095451A1 (en) * | 2001-01-18 | 2002-07-18 | International Business Machines Corporation | Floating point unit for multiple data architectures |
US20060179100A1 (en) * | 2005-02-09 | 2006-08-10 | International Business Machines Corporation | System and method for performing floating point store folding |
US20070226288A1 (en) * | 2006-03-23 | 2007-09-27 | Fujitsu Limited | Processing method and computer system for summation of floating point data |
US20110060785A1 (en) * | 2009-09-09 | 2011-03-10 | Via Technologies, Inc. | Fast floating point result forwarding using non-architected data format |
US8375078B2 (en) * | 2009-09-09 | 2013-02-12 | Via Technologies, Inc. | Fast floating point result forwarding using non-architected data format |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110060785A1 (en) * | 2009-09-09 | 2011-03-10 | Via Technologies, Inc. | Fast floating point result forwarding using non-architected data format |
US8375078B2 (en) | 2009-09-09 | 2013-02-12 | Via Technologies, Inc. | Fast floating point result forwarding using non-architected data format |
US20150309799A1 (en) * | 2014-04-25 | 2015-10-29 | Broadcom Corporation | Stunt box |
US10713049B2 (en) * | 2014-04-25 | 2020-07-14 | Avago Technologies International Sales Pte. Limited | Stunt box to broadcast and store results until retirement for an out-of-order processor |
Also Published As
Publication number | Publication date |
---|---|
TWI450191B (en) | 2014-08-21 |
TW201110019A (en) | 2011-03-16 |
US20110060785A1 (en) | 2011-03-10 |
US8375078B2 (en) | 2013-02-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8386755B2 (en) | Non-atomic scheduling of micro-operations to perform round instruction | |
US6487575B1 (en) | Early completion of iterative division | |
JP4938844B2 (en) | Mode-based multiply-add processor for denormalized operands | |
US8074060B2 (en) | Out-of-order execution microprocessor that selectively initiates instruction retirement early | |
US8214417B2 (en) | Subnormal number handling in floating point adder without detection of subnormal numbers before exponent subtraction | |
US7917568B2 (en) | X87 fused multiply-add instruction | |
TW201617857A (en) | Non-atomic split-path fused multiply-accumulate | |
US8046400B2 (en) | Apparatus and method for optimizing the performance of x87 floating point addition instructions in a microprocessor | |
US8838665B2 (en) | Fast condition code generation for arithmetic logic unit | |
US5884062A (en) | Microprocessor with pipeline status integrity logic for handling multiple stage writeback exceptions | |
Raveendran et al. | A RISC-V instruction set processor-micro-architecture design and analysis | |
US5991863A (en) | Single carry/borrow propagate adder/decrementer for generating register stack addresses in a microprocessor | |
Quinnell et al. | Bridge floating-point fused multiply-add design | |
US8375078B2 (en) | Fast floating point result forwarding using non-architected data format | |
US7523152B2 (en) | Methods for supporting extended precision integer divide macroinstructions in a processor | |
US8620983B2 (en) | Leading sign digit predictor for floating point near subtractor | |
US6237085B1 (en) | Processor and method for generating less than (LT), Greater than (GT), and equal to (EQ) condition code bits concurrent with a logical or complex operation | |
JP3122420B2 (en) | Processor and condition code / bit calculation method | |
US8495343B2 (en) | Apparatus and method for detection and correction of denormal speculative floating point operand | |
US7234044B1 (en) | Processor registers having state information | |
Gilani et al. | Virtual floating-point units for low-power embedded processors | |
EP3118737B1 (en) | Arithmetic processing device and method of controlling arithmetic processing device | |
CN101916182B (en) | Transmission of fast floating point result using non-architected data format | |
Andorno | Design of the frontend for LEN5, a RISC-V Out-of-Order processor | |
JP2024025407A (en) | Arithmetic processing device and processing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: VIA TECHNOLOGIES, INC., TAIWAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HENRY, G. GLENN;PARKS, TERRY;REEL/FRAME:024796/0881 Effective date: 20100729 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |