WO2004095234A3 - Efficient multiplication sequence for large integer operands wider than the multiplier hardware - Google Patents

Efficient multiplication sequence for large integer operands wider than the multiplier hardware Download PDF

Info

Publication number
WO2004095234A3
WO2004095234A3 PCT/US2004/008715 US2004008715W WO2004095234A3 WO 2004095234 A3 WO2004095234 A3 WO 2004095234A3 US 2004008715 W US2004008715 W US 2004008715W WO 2004095234 A3 WO2004095234 A3 WO 2004095234A3
Authority
WO
WIPO (PCT)
Prior art keywords
lowbar
operand
multiply
wider
sequence
Prior art date
Application number
PCT/US2004/008715
Other languages
French (fr)
Other versions
WO2004095234A2 (en
Inventor
Vincent Dupaquis
Laurent Paris
Original Assignee
Atmel Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Atmel Corp filed Critical Atmel Corp
Priority to DE602004023067T priority Critical patent/DE602004023067D1/en
Priority to EP04759716A priority patent/EP1614027B1/en
Publication of WO2004095234A2 publication Critical patent/WO2004095234A2/en
Publication of WO2004095234A3 publication Critical patent/WO2004095234A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • G06F7/53Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
    • G06F7/5324Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel partitioned, i.e. using repetitively a smaller parallel parallel multiplier or using an array of such smaller multipliers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/3001Arithmetic instructions
    • G06F9/30014Arithmetic instructions with variable precision
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/322Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address
    • G06F9/325Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address for loops, e.g. loop detection or loop counter
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
    • G06F9/3893Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units controlled in tandem, e.g. multiplier-accumulator

Abstract

A method of operating a multiplication circuit (21) to perform multiply-accumulate operations on multi­-word operands is characterized by an operations sequencer (23) that is programmed to direct the transfer of operand segments between RAM (15) and internal data registers (27; RX, RY, RZ, RR) in a specified sequence. The se­quence (e.g., Figs. 5A-5C) processes groups of two adja­cent result word-weights (columns), with the multiply cycles within a group proceeding in a zigzag fashion by alternating columns with steadily increasing or decreas­ing operand segment weights. In multiplier embodiments having additional internal cache registers (C_A0, C_ Al, C_B0, C_B1, C_B2), these store frequently used operand segments so they aren't reloaded from memory multiple times. In this case, the sequence within a group need not proceed in a strict zigzag fashion, but can jump to a multiply operation involving at least one operand segment stored in a cache.
PCT/US2004/008715 2003-04-07 2004-03-22 Efficient multiplication sequence for large integer operands wider than the multiplier hardware WO2004095234A2 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
DE602004023067T DE602004023067D1 (en) 2003-04-07 2004-03-22 EFFICIENT MULTIPLICATION SEQUENCE FOR LARGE NUMBERS OF OPERANDS WIDER THAN THE MULTIPLIER HARDWARE
EP04759716A EP1614027B1 (en) 2003-04-07 2004-03-22 Efficient multiplication sequence for large integer operands wider than the multiplier hardware

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
FR0304299A FR2853425B1 (en) 2003-04-07 2003-04-07 EFFICIENT MULTIPLICATION SEQUENCE FOR OPERANDS HAVING LARGER WHOLE ENTIRE NUMBERS THAN MULTIPLIER EQUIPMENT
FR03/04299 2003-04-07

Publications (2)

Publication Number Publication Date
WO2004095234A2 WO2004095234A2 (en) 2004-11-04
WO2004095234A3 true WO2004095234A3 (en) 2005-11-03

Family

ID=32982290

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2004/008715 WO2004095234A2 (en) 2003-04-07 2004-03-22 Efficient multiplication sequence for large integer operands wider than the multiplier hardware

Country Status (7)

Country Link
US (1) US7392276B2 (en)
EP (1) EP1614027B1 (en)
CN (1) CN100489764C (en)
DE (1) DE602004023067D1 (en)
FR (1) FR2853425B1 (en)
TW (1) TWI338858B (en)
WO (1) WO2004095234A2 (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8538015B2 (en) * 2007-03-28 2013-09-17 Intel Corporation Flexible architecture and instruction for advanced encryption standard (AES)
US8028015B2 (en) * 2007-08-10 2011-09-27 Inside Contactless S.A. Method and system for large number multiplication
CN101271570B (en) * 2008-05-07 2011-08-17 威盛电子股份有限公司 Apparatus and method for large integer multiplication operation
US20110106872A1 (en) * 2008-06-06 2011-05-05 William Hasenplaugh Method and apparatus for providing an area-efficient large unsigned integer multiplier
CN101562594B (en) * 2009-05-25 2011-09-07 哈尔滨工业大学 Phase factor combined circuit based on stream line operation
US8495125B2 (en) * 2009-05-27 2013-07-23 Microchip Technology Incorporated DSP engine with implicit mixed sign operands
EP2365659B1 (en) * 2010-03-01 2017-04-12 Inside Secure Method to test the resistance of an integrated circuit to a side channel attack
EP2761430B1 (en) 2011-09-27 2015-07-29 Technische Universität Graz Multiplication of large operands
CN106371808B (en) * 2015-07-22 2019-07-12 华为技术有限公司 A kind of method and terminal of parallel computation
CN115480730A (en) * 2016-10-20 2022-12-16 英特尔公司 Systems, devices, and methods for fused multiply-add
US11599334B2 (en) * 2020-06-09 2023-03-07 VeriSilicon Microelectronics Enhanced multiply accumulate device for neural networks

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4893268A (en) * 1988-04-15 1990-01-09 Motorola, Inc. Circuit and method for accumulating partial products of a single, double or mixed precision multiplication
US5457804A (en) * 1992-06-10 1995-10-10 Nec Corporation Accumulating multiplication circuit executing a double-precision multiplication at a high speed

Family Cites Families (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4240144A (en) 1979-01-02 1980-12-16 Honeywell Information Systems Inc. Long operand alignment and merge operation
JPH061438B2 (en) * 1984-04-26 1994-01-05 日本電気株式会社 Double precision multiplier
US4809212A (en) * 1985-06-19 1989-02-28 Advanced Micro Devices, Inc. High throughput extended-precision multiplier
US4754421A (en) 1985-09-06 1988-06-28 Texas Instruments Incorporated Multiple precision multiplication device
US4876660A (en) 1987-03-20 1989-10-24 Bipolar Integrated Technology, Inc. Fixed-point multiplier-accumulator architecture
US5121431A (en) 1990-07-02 1992-06-09 Northern Telecom Limited Processor method of multiplying large numbers
US5606677A (en) 1992-11-30 1997-02-25 Texas Instruments Incorporated Packed word pair multiply operation forming output including most significant bits of product and other bits of one input
EP0924601B1 (en) 1993-11-23 2001-09-26 Hewlett-Packard Company, A Delaware Corporation Parallel data processing in a single processor
US5446651A (en) 1993-11-30 1995-08-29 Texas Instruments Incorporated Split multiply operation
US6295599B1 (en) 1995-08-16 2001-09-25 Microunity Systems Engineering System and method for providing a wide operand architecture
US5953241A (en) 1995-08-16 1999-09-14 Microunity Engeering Systems, Inc. Multiplier array processing system with enhanced utilization at lower precision for group multiply and sum instruction
US6385634B1 (en) 1995-08-31 2002-05-07 Intel Corporation Method for performing multiply-add operations on packed data
US5862067A (en) 1995-12-29 1999-01-19 Intel Corporation Method and apparatus for providing high numerical accuracy with packed multiply-add or multiply-subtract operations
DE19637369C2 (en) 1996-09-13 2001-11-15 Micronas Gmbh Digital signal processor with multiplier and method
US5996066A (en) 1996-10-10 1999-11-30 Sun Microsystems, Inc. Partitioned multiply and add/subtract instruction for CPU with integrated graphics functions
US5943250A (en) 1996-10-21 1999-08-24 Samsung Electronics Co., Ltd. Parallel multiplier that supports multiple numbers with different bit lengths
KR100222032B1 (en) * 1996-12-24 1999-10-01 윤종용 Double precision multiplier
US6233597B1 (en) 1997-07-09 2001-05-15 Matsushita Electric Industrial Co., Ltd. Computing apparatus for double-precision multiplication
US6026421A (en) 1997-11-26 2000-02-15 Atmel Corporation Apparatus for multiprecision integer arithmetic
US6202077B1 (en) 1998-02-24 2001-03-13 Motorola, Inc. SIMD data processing extended precision arithmetic operand format
US6055554A (en) 1998-03-04 2000-04-25 Internatinal Business Machines Corporation Floating point binary quad word format multiply instruction unit
US6523055B1 (en) * 1999-01-20 2003-02-18 Lsi Logic Corporation Circuit and method for multiplying and accumulating the sum of two products in a single cycle

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4893268A (en) * 1988-04-15 1990-01-09 Motorola, Inc. Circuit and method for accumulating partial products of a single, double or mixed precision multiplication
US5457804A (en) * 1992-06-10 1995-10-10 Nec Corporation Accumulating multiplication circuit executing a double-precision multiplication at a high speed

Also Published As

Publication number Publication date
FR2853425A1 (en) 2004-10-08
TWI338858B (en) 2011-03-11
FR2853425B1 (en) 2006-01-13
TW200504593A (en) 2005-02-01
EP1614027A2 (en) 2006-01-11
WO2004095234A2 (en) 2004-11-04
US20040199562A1 (en) 2004-10-07
DE602004023067D1 (en) 2009-10-22
US7392276B2 (en) 2008-06-24
CN100489764C (en) 2009-05-20
CN1809805A (en) 2006-07-26
EP1614027B1 (en) 2009-09-09
EP1614027A4 (en) 2006-06-21

Similar Documents

Publication Publication Date Title
EP3513281B1 (en) Vector multiply-add instruction
US5579253A (en) Computer multiply instruction with a subresult selection option
US20060149804A1 (en) Multiply-sum dot product instruction with mask and splat
US8631224B2 (en) SIMD dot product operations with overlapped operands
US20090043836A1 (en) Method and system for large number multiplication
WO2004095234A3 (en) Efficient multiplication sequence for large integer operands wider than the multiplier hardware
US20060224656A1 (en) Methods and apparatus for efficient complex long multiplication and covariance matrix implementation
US20030014457A1 (en) Method and apparatus for vector processing
EP3436928B1 (en) Complex multiply instruction
CN111381880B (en) Processor, medium, and operation method of processor
CN111381939B (en) Register file in a multithreaded processor
WO2007050444A3 (en) Integrated processor array, instruction sequencer and i/o controller
EP3299952B1 (en) Circuit for performing a multiply-and-accumulate operation
US8892615B2 (en) Arithmetic operation circuit and method of converting binary number
US20200310820A1 (en) Apparatus and method for controlling complex multiply-accumulate circuitry
US6813627B2 (en) Method and apparatus for performing integer multiply operations using primitive multi-media operations that operate on smaller operands
US6535901B1 (en) Method and apparatus for generating a fast multiply accumulator
EP3264261B1 (en) Processor and control method of processor
EP3716050B1 (en) Using fuzzy-jbit location of floating-point multiply-accumulate results
EP0933703A2 (en) Method and apparatus for processing program loops
US8055883B2 (en) Pipe scheduling for pipelines based on destination register number
US10846056B2 (en) Configurable SIMD multiplication circuit
EP0809179A2 (en) Digital microprocessor device having variable-delay division hardware
US10664237B2 (en) Apparatus and method for processing reciprocal square root operations
Brunie Towards the basic linear algebra unit: replicating multi-dimensional FPUs to accelerate linear algebra applications

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2004759716

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 20048091607

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 2004759716

Country of ref document: EP

DPEN Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed from 20040101)