WO2001080483A2 - Authentication engine architecture and method - Google Patents

Authentication engine architecture and method Download PDF

Info

Publication number
WO2001080483A2
WO2001080483A2 PCT/US2001/040507 US0140507W WO0180483A2 WO 2001080483 A2 WO2001080483 A2 WO 2001080483A2 US 0140507 W US0140507 W US 0140507W WO 0180483 A2 WO0180483 A2 WO 0180483A2
Authority
WO
WIPO (PCT)
Prior art keywords
hash
round
authentication
shal
data
Prior art date
Application number
PCT/US2001/040507
Other languages
French (fr)
Other versions
WO2001080483A3 (en
Inventor
Mark Buer
Patrick Y. Law
Zheng Qi
Original Assignee
Broadcom Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Broadcom Corporation filed Critical Broadcom Corporation
Priority to AT01927441T priority Critical patent/ATE304759T1/en
Priority to DE60113395T priority patent/DE60113395T2/en
Priority to EP01927441A priority patent/EP1273129B1/en
Priority to AU2001253888A priority patent/AU2001253888A1/en
Publication of WO2001080483A2 publication Critical patent/WO2001080483A2/en
Publication of WO2001080483A3 publication Critical patent/WO2001080483A3/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0643Hash functions, e.g. MD5, SHA, HMAC or f9 MAC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/12Details relating to cryptographic hardware or logic circuitry
    • H04L2209/125Parallelization or pipelining, e.g. for accelerating processing of cryptographic operations

Definitions

  • the present invention relates generally to the field of cryptography, and more
  • the invention is directed to a hardware implementation to increase the speed at which authentication procedures may be performed on data packets transmitted over a computer network.
  • Cryptography accelerator chips may be included in routers or
  • gateways for example, in order to provide automatic IP packet encryption/decryption.
  • Cryptography protocols typically incorporate both encryption/decryption and
  • Encryption/decryption relates to enciphering and deciphering data
  • authentication is concerned with data integrity, including confirming
  • SSL Netscape Communications
  • SSL uses a variant of HMAC (RFC2104) for authentication.
  • the underlying hash algorithm can be either MD5 (RFC 1321) and SHA1 (NIST).
  • SSL deploys algorithms such as RC4, DES, triple DES for encryption/decryption operations.
  • IP layer security standard protocol IPSec
  • RRC2403 HMAC-SHAl -96
  • RRC2404 HMAC-SHAl -96
  • MD5 and SHA1 specify that data is to be processed in D IZ-D ⁇ DIOCKS. II the data in a packet to be processed is not of a multiple of 512 bits, padding is applied
  • the packet is broken into 512-bits data blocks for authentication processing. If the packet is not a multiple of 512 bits, the data left over following splitting of the packet into complete 512-bit
  • Ethernet packet is up to 1,500 bytes. When such a packet gets split into 512-bit blocks, only the last block gets padded and so that overall a relatively small percentage of padding overhead is required. However for shorter packets, the padding
  • MD5 and SHA1 specify 64 rounds and 80 rounds, respectively, based on different
  • hash states i.e., an initial "set” of hash states and an end set; a "set” may be of 4 or 5
  • MD5 and SHA1 each specify a set of constants as the initial hash states for the first ZUZ-DII DIOCK. ine
  • the computation of the padded portion of the data is also generally considered performance overhead because it is not part of the true data. Accordingly, the performance of MD5 and SHAl degrade the most when the length of
  • the padding is about the same as the length of the data (e.g., as described above, when
  • IPSec expand MD5 and SHAl, respectively, by performing two loops of operations.
  • the HMAC algorithm for either MD5 or SHAl is depicted in Fig. 1.
  • the inner hash (inner loop) and the outer hash (outer loop) use different initial hash states.
  • the outer hash is used to compute a digest based on the result of the inner
  • hash operation is comparable to the time required to perform the inner hash operation.
  • Authentication represents a significant proportion of tne time require ⁇ to complete cryptography operations in the application of cryptography protocols
  • authentication is often the time limiting step, particularly for the processing or short packets, and thus creates a data processing
  • the present invention provides an architecture (hardware
  • the invention has particular application to the variants of the SHAl and MD5 authentication
  • an authentication engine in accordance with the present invention provides improved performance with regard to the processing of short data packets.
  • Authentication engines in accordance with the present invention apply a variety of techniques that may include, in various applications, collapsing two multi-
  • round authentication algorithm e.g., SHAl or MD5 or variants
  • timing path ("hiding the adds”); and, for a multi-loop (e.g., HMAC) variant of a multi-
  • the present invention pertains to an authentication engine
  • the architecture for an multi-loop, multi-round authentication algorithm.
  • payload data input buffer configured for loading one new data block while another
  • an initial hash state input buffer configuration for loading initial hash states to the inner and outer hash engines for concurrent inner hash and outer hash operations
  • the multi-loop, multi-round authentication algorithm may be HMAC-
  • the invention pertains to an authentication engine architecture for a multi-round authentication algorithm.
  • the architecture includes a
  • hash engine configured to implement hash round logic for a multi-round
  • the hash round logic implementation included at least one addition module having a plurality of carry save adders for computation of partial products, and a carry look-ahead adder for computation and propagation of a final
  • the multi-round authentication algorithm may be MD5 or SHAl .
  • the invention pertains to an authentication engine architecture for an SHAl authentication algorithm.
  • the architecture includes at least one hash engine configured to implement hash round logic.
  • the logic implementation includes at least one hash engine configured to implement hash round logic.
  • registers having the critical path are alternative.
  • the invention pertains to a method of authenticating data
  • the method involves receiving a data packet stream, splitting the packet data stream into fixed-size data blocks, and processing the
  • the architecture is configured to pipeline the hash operations of the inner hash and outer hash engines, collapse and rearrange multi-round logic to reduce
  • multi-round authentication algorithm may be HMAC-MD5 or HMAC-SHAl.
  • the invention in another aspect, pertains to a method of authenticating data transmitted over a computer network.
  • the method involves receiving a data packet
  • the packet data stream splitting the packet data stream into fixed-size data blocks, processing the fixed-size data blocks using a multi-round authentication engine architecture.
  • the multi-round authentication algorithm may be MD5 or SHAl .
  • the invention pertains to a method of authenticating data transmitted over a computer network using an SHAl authentication algorithm.
  • the method involves providing five hash state registers, and providing data paths from the five state registers such that four of the five data paths from the registers in any SHAl round are not timing critical.
  • Fig. 1 is a high-level block diagram depicting the HMAC-x algorithm (HMAC for either MD5 or SHAl) implemented in the IPSec standard protocol.
  • Fig. 2 is a high-level block diagram of an authentication engine architecture in
  • Fig. 3 is a time study diagram illustrating the critical path of the conventional
  • Fig. 4 is a time study diagram illustrating the critical path of the round logic of the SHAl authentication algorithm in accordance with one embodiment the present
  • Fig. 5 is a high-level block diagram of an SHAl hash engine illustrating the major elements of a round logic design in accordance with one embodiment the present invention.
  • Fig. 6 is a lower-level block diagram illustrating details of the scheduling of
  • the present invention provides an architecture (hardware
  • loop and/or multi-round authentication algorithms may be performed on data packets transmitted over a computer network.
  • Authentication engines in accordance with the present invention apply a variety of techniques that may include, in various
  • collapsing two multi-round authentication algorithm e.g., SHAl or
  • MD5 or variants processing rounds into one; reducing operational overhead by scheduling the additions required by a multi-round authentication algorithm (e.g.,
  • the present invention may be implemented in a variety of ways. As described
  • the invention has particular application to the variants of the SHAl and MD5 authentication algorithms specified by the IPSec cryptography standard.
  • the invention is discussed primarily in connection with the
  • the invention may also be applied to multi-loop and/or multi-round authentication
  • aspects may be used independently to accelerate authentication operations.
  • the pipelining operations are particularly applicable to multi-loop, multi- round authentication algorithms; the round-collapsing operations are particularly applicable to SHAl and variant authentication algorithms; while the scheduling of the additions may be applied to any multi-round authentication algorithm.
  • FIG. 2 is a high-level block diagram of an authentication engine architecture in
  • the engine architecture includes a core having two instantiations of the hash round logic
  • inner and outer hash engines (inner and outer loops) for each of the
  • Pipeline control logic ensures that the outer hash operation for one data
  • a dual- frame input buffer is used for the inner hash engine, allowing one new 512-bit block to be loaded while
  • the engine 200 includes a dual-frame input data payload buffer 201, in this instance having left frame 202, and a right frame 204.
  • payloads received by the engine 200 for example from data packets received off a
  • FIG. 2 illustrates an implementation of the present invention for
  • the architecture includes hash engines for the MD5 and
  • the input data payloads are loaded into the dual frames of the input data buffer 201, split into 512-bit data blocks, padded if necessary (i.e., where the data
  • a multiplexer 206 controls the flow of 512-bit data blocks from the frames of the input buffer to an inner hash engine.
  • Initial hash states are generated by software based on the authentication key and some default constant states based on the HMAC algorithm (pre-hashed), in accordance the specifications for these algorithms. This is typically done once per
  • the initial states may be derived from the default constant states
  • the initial hash states for the inner hash of a given data block are loaded into a
  • HMAC state buffer associated with the outer hash engine(s) 220, 222.
  • the outer hash states for that block are loaded into the
  • the initial hash states are available for concu ⁇ ent inner hash and outer hash operations. Further, the double buffering of the hash states allows initial hash states of the second packet to be loaded while the first packet is being processed so that the data processing is continuous from packet to packet, thereby maximizing the efficiency and processing power of the hash engine.
  • the engine 200 further includes a dual-ported ROM 218.
  • the dual-ported ROM 218 The dual-ported ROM 218.
  • ROM 218 further facilitates the parallel inner and outer has operations by allowing for
  • the inner hash is conducted on all 512 bit blocks of a given data packet.
  • An output buffer 230 stores the digest and outputs it through a multiplexer 232.
  • HMAC-SHAl -96 is
  • Fig. 3 is a time study diagram illustrating the timing critical path of the conventional round logic of the SHAl authentication algorithm.
  • Registers a, b, c, d and e hold the intermediate hash states between rounds. They are duplicated in this figure to demonstrate the ending points of the logic paths clearly. In the actual design, the paths are fed back to the same set of registers because the round logic is reused 80 times.
  • the "+" symbols identify standard adders implemented as carry look-ahead
  • CLAs adders
  • timing critical paths are from registers b, c and d, going through the non-linear function (defined by the SHAl specification) and the adders and ending at register a.
  • Registers b, c, d and e each receives a non-critical input (b receives a, etc.).
  • Fig. 4 is a time study diagram illustrating the timing critical path of the
  • the SHAl algorithm specifies five registers. As illustrated above, the data path of four of the five registers in any SHAl round are not
  • the registers having the critical path are alternative so that four registers worth of data may always be passed on to the next round prior to completion of the critical path in
  • the adding operations may be "hidden.”
  • the eighty rounds of an SHAl loop are collapsed
  • the collapsing of rounds is accomplished by having a single set of registers (the prefe ⁇ ed embodiment has 5
  • both MD5 and SHAl algorithms specify that the final hash states of every 512-bit block to be added together with the initial hash states.
  • one state register is re-computed every round.
  • the rest of the state registers use shifted or non-shifted contents from neighboring registers.
  • the initial hash states are represented by ia, ib, ic, id and ie.
  • this aspect of the present invention is applicable to both collapsed and non-collapsed multi-round authentication algorithms. Implementation of this aspect of the present invention in conjunction with a collapsed
  • Fig. 5 is a high-level block diagram of an SHAl hash engine illustrating the major elements of a collapsed round logic design in accordance with one embodiment
  • the hash engine has five registers, A, B, C, D and E.
  • the initial hash state in register A (a ⁇ goes through a 5-bit circular shift and is added to the initial hash state
  • add4tol adder module that is built by CSA and CLA adders.
  • the adder modules conclude with a carry look-ahead (CLA) adder.
  • CLA carry look-ahead
  • each adder module is added by a CLA adder to generate and propagate a final sum for the round which is then fed back into register A for the next round.
  • Fig. 6 is a lower-level block diagram illustrating details of the scheduling of
  • Step 1 the operation is done in two steps. Step 2
  • Si (a ⁇ «5) + f(b, c, d)+ e + w + k.
  • Step 2 uses module add4tol and a 32-bit carry look-ahead adder (CLA) to generate:
  • carry save adders are used to perform 3-2 input reduction before the 32-bit CLA is applied.
  • the overall delay is equivalent to two 32-bit CLA delays
  • the principles of the invention may also be applied to multi-round authentication algorithms generally, whether or not used in conjunction with

Abstract

Provided is an architecture (hardware implementation) for an authentication engine to increase the speed at which multi-loop and/or multi-round authentication algorithms may be performed on data packets transmitted over a computer network. Authentication engines in accordance with the present invention apply a variety of techniques that may include, in various applications, collapsing two multi-round authentication algorithm (e.g., SHA1 or MD5 or variants) processing rounds into one; reducing operational overhead by scheduling the additions required by a multi-round authentication algorithm in such a manner as to reduce the overall critical timing path ('hiding the ads'); and, for a multi-loop (e.g., HMAC) variant of a multi-round authentication algorithm, pipelining the inner and outer loops. In one particular example of applying the invention in an authentication engine using the HMAC-SHA1 algorithm of the IPSec protocol, collapsing of the conventional 80 SHA1 rounds into 40 rounds, hiding the ads, and pipelining the inner and outer loops allows HMAC-SHA1 to be conducted in approximately the same time as conventional SHA1.

Description

PATENT APPLICATION
AUTHENTICATION ENGINE ARCHITECTURE AND METHOD
BACKGROUND OF THE INVENTION
The present invention relates generally to the field of cryptography, and more
specifically to an architecture and method for cryptography acceleration. In particular,
the invention is directed to a hardware implementation to increase the speed at which authentication procedures may be performed on data packets transmitted over a computer network.
Many methods to perform cryptography are well known in the art and are
discussed, for example, in Applied Cryptography. Bruce Schneier, John Wiley &
Sons, Inc. (1996, 2nd Edition), herein incorporated by reference. In order to improve the speed of cryptography processing, specialized cryptography accelerator chips have
been developed. Cryptography accelerator chips may be included in routers or
gateways, for example, in order to provide automatic IP packet encryption/decryption.
By embedding cryptography functionality in network hardware, both system performance and data security are enhanced.
Cryptography protocols typically incorporate both encryption/decryption and
authentication functionalities. Encryption/decryption relates to enciphering and deciphering data, authentication is concerned with data integrity, including confirming
the identity of the transmitting party and ensuring that a data packet has not been tampered with en route to the recipient. It is known that by mcorporaung Dotn encryption and authentication functionalities in a single accelerator chip, over-all
system performance can be enhanced.
Examples of cryptography protocols which incorporate encryption/decryption
and authentication functionalities include SSL (Netscape Communications
Corporation), commonly used in electronic commerce transactions, and the more
recently promulgated industry security standard known as "BPSec." These protocols
and their associated algorithms are well known in the cryptography art and are described in detail in National Institute of Standards and Technology (NIST), IETF
and other specifications, some of which are identified (for example, by IETF RFC#)
below for convenience. These specifications are incorporated herein by reference for
all purposes.
SSL (v3) uses a variant of HMAC (RFC2104) for authentication. The underlying hash algorithm can be either MD5 (RFC 1321) and SHA1 (NIST). In
addition, the key generation algorithm in SSL also relies on a sequence of MD5 and
SHA1 operations. SSL deploys algorithms such as RC4, DES, triple DES for encryption/decryption operations.
The IP layer security standard protocol, IPSec (RFC2406) specifies two
standard algorithms for performing authentication operations, HMAC-MD5-96
(RFC2403) and HMAC-SHAl -96 (RFC2404). These algorithms are based on the underlying MD5 and SHA1 algorithms, respectively. The goal of the authentication computation is to generate a unique digital representation, called a digest, for the input
data. Both MD5 and SHA1 specify that data is to be processed in D IZ-DΠ DIOCKS. II the data in a packet to be processed is not of a multiple of 512 bits, padding is applied
to round up the data length to a multiple of 512 bits. Thus, if a data packet that is
received by a chip for an authentication is larger then 512 bits, the packet is broken into 512-bits data blocks for authentication processing. If the packet is not a multiple of 512 bits, the data left over following splitting of the packet into complete 512-bit
blocks must be padded in order to reach the 512-bit block processing size. The same
is true if a packet contains fewer then 512 bits of data. For reference, a typical
Ethernet packet is up to 1,500 bytes. When such a packet gets split into 512-bit blocks, only the last block gets padded and so that overall a relatively small percentage of padding overhead is required. However for shorter packets, the padding
overhead can be much higher. For example, if a packet has just over 512 bits it will
need to be divided into two 512-bit blocks, the second of which is mostly padding so that padding overhead approaches 50% of the process data. The authentication of
such short data packets is particularly burdensome and time consuming using the
conventionally implemented MD5 and SHA1 authentication algorithms.
For each 512-bit data block, a set of operations including non-linear functions, shift functions and additions, called a "round," is applied to the block repeatedly.
MD5 and SHA1 specify 64 rounds and 80 rounds, respectively, based on different
non-linear and shift functions, as well as different operating sequences. In every
round, the operation starts with certain hash states (referred to as "context") held by hash state registers (in hardware) or variables (in software), and ends with a new set of
hash states (i.e., an initial "set" of hash states and an end set; a "set" may be of 4 or 5
for the number of registers used by MD5 and SHA1, respectively). MD5 and SHA1 each specify a set of constants as the initial hash states for the first ZUZ-DII DIOCK. ine
following blocks use initial hash states resulting from additions of the initial hash
states and the ending hash states of the previous blocks.
Typically, MD5 and SHAl rounds are translated into clock cycles in hardware
implementations. The addition of the hash states, to the extent that they cannot be
performed in parallel with other round operations, requires overhead clock cycles in
the whole computation. The computation of the padded portion of the data is also generally considered performance overhead because it is not part of the true data. Accordingly, the performance of MD5 and SHAl degrade the most when the length of
the padding is about the same as the length of the data (e.g., as described above, when
a packet has just fewer than 512 bits of data and the padding logic requires an extra
512-bit to be added for holding the pad values).
Moreover, the HMAC-MD5-96 and HMAC-SHAl -96 algorithms used in
IPSec expand MD5 and SHAl, respectively, by performing two loops of operations.
The HMAC algorithm for either MD5 or SHAl (HMAC-x algorithm) is depicted in Fig. 1. The inner hash (inner loop) and the outer hash (outer loop) use different initial hash states. The outer hash is used to compute a digest based on the result of the inner
hash. Since the result of inner hash is 128 bits long for MD5 and 160 bits long for
SHAl, the result must always be padded up to 512 bits and the outer hash only
processes the one 512-bit block of data. HMAC-MD5-96 and HMAC-SHAl -96
provide a higher level of security, however additional time is needed to perform the outer hash operation. This additional time becomes significant when the length of the data to be processed is short, in which case, the time required to perform the outer
hash operation is comparable to the time required to perform the inner hash operation. Authentication represents a significant proportion of tne time requireα to complete cryptography operations in the application of cryptography protocols
incorporating both encryption/decryption and MD5 and/or SHAl authentication
functionalities. In the case of IPSec, authentication is often the time limiting step, particularly for the processing or short packets, and thus creates a data processing
bottleneck. Accordingly, techniques to accelerate authentication and relieve this
bottleneck would be desirable. Further, accelerated implementations of multi-round authentication algorithms would benefit any application of these authentication
algorithms.
SUMMARY OF THE INVENTION
In general, the present invention provides an architecture (hardware
implementation) for an authentication engine to increase the speed at which multi- loop and/or multi-round authentication algorithms may be performed on data packets
transmitted over a computer network. As described in this application, the invention has particular application to the variants of the SHAl and MD5 authentication
algorithms specified by the IPSec cryptography standard. In accordance with the IPSec standard, the invention may be used in conjunction with data
encryption/encryption architecture and protocols. However it is also suitable for use
in conjunction with other non-IPSec cryptography algorithms, and for applications in
which encryption/decryption is not conducted (in IPSec or not) and where it is purely
authentication that is accelerated. Among other advantages, an authentication engine in accordance with the present invention provides improved performance with regard to the processing of short data packets.
Authentication engines in accordance with the present invention apply a variety of techniques that may include, in various applications, collapsing two multi-
round authentication algorithm (e.g., SHAl or MD5 or variants) processing rounds
into one; reducing operational overhead by scheduling the additions required by a multi-round authentication algorithm in such a manner as to reduce the overall critical
timing path ("hiding the adds"); and, for a multi-loop (e.g., HMAC) variant of a multi-
round authentication algorithm, pipelining the inner and outer loops. In one particular
example of applying the invention in an authentication engine using the HMAC- SHAl algorithm of the IPSec protocol, collapsing of the conventional 80 SHAl rounds into 40 rounds, hiding the adds, and pipelining the inner and outer loops allows HMAC-SHAl to be conducted in approximately the same time as conventional
SHAl.
In one aspect, the present invention pertains to an authentication engine
architecture for an multi-loop, multi-round authentication algorithm. The architecture
includes a first instantiation of a multi-round authentication algorithm hash round
logic in an inner hash engine, and a second instantiation of a multi-round authentication algorithm hash round logic in an outer hash engine. A dual-frame
payload data input buffer configured for loading one new data block while another
data block one is being processed in the inner hash engine, an initial hash state input buffer configuration for loading initial hash states to the inner and outer hash engines for concurrent inner hash and outer hash operations, and a dual-ported ROM
configured for concurrent constant lookups for both inner and outer hash engines are
also included. The multi-loop, multi-round authentication algorithm may be HMAC-
MD5 or HMAC-SHAl.
In another aspect, the invention pertains to an authentication engine architecture for a multi-round authentication algorithm. The architecture includes a
hash engine configured to implement hash round logic for a multi-round
authentication algorithm. The hash round logic implementation included at least one addition module having a plurality of carry save adders for computation of partial products, and a carry look-ahead adder for computation and propagation of a final
sum. The multi-round authentication algorithm may be MD5 or SHAl .
In another aspect, the invention pertains to an authentication engine architecture for an SHAl authentication algorithm. The architecture includes at least one hash engine configured to implement hash round logic. The logic implementation
includes five hash state registers, one critical and four non-critical data paths associated with the five registers. In successive SHAl rounds, registers having the critical path are alternative.
In another aspect, the invention pertains to a method of authenticating data
transmitted over a computer network. The method involves receiving a data packet stream, splitting the packet data stream into fixed-size data blocks, and processing the
fixed-size data blocks using a multi-loop, multi-round authentication engine
architecture having a hash engine core with an inner hash engine and an outer hash
engine. The architecture is configured to pipeline the hash operations of the inner hash and outer hash engines, collapse and rearrange multi-round logic to reduce
rounds of hash operations, and implement multi-round logic to schedule addition computations to be conducted in parallel with round operations. The multi-loop,
multi-round authentication algorithm may be HMAC-MD5 or HMAC-SHAl.
In another aspect, the invention pertains to a method of authenticating data transmitted over a computer network. The method involves receiving a data packet
stream, splitting the packet data stream into fixed-size data blocks, processing the fixed-size data blocks using a multi-round authentication engine architecture. The
architecture implements hash round logic for a multi-round authentication algorithm
configured to schedule addition computations to be conducted in parallel with round
operations. The multi-round authentication algorithm may be MD5 or SHAl .
In still another aspect, the invention pertains to a method of authenticating data transmitted over a computer network using an SHAl authentication algorithm. The method involves providing five hash state registers, and providing data paths from the five state registers such that four of the five data paths from the registers in any SHAl round are not timing critical.
These and other features and advantages of the present invention will be
presented in more detail in the following specification of the invention and the accompanying figures which illustrate by way of example the principles of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention will be readily understood by the following detailed
description in conjunction with the accompanying drawings, wherein like reference
numerals designate like structural elements, and in which:
Fig. 1 is a high-level block diagram depicting the HMAC-x algorithm (HMAC for either MD5 or SHAl) implemented in the IPSec standard protocol.
Fig. 2 is a high-level block diagram of an authentication engine architecture in
accordance with one embodiment the present invention.
Fig. 3 is a time study diagram illustrating the critical path of the conventional
round logic of the SHAl authentication algorithm.
Fig. 4 is a time study diagram illustrating the critical path of the round logic of the SHAl authentication algorithm in accordance with one embodiment the present
invention.
Fig. 5 is a high-level block diagram of an SHAl hash engine illustrating the major elements of a round logic design in accordance with one embodiment the present invention.
Fig. 6 is a lower-level block diagram illustrating details of the scheduling of
the additions within the round logic design of Fig. 5. DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
Reference will now be made in detail to some specific embodiments of the
invention including the best modes contemplated by the inventors for carrying out the
invention. Examples of these specific embodiments are illustrated in the
accompanying drawings. While the invention is described in conjunction with these specific embodiments, it will be understood that it is not intended to limit the
invention to the described embodiments. On the contrary, it is intended to cover
alternatives, modifications, and equivalents as may be included within the spirit and scope of the invention as defined by the appended claims. In the following description, numerous, specific details are set forth in order to provide a thorough
understanding of the present invention. The present invention may be practiced
without some or all of these specific details. In other instances, well known process operations have not been described in detail in order not to unnecessarily obscure the present invention.
In general, the present invention provides an architecture (hardware
implementation) for an authentication engine to increase the speed at which multi-
loop and/or multi-round authentication algorithms may be performed on data packets transmitted over a computer network. Authentication engines in accordance with the present invention apply a variety of techniques that may include, in various
applications, collapsing two multi-round authentication algorithm (e.g., SHAl or
MD5 or variants) processing rounds into one; reducing operational overhead by scheduling the additions required by a multi-round authentication algorithm (e.g.,
SHAl or variants) in such a manner as to reduce the overall critical timing path
("hiding the adds"); and, for an HMAC (multi loop) variant of a multi-round authentication algorithm, pipelining the inner and outer loops. Among other
advantages, an authentication engine in accordance with the present invention
provides improved performance with regard to the processing of short data packets.
In this specification and the appended claims, the singular forms "a," "an," and "the" include plural reference unless the context clearly dictates otherwise. Unless
defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood to one of ordinary skill in the art to which this invention belongs.
The present invention may be implemented in a variety of ways. As described
in this application, the invention has particular application to the variants of the SHAl and MD5 authentication algorithms specified by the IPSec cryptography standard. In the following description, the invention is discussed primarily in connection with the
IPSec protocol. However, one of skill in the art will recognize that various aspects of
the invention may also be applied to multi-loop and/or multi-round authentication
algorithms generally, whether or not used with IPSec or in conjunction with cryptography operations at all. Further, while the aspects of the present invention described below are used together in a preferred embodiment of the invention, some
aspects may be used independently to accelerate authentication operations. For
example, the pipelining operations are particularly applicable to multi-loop, multi- round authentication algorithms; the round-collapsing operations are particularly applicable to SHAl and variant authentication algorithms; while the scheduling of the additions may be applied to any multi-round authentication algorithm.
Pipelining Inner and Outer Hash Operations Fig. 2 is a high-level block diagram of an authentication engine architecture in
accordance with one embodiment the present invention. The engine architecture
implements a pipelined structure to hide the time required for performing the outer
hash operation when multiple data payloads are fed to the engine continuously. The engine architecture includes a core having two instantiations of the hash round logic;
in this instance, inner and outer hash engines (inner and outer loops) for each of the
MD5 hash round logic and the SHAl hash round logic supported by the IPSec
protocol. Pipeline control logic ensures that the outer hash operation for one data
payload is performed in parallel with the inner hash operation of the next data payload
in the packet stream fed to the authentication engine. A dual- frame input buffer is used for the inner hash engine, allowing one new 512-bit block to be loaded while
another one is being processed, and the initial hash states are double buffered for
concurrent inner hash and outer hash operations. In addition, dual-ported ROM is
used for concuπent constant lookups by both inner and outer hash engines.
Referring to Fig. 2, the engine 200 includes a dual-frame input data payload buffer 201, in this instance having left frame 202, and a right frame 204. Input data
payloads received by the engine 200, for example from data packets received off a
network by a chip on which the engine architecture is implemented, are distributed
between the frames 202, 204 of the input data buffer 201 so that one data block may be loaded into the buffer while another one is being processed downstream in the data
flow. Since Fig. 2 illustrates an implementation of the present invention for
processing IPSec packets, the architecture includes hash engines for the MD5 and
SHAl authentication protocols supported by IPSec. In accordance with the MD5 and
SHAl protocols, the input data payloads are loaded into the dual frames of the input data buffer 201, split into 512-bit data blocks, padded if necessary (i.e., where the data
block is less than 512 bits) and stored prior to being passed to an inner hash engine for processing. A multiplexer 206 controls the flow of 512-bit data blocks from the frames of the input buffer to an inner hash engine.
Initial hash states are needed on per packet basis for the first data block of each
packet. Initial hash states, are generated by software based on the authentication key and some default constant states based on the HMAC algorithm (pre-hashed), in accordance the specifications for these algorithms. This is typically done once per
key. Alternatively, the initial states may be derived from the default constant states
and the authentication key using the same hardware for every packet that requires authentication.
The initial hash states for the inner hash of a given data block are loaded into a
buffer 214 associated with the inner hash engine(s) 210, 212. The initial hash states
for the outer hash of that data block are loaded into the first 215 of a pair of buffers
215, 216 (referred to as an HMAC state buffer) associated with the outer hash engine(s) 220, 222. When the initial hash states are passed to the inner hash engine
for processing of the data block, the outer hash states for that block are loaded into the
second buffer 216, and the inner and outer initial hash states for the next packet to be
processed are loaded into the buffers 214, 215, respectively. In this way, the synchronization of the inner and outer hash states for a given data block is maintained,
and the initial hash states are available for concuπent inner hash and outer hash operations. Further, the double buffering of the hash states allows initial hash states of the second packet to be loaded while the first packet is being processed so that the data processing is continuous from packet to packet, thereby maximizing the efficiency and processing power of the hash engine.
The engine 200, further includes a dual-ported ROM 218. The dual-ported
ROM 218 further facilitates the parallel inner and outer has operations by allowing for
concuπent constant lookups by both inner and outer hash engines.
The inner hash is conducted on all 512 bit blocks of a given data packet. The
result of inner hash is 128 bits long for MD5 and 160 bits long for SHAl. The result
is padded up to 512 bits and the outer hash processes the one 512-bit block of data to
compute a digest based on the result of the inner hash. An output buffer 230 stores the digest and outputs it through a multiplexer 232.
Collapsing Multi-Round Authentication Algorithm Processing Rounds
Of the two algorithms supported by the IPSEc protocol, HMAC-SHAl -96 is
about twenty-five percent slower than HMAC-MD5-96 in terms of the total computation rounds. One way to improve HMAC-SHAl -96 in an IPSec-supporting
hardware implementation is to collapse multiple rounds of logic into single clock
cycle thus the total number of clocks required for HMAC-SHAl -96 operation is
reduced. The same approach may be applied to any multi-round authentication algorithm. However, simply collapsing the logic for multiple rounds into a single clock cycle can cause the delay to compute the collapsed logic to increase, therefore
reducing the maximum clock frequency.
Fig. 3 is a time study diagram illustrating the timing critical path of the conventional round logic of the SHAl authentication algorithm. Registers a, b, c, d and e hold the intermediate hash states between rounds. They are duplicated in this figure to demonstrate the ending points of the logic paths clearly. In the actual design, the paths are fed back to the same set of registers because the round logic is reused 80 times. The "+" symbols identify standard adders implemented as carry look-ahead
adders (CLAs). W, represents the incoming payload. K, represents a constant,
obtained from ROM used in the authentication computations. It is shown in the figure
that the timing critical paths are from registers b, c and d, going through the non-linear function (defined by the SHAl specification) and the adders and ending at register a. Registers b, c, d and e each receives a non-critical input (b receives a, etc.).
Fig. 4 is a time study diagram illustrating the timing critical path of the
collapsed round logic of the SHAl authentication algorithm in accordance with one
embodiment the present invention. The SHAl algorithm specifies five registers. As illustrated above, the data path of four of the five registers in any SHAl round are not
critical (time limiting). In accordance with this invention, in successive SHAl rounds
the registers having the critical path are alternative so that four registers worth of data may always be passed on to the next round prior to completion of the critical path in
the cuπent round. Thus, when two rounds of SHAl are put together, the critical path
computation of the second round is independent of that of the first round, since the
receiving register of the critical path of the first round (i.e., register a) is not the
driving register of the critical path of the second round (i.e., register e). This approach demonstrates how two SHAl rounds may be collapsed together while maintaining the
same amount of delay for the timing critical path, and how by alternating the critical path from register to register between rounds in this way, the adding operations may be "hidden." In a prefeπed embodiment, the eighty rounds of an SHAl loop are collapsed
into forty rounds. As described and illustrated above, the collapsing of rounds is accomplished by having a single set of registers (the prefeπed embodiment has 5
registers as defined by the IPSec protocol) with two rounds of logic. It is contemplated that the techniques of invention described herein can also be applied to further collapse the number of SHAl rounds in an SHAl loop into twenty or even
fewer rounds.
Scheduling the Additions
As described above, both MD5 and SHAl algorithms specify that the final hash states of every 512-bit block to be added together with the initial hash states.
The results are then used as the initial states of the next 512-bit block. In MD5,
values of four pairs of 32-bit registers need to be added and in SHAl, five pairs.
Considering that each 32-bit addition takes one clock cycle, a typical hardware implementation would use four extra cycles in MD5 and five extra cycles in SHAl to perform these additions if hardware resources are limited.
As noted above with reference to Figs. 3 and 4, in both MD5 and SHAl, only
one state register is re-computed every round. The rest of the state registers use shifted or non-shifted contents from neighboring registers. Thus, the final hash states
are not generated in the final round, but rather in the last four consecutive MD5
rounds or five SHAl rounds, respectively. The present invention exploits this
observation by providing architecture and logic enabling the scheduling of the additions as early as the final hash state is available, hiding the computation time completely behind the round operations. This is illustrated in the following scheduling tables in which 'Ti' represents one clock cycle and 'rnd i' represents round
operation. The initial hash states are represented by ia, ib, ic, id and ie. Parallel
operations are listed in the same column.
M D5
Figure imgf000020_0001
In one embodiment of the invention, a plurality of adds with the final hash
states may be accomplished in a single clock cycle. An example is shown in the
"collapsed SHAl" table, in which the five adds are performed in just three clock
cycles T39, T40 and TI of the next loop. One of skill in the art will recognize that, consistent with the principles of this invention described herein, it is possible to
perform more than two adds in parallel in one clock cycle. Moreover, it should be
noted that, as illustrated in the tables, this aspect of the present invention is applicable to both collapsed and non-collapsed multi-round authentication algorithms. Implementation of this aspect of the present invention in conjunction with a collapsed
multi-round algorithm is particularly advantageous since hiding of adding steps
becomes increasingly important as the number of rounds is decreased. Adds that are not hidden in the manner of this aspect of the present invention would represent an even larger proportion of overhead in a collapsed round implementation than in an
implementation with a higher number of rounds.
Logic Design
Fig. 5 is a high-level block diagram of an SHAl hash engine illustrating the major elements of a collapsed round logic design in accordance with one embodiment
the present invention consistent with the timing critical path study of Fig. 4. The
design makes use of carry save adders (CSA; delay is equivalent to 1-bit adder), taking advantage of their capacity to add multiple quantities together. CSAs
efficiently add multiple quantities together to generate partial products which are not propagated. Two comprehensive addition modules, add5tol and add4tol in the figure
each uses several stages of CSA followed-by a carry look-ahead (CLA) adder, as
illustrated and described in more detail with reference to Fig. 6, below.
The hash engine has five registers, A, B, C, D and E. The initial hash state in register A (a^ goes through a 5-bit circular shift and is added to the initial hash state
in register E (ei), the payload data (W,), a constant (Ki), and the result of a function (Ft) of the initial hash states in registers B, C and D by an add5tol adder module that is built by CSA and CLA adders. The initial hash state in register D (d\) is added to
the payload data (W1+1), a constant (K1+1), and the result of a function (Ft) of the initial hash states in registers A, B (which passes through a 30-bit circular shift) and C by an
add4tol adder module that is built by CSA and CLA adders.
The adder modules conclude with a carry look-ahead (CLA) adder. The sum
of each adder module is added by a CLA adder to generate and propagate a final sum for the round which is then fed back into register A for the next round. The most
timing critical input of these two modules needs only to go through the last CLA
stage.
Fig. 6 is a lower-level block diagram illustrating details of the scheduling of
the additions within the round logic design of Fig. 5. Unrolling two rounds of SHAl
operation will lead to a speed path of:
S = ((a<«5) + f(b, c, d)+ e + w + k)<«5+ f(b, c, d) + e + w + k,
where, a, b, c, d, e, w and k are 32-bit quantities. In accordance with the embodiment
of the present invention depicted in Fig. 5, the operation is done in two steps. Step 1
uses module add5tol to generate:
Si = (a<«5) + f(b, c, d)+ e + w + k.
Step 2 uses module add4tol and a 32-bit carry look-ahead adder (CLA) to generate:
S = Si<«5+ f(b, c, d) + e + w + k.
In each step, carry save adders (CSA) are used to perform 3-2 input reduction before the 32-bit CLA is applied. The overall delay is equivalent to two 32-bit CLA delays
plus one 32-bit CSA delay plus the delay for function for the most timing critical
path. After all the reductions are completed via CSAs, Step 1 and Step 2 become: S = (A+B)<«5+C+D.
Implementations of the invention using this logic design in an authentication
engine using the HMAC-SHAl algorithm of the IPSec protocol, collapsing of the
conventional 80 SHAl rounds into 40 rounds, hiding the adds, and pipelining the inner and outer loops have enabled HMAC-SHAl to be conducted in approximately
the same time as conventional SHAl.
Conclusion
Although the foregoing invention has been described in some detail for
purposes of clarity of understanding, those skilled in the art will appreciate that
various adaptations and modifications of the just-described prefeπed embodiments
can be configured without departing from the scope and spirit of the invention. For
example, while the present invention has been described primarily in connection with
the IPSec protocol, the principles of the invention may also be applied to multi-round authentication algorithms generally, whether or not used in conjunction with
cryptography operations. Therefore, the described embodiments should be taken as illustrative and not restrictive, and the invention should not be limited to the details
given herein but should be defined by the following claims and their full scope of equivalents.
What is claimed is:

Claims

1. An authentication engine architecture for an multi-loop, multi-round authentication algorithm, comprising:
a first instantiation of a multi-round authentication algorithm hash round logic in an inner hash engine;
a second instantiation of a multi-round authentication algorithm hash round logic in an outer hash engine;
a dual-frame payload data input buffer configured for loading one new data block while another data block one is being processed in the inner hash engine;
an initial hash state input buffer configuration for loading initial hash states to the inner and outer hash engines for concuπent inner hash and outer hash operations; and
a dual -ported ROM configured for concuπent constant lookups for both inner and outer hash engines.
2. The authentication engine architecture of claim 1, wherein the multi-loop, multi-round authentication algorithm is HMAC-MD5.
3. The authentication engine architecture of claim 1, wherein the multi-loop, multi -round authentication algorithm is HMAC-SHAl .
4. The authentication engine architecture of claim 1, wherein at least one of the inner and outer hash engines is configured to implement hash round logic including at least one addition module comprising:
a plurality of carry save adders for computation of partial products; and
a carry look-ahead adder for computation and propagation of a final sum.
5. The authentication engine of claim 4, wherein the carry save adders and the carry look-ahead adder are configured such that addition computations are conducted in parallel with round operations.
6. The authentication engine architecture of claim 3, wherein at least one of the inner and outer hash engines is configured to implement hash round logic comprising:
five hash state registers; one critical and four non-critical data paths associated with the five registers, such that in successive SHAl rounds, registers having the critical path are alternative.
7. The authentication engine architecture of claim 6, wherein said hash round logic is implemented such that eighty rounds of an SHAl loop are collapsed into forty rounds.
8. The authentication engine architecture of claim 3, wherein at least one of the inner and outer hash engines is configured to implement hash round logic comprising:
five hash state registers;
a 5 -bit circular shifter;
an add5tol adder module having a plurality of CSAs and a CLA adder;
a 30-bit circular shifter; and
an add4tol adder module having a plurality of CSAs and a CLA adder.
9. An authentication engine architecture for a multi-round authentication algorithm, comprising:
a hash engine configured to implement hash round logic for a multi-round authentication algorithm, said hash round logic implementation including at least one addition module comprising,
a plurality of carry save adders for computation of partial products, and
a carry look-ahead adder for computation and propagation of a final sum.
10. The authentication engine of claim 9, wherein the carry save adders and the carry look-ahead adder are configured such that addition computations are conducted in parallel with round operations.
11. The authentication engine architecture of claim 9, wherein the multi-round authentication algorithm is MD5.
12. The authentication engine architecture of claim 9, wherein the multi-round authentication algorithm is SHAl .
13. The authentication engine architecture of claim 12, wherein the hash round logic implementation comprises: five hash state registers;
a 5-bit circular shifter;
an add5tol adder module having a plurality of CSAs and a CLA adder;
a 30-bit circular shifter; and
an add4tol adder module having a plurality of CSAs and a CLA adder.
14. An authentication engine architecture for an SHAl authentication algorithm, comprising:
at least one hash engine configured to implement hash round logic comprising:
five hash state registers;
one critical and four non-critical data paths associated with the five registers, such that in successive SHAl rounds, registers having the critical path are alternative.
15. The authentication engine architecture of claim 14, wherein said hash round logic is implemented such that eighty rounds of an SHAl loop are collapsed into forty rounds.
16. A method of authenticating data transmitted over a computer network, comprising:
receiving a data packet stream;
splitting the packet data stream into fixed-size data blocks; and
processing the fixed-size data blocks using a multi-loop, multi-round authentication engine architecture having a hash engine core comprising an inner hash engine and an outer hash engine, said architecture configured to,
pipeline hash operations of said inner hash and outer hash engines,
collapse and rearrange multi-round logic to reduce rounds of hash operations, and
implement multi-round logic to schedule addition computations to be conducted in parallel with round operations.
84 17. The method of claim 16, wherein said pipelining comprises performance of an
85 outer hash operation for one data payload in parallel with an inner hash operation of a
86 second data payload in a packet stream fed to the authentication engine.
87 18. The method of claim 17, wherein a dual-frame input buffer is used for the
88 inner hash engine.
89 19. The method of claim 18, wherein initial hash states for the hash operations are
90 double buffered for concuπent inner hash and outer hash operations.
91 20. The method of claim 19, wherein concuπent constant lookups are performed
92 from a dual-ported ROM by both inner and outer hash engines.
93 21. The method of claim 16, wherein the multi-loop, multi-round authentication
94 algorithm is MD5.
95 22. The method of claim 16, wherein the multi-loop, multi-round authentication
96 algorithm is SHAl.
97 23. The method of claim 22 wherein said scheduling of additions comprises:
98 conducting a 5-bit circular shift on data from a first register;
99 adding an initial hash state in a second register, a first payload data block, a
100 first constant, and the result of a function (Ft) of the initial hash states in third, fourth
101 and fifth additional registers with an add5tol adder module having a plurality of CSAs
102 and a CLA adder;
103 conducting a 30-bit circular shift on data from the third additional register; and
104 adding the initial hash state in the fourth additional register to a second
105 payload block, a second constant, and the result of a function (Ft) of the initial hash
106 states in the first and fifth registers and the shifted hash state of the third register with
107 an add4tol adder module having a plurality of CSAs and a CLA adder.
108 24. The method of claim 22, wherein said collapsing and rearranging of the multi-
109 round logic comprises:
110 providing five hash state registers; and
111 providing data paths from said five state registers such that four of the five
112 data paths from the registers in any SHAl round are not timing critical.
113 25. The method of claim 24, wherein, in successive SHAl rounds, registers having
114 the critical path are alternative.
115 26. The method of claim 25, wherein eighty rounds of an SHAl loop are collapsed
116 into forty rounds.
117 27. A method of authenticating data transmitted over a computer network,
118 comprising:
119 receiving a data packet stream;
120 splitting the packet data stream into fixed-size data blocks; and
121 processing the fixed-size data blocks using a multi-round authentication
122 engine architecture, said architecture implementing hash round logic for a multi-round
123 authentication algorithm configured to schedule addition computations to be
124 conducted in parallel with round operations.
125 28. The method of claim 27 wherein said hash round logic comprises:
126 conducting a 5-bit circular shift on data from a first register;
127 adding an initial hash state in a second register, a first payload data block, a
128 first constant, and the result of a function (Ft) of the initial hash states in third, fourth
129 and fifth additional registers with an add5tol adder module having a plurality of CSAs
130 and a CLA adder;
131 conducting a 30-bit circular shift on data from the third additional register; and
132 adding the initial hash state in the fourth additional register to a second
133 payload block, a second constant, and the result of a function (Ft) of the initial hash
134 states in the first and fifth registers and the shifted hash state of the third register with
135 an add4tol adder module having a plurality of CSAs and a CLA adder.
136 29. A method of authenticating data transmitted over a computer network using an
137 SHAl authentication algorithm, comprising:
138 providing five hash state registers; and
139 providing data paths from said five state registers such that four of the five
140 data paths from the registers in any SHAl round are not timing critical.
141 30. The method of claim 29, wherein, in successive SHAl rounds, registers having
142 the critical path are alternative.
143 31. The method of claim 30, wherein eighty rounds of an SHAl loop are collapsed
144 into forty rounds.
145
PCT/US2001/040507 2000-04-13 2001-04-11 Authentication engine architecture and method WO2001080483A2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AT01927441T ATE304759T1 (en) 2000-04-13 2001-04-11 METHOD AND ARCHITECTURE FOR AUTHENTICATION
DE60113395T DE60113395T2 (en) 2000-04-13 2001-04-11 METHOD AND ARCHITECTURE FOR AUTHENTICATION
EP01927441A EP1273129B1 (en) 2000-04-13 2001-04-11 Authentication engine architecture and method
AU2001253888A AU2001253888A1 (en) 2000-04-13 2001-04-11 Authentication engine architecture and method

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US19715200P 2000-04-13 2000-04-13
US60/197,152 2000-04-13
US26142501P 2001-01-12 2001-01-12
US60/261,425 2001-01-12
US09/827,882 2001-04-04
US09/827,882 US7177421B2 (en) 2000-04-13 2001-04-04 Authentication engine architecture and method

Publications (2)

Publication Number Publication Date
WO2001080483A2 true WO2001080483A2 (en) 2001-10-25
WO2001080483A3 WO2001080483A3 (en) 2002-04-04

Family

ID=27393706

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2001/040507 WO2001080483A2 (en) 2000-04-13 2001-04-11 Authentication engine architecture and method

Country Status (6)

Country Link
US (2) US7177421B2 (en)
EP (1) EP1273129B1 (en)
AT (1) ATE304759T1 (en)
AU (1) AU2001253888A1 (en)
DE (1) DE60113395T2 (en)
WO (1) WO2001080483A2 (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002056538A2 (en) * 2001-01-12 2002-07-18 Broadcom Corporation Implementation of the shai algorithm
WO2002101978A2 (en) * 2001-06-13 2002-12-19 Corrent Corporation Apparatus and method for a hash processing system using multiple hash storage areas
WO2002101525A2 (en) * 2001-06-13 2002-12-19 Corrent Corporation Apparatus and methods for a hash processing system using integrated message digest and secure hash architectures
US6971006B2 (en) 1999-07-08 2005-11-29 Broadcom Corporation Security chip architecture and implementations for cryptography acceleration
KR100581662B1 (en) 2005-08-31 2006-05-22 주식회사 칩스앤미디어 Common engine for plural hash functions having different algorithms
US7177421B2 (en) 2000-04-13 2007-02-13 Broadcom Corporation Authentication engine architecture and method
US7191341B2 (en) 2002-12-18 2007-03-13 Broadcom Corporation Methods and apparatus for ordering data in a cryptography accelerator
US7266703B2 (en) 2001-06-13 2007-09-04 Itt Manufacturing Enterprises, Inc. Single-pass cryptographic processor and method
US7277542B2 (en) 2000-09-25 2007-10-02 Broadcom Corporation Stream cipher encryption application accelerator and methods thereof
US7403615B2 (en) 2001-08-24 2008-07-22 Broadcom Corporation Methods and apparatus for accelerating ARC4 processing
US7434043B2 (en) 2002-12-18 2008-10-07 Broadcom Corporation Cryptography accelerator data routing unit
CN100449986C (en) * 2003-01-28 2009-01-07 华为技术有限公司 Method for raising operational speed of key-hashing method
US7555121B2 (en) 2000-09-25 2009-06-30 Broadcom Corporation Methods and apparatus for implementing a cryptography engine
US7568110B2 (en) 2002-12-18 2009-07-28 Broadcom Corporation Cryptography accelerator interface decoupling from cryptography processing cores
US7600131B1 (en) 1999-07-08 2009-10-06 Broadcom Corporation Distributed processing in a cryptography acceleration chip
US7861104B2 (en) 2001-08-24 2010-12-28 Broadcom Corporation Methods and apparatus for collapsing interrupts
US8295484B2 (en) 2004-12-21 2012-10-23 Broadcom Corporation System and method for securing data from a remote input device
US9264426B2 (en) 2004-12-20 2016-02-16 Broadcom Corporation System and method for authentication via a proximate device
CN107835071A (en) * 2017-11-03 2018-03-23 中国人民解放军国防科技大学 Method and device for improving operation speed of key-in-hash method
CN112564922A (en) * 2020-12-22 2021-03-26 创元网络技术股份有限公司 Multifunctional integrated high-speed HMAC-SHA1 password recovery method based on mimicry calculation

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040064737A1 (en) * 2000-06-19 2004-04-01 Milliken Walter Clark Hash-based systems and methods for detecting and preventing transmission of polymorphic network worms and viruses
US7328349B2 (en) * 2001-12-14 2008-02-05 Bbn Technologies Corp. Hash-based systems and methods for detecting, preventing, and tracing network worms and viruses
US20040073617A1 (en) * 2000-06-19 2004-04-15 Milliken Walter Clark Hash-based systems and methods for detecting and preventing transmission of unwanted e-mail
US7200105B1 (en) 2001-01-12 2007-04-03 Bbn Technologies Corp. Systems and methods for point of ingress traceback of a network attack
US7489779B2 (en) * 2001-03-22 2009-02-10 Qstholdings, Llc Hardware implementation of the secure hash standard
US20020191783A1 (en) * 2001-06-13 2002-12-19 Takahashi Richard J. Method and apparatus for creating a message digest using a multiple round, one-way hash algorithm
US7360076B2 (en) 2001-06-13 2008-04-15 Itt Manufacturing Enterprises, Inc. Security association data cache and structure
TWI230532B (en) * 2002-03-05 2005-04-01 Admtek Inc Pipelined engine for encryption/authentication in IPSEC
US7237262B2 (en) * 2002-07-09 2007-06-26 Itt Manufacturing Enterprises, Inc. System and method for anti-replay processing of a data packet
US20040123123A1 (en) * 2002-12-18 2004-06-24 Buer Mark L. Methods and apparatus for accessing security association information in a cryptography accelerator
US20040123120A1 (en) * 2002-12-18 2004-06-24 Broadcom Corporation Cryptography accelerator input interface data handling
US7181009B1 (en) 2002-12-18 2007-02-20 Cisco Technology, Inc. Generating message digests according to multiple hashing procedures
US8041957B2 (en) * 2003-04-08 2011-10-18 Qualcomm Incorporated Associating software with hardware using cryptography
US20040268123A1 (en) * 2003-06-27 2004-12-30 Nokia Corporation Security for protocol traversal
US7908484B2 (en) * 2003-08-22 2011-03-15 Nokia Corporation Method of protecting digest authentication and key agreement (AKA) against man-in-the-middle (MITM) attack
US7826614B1 (en) 2003-11-05 2010-11-02 Globalfoundries Inc. Methods and apparatus for passing initialization vector information from software to hardware to perform IPsec encryption operation
US7747020B2 (en) * 2003-12-04 2010-06-29 Intel Corporation Technique for implementing a security algorithm
US7684563B1 (en) * 2003-12-12 2010-03-23 Sun Microsystems, Inc. Apparatus and method for implementing a unified hash algorithm pipeline
CN1969526B (en) * 2004-04-14 2010-10-13 北方电讯网络有限公司 Securing home agent to mobile node communication with HA-MN key
JP4549303B2 (en) * 2005-02-07 2010-09-22 株式会社ソニー・コンピュータエンタテインメント Method and apparatus for providing a message authentication code using a pipeline
US8059551B2 (en) * 2005-02-15 2011-11-15 Raytheon Bbn Technologies Corp. Method for source-spoofed IP packet traceback
US7921303B2 (en) 2005-11-18 2011-04-05 Qualcomm Incorporated Mobile security system and method
US7995584B2 (en) * 2007-07-26 2011-08-09 Hewlett-Packard Development Company, L.P. Method and apparatus for detecting malicious routers from packet payload
US8363827B2 (en) * 2007-12-03 2013-01-29 Intel Corporation Method and apparatus for generic multi-stage nested hash processing
GB0812593D0 (en) * 2008-07-09 2008-08-20 Univ Belfast Data security devices and methods
JP2010128392A (en) * 2008-11-28 2010-06-10 Canon Inc Hash processing apparatus and hash processing method
US20110019814A1 (en) * 2009-07-22 2011-01-27 Joseph Roy Hasting Variable sized hash output generation using a single hash and mixing function
US8514855B1 (en) * 2010-05-04 2013-08-20 Sandia Corporation Extensible packet processing architecture
WO2013095547A1 (en) * 2011-12-22 2013-06-27 Intel Corporation Apparatus and method of execution unit for calculating multiple rounds of a skein hashing algorithm
US8874933B2 (en) * 2012-09-28 2014-10-28 Intel Corporation Instruction set for SHA1 round processing on 128-bit data paths
US10097345B2 (en) * 2015-04-14 2018-10-09 PeerNova, Inc. Secure hash algorithm in digital hardware for cryptographic applications
US11070380B2 (en) 2015-10-02 2021-07-20 Samsung Electronics Co., Ltd. Authentication apparatus based on public key cryptosystem, mobile device having the same and authentication method
US10262164B2 (en) 2016-01-15 2019-04-16 Blockchain Asics Llc Cryptographic ASIC including circuitry-encoded transformation function
US10454670B2 (en) * 2016-06-10 2019-10-22 Cryptography Research, Inc. Memory optimization for nested hash operations
US10372943B1 (en) 2018-03-20 2019-08-06 Blockchain Asics Llc Cryptographic ASIC with combined transformation and one-way functions
US10256974B1 (en) 2018-04-25 2019-04-09 Blockchain Asics Llc Cryptographic ASIC for key hierarchy enforcement
CN111899104B (en) * 2018-11-27 2023-12-01 创新先进技术有限公司 Service execution method and device
US11714620B1 (en) 2022-01-14 2023-08-01 Triad National Security, Llc Decoupling loop dependencies using buffers to enable pipelining of loops

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5870474A (en) * 1995-12-04 1999-02-09 Scientific-Atlanta, Inc. Method and apparatus for providing conditional access in connection-oriented, interactive networks with a multiplicity of service providers
US4041292A (en) * 1975-12-22 1977-08-09 Honeywell Information Systems Inc. High speed binary multiplication system employing a plurality of multiple generator circuits
JPS60140428A (en) * 1983-12-28 1985-07-25 Hitachi Ltd Divider
US4801935A (en) * 1986-11-17 1989-01-31 Computer Security Corporation Apparatus and method for security of electric and electronic devices
WO1991015820A1 (en) * 1990-04-04 1991-10-17 International Business Machines Corporation Early scism alu status determination apparatus
US5297206A (en) * 1992-03-19 1994-03-22 Orton Glenn A Cryptographic method for communication and electronic signatures
US5548544A (en) * 1994-10-14 1996-08-20 Ibm Corporation Method and apparatus for rounding the result of an arithmetic operation
US5936967A (en) * 1994-10-17 1999-08-10 Lucent Technologies, Inc. Multi-channel broadband adaptation processing
US5796836A (en) * 1995-04-17 1998-08-18 Secure Computing Corporation Scalable key agile cryptography
US5943338A (en) * 1996-08-19 1999-08-24 3Com Corporation Redundant ATM interconnect mechanism
US6111858A (en) * 1997-02-18 2000-08-29 Virata Limited Proxy-controlled ATM subnetwork
AUPO799197A0 (en) * 1997-07-15 1997-08-07 Silverbrook Research Pty Ltd Image processing method and apparatus (ART01)
US5940877A (en) * 1997-06-12 1999-08-17 International Business Machines Corporation Cache address generation with and without carry-in
US6216167B1 (en) * 1997-10-31 2001-04-10 Nortel Networks Limited Efficient path based forwarding and multicast forwarding
US6304657B1 (en) * 1999-05-26 2001-10-16 Matsushita Electric Industrial Co., Ltd. Data encryption apparatus using odd number of shift-rotations and method
JP3864675B2 (en) * 2000-03-09 2007-01-10 株式会社日立製作所 Common key encryption device
US7177421B2 (en) 2000-04-13 2007-02-13 Broadcom Corporation Authentication engine architecture and method

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BELLARE M: "MESSAGE AUTHENTICATION USING HASH FUNCTIONS - THE HMAC CONSTRUCTION" RSA LABORATORIES' CRYPTOBYTES, vol. 2, no. 1, 1996, pages 1-5, XP002184520 *
SCHNEIER B: "APPLIED CRYPTOGRAPHY, SECOND EDITION" 1996 , JOHN WILEY & SONS , NEW YORK US XP002184521 cited in the application page 436, paragraph 18.5 -page 440 page 442, paragraph 18.7 -page 444 *
STALLINGS W: "SHA: THE SECURE HASH ALGORITHM PUTTING MESSAGE DIGESTS TO WORK" DR. DOBBS JOURNAL, REDWOOD CITY, CA, US, 1 April 1994 (1994-04-01), page 32,34 XP000570561 *
TOUCH J D: "PERFORMANCE ANALYSIS OF MD5" COMPUTER COMMUNICATIONS REVIEW, ASSOCIATION FOR COMPUTING MACHINERY. NEW YORK, US, vol. 25, no. 4, 1 October 1995 (1995-10-01), pages 77-86, XP000541653 ISSN: 0146-4833 *

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7996670B1 (en) 1999-07-08 2011-08-09 Broadcom Corporation Classification engine in a cryptography acceleration chip
US7600131B1 (en) 1999-07-08 2009-10-06 Broadcom Corporation Distributed processing in a cryptography acceleration chip
US6971006B2 (en) 1999-07-08 2005-11-29 Broadcom Corporation Security chip architecture and implementations for cryptography acceleration
US7124296B2 (en) 1999-07-08 2006-10-17 Broadcom Corporation Security chip architecture and implementations for cryptography acceleration
US7177421B2 (en) 2000-04-13 2007-02-13 Broadcom Corporation Authentication engine architecture and method
US8000469B2 (en) 2000-04-13 2011-08-16 Broadcom Corporation Authentication engine architecture and method
US7555121B2 (en) 2000-09-25 2009-06-30 Broadcom Corporation Methods and apparatus for implementing a cryptography engine
US7277542B2 (en) 2000-09-25 2007-10-02 Broadcom Corporation Stream cipher encryption application accelerator and methods thereof
US7903813B2 (en) 2000-09-25 2011-03-08 Broadcom Corporation Stream cipher encryption application accelerator and methods thereof
WO2002056538A2 (en) * 2001-01-12 2002-07-18 Broadcom Corporation Implementation of the shai algorithm
WO2002056538A3 (en) * 2001-01-12 2002-12-19 Broadcom Corp Implementation of the shai algorithm
US7299355B2 (en) 2001-01-12 2007-11-20 Broadcom Corporation Fast SHA1 implementation
WO2002101978A3 (en) * 2001-06-13 2003-04-03 Corrent Corp Apparatus and method for a hash processing system using multiple hash storage areas
US7266703B2 (en) 2001-06-13 2007-09-04 Itt Manufacturing Enterprises, Inc. Single-pass cryptographic processor and method
US7249255B2 (en) 2001-06-13 2007-07-24 Corrent Corporation Apparatus and method for a hash processing system using multiple hash storage areas
US7213148B2 (en) 2001-06-13 2007-05-01 Corrent Corporation Apparatus and method for a hash processing system using integrated message digest and secure hash architectures
WO2002101525A3 (en) * 2001-06-13 2003-03-06 Corrent Corp Apparatus and methods for a hash processing system using integrated message digest and secure hash architectures
WO2002101525A2 (en) * 2001-06-13 2002-12-19 Corrent Corporation Apparatus and methods for a hash processing system using integrated message digest and secure hash architectures
WO2002101978A2 (en) * 2001-06-13 2002-12-19 Corrent Corporation Apparatus and method for a hash processing system using multiple hash storage areas
US7861104B2 (en) 2001-08-24 2010-12-28 Broadcom Corporation Methods and apparatus for collapsing interrupts
US7403615B2 (en) 2001-08-24 2008-07-22 Broadcom Corporation Methods and apparatus for accelerating ARC4 processing
US7568110B2 (en) 2002-12-18 2009-07-28 Broadcom Corporation Cryptography accelerator interface decoupling from cryptography processing cores
US7434043B2 (en) 2002-12-18 2008-10-07 Broadcom Corporation Cryptography accelerator data routing unit
US7191341B2 (en) 2002-12-18 2007-03-13 Broadcom Corporation Methods and apparatus for ordering data in a cryptography accelerator
CN100449986C (en) * 2003-01-28 2009-01-07 华为技术有限公司 Method for raising operational speed of key-hashing method
US9264426B2 (en) 2004-12-20 2016-02-16 Broadcom Corporation System and method for authentication via a proximate device
US8295484B2 (en) 2004-12-21 2012-10-23 Broadcom Corporation System and method for securing data from a remote input device
US9288192B2 (en) 2004-12-21 2016-03-15 Broadcom Corporation System and method for securing data from a remote input device
KR100581662B1 (en) 2005-08-31 2006-05-22 주식회사 칩스앤미디어 Common engine for plural hash functions having different algorithms
CN107835071A (en) * 2017-11-03 2018-03-23 中国人民解放军国防科技大学 Method and device for improving operation speed of key-in-hash method
CN107835071B (en) * 2017-11-03 2020-02-21 中国人民解放军国防科技大学 Method and device for improving operation speed of key-in-hash method
CN112564922A (en) * 2020-12-22 2021-03-26 创元网络技术股份有限公司 Multifunctional integrated high-speed HMAC-SHA1 password recovery method based on mimicry calculation

Also Published As

Publication number Publication date
DE60113395D1 (en) 2005-10-20
US8000469B2 (en) 2011-08-16
EP1273129A2 (en) 2003-01-08
US20070110230A1 (en) 2007-05-17
WO2001080483A3 (en) 2002-04-04
AU2001253888A1 (en) 2001-10-30
DE60113395T2 (en) 2006-06-14
EP1273129B1 (en) 2005-09-14
ATE304759T1 (en) 2005-09-15
US7177421B2 (en) 2007-02-13
US20020001384A1 (en) 2002-01-03

Similar Documents

Publication Publication Date Title
US7177421B2 (en) Authentication engine architecture and method
US7299355B2 (en) Fast SHA1 implementation
US20020078342A1 (en) E-commerce security processor alignment logic
EP1271839B1 (en) AES Encryption circuit
US9363078B2 (en) Method and apparatus for hardware-accelerated encryption/decryption
US7249255B2 (en) Apparatus and method for a hash processing system using multiple hash storage areas
US6324286B1 (en) DES cipher processor for full duplex interleaving encryption/decryption service
US6870929B1 (en) High throughput system for encryption and other data operations
US7295671B2 (en) Advanced encryption standard (AES) hardware cryptographic engine
US7213148B2 (en) Apparatus and method for a hash processing system using integrated message digest and secure hash architectures
EP1215842A2 (en) Methods and apparatus for implementing a cryptography engine
EP1215841B1 (en) Methods and apparatus for implementing a cryptography engine
US20020032551A1 (en) Systems and methods for implementing hash algorithms
US7657757B2 (en) Semiconductor device and method utilizing variable mode control with block ciphers
Yang et al. A high speed architecture for galois/counter mode of operation (gcm)
CN112367158A (en) Method for accelerating SM3 algorithm, processor, chip and electronic equipment
EP1215843B1 (en) Methods and apparatus for implementing a cryptography engine
WO2001017152A1 (en) A method for the hardware implementation of the idea cryptographic algorithm - hipcrypto
CN114553424A (en) ZUC-256 stream cipher light-weight hardware system
US7151829B2 (en) System and method for implementing a hash algorithm
JP4395527B2 (en) Information processing device
US20110176673A1 (en) Encrypting apparatus
Lin et al. The Design of a High-Throughput Hardware Architecture for the AES-GCM Algorithm
CN116132018A (en) Method for realizing SHA256 algorithm on P4 programmable switch
SAKURAI SHA

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
AK Designated states

Kind code of ref document: A3

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A3

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

WWE Wipo information: entry into national phase

Ref document number: 2001927441

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 2001927441

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: JP

WWG Wipo information: grant in national office

Ref document number: 2001927441

Country of ref document: EP