US20010042210A1 - Cryptographic data processing systems, computer program products, and methods of operating same in which a system memory is used to transfer information between a host processor and an adjunct processor - Google Patents

Cryptographic data processing systems, computer program products, and methods of operating same in which a system memory is used to transfer information between a host processor and an adjunct processor Download PDF

Info

Publication number
US20010042210A1
US20010042210A1 US09/852,562 US85256201A US2001042210A1 US 20010042210 A1 US20010042210 A1 US 20010042210A1 US 85256201 A US85256201 A US 85256201A US 2001042210 A1 US2001042210 A1 US 2001042210A1
Authority
US
United States
Prior art keywords
host processor
processor
cryptographic
computer readable
command
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/852,562
Inventor
David Blaker
Raymond Savarda
Michael Hanna
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NBMK ENCRYPTION TECHNOLOGIES Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US09/852,562 priority Critical patent/US20010042210A1/en
Assigned to NETOCTAVE, INC. reassignment NETOCTAVE, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CELOTEK CORPORATION
Assigned to NETOCTAVE, INC. reassignment NETOCTAVE, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HANNA, MICHAEL, BLAKER, DAVID M., SAVARDA, RAYMOND
Publication of US20010042210A1 publication Critical patent/US20010042210A1/en
Assigned to INTERSOUTH PARTNERS V, L.P. AS AGENT FOR THE SUCURED PARITIES PURSUANT TO THE SECURITY AGREEMENT reassignment INTERSOUTH PARTNERS V, L.P. AS AGENT FOR THE SUCURED PARITIES PURSUANT TO THE SECURITY AGREEMENT SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: NETOCTAVE, INC.
Assigned to NETOCTAVE, INC. reassignment NETOCTAVE, INC. TERMINATION OF SECURITY INTEREST Assignors: INTERSOUTH PARTNERS V, L.P. AS AGENT FOR THE SECURED PARTIES PURSUANT TO THE TERMINATION OF SECURITY INTEREST
Assigned to CYBERGUARD CORPORATION reassignment CYBERGUARD CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: NETOCTAVE, INC.
Assigned to NBMK ENCRYPTION TECHNOLOGIES, INC. reassignment NBMK ENCRYPTION TECHNOLOGIES, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CYBERGUARD CORPORATION
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline, look ahead using a slave processor, e.g. coprocessor
    • G06F9/3879Concurrent instruction execution, e.g. pipeline, look ahead using a slave processor, e.g. coprocessor for non-native instruction execution, e.g. executing a command; for Java instruction set
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • G06F21/121Restricting unauthorised execution of programs
    • G06F21/123Restricting unauthorised execution of programs by using dedicated hardware, e.g. dongles, smart cards, cryptographic processors, global positioning systems [GPS] devices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
    • G06F21/72Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information in cryptographic circuits
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/60Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
    • G06F7/72Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic
    • G06F7/728Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic using Montgomery reduction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0861Generation of secret information including derivation or calculation of cryptographic keys or passwords
    • H04L9/0877Generation of secret information including derivation or calculation of cryptographic keys or passwords using additional device, e.g. trusted platform module [TPM], smartcard, USB or hardware security module [HSM]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2209/00Additional information or applications relating to cryptographic mechanisms or cryptographic arrangements for secret or secure communication H04L9/00
    • H04L2209/12Details relating to cryptographic hardware or logic circuitry
    • H04L2209/125Parallelization or pipelining, e.g. for accelerating processing of cryptographic operations

Definitions

  • the present invention relates generally to the field of data processing systems, and, more particularly, to cryptographic data processing systems, computer program products, and methods of operating same.
  • conventional cryptographic data processing systems generally use two main methods for issuing a command to a cryptographic accelerator:
  • the first method involves the provision of a command register on the cryptographic accelerator that a host processor uses to issue a single command. Once the cryptographic accelerator completes executing a command, the host processor may issue a new command. After completing a command, the cryptographic accelerator is generally idle until the host processor issues a new command. Unfortunately, the host processor may spend much time interacting directly with the cryptographic accelerator to download data and issue commands. This may reduce the amount of time available to the host processor for attending to other tasks.
  • the second method allows the host processor to download one or more command sequences to the cryptographic accelerator and then to instruct the cryptographic accelerator to execute one or more of the downloaded command sequences.
  • the cryptographic accelerator is generally idle until the host processor issues a new command.
  • the size of the command sequences may be limited based on the amount of memory that may be placed on the cryptographic accelerator.
  • the host processor may spend much time interacting directly with the cryptographic accelerator to download data and issue command sequences. This may reduce the amount of time available to the host processor for attending to other tasks.
  • the size of the operands will always be less than or equal to the register size. As a result, some of the memory in the registers may be wasted. This reduces the number of operands that may be stored on a chip in a given amount of space.
  • the cryptographic accelerator is redesigned to accommodate larger operands, then each of the registers may need to be modified. More registers may be designed into a cryptographic accelerator; however, adding more memory to a cryptographic accelerator may reduce the amount of other functionality that may be included and/or increase the cost.
  • Cryptographic processors and/or other types of signal processors and integrated circuits may use a hardware-based random number generator.
  • Various conventional methods may be used to retrieve random numbers from an integrated circuit incorporating a random number generator.
  • One method is for the random number generator to provide one or more data registers that a host processor may read to obtain random numbers.
  • the host processor may tell the random number generator to provide more random data before or after retrieving random data from the registers.
  • the random number generator may generate the random data in the background so that random data may be available when needed by the host processor.
  • Another method for obtaining random data is for the host processor to request a sample of random data from the random number generator.
  • the host processor may provide the random number generator with a request that specifies an amount of random data and a location in memory where the random data should be placed.
  • the random number generator may then generate the random data and transfer the random data to the requested location in the background.
  • any buffer management that may be desired is generally performed by the host processor.
  • the bus that connects the host processor with the random number generator may be used inefficiently because single data reads are typically used instead of block reads. If a host processor requests a block of random data, however, then the host processor may initiate the data transfers and any desired buffer management that may be desired is generally performed by the host processor. The foregoing operations may be performed in the background and/or a fast host processor may be used; however, a faster host processor may increase system costs.
  • Embodiments of the present invention provide cryptographic data processing systems, computer program products, and methods of operating same in which system memory is used to transfer information between a host processor and an adjunct processor.
  • cryptographic data processing systems comprise a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory.
  • One or more operands are downloaded into the local memory from the system memory, using, for example, a load command, and one or more operations are performed on at least one of the downloaded operands to generate a result in the local memory.
  • the result that is generated in the local memory is then stored in the system memory, using, for example, a store command.
  • interaction between the host processor and the cryptographic processor may be reduced as the host processor need not consume processing time downloading operands to the cryptographic accelerator processor and/or uploading results from the cryptographic processor to system memory.
  • interaction between the host processor and the cryptographic processor may be further reduced and overall system performance improved by providing a command queue in the system memory, loading a command block into the command queue using the host processor, executing the command block using the cryptographic processor, and notifying the host processor that the command block has been executed.
  • the host processor need not spend time interacting directly with the cryptographic processor.
  • the host processor may be notified that the command block has been executed in various ways.
  • the cryptographic processor may invoke an interrupt to notify the host processor that the command block has been executed.
  • the interrupt may be invoked if the host processor has requested notification via an interrupt in an interrupt field of the command block.
  • the cryptographic processor may update a completion field in the command block to notify the host processor that the command block has been executed.
  • a periodic interrupt may be defined such that when the interrupt occurs the host processor reads the completion fields of any command blocks.
  • improved utilization of the system memory may be attained by re-using at least a portion of a command block that contains input data to store a result or output that is generated by an adjunct processor, such as the cryptographic processor.
  • a command queue may be provided in the system memory and a command block may be loaded into the command queue using the host processor.
  • the command block comprises an input data field that contains input data.
  • the adjunct processor performs an operation based on the input data to generate a result and this result is stored in the input data field such that at least a portion of the input data is overwritten.
  • the memory reserved for the command block in the system memory may be reduced because additional storage space need not be reserved to store the result of executing the command block either in the command block or elsewhere in the system memory.
  • Cryptographic processors and/or other types of signal processors and integrated circuits may use a hardware-based random number generator.
  • random number samples may be provided for use by the host processor while reducing interaction between the host processor and the cryptographic processor.
  • a random number data queue in the system memory may be provided that has a read address and a write address associated therewith.
  • the cryptographic processor loads a random number sample into the random number data queue at the write address and the host processor reads the random number sample beginning at the read address.
  • the host processor need only interact with the cryptographic processor to update the read address and to check the value of the write address when the read address approaches the last value the host processor has for the write address.
  • the cryptographic accelerator processor may manage the buffering of the random number samples, which may conserve processor cycles of the host processor.
  • FIG. 1 is a block diagram that illustrates cryptographic data processing systems, computer program products, and methods of operating same in accordance with embodiments of the present invention
  • FIG. 2 is a flowchart that illustrates operations of cryptographic data processing systems and computer program products in accordance with embodiments of the present invention
  • FIGS. 3 - 5 are block diagrams that illustrate functional execution units of a cryptographic accelerator processor in accordance with embodiments of the present invention.
  • FIG. 6 is a flowchart that illustrates operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention.
  • FIGS. 7 - 8 are block diagrams that illustrate an encryption/authentication command queue and a public key command queue, respectively, in accordance with embodiments of the present invention.
  • FIGS. 9 - 11 are flowcharts that illustrate operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention.
  • FIGS. 12 A- 12 D are block diagrams that illustrate command blocks in accordance with embodiments of the present invention.
  • FIG. 13 is a flowchart that illustrates operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention.
  • FIGS. 14A, 14B, and 15 are block diagrams that illustrate command blocks in accordance with further embodiments of the present invention.
  • FIGS. 16 and 17 are flowcharts that illustrate operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention.
  • FIG. 18 is a block diagram that illustrates a random number generator data queue in accordance with embodiments of the present invention.
  • FIG. 19 is a flowchart that illustrates operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention.
  • FIG. 20 is a block diagram that illustrates a command interface for a conventional application specific integrated circuit
  • FIG. 21 is a block diagram that illustrates parallel command interfaces for an application specific integrated circuit in accordance with embodiments of the present invention.
  • FIG. 22 is a block diagram of a cryptographic accelerator processor in which command interface managers are respectively associated with functional execution units in accordance with embodiments of the present invention.
  • FIG. 23 is a flowchart that illustrates operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention.
  • the present invention may be embodied as methods, data processing systems, and/or computer program products. Accordingly, the present invention may be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.). Furthermore, the present invention may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by or in connection with an instruction execution system.
  • a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
  • the computer-usable or computer-readable medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. More specific examples (a nonexhaustive list) of the computer-readable medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, and a portable compact disc read-only memory (CD-ROM).
  • RAM random access memory
  • ROM read-only memory
  • EPROM or Flash memory erasable programmable read-only memory
  • CD-ROM portable compact disc read-only memory
  • the computer-usable or computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted, or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
  • an exemplary cryptographic data processing system 12 comprises a cryptographic accelerator processor 14 , a host processor 16 , a cache memory 18 , a system memory 22 , and a system bus controller 24 , such as a north-bridge system controller.
  • the system bus controller 24 couples the host processor 16 to the cache memory 18 and the system memory 22 , and also couples the host processor 16 and the system memory 22 to the cryptographic accelerator processor 14 via a system bus 26 , which may be, for example, a peripheral component interconnect (PCI) bus.
  • PCI peripheral component interconnect
  • the host processor 16 may be, for example, a commercially available or custom microprocessor.
  • the system memory 22 is representative of an overall hierarchy of memory devices containing the software and data used to implement the functionality of the cryptographic data processing system 12 .
  • the system memory 22 may include, but is not limited to, the following types of devices: ROM, PROM, EPROM, EEPROM, flash, SRAM, and DRAM.
  • the cryptographic accelerator processor 14 comprises a random number generator (RNG) execution unit 28 , an encryption/authentication (E/A) execution unit 32 , and a public key (PK) engine execution unit 34 , which are coupled to a local memory 36 via a local bus 38 .
  • RNG random number generator
  • E/A encryption/authentication
  • PK public key
  • the system memory 22 contains a random number (RN) data queue 42 , an E/A command queue 44 , a PK command queue 46 , and data buffer(s) 47 .
  • RN random number
  • FIG. 1 illustrates an exemplary cryptographic data processing system architecture
  • the present invention is not limited to such a configuration, but is intended to encompass any configuration capable of carrying out operations described herein.
  • Computer program code for carrying out operations of embodiments of the cryptographic data processing system 12 may be written in a high-level programming language, such as C or C++, for development convenience. Nevertheless, some modules or routines may be written in assembly language or even micro-code to enhance performance and/or memory usage. It will be further appreciated that the functionality of any or all of the program modules may also be implemented using discrete hardware components, a single application specific integrated circuit (ASIC), or a programmed digital signal processor or microcontroller.
  • ASIC application specific integrated circuit
  • These computer program instructions may also be stored in a computer usable or computer-readable memory that may direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer usable or computer-readable memory produce an article of manufacture including instructions that implement the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions that execute on the computer or other programmable apparatus provide steps for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • the host processor 16 loads a command block into one of the command queues 44 and 46 at block 52 .
  • the cryptographic accelerator processor 14 may be notified by the host processor 16 that the command block is available for processing or may periodically access the command queues 44 , and/or 46 to determine if a command block is available for processing.
  • the cryptographic accelerator processor 14 downloads the command block from one of the command queues 44 and 46 and executes the command block at block 54 .
  • the host processor 16 is notified at block 56 .
  • the host processor 16 need not spend time interacting directly with the cryptographic accelerator processor 14 (e.g., issuing a command to the cryptographic accelerator processor 14 , waiting for that command to complete, and then issuing another command). Instead, the host processor 16 may load commands into command queues 44 and 46 , which may then be processed in background by the cryptographic accelerator processor 14 . Moreover, the size and number of command block sequences may be less constrained because the availability of system memory is generally more abundant.
  • the RNG execution unit 28 , the E/A execution unit 32 , and the PK engine execution unit 34 may use various registers that facilitate communication with the RN data queue 42 and the command queues 44 and 46 .
  • a control/status register 62 , a RN data queue base address register 64 , a RN data queue size register 66 , and a RN data queue pointer register 68 may be defined for use by the RNG execution unit 28 .
  • the control/status register 62 may include a self-test error field, which may be set if the RNG execution unit 28 generates two successive random number samples that are the same, and/or an error flag field, which maybe used to notify the host processor 16 of an error on the system bus 26 .
  • the RN data queue base address register 64 may be used to hold the base address of the RN data queue 42 in the system memory 22 . If the RN data queue 42 does not have a fixed size, then the RN data queue size register 66 may be used to hold the size of the RN data queue 42 .
  • the RN data queue pointer register 68 may comprise a read pointer 72 portion and a write pointer 74 portion, which may be used by the RNG execution unit 28 and the host processor 16 as will be discussed in more detail hereinafter.
  • a control/status register 82 may be defined for use by the E/A execution unit 32 .
  • the control/status register 82 may include an interrupt flag field, which may be set if the host processor 16 requests an interrupt upon completion of a command block and/or if execution of a command block fails and/or an error flag field, which may be used to notify the host processor 16 of an error on the system bus 26 .
  • the E/A command queue base address register 84 may be used to hold the base address of the E/A command queue 44 in the system memory 22 .
  • the E/A command queue size register 86 may be used to hold the size of the E/A command queue 44 .
  • the E/A command queue pointer register 88 may comprise a read pointer 92 portion and a write pointer 94 portion, which may be used by the E/A execution unit 32 and the host processor 16 , respectively, as will be discussed in more detail hereinafter.
  • a control/status register 102 may be defined for use by the PK engine execution unit 34 .
  • a PK command queue base address register 104 may be defined for use by the PK engine execution unit 34 .
  • a PK command queue size register 106 may be defined for use by the PK engine execution unit 34 .
  • a PK command queue pointer register 108 may be defined for use by the PK engine execution unit 34 .
  • the control/status register 102 may include an interrupt flag field, which may be set if the host processor 16 requests an interrupt upon completion of a command block and/or if execution of a command block fails and/or an error flag field, which may be used to notify the host processor 16 of an error on the system bus 26 .
  • the PK command queue base address register 104 may be used to hold the base address of the PK command queue 46 in the system memory 22 . If the PK command queue 46 does not have a fixed size, then the PK command queue size register 106 may be used to hold the size of the PK command queue 46 .
  • the PK command queue pointer register 108 may comprise a read pointer 112 portion and a write pointer 114 portion, which may be used by the PK engine execution unit 34 and the host processor 16 , respectively, as will be discussed in more detail hereinafter.
  • the host processor 16 writes commands into the command queues 44 and 46 beginning at write address locations stored in the write pointers for the respective command queues (e.g., write pointers 94 and 114 ). Before writing a command block into a command queue, however, the host processor determines at block 122 whether the write address plus the command block size equals the read address stored in the corresponding read pointer 92 or 112 .
  • the host processor 16 postpones loading a new command block into the command queue until the cryptographic accelerator processor 14 has incremented the read address. If, however, the result determined at block 122 is “No,” then the host processor 16 loads a command block into the command queue at block 124 at the write address associated with the command queue and then increments the write address at block 126 by an amount corresponding to the size of the loaded command block.
  • the host processor 16 need not check the current read address every time a new command block is loaded. Instead, the host processor 16 may check the read address when the write address is getting close to the last value the host processor 16 has for the read address. Checking the read address may be expensive in terms of processor cycles consumed. By checking the read address only when the read address is getting close to the write address (e.g., within a predefined threshold), host processor 16 cycles may be conserved.
  • FIGS. 7 and 8 show embodiments of the E/A command queue 44 and the PK command queue 46 , respectively.
  • both the E/A command queue 44 and the PK command queue 46 are configured to hold m command blocks, which each comprise eight, thirty-two bit words.
  • the host processor 16 has written a single command block into the first command block position (i.e., the “ 0 ” position) and the write address has been incremented to point to the next empty command block slot.
  • the addresses used in FIGS. 7 and 8 are based on command block slot numbers for purposes of illustration.
  • These addresses may be converted into absolute addresses by multiplying the command block slot number by 256 and adding the resulting product to the respective base addresses for the command queues, which are stored in the E/A command queue base address register 84 and the PK command queue base address register 104 , respectively.
  • the test used at block 122 of FIG. 6 to determine whether a new command block may be loaded into a command queue implies that if a command queue may hold up to m command blocks, then only m ⁇ 1 command blocks may be stored in the command queue at the same time.
  • the cryptographic accelerator processor 14 determines whether the write address is equal to the read address. Specifically, the E/A execution unit 32 determines whether the write address is equal to the read address for the E/A command queue 44 and the PK engine execution unit 34 determines whether the write address is equal to the read address for the PK command queue 46 . If the result determined at block 132 is “Yes,” then the cryptographic processor 14 waits until the host processor 16 loads a new command block into the command queue.
  • the cryptographic accelerator processor 14 downloads the command block at the read address associated with the command queue and executes the command block at block 134 .
  • multiple command blocks may be downloaded for execution on the cryptographic accelerator processor 14 at the same time, which may further improve performance.
  • the cryptographic accelerator processor 14 then increments the read address at block 136 by an amount corresponding to the size of the executed command block.
  • the read addresses are set to point to the first command block slot, which has been loaded with a command block by the host processor 16 .
  • the E/A execution unit 32 and the PK engine execution unit 34 may read the command blocks loaded in the E/A command queue 44 and the PK command queue 46 , respectively, with only minimal interaction with the host processor 16 , e.g., maintenance of the read pointers 92 and 112 , and the write pointers 94 , and 114 .
  • the cryptographic accelerator processor 14 may continue to execute commands located in a circular command queue in system memory until the read address equals the write address for that command queue.
  • interaction between the host processor 16 and the cryptographic accelerator processor 14 may be further reduced and overall system performance improved by including load and store commands in the cryptographic accelerator processor's command set.
  • a load command loads one or more operands from the system memory 22 (e.g., the data buffer(s) 47 ) to the local memory 36 at block 142 .
  • the cryptographic accelerator processor 14 then performs one or more operations on the operand(s) at block 144 to generate a result that is stored in the local memory 36 .
  • a store command then stores the result in the system memory 22 at block 146 .
  • the host processor 16 need not consume processing time downloading operands to the cryptographic accelerator processor 14 and/or uploading results from the cryptographic accelerator processor 14 into the system memory 22 .
  • At least a portion of the operands downloaded from the system memory 22 may be stored in the local memory 36 .
  • an offset is used that identifies the relative position of the operands and results in the local memory 36 .
  • a cryptographic accelerator processor 14 instruction may indicate that “a” is at offset 0 relative to a base address of the local memory 36 , “b” is at offset 8 relative to the base address of the local memory 36 , and the result “c” should be placed at offset 122 relative to the base address of the local memory 36 .
  • the result generated in the local memory 36 may also be stored in a result field of a command block, which is located in one of the command queues 44 and 46 in the system memory 22 .
  • operands and results may be packed together into the local memory 36 , which may conserve storage space. Because there is no wasted space in storing the operands and results in the local memory 36 , memory utilization may be improved. If the cryptographic accelerator processor 14 needs to be redesigned to handle larger operands, then the local memory 36 may be easier to resize than resizing several registers.
  • interaction between the host processor 16 and the cryptographic accelerator processor 14 may be further reduced and overall system performance improved by allowing the cryptographic accelerator processor 14 to inform the host processor 16 when command blocks have been executed.
  • the host processor 16 loads a command block into one of the command queues 44 and 46 at block 152 .
  • the command block may include an interrupt field, which may be set by the host processor 16 to turn an interrupt request on or off.
  • the cryptographic accelerator processor 14 downloads the command block from one of the command queues 44 and 46 and executes the command block at block 154 .
  • the cryptographic accelerator processor 14 may optionally store error information in the command block as shown in FIG. 12B at block 156 .
  • the error information may comprise information that is associated with downloading the command block to the cryptographic accelerator processor 14 and/or executing the command block on the cryptographic accelerator processor 14 .
  • the cryptographic accelerator processor 14 invokes an interrupt to notify the host processor 16 that the command block has completed.
  • the cryptographic accelerator processor 14 may update a completion field in the command block as shown in FIG. 12C.
  • a periodic interrupt may be defined that upon each occurrence triggers the host processor 16 to check one or more of the command queues 44 and 46 to determine whether any of the command blocks stored therein have been executed by examining their completion fields.
  • the cryptographic accelerator processor 14 may store the results from executing a command block in the command block as shown in FIG. 12D.
  • the host processor 16 may set a timer when storing a command block into a command queue 42 , 44 . Upon expiration of the timer, the host processor 16 may check to determine whether the command block has been executed.
  • the status of a command block may be determined by the host processor 16 without the need to process an interrupt from the cryptographic accelerator processor 14 .
  • improved utilization of the system memory 22 may be attained by re-using at least a portion of a command block that contains input data to store a result or output that is generated by an adjunct processor, such as the cryptographic accelerator processor 14 , upon executing the command block. It is assumed that the size of the result or output is small enough to fit into the portion of the command block containing the input data that is to be overwritten. In addition, the region of the command block in which the result or output is stored should be selected carefully to ensure that the input data that is overwritten is no longer needed by the host processor 16 after the command block has been executed by the adjunct processor.
  • exemplary operations begin at block 162 where the host processor loads a command block that includes input data into one of the command queues 44 or 46 in the system memory 22 .
  • the command block may include pointers to input data that reside, for example, in the data buffer(s) 47 in the system memory 22 .
  • An adjunct processor such as the cryptographic accelerator processor 14 , may download the command block and perform one or more operations on the input data to generate a result at block 164 . If the command block includes pointers to input data, then the data are separately downloaded to the cryptographic accelerator processor 14 using the input data pointers.
  • the result is then stored in the command block in the system memory 22 at block 166 such that at least a portion of the input data is overwritten.
  • the memory reserved for the command block in the system memory 22 may be reduced because additional storage space need not be reserved to store the result of executing the command block either in the command block or elsewhere in the system memory 22 .
  • FIGS. 14A and 14B show an exemplary command block for decrypting an encrypted packet.
  • a command block is shown that comprises a field that contains a hash key for the encrypted packet and another field that contains input information.
  • the cryptographic accelerator processor 14 downloads the command block of FIG. 14A and performs hash operations using the hash key and input information to generate a hash value.
  • this hash value is then stored in the command block in the system memory 22 by overwriting the input information, which is no longer needed once the hash value has been computed.
  • the input information may be one or more pointers to input data stored, for example, in the data buffer(s) 47 in the system memory 22 .
  • the command block may include an input pointer field and/or an output pointer field, which are used to identify the location of the encrypted packet in the system memory 22 and the location where the decrypted packet is to be stored in the system memory 22 .
  • the cryptographic accelerator processor 14 may use the input pointer to download the encrypted packet from the system memory 22 and may then decrypt the encrypted packet using the hash key and input information to generate a hash value as discussed hereinabove.
  • the input information may be one or more pointers to input data stored, for example, in the data buffer(s) 47 in the system memory 22 .
  • the hash value may be attached to the decrypted packet and the decrypted packet with the attached hash value may be stored in the system memory 22 at the address identified by the output pointer field in the command block.
  • Cryptographic processors and/or other types of signal processors and integrated circuits may use a hardware-based random number generator.
  • the cryptographic accelerator processor 14 may include a RNG execution unit 28 that may be used to generate random numbers for use by other execution units of the cryptographic accelerator processor 14 and/or the host processor 16 . Exemplary operations that may be used to reduce interaction between the host processor 16 and the cryptographic accelerator processor 14 and to improve overall system performance will be described hereafter. Referring now to FIG. 16, operations begin at block 172 where the cryptographic accelerator processor 14 loads a random number sample into the RN data queue 42 beginning at the write address stored in the write pointer field 74 of the RN data queue pointer register 68 (see FIG. 3).
  • the host processor 16 reads the random number sample in the RN data queue 42 beginning at the read address stored in the read pointer field 72 of the RN data queue pointer register 68 (see FIG. 3).
  • the host processor 16 need not spend time interacting directly with the cryptographic accelerator processor 14 to request blocks of random data and/or reading random data from, for example, one or more registers on the cryptographic accelerator processor 14 chip.
  • the cryptographic accelerator processor 14 determines at block 182 whether the write address plus the random number sample size equals the read address stored in the read pointer field 72 . If the result determined at block 182 is “Yes,” then the cryptographic processor 14 postpones loading a new random number sample into the RN data queue 42 until the host processor 16 has incremented the read address.
  • the cryptographic processor 14 loads a random number sample into the RN data queue 42 at block 184 at the write address stored in the write pointer field 74 and then increments the write address at block 186 by an amount corresponding to the size of the loaded random number sample.
  • the cryptographic processor 14 may include a register and/or may recognize a command block that may be written to the cryptographic processor 14 that allows the host processor 16 to, for example, provide the cryptographic processor 14 with a random number seed and/or instruct the cryptographic processor 14 to begin generating random numbers.
  • FIG. 18 shows an exemplary embodiment of the RN data queue 42 .
  • the RN data queue 42 is configured to hold 512 random number samples, which each comprise 64 bits.
  • the cryptographic processor 14 has written four random number samples into addresses 1 through 4 and the write address has been incremented to point to the next available address, which is empty or contains data that have already been read by the host processor 16 .
  • the addresses shown in FIG. 18 are based on random number sample units for purposes of illustration. These addresses may be converted into absolute addresses by multiplying the random number sample number by 64 and adding the resulting product to the respective base address for the RN data queue 42 , which is stored in the RN data queue base address register 64 .
  • the test used at block 182 of FIG. 17 to determine whether a new random number sample may be loaded into the RN data queue 42 implies that if the RN data queue 42 may hold up to m random number samples, then only m ⁇ 1 random number samples may be stored in the RN data queue 42 at the same time. Thus, if the RN data queue 42 is filled to its capacity, then it may hold 32,704 bits (511, 64-bit random number samples), which exceeds the 20,000 bits required by the Federal Information Processing Standard (FIPS) 140-1, Security Requirements for Cryptographic Modules issued Jan. 11, 1994.
  • FIPS Federal Information Processing Standard
  • the host processor 16 determines whether the write address is equal to the read address. If the result determined at block 192 is “Yes,” then the host processor 16 waits until the cryptographic accelerator processor 14 loads a new random number sample into the RN data queue 42 . If, however, the result determined at block 192 is “No,” then the host processor 16 reads the random number sample at the read address stored in the read pointer field 72 at block 194 . The host processor 16 then increments the read address at block 196 by an amount corresponding to the size of the random number sample. The host processor 16 need not check the current write address every time a new random number sample is read. Instead, the host processor 16 may check the write address when the read address is getting close to the last value the host processor 16 has for the write address.
  • a cryptographic accelerator processor 14 may provide random number samples for use by a host processor 16 with reduced interaction between the host processor 16 and the cryptographic accelerator processor 14 .
  • the host processor 14 need only interact with the cryptographic accelerator processor 14 to update the read address and to check the value of the write address when the read address approaches the last value the host processor 14 has for the write address.
  • the cryptographic accelerator processor 14 may manage the buffering of the random number samples, which may conserve processor cycles of the host processor 16 and may reduce transactions on the system bus 26 , which may improve overall system performance.
  • ASICs such as the ASIC 202 shown in FIG. 20.
  • the ASIC 202 comprises a plurality of functional units 204 , 206 , and 208 , which are configured to perform specific operations.
  • input commands are provided to the ASIC 202 serially and then routed to the appropriate functional unit 204 , 206 , and/or 208 .
  • the outputs and/or results of executing the input commands are provided serially as command outputs from the ASIC 202 .
  • the ASIC 202 typically processes commands sequentially such that a first command must finish before a subsequent command may be processed even if the commands are executed by different functional units.
  • an ASIC 212 includes a plurality of functional units 214 , 216 , and 218 , which each receive command inputs through its own command interface and generate outputs and/or results that may be communicated to another processor through the command interface.
  • the functional units 214 , 216 , and 218 may operate independently and in parallel, thereby improving the performance of a cryptographic data processing system.
  • the functional units 214 , 216 , and 218 may comprise the E/A execution unit 32 , the RNG execution unit 28 , and the PK engine execution unit 34 .
  • the E/A execution unit 32 comprises a command interface manager 222
  • the RNG execution unit 28 comprises a command interface manager 224
  • the PK engine execution unit 34 comprises a command interface manager 226 .
  • These respective command interface managers 222 , 224 , and 226 may be used to receive input command blocks from the E/A command queue 44 , to transmit random number samples to the RN data queue 42 , and to receive input command blocks from the PK command queue 46 , respectively, and to allow the respective execution units 28 , 32 , and 34 to perform operations in parallel.
  • Operations begin at block 232 where one or more command blocks are provided to each of the functional units, such as, for example, by providing command blocks in the E/A command queue 44 and the PK command queue 46 for the E/A execution unit 32 and the PK engine execution unit 34 , respectively.
  • the command blocks are simultaneously executed by the functional units by accessing the command blocks in parallel through, for example, the command interface manager 222 and the command interface manager 226 , which are associated with the E/A execution unit 32 and the PK engine execution unit 34 , respectively.
  • command blocks may be provided to the cryptographic processor 14 in serial fashion over the system bus 24 . Nevertheless, the cryptographic processor 14 may distribute command blocks to the command interface managers 222 , 224 , and 226 associated with the execution units 32 , 28 , and 34 , which may then process the command blocks in parallel.
  • commands may be provided to the command interface managers in a variety of ways.
  • a processor may write commands directly to the command interface managers or, alternatively, commands may be stored in a memory and the command interface managers may be provided with the addresses where they may retrieve the stored commands for execution.
  • the total number of operations that may be performed may be increased and the average latency for completing operations may be reduced.
  • each block may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the blocks may occur out of the order noted in FIGS. 2, 6, 9 - 11 , 13 , 16 , 17 , 19 , and 23 .
  • two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending on the functionality involved.

Abstract

Embodiments of cryptographic data processing systems, computer program products, and methods of operating same are provided in which system memory is used to transfer information between a host processor and an adjunct processor.

Description

    CROSS-REFERENCE TO PROVISIONAL APPLICATIONS
  • This application claims the benefit of Provisional Application Ser. No. 60/203,409, filed May 11, 2000, entitled Cryptographic Acceleration Methods and Apparatus, and Provisional Application Ser. No. 60/203,465, filed May 11, 2000, entitled Methods and Apparatus for Supplying Random Numbers, the disclosures of which are hereby incorporated herein by reference in their entirety as if set forth fully herein.[0001]
  • BACKGROUND OF THE INVENTION
  • The present invention relates generally to the field of data processing systems, and, more particularly, to cryptographic data processing systems, computer program products, and methods of operating same. [0002]
  • Signal processors and integrated circuit chips have been developed to accelerate cryptographic operations, such as public key operations. Examples of such chips include, but are not limited to, the Hifn 6500 available from Hifn, Inc., the SafeNet ADSP 2141 available from SafeNet, Inc., and the Rainbow Mykotronx FastMAP available from Rainbow Mykotronx, Inc. Despite the availability of cryptographic accelerator products, there remains room for improvement in the art. [0003]
  • For example, conventional cryptographic data processing systems generally use two main methods for issuing a command to a cryptographic accelerator: The first method involves the provision of a command register on the cryptographic accelerator that a host processor uses to issue a single command. Once the cryptographic accelerator completes executing a command, the host processor may issue a new command. After completing a command, the cryptographic accelerator is generally idle until the host processor issues a new command. Unfortunately, the host processor may spend much time interacting directly with the cryptographic accelerator to download data and issue commands. This may reduce the amount of time available to the host processor for attending to other tasks. [0004]
  • The second method allows the host processor to download one or more command sequences to the cryptographic accelerator and then to instruct the cryptographic accelerator to execute one or more of the downloaded command sequences. After completing a command sequence, the cryptographic accelerator is generally idle until the host processor issues a new command. The size of the command sequences may be limited based on the amount of memory that may be placed on the cryptographic accelerator. Like the first method, the host processor may spend much time interacting directly with the cryptographic accelerator to download data and issue command sequences. This may reduce the amount of time available to the host processor for attending to other tasks. [0005]
  • Cryptographic accelerators generally perform operations using one or more operands. These devices may include general-purpose operand storage that comprises fixed length registers to store the operands and results. To execute an instruction, a register number is used to indicate which operand should be used for the operation and where the output should be stored. For example, if the operation were “a+b=c,” then part of the instruction would indicate that “a” is in register 7, “b” is in [0006] register 1, and “c” should be put into register 2.
  • Because the registers are fixed in size and the operands and results are variable in size, the size of the operands will always be less than or equal to the register size. As a result, some of the memory in the registers may be wasted. This reduces the number of operands that may be stored on a chip in a given amount of space. In addition, if the cryptographic accelerator is redesigned to accommodate larger operands, then each of the registers may need to be modified. More registers may be designed into a cryptographic accelerator; however, adding more memory to a cryptographic accelerator may reduce the amount of other functionality that may be included and/or increase the cost. [0007]
  • Cryptographic processors and/or other types of signal processors and integrated circuits may use a hardware-based random number generator. Various conventional methods may be used to retrieve random numbers from an integrated circuit incorporating a random number generator. One method is for the random number generator to provide one or more data registers that a host processor may read to obtain random numbers. The host processor may tell the random number generator to provide more random data before or after retrieving random data from the registers. The random number generator may generate the random data in the background so that random data may be available when needed by the host processor. [0008]
  • Another method for obtaining random data is for the host processor to request a sample of random data from the random number generator. The host processor may provide the random number generator with a request that specifies an amount of random data and a location in memory where the random data should be placed. The random number generator may then generate the random data and transfer the random data to the requested location in the background. [0009]
  • Unfortunately, by providing random data through data registers on the random number generator or other integrated circuit chip, any buffer management that may be desired is generally performed by the host processor. Moreover, the bus that connects the host processor with the random number generator may be used inefficiently because single data reads are typically used instead of block reads. If a host processor requests a block of random data, however, then the host processor may initiate the data transfers and any desired buffer management that may be desired is generally performed by the host processor. The foregoing operations may be performed in the background and/or a fast host processor may be used; however, a faster host processor may increase system costs. [0010]
  • SUMMARY OF THE INVENTION
  • Embodiments of the present invention provide cryptographic data processing systems, computer program products, and methods of operating same in which system memory is used to transfer information between a host processor and an adjunct processor. For example, in accordance with embodiments of the present invention, cryptographic data processing systems comprise a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory. One or more operands are downloaded into the local memory from the system memory, using, for example, a load command, and one or more operations are performed on at least one of the downloaded operands to generate a result in the local memory. The result that is generated in the local memory is then stored in the system memory, using, for example, a store command. Advantageously, interaction between the host processor and the cryptographic processor may be reduced as the host processor need not consume processing time downloading operands to the cryptographic accelerator processor and/or uploading results from the cryptographic processor to system memory. [0011]
  • In accordance with further embodiments of the present invention, interaction between the host processor and the cryptographic processor may be further reduced and overall system performance improved by providing a command queue in the system memory, loading a command block into the command queue using the host processor, executing the command block using the cryptographic processor, and notifying the host processor that the command block has been executed. By loading commands into a command queue in the system memory where the cryptographic processor may retrieve them for processing, the host processor need not spend time interacting directly with the cryptographic processor. [0012]
  • In accordance with further embodiments of the present invention, the host processor may be notified that the command block has been executed in various ways. For example, the cryptographic processor may invoke an interrupt to notify the host processor that the command block has been executed. In particular embodiments, the interrupt may be invoked if the host processor has requested notification via an interrupt in an interrupt field of the command block. In other embodiments, the cryptographic processor may update a completion field in the command block to notify the host processor that the command block has been executed. In still other embodiments, a periodic interrupt may be defined such that when the interrupt occurs the host processor reads the completion fields of any command blocks. hi accordance with still further embodiments of the present invention, improved utilization of the system memory may be attained by re-using at least a portion of a command block that contains input data to store a result or output that is generated by an adjunct processor, such as the cryptographic processor. For example, a command queue may be provided in the system memory and a command block may be loaded into the command queue using the host processor. The command block comprises an input data field that contains input data. The adjunct processor performs an operation based on the input data to generate a result and this result is stored in the input data field such that at least a portion of the input data is overwritten. Advantageously, the memory reserved for the command block in the system memory may be reduced because additional storage space need not be reserved to store the result of executing the command block either in the command block or elsewhere in the system memory. [0013]
  • Cryptographic processors and/or other types of signal processors and integrated circuits may use a hardware-based random number generator. In accordance with further embodiments of the present invention, random number samples may be provided for use by the host processor while reducing interaction between the host processor and the cryptographic processor. For example, a random number data queue in the system memory may be provided that has a read address and a write address associated therewith. The cryptographic processor loads a random number sample into the random number data queue at the write address and the host processor reads the random number sample beginning at the read address. The host processor need only interact with the cryptographic processor to update the read address and to check the value of the write address when the read address approaches the last value the host processor has for the write address. In addition, the cryptographic accelerator processor may manage the buffering of the random number samples, which may conserve processor cycles of the host processor.[0014]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Other features of the present invention will be more readily understood from the following detailed description of specific embodiments thereof when read in conjunction with the accompanying drawings, in which: [0015]
  • FIG. 1 is a block diagram that illustrates cryptographic data processing systems, computer program products, and methods of operating same in accordance with embodiments of the present invention; [0016]
  • FIG. 2 is a flowchart that illustrates operations of cryptographic data processing systems and computer program products in accordance with embodiments of the present invention; [0017]
  • FIGS. [0018] 3-5 are block diagrams that illustrate functional execution units of a cryptographic accelerator processor in accordance with embodiments of the present invention;
  • FIG. 6 is a flowchart that illustrates operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention; [0019]
  • FIGS. [0020] 7-8 are block diagrams that illustrate an encryption/authentication command queue and a public key command queue, respectively, in accordance with embodiments of the present invention;
  • FIGS. [0021] 9-11 are flowcharts that illustrate operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention;
  • FIGS. [0022] 12A-12D are block diagrams that illustrate command blocks in accordance with embodiments of the present invention;
  • FIG. 13 is a flowchart that illustrates operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention; [0023]
  • FIGS. 14A, 14B, and [0024] 15 are block diagrams that illustrate command blocks in accordance with further embodiments of the present invention;
  • FIGS. 16 and 17 are flowcharts that illustrate operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention; [0025]
  • FIG. 18 is a block diagram that illustrates a random number generator data queue in accordance with embodiments of the present invention; [0026]
  • FIG. 19 is a flowchart that illustrates operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention; [0027]
  • FIG. 20 is a block diagram that illustrates a command interface for a conventional application specific integrated circuit; [0028]
  • FIG. 21 is a block diagram that illustrates parallel command interfaces for an application specific integrated circuit in accordance with embodiments of the present invention; [0029]
  • FIG. 22 is a block diagram of a cryptographic accelerator processor in which command interface managers are respectively associated with functional execution units in accordance with embodiments of the present invention; and [0030]
  • FIG. 23 is a flowchart that illustrates operations of cryptographic data processing systems and computer program products in accordance with further embodiments of the present invention.[0031]
  • DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
  • While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that there is no intent to limit the invention to the particular forms disclosed, but on the contrary, the invention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the invention as defined by the claims. Like reference numbers signify like elements throughout the description of the figures. It will also be understood that when an element is referred to as being “connected” or “coupled” to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. In contrast, when an element is referred to as being “directly connected” or “directly coupled” to another element, there are no intervening elements present. [0032]
  • The present invention may be embodied as methods, data processing systems, and/or computer program products. Accordingly, the present invention may be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.). Furthermore, the present invention may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by or in connection with an instruction execution system. In the context of this document, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. [0033]
  • The computer-usable or computer-readable medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. More specific examples (a nonexhaustive list) of the computer-readable medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, and a portable compact disc read-only memory (CD-ROM). Note that the computer-usable or computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted, or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory. [0034]
  • Referring now to FIG. 1, an exemplary cryptographic [0035] data processing system 12, in accordance with embodiments of the present invention, comprises a cryptographic accelerator processor 14, a host processor 16, a cache memory 18, a system memory 22, and a system bus controller 24, such as a north-bridge system controller. The system bus controller 24 couples the host processor 16 to the cache memory 18 and the system memory 22, and also couples the host processor 16 and the system memory 22 to the cryptographic accelerator processor 14 via a system bus 26, which may be, for example, a peripheral component interconnect (PCI) bus. The host processor 16 may be, for example, a commercially available or custom microprocessor. The system memory 22 is representative of an overall hierarchy of memory devices containing the software and data used to implement the functionality of the cryptographic data processing system 12. The system memory 22 may include, but is not limited to, the following types of devices: ROM, PROM, EPROM, EEPROM, flash, SRAM, and DRAM.
  • In accordance with embodiments of the present invention, the [0036] cryptographic accelerator processor 14 comprises a random number generator (RNG) execution unit 28, an encryption/authentication (E/A) execution unit 32, and a public key (PK) engine execution unit 34, which are coupled to a local memory 36 via a local bus 38.
  • In accordance with particular embodiments of the present invention, the [0037] system memory 22 contains a random number (RN) data queue 42, an E/A command queue 44, a PK command queue 46, and data buffer(s) 47.
  • Although FIG. 1 illustrates an exemplary cryptographic data processing system architecture, it will be understood that the present invention is not limited to such a configuration, but is intended to encompass any configuration capable of carrying out operations described herein. Computer program code for carrying out operations of embodiments of the cryptographic [0038] data processing system 12 may be written in a high-level programming language, such as C or C++, for development convenience. Nevertheless, some modules or routines may be written in assembly language or even micro-code to enhance performance and/or memory usage. It will be further appreciated that the functionality of any or all of the program modules may also be implemented using discrete hardware components, a single application specific integrated circuit (ASIC), or a programmed digital signal processor or microcontroller.
  • The present invention is described hereinafter with reference to flowchart and/or block diagram illustrations of methods, data processing systems, and/or computer program products in accordance with exemplary embodiments of the invention. It will be understood that each block of the flowchart and/or block diagram illustrations, and combinations of blocks in the flowchart and/or block diagram illustrations, may be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, a special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. [0039]
  • These computer program instructions may also be stored in a computer usable or computer-readable memory that may direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer usable or computer-readable memory produce an article of manufacture including instructions that implement the function/act specified in the flowchart and/or block diagram block or blocks. [0040]
  • The computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions that execute on the computer or other programmable apparatus provide steps for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. [0041]
  • Exemplary operations of cryptographic data processing systems, computer program products, and methods of operating same, in accordance with embodiments of the present invention, will be described hereafter. Referring now to FIG. 2, the [0042] host processor 16 loads a command block into one of the command queues 44 and 46 at block 52. The cryptographic accelerator processor 14 may be notified by the host processor 16 that the command block is available for processing or may periodically access the command queues 44, and/or 46 to determine if a command block is available for processing. The cryptographic accelerator processor 14 downloads the command block from one of the command queues 44 and 46 and executes the command block at block 54. Once the cryptographic accelerator processor 14 completes execution of the command block, the host processor 16 is notified at block 56. Thus, according to embodiments of the present invention, the host processor 16 need not spend time interacting directly with the cryptographic accelerator processor 14 (e.g., issuing a command to the cryptographic accelerator processor 14, waiting for that command to complete, and then issuing another command). Instead, the host processor 16 may load commands into command queues 44 and 46, which may then be processed in background by the cryptographic accelerator processor 14. Moreover, the size and number of command block sequences may be less constrained because the availability of system memory is generally more abundant.
  • Referring now to FIGS. [0043] 3-5, the RNG execution unit 28, the E/A execution unit 32, and the PK engine execution unit 34 may use various registers that facilitate communication with the RN data queue 42 and the command queues 44 and 46. For example, as shown in FIG. 3, a control/status register 62, a RN data queue base address register 64, a RN data queue size register 66, and a RN data queue pointer register 68 may be defined for use by the RNG execution unit 28. The control/status register 62 may include a self-test error field, which may be set if the RNG execution unit 28 generates two successive random number samples that are the same, and/or an error flag field, which maybe used to notify the host processor 16 of an error on the system bus 26. The RN data queue base address register 64 may be used to hold the base address of the RN data queue 42 in the system memory 22. If the RN data queue 42 does not have a fixed size, then the RN data queue size register 66 may be used to hold the size of the RN data queue 42. The RN data queue pointer register 68 may comprise a read pointer 72 portion and a write pointer 74 portion, which may be used by the RNG execution unit 28 and the host processor 16 as will be discussed in more detail hereinafter.
  • As shown in FIG. 4, a control/[0044] status register 82, an E/A command queue base address register 84, an E/A command queue size register 86, and an E/A command queue pointer register 88 may be defined for use by the E/A execution unit 32. The control/status register 82 may include an interrupt flag field, which may be set if the host processor 16 requests an interrupt upon completion of a command block and/or if execution of a command block fails and/or an error flag field, which may be used to notify the host processor 16 of an error on the system bus 26. The E/A command queue base address register 84 may be used to hold the base address of the E/A command queue 44 in the system memory 22. If the E/A command queue 44 does not have a fixed size, then the E/A command queue size register 86 may be used to hold the size of the E/A command queue 44. The E/A command queue pointer register 88 may comprise a read pointer 92 portion and a write pointer 94 portion, which may be used by the E/A execution unit 32 and the host processor 16, respectively, as will be discussed in more detail hereinafter.
  • As shown in FIG. 5, a control/[0045] status register 102, a PK command queue base address register 104, a PK command queue size register 106, and a PK command queue pointer register 108 may be defined for use by the PK engine execution unit 34.
  • The control/[0046] status register 102 may include an interrupt flag field, which may be set if the host processor 16 requests an interrupt upon completion of a command block and/or if execution of a command block fails and/or an error flag field, which may be used to notify the host processor 16 of an error on the system bus 26. The PK command queue base address register 104 may be used to hold the base address of the PK command queue 46 in the system memory 22. If the PK command queue 46 does not have a fixed size, then the PK command queue size register 106 may be used to hold the size of the PK command queue 46. The PK command queue pointer register 108 may comprise a read pointer 112 portion and a write pointer 114 portion, which may be used by the PK engine execution unit 34 and the host processor 16, respectively, as will be discussed in more detail hereinafter.
  • Referring now to FIG. 6, operations for loading a command block into the E/[0047] A command queue 44 and/or the PK command queue 46, in accordance with embodiments of the present invention, will be described in more detail hereafter. In general, the host processor 16 writes commands into the command queues 44 and 46 beginning at write address locations stored in the write pointers for the respective command queues (e.g., write pointers 94 and 114). Before writing a command block into a command queue, however, the host processor determines at block 122 whether the write address plus the command block size equals the read address stored in the corresponding read pointer 92 or 112. If the result determined at block 122 is “Yes,” then the host processor 16 postpones loading a new command block into the command queue until the cryptographic accelerator processor 14 has incremented the read address. If, however, the result determined at block 122 is “No,” then the host processor 16 loads a command block into the command queue at block 124 at the write address associated with the command queue and then increments the write address at block 126 by an amount corresponding to the size of the loaded command block. The host processor 16 need not check the current read address every time a new command block is loaded. Instead, the host processor 16 may check the read address when the write address is getting close to the last value the host processor 16 has for the read address. Checking the read address may be expensive in terms of processor cycles consumed. By checking the read address only when the read address is getting close to the write address (e.g., within a predefined threshold), host processor 16 cycles may be conserved.
  • The foregoing operations are illustrated, for example, in FIGS. 7 and 8, which show embodiments of the E/[0048] A command queue 44 and the PK command queue 46, respectively. As shown in FIGS. 7 and 8, both the E/A command queue 44 and the PK command queue 46 are configured to hold m command blocks, which each comprise eight, thirty-two bit words. The host processor 16 has written a single command block into the first command block position (i.e., the “0” position) and the write address has been incremented to point to the next empty command block slot. The addresses used in FIGS. 7 and 8 are based on command block slot numbers for purposes of illustration. These addresses may be converted into absolute addresses by multiplying the command block slot number by 256 and adding the resulting product to the respective base addresses for the command queues, which are stored in the E/A command queue base address register 84 and the PK command queue base address register 104, respectively. Note that the test used at block 122 of FIG. 6 to determine whether a new command block may be loaded into a command queue implies that if a command queue may hold up to m command blocks, then only m−1 command blocks may be stored in the command queue at the same time.
  • Referring now to FIG. 9, operations for executing a command block that has been loaded into the E/[0049] A command queue 44 and/or the PK command queue 46, in accordance with embodiments of the present invention, will be described in more detail hereafter. At block 132, the cryptographic accelerator processor 14 determines whether the write address is equal to the read address. Specifically, the E/A execution unit 32 determines whether the write address is equal to the read address for the E/A command queue 44 and the PK engine execution unit 34 determines whether the write address is equal to the read address for the PK command queue 46. If the result determined at block 132 is “Yes,” then the cryptographic processor 14 waits until the host processor 16 loads a new command block into the command queue. If, however, the result determined at block 132 is “No,” then the cryptographic accelerator processor 14 downloads the command block at the read address associated with the command queue and executes the command block at block 134. In particular embodiments of the present invention, multiple command blocks may be downloaded for execution on the cryptographic accelerator processor 14 at the same time, which may further improve performance. The cryptographic accelerator processor 14 then increments the read address at block 136 by an amount corresponding to the size of the executed command block.
  • Returning to FIGS. 7 and 8, the read addresses are set to point to the first command block slot, which has been loaded with a command block by the [0050] host processor 16. The E/A execution unit 32 and the PK engine execution unit 34 may read the command blocks loaded in the E/A command queue 44 and the PK command queue 46, respectively, with only minimal interaction with the host processor 16, e.g., maintenance of the read pointers 92 and 112, and the write pointers 94, and 114. In general, the cryptographic accelerator processor 14 may continue to execute commands located in a circular command queue in system memory until the read address equals the write address for that command queue.
  • In accordance with further embodiments of the present invention, interaction between the [0051] host processor 16 and the cryptographic accelerator processor 14 may be further reduced and overall system performance improved by including load and store commands in the cryptographic accelerator processor's command set. Referring now to FIG. 10, a load command loads one or more operands from the system memory 22 (e.g., the data buffer(s) 47) to the local memory 36 at block 142. The cryptographic accelerator processor 14 then performs one or more operations on the operand(s) at block 144 to generate a result that is stored in the local memory 36. A store command then stores the result in the system memory 22 at block 146. Advantageously, the host processor 16 need not consume processing time downloading operands to the cryptographic accelerator processor 14 and/or uploading results from the cryptographic accelerator processor 14 into the system memory 22.
  • To improve utilization of the chip area used to implement the [0052] cryptographic accelerator processor 14, at least a portion of the operands downloaded from the system memory 22 may be stored in the local memory 36. Instead of using a register number to identify the location of operands and results, an offset is used that identifies the relative position of the operands and results in the local memory 36. For example, to perform the operation “a+b=c,” a cryptographic accelerator processor 14 instruction may indicate that “a” is at offset 0 relative to a base address of the local memory 36, “b” is at offset 8 relative to the base address of the local memory 36, and the result “c” should be placed at offset 122 relative to the base address of the local memory 36. In accordance with further embodiments of the present invention, the result generated in the local memory 36 may also be stored in a result field of a command block, which is located in one of the command queues 44 and 46 in the system memory 22. Advantageously, operands and results may be packed together into the local memory 36, which may conserve storage space. Because there is no wasted space in storing the operands and results in the local memory 36, memory utilization may be improved. If the cryptographic accelerator processor 14 needs to be redesigned to handle larger operands, then the local memory 36 may be easier to resize than resizing several registers.
  • In accordance with further embodiments of the present invention, interaction between the [0053] host processor 16 and the cryptographic accelerator processor 14 may be further reduced and overall system performance improved by allowing the cryptographic accelerator processor 14 to inform the host processor 16 when command blocks have been executed. Referring now to FIG. 11, the host processor 16 loads a command block into one of the command queues 44 and 46 at block 152. As shown in FIG. 12A, the command block may include an interrupt field, which may be set by the host processor 16 to turn an interrupt request on or off. The cryptographic accelerator processor 14 downloads the command block from one of the command queues 44 and 46 and executes the command block at block 154. The cryptographic accelerator processor 14 may optionally store error information in the command block as shown in FIG. 12B at block 156. The error information may comprise information that is associated with downloading the command block to the cryptographic accelerator processor 14 and/or executing the command block on the cryptographic accelerator processor 14. At block 158, if an interrupt has been requested in the interrupt field of the command block, then the cryptographic accelerator processor 14 invokes an interrupt to notify the host processor 16 that the command block has completed.
  • In other embodiments of the present invention, instead of invoking an interrupt to notify the [0054] host processor 16 that a command block has been executed, the cryptographic accelerator processor 14 may update a completion field in the command block as shown in FIG. 12C. In addition, a periodic interrupt may be defined that upon each occurrence triggers the host processor 16 to check one or more of the command queues 44 and 46 to determine whether any of the command blocks stored therein have been executed by examining their completion fields. In still other embodiments of the present invention, the cryptographic accelerator processor 14 may store the results from executing a command block in the command block as shown in FIG. 12D.
  • In still other embodiments of the present invention, the [0055] host processor 16 may set a timer when storing a command block into a command queue 42, 44. Upon expiration of the timer, the host processor 16 may check to determine whether the command block has been executed. Advantageously, the status of a command block may be determined by the host processor 16 without the need to process an interrupt from the cryptographic accelerator processor 14.
  • In accordance with further embodiments of the present invention, improved utilization of the [0056] system memory 22 may be attained by re-using at least a portion of a command block that contains input data to store a result or output that is generated by an adjunct processor, such as the cryptographic accelerator processor 14, upon executing the command block. It is assumed that the size of the result or output is small enough to fit into the portion of the command block containing the input data that is to be overwritten. In addition, the region of the command block in which the result or output is stored should be selected carefully to ensure that the input data that is overwritten is no longer needed by the host processor 16 after the command block has been executed by the adjunct processor.
  • Referring now to FIG. 13, exemplary operations begin at [0057] block 162 where the host processor loads a command block that includes input data into one of the command queues 44 or 46 in the system memory 22. Note that that instead of or in addition to including input data into the command block, the command block may include pointers to input data that reside, for example, in the data buffer(s) 47 in the system memory 22. An adjunct processor, such as the cryptographic accelerator processor 14, may download the command block and perform one or more operations on the input data to generate a result at block 164. If the command block includes pointers to input data, then the data are separately downloaded to the cryptographic accelerator processor 14 using the input data pointers. The result is then stored in the command block in the system memory 22 at block 166 such that at least a portion of the input data is overwritten. Advantageously, the memory reserved for the command block in the system memory 22 may be reduced because additional storage space need not be reserved to store the result of executing the command block either in the command block or elsewhere in the system memory 22.
  • The foregoing operations are illustrated by way of example in FIGS. 14A and 14B, which show an exemplary command block for decrypting an encrypted packet. Specifically, in FIG. 14A, a command block is shown that comprises a field that contains a hash key for the encrypted packet and another field that contains input information. The [0058] cryptographic accelerator processor 14 downloads the command block of FIG. 14A and performs hash operations using the hash key and input information to generate a hash value. As shown in FIG. 14B, this hash value is then stored in the command block in the system memory 22 by overwriting the input information, which is no longer needed once the hash value has been computed. Note that the input information may be one or more pointers to input data stored, for example, in the data buffer(s) 47 in the system memory 22.
  • In accordance with further embodiments of the present invention illustrated in FIG. 15, the command block may include an input pointer field and/or an output pointer field, which are used to identify the location of the encrypted packet in the [0059] system memory 22 and the location where the decrypted packet is to be stored in the system memory 22. For example, the cryptographic accelerator processor 14 may use the input pointer to download the encrypted packet from the system memory 22 and may then decrypt the encrypted packet using the hash key and input information to generate a hash value as discussed hereinabove. Note that the input information may be one or more pointers to input data stored, for example, in the data buffer(s) 47 in the system memory 22. The hash value may be attached to the decrypted packet and the decrypted packet with the attached hash value may be stored in the system memory 22 at the address identified by the output pointer field in the command block.
  • Cryptographic processors and/or other types of signal processors and integrated circuits may use a hardware-based random number generator. The [0060] cryptographic accelerator processor 14 may include a RNG execution unit 28 that may be used to generate random numbers for use by other execution units of the cryptographic accelerator processor 14 and/or the host processor 16. Exemplary operations that may be used to reduce interaction between the host processor 16 and the cryptographic accelerator processor 14 and to improve overall system performance will be described hereafter. Referring now to FIG. 16, operations begin at block 172 where the cryptographic accelerator processor 14 loads a random number sample into the RN data queue 42 beginning at the write address stored in the write pointer field 74 of the RN data queue pointer register 68 (see FIG. 3). At block 174 the host processor 16 reads the random number sample in the RN data queue 42 beginning at the read address stored in the read pointer field 72 of the RN data queue pointer register 68 (see FIG. 3). Thus, according to embodiments of the present invention, the host processor 16 need not spend time interacting directly with the cryptographic accelerator processor 14 to request blocks of random data and/or reading random data from, for example, one or more registers on the cryptographic accelerator processor 14 chip.
  • Referring now to FIG. 17, operations for loading a random number sample into the [0061] RN data queue 42, in accordance with embodiments of the present invention, will be described in more detail hereafter. Before writing a random number sample into the RN data queue 42, the cryptographic accelerator processor 14 determines at block 182 whether the write address plus the random number sample size equals the read address stored in the read pointer field 72. If the result determined at block 182 is “Yes,” then the cryptographic processor 14 postpones loading a new random number sample into the RN data queue 42 until the host processor 16 has incremented the read address. If, however, the result determined at block 182 is “No,” then the cryptographic processor 14 loads a random number sample into the RN data queue 42 at block 184 at the write address stored in the write pointer field 74 and then increments the write address at block 186 by an amount corresponding to the size of the loaded random number sample.
  • Note that in accordance with embodiments of the present invention, the [0062] cryptographic processor 14 may include a register and/or may recognize a command block that may be written to the cryptographic processor 14 that allows the host processor 16 to, for example, provide the cryptographic processor 14 with a random number seed and/or instruct the cryptographic processor 14 to begin generating random numbers.
  • The foregoing operations are illustrated, for example, in FIG. 18, which shows an exemplary embodiment of the [0063] RN data queue 42. As shown in FIG. 18, the RN data queue 42 is configured to hold 512 random number samples, which each comprise 64 bits. The cryptographic processor 14 has written four random number samples into addresses 1 through 4 and the write address has been incremented to point to the next available address, which is empty or contains data that have already been read by the host processor 16. The addresses shown in FIG. 18 are based on random number sample units for purposes of illustration. These addresses may be converted into absolute addresses by multiplying the random number sample number by 64 and adding the resulting product to the respective base address for the RN data queue 42, which is stored in the RN data queue base address register 64. Note that the test used at block 182 of FIG. 17 to determine whether a new random number sample may be loaded into the RN data queue 42 implies that if the RN data queue 42 may hold up to m random number samples, then only m−1 random number samples may be stored in the RN data queue 42 at the same time. Thus, if the RN data queue 42 is filled to its capacity, then it may hold 32,704 bits (511, 64-bit random number samples), which exceeds the 20,000 bits required by the Federal Information Processing Standard (FIPS) 140-1, Security Requirements for Cryptographic Modules issued Jan. 11, 1994.
  • Referring now to FIG. 19, operations for reading a command block that has been loaded into the [0064] RN data queue 42, in accordance with embodiments of the present invention, will be described in more detail hereafter. At block 192, the host processor 16 determines whether the write address is equal to the read address. If the result determined at block 192 is “Yes,” then the host processor 16 waits until the cryptographic accelerator processor 14 loads a new random number sample into the RN data queue 42. If, however, the result determined at block 192 is “No,” then the host processor 16 reads the random number sample at the read address stored in the read pointer field 72 at block 194. The host processor 16 then increments the read address at block 196 by an amount corresponding to the size of the random number sample. The host processor 16 need not check the current write address every time a new random number sample is read. Instead, the host processor 16 may check the write address when the read address is getting close to the last value the host processor 16 has for the write address.
  • Thus, according to embodiments of the present invention, a [0065] cryptographic accelerator processor 14 may provide random number samples for use by a host processor 16 with reduced interaction between the host processor 16 and the cryptographic accelerator processor 14. In general, the host processor 14 need only interact with the cryptographic accelerator processor 14 to update the read address and to check the value of the write address when the read address approaches the last value the host processor 14 has for the write address. In addition, the cryptographic accelerator processor 14 may manage the buffering of the random number samples, which may conserve processor cycles of the host processor 16 and may reduce transactions on the system bus 26, which may improve overall system performance.
  • The performance of cryptographic data processing systems may be affected by the system architecture and the methodology used to perform operations. For example, conventional cryptographic data processing systems may comprise one or more ASICs, such as the [0066] ASIC 202 shown in FIG. 20. The ASIC 202 comprises a plurality of functional units 204, 206, and 208, which are configured to perform specific operations. As shown in FIG. 20, however, input commands are provided to the ASIC 202 serially and then routed to the appropriate functional unit 204, 206, and/or 208. The outputs and/or results of executing the input commands are provided serially as command outputs from the ASIC 202. Thus, the ASIC 202 typically processes commands sequentially such that a first command must finish before a subsequent command may be processed even if the commands are executed by different functional units.
  • Referring now to FIG. 21, the performance of cryptographic data processing systems may be improved, in accordance with embodiments of the present invention, by providing separate command interfaces that are respectively associated with the functional units such that each functional unit may receive command inputs and may generate command outputs and/or results independently of other functional units. As shown in FIG. 21, an [0067] ASIC 212 includes a plurality of functional units 214, 216, and 218, which each receive command inputs through its own command interface and generate outputs and/or results that may be communicated to another processor through the command interface. By associating a separate command interface with each functional unit 214, 216, and 218, the functional units 214, 216, and 218 may operate independently and in parallel, thereby improving the performance of a cryptographic data processing system.
  • Referring now to FIG. 22, the [0068] functional units 214, 216, and 218 may comprise the E/A execution unit 32, the RNG execution unit 28, and the PK engine execution unit 34. The E/A execution unit 32 comprises a command interface manager 222, the RNG execution unit 28 comprises a command interface manager 224, and the PK engine execution unit 34 comprises a command interface manager 226. These respective command interface managers 222, 224, and 226 may be used to receive input command blocks from the E/A command queue 44, to transmit random number samples to the RN data queue 42, and to receive input command blocks from the PK command queue 46, respectively, and to allow the respective execution units 28, 32, and 34 to perform operations in parallel.
  • Referring now to FIG. 23, operations of cryptographic data processing systems in which command interface managers are respectively associated with a plurality of functional units, in accordance with embodiments of the present invention, will be described hereafter. Operations begin at [0069] block 232 where one or more command blocks are provided to each of the functional units, such as, for example, by providing command blocks in the E/A command queue 44 and the PK command queue 46 for the E/A execution unit 32 and the PK engine execution unit 34, respectively. At block 234, the command blocks are simultaneously executed by the functional units by accessing the command blocks in parallel through, for example, the command interface manager 222 and the command interface manager 226, which are associated with the E/A execution unit 32 and the PK engine execution unit 34, respectively.
  • Note that command blocks may be provided to the [0070] cryptographic processor 14 in serial fashion over the system bus 24. Nevertheless, the cryptographic processor 14 may distribute command blocks to the command interface managers 222, 224, and 226 associated with the execution units 32, 28, and 34, which may then process the command blocks in parallel.
  • For purposes of illustration, exemplary embodiments of the present invention have been discussed hereinabove in which operations related to random number generation, encryption/authentication, and public key generation are performed in parallel based on functional units defined therefor. It will be understood that the operations that may be performed in parallel may be adjusted based on requirements and/or needs. Moreover, commands may be provided to the command interface managers in a variety of ways. A processor may write commands directly to the command interface managers or, alternatively, commands may be stored in a memory and the command interface managers may be provided with the addresses where they may retrieve the stored commands for execution. [0071]
  • In summary, by performing operations in parallel using a plurality of functional units, the total number of operations that may be performed may be increased and the average latency for completing operations may be reduced. [0072]
  • The flowcharts of FIGS. 2, 6, [0073] 9-11, 13, 16, 17, 19, and 23 illustrate the architecture, functionality, and operations of possible embodiments of the cryptographic data processing system 12 of FIG. 1. In this regard, each block may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that in some alternative embodiments, the functions noted in the blocks may occur out of the order noted in FIGS. 2, 6, 9-11, 13, 16, 17, 19, and 23. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending on the functionality involved.
  • In concluding the detailed description, it should be noted that many variations and modifications can be made to the preferred embodiments without substantially departing from the principles of the present invention. All such variations and modifications are intended to be included herein within the scope of the present invention, as set forth in the following claims. [0074]

Claims (86)

We claim:
1. A method of operating a cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory and is coupled to the host processor and the system memory, the method comprising:
loading at least one operand from the system memory to the local memory;
performing at least one operation on the at least one operand to generate a result in the local memory; and
storing the result generated in the local memory in the system memory.
2. A method as recited in
claim 1
, wherein performing the at least one operation, and storing the result are performed by the cryptographic processor without interaction with the host processor.
3. A cryptographic data processing system, comprising:
a host processor;
a system memory coupled to the host processor; and
a cryptographic processor that comprises a local memory and is coupled to the host processor and the system memory, the cryptographic processor being programmed to load at least one operand from the system memory to the local memory, perform at least one operation on the at least one operand to generate a result in the local memory, and store the result generated in the local memory in the system memory.
4. A method of operating a cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory and is coupled to the host processor and the system memory, the method comprising:
providing a command queue in the system memory;
loading a command block into the command queue using the host processor;
executing the command block using the cryptographic processor; and notifying the host processor that the command block has been executed.
5. A method as recited in
claim 4
, further comprising:
providing a read address for the command queue and a write address for the command queue;
wherein loading the command block into the command queue using the host processor comprises loading the command block into the command queue using the host processor beginning at the write address, and wherein executing the command block using the cryptographic processor comprises executing the command block using the cryptographic processor beginning at the read address.
6. A method as recited in
claim 5
, wherein loading the command block into the command queue using the host processor beginning at the write address comprises:
determining if the write address plus an amount corresponding to a size of a single command block equals the read address; and
loading the command block into the command queue using the host processor beginning at the write address if the write address plus the amount corresponding to the size of the single command block does not equal the read address.
7. A method as recited in
claim 6
, further comprising:
incrementing the write address by the amount corresponding to the size of a single command block using the host processor after loading the command block into the command queue using the host processor beginning at the write address if the write address plus the amount corresponding to the size of the single command block does not equal the read address.
8. A method as recited in
claim 5
, wherein executing the command block using the cryptographic processor beginning at the read address comprises:
determining whether the read address is equal to the write address; and
executing the command block using the cryptographic processor beginning at the read address if the read address is not equal to the write address.
9. A method as recited in
claim 8
, further comprising:
incrementing the read address by an amount corresponding to a size of a single command block using the cryptographic processor after executing the command block using the cryptographic processor beginning at the read address.
10. A method as recited in
claim 4
, wherein notifying the host processor that the command block has been executed comprises invoking an interrupt using the cryptographic processor after executing the command block.
11. A method as recited in
claim 4
, wherein notifying the host processor that the command block has been executed comprises updating a completion field in the command block using the cryptographic processor.
12. A method as recited in
claim 11
, further comprising:
providing a periodic interrupt; and
reading the completion field using the host processor upon invocation of the periodic interrupt.
13. A method as recited in
claim 4
, wherein notifying the host processor that the command block has been executed comprises:
setting a timer after loading the command block into the command queue using the host processor; and
checking whether the command block has been executed after expiration of the timer.
14. A method as recited in
claim 4
, further comprising:
loading at least one operand from the command queue to the local memory;
performing at least one operation on the at least one operand to generate a result in the local memory; and
storing the result generated in the local memory in the command queue.
15. A method of operating a cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that is coupled to the host processor and the system memory, the method comprising:
providing a command queue in the system memory;
loading a command block into the command queue using the host processor;
setting a value of an interrupt field in the command block to request an interrupt when the command block has been executed;
executing the command block using the cryptographic processor; and
invoking an interrupt using the cryptographic processor after executing the command block if the interrupt field in the command block is set to the value to request the interrupt.
16. A method as recited in
claim 15
, further comprising:
storing error information in the command block that is associated with executing the command block using the cryptographic processor.
17. A method of operating a cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that is coupled to the host processor and the system memory, the method comprising:
providing a command queue in the system memory;
loading a command block into the command queue using the host processor;
executing the command block using the cryptographic processor; and
updating a completion field in the command block using the cryptographic processor.
18. A method as recited in
claim 17
, further comprising:
providing a periodic interrupt; and
reading the completion field using the host processor upon invocation of the periodic interrupt.
19. A method as recited in
claim 17
, further comprising:
storing error information in the command block that is associated with executing the command block using the cryptographic processor.
20. A method of operating a data processing system that comprises a host processor, a system memory coupled to the host processor, and an adjunct processor integrated circuit that is coupled to the host processor and the system memory, the method comprising:
providing a command queue in the system memory;
loading a command block into the command queue using the host processor, the command block comprising an input data field that contains input data;
performing an operation based on the input data using the adjunct processor to generate a result; and
storing the result in the input data field such that at least a portion of the input data is overwritten.
21. A method as recited in
claim 20
, wherein the data processing system comprises a cryptographic data processing system, the adjunct processor integrated circuit comprises a cryptographic processor integrated circuit, and performing the operation based on the input data comprises:
performing a hash operation based on the input data using the cryptographic processor to generate a hash value.
22. A method as recited in
claim 21
, wherein storing the result in the input data field comprises:
storing the hash value in the input data field such that the at least a portion of the input data is overwritten.
23. A method as recited in
claim 21
, wherein the command block further comprises an input pointer field that contains an address in the system memory of an incoming packet and wherein performing the hash operation comprises:
performing the hash operation based on the input data and the incoming packet using the cryptographic processor to generate the hash value.
24. A method as recited in
claim 23
, wherein the command block further comprises an output pointer field that contains an address in the system memory for storing a decrypted packet, the method further comprising:
decrypting the incoming packet using the cryptographic processor to generate the decrypted packet;
attaching the hash value to the decrypted packet; and
storing the decrypted packet with the attached hash value at the address in the system memory contained in the output pointer field.
25. A method of operating a cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory and is coupled to the host processor and the system memory, the method comprising:
providing a command queue in the system memory;
providing a read address for the command queue and a write address for the command queue;
loading a random number sample into the command queue using the cryptographic processor beginning at the write address; and
reading the random number sample using the host processor beginning at the read address.
26. A method as recited in
claim 25
, wherein loading the random number sample into the command queue using the cryptographic processor beginning at the write address comprises:
determining if the write address plus an amount corresponding to a size of a single random number sample equals the read address; and
loading the random number sample into the command queue using the cryptographic processor beginning at the write address if the write address plus the amount corresponding to the size of the single random number sample does not equal the read address.
27. A method as recited in
claim 26
, further comprising:
incrementing the write address by the amount corresponding to the size of a single random number sample using the cryptographic processor after loading the random number sample into the command queue using the cryptographic processor beginning at the write address if the write address plus the amount corresponding to the size of the single random number sample does not equal the read address.
28. A method as recited in
claim 25
, wherein reading the random number sample using the host processor beginning at the read address comprises:
determining whether the read address is equal to the write address; and
reading the random number sample using the host processor beginning at the read address if the read address is not equal to the write address.
29. A method as recited in
claim 28
, further comprising:
incrementing the read address by an amount corresponding to a size of a single random number sample using the host processor after reading the random number sample using the host processor beginning at the read address.
30. A method of operating a data processing system that comprises a host processor, a system memory coupled to the host processor, and an adjunct processor integrated circuit that is coupled to the host processor and the system memory, the method comprising:
transferring information between the host processor and the adjunct processor using the system memory.
31. A cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory and is coupled to the host processor and the system memory, the system further comprising:
means for loading at least one operand from the system memory to the local memory;
means for performing at least one operation on the at least one operand to generate a result in the local memory; and
means for storing the result generated in the local memory in the system memory.
32. A cryptographic data processing system as recited in
claim 31
, wherein the means for performing the at least one operation, and the means for storing the result execute without interaction with the host processor.
33. A cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory and is coupled to the host processor and the system memory, the system further comprising:
means for providing a command queue in the system memory;
means for loading a command block into the command queue using the host processor;
means for executing the command block using the cryptographic processor; and
means for notifying the host processor that the command block has been executed.
34. A cryptographic data processing system as recited in
claim 33
, further comprising:
means for providing a read address for the command queue and a write address for the command queue;
wherein the means for loading the command block into the command queue using the host processor comprises means for loading the command block into the command queue using the host processor beginning at the write address, and wherein the means for executing the command block using the cryptographic processor comprises means for executing the command block using the cryptographic processor beginning at the read address.
35. A cryptographic data processing system as recited in
claim 34
, wherein the means for loading the command block into the command queue using the host processor beginning at the write address comprises: means for determining if the write address plus an amount corresponding to a size of a single command block equals the read address; and
means for loading the command block into the command queue using the host processor beginning at the write address if the write address plus the amount corresponding to the size of the single command block does not equal the read address.
36. A cryptographic data processing system as recited in
claim 35
, further comprising:
means for incrementing the write address by the amount corresponding to the size of a single command block using the host processor if the write address plus the amount corresponding to the size of the single command block does not equal the read address, the means for incrementing being responsive to the means for loading the command block into the command queue using the host processor beginning at the write address.
37. A cryptographic data processing system as recited in
claim 34
, wherein the means for executing the command block using the cryptographic processor beginning at the read address comprises:
means for determining whether the read address is equal to the write address; and
means for executing the command block using the cryptographic processor beginning at the read address if the read address is not equal to the write address.
38. A cryptographic data processing system as recited in
claim 37
, further comprising:
means for incrementing the read address by an amount corresponding to a size of a single command block using the cryptographic processor, the means for incrementing being responsive to the means for executing the command block using the cryptographic processor beginning at the read address.
39. A cryptographic data processing system as recited in
claim 33
, wherein the means for notifying the host processor that the command block has been executed comprises means for invoking an interrupt using the cryptographic processor after executing the command block.
40. A cryptographic data processing system as recited in
claim 33
, wherein the means for notifying the host processor that the command block has been executed comprises means for updating a completion field in the command block using the cryptographic processor.
41. A cryptographic data processing system as recited in
claim 40
, further comprising:
means for providing a periodic interrupt; and
means for reading the completion field using the host processor upon invocation of the periodic interrupt.
42. A cryptographic data processing system as recited in
claim 33
, wherein the means for notifying the host processor that the command block has been executed comprises:
means for setting a timer after loading the command block into the command queue using the host processor; and
means for checking whether the command block has been executed after expiration of the timer.
43. A cryptographic data processing system as recited in
claim 33
, further comprising:
means for loading at least one operand from the command queue to the local memory;
means for performing at least one operation on the at least one operand to generate a result in the local memory; and
means for storing the result generated in the local memory in the command queue.
44. A cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that is coupled to the host processor and the system memory, the system further comprising:
means for providing a command queue in the system memory;
means for loading a command block into the command queue using the host processor;
means for setting a value of an interrupt field in the command block to request an interrupt when the command block has been executed;
means for executing the command block using the cryptographic processor; and
means for invoking an interrupt using the cryptographic processor after executing the command block if the interrupt field in the command block is set to the value to request the interrupt.
45. A cryptographic data processing system as recited in
claim 44
, further comprising:
means for storing error information in the command block that is associated with executing the command block using the cryptographic processor.
46. A cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that is coupled to the host processor and the system memory, the system further comprising:
means for providing a command queue in the system memory;
means for loading a command block into the command queue using the host processor;
means for executing the command block using the cryptographic processor; and
means for updating a completion field in the command block using the cryptographic processor.
47. A cryptographic data processing system as recited in
claim 46
, further comprising:
means for providing a periodic interrupt; and
means for reading the completion field using the host processor upon invocation of the periodic interrupt.
48. A cryptographic data processing system as recited in
claim 46
, further comprising:
means for storing error information in the command block that is associated with executing the command block using the cryptographic processor.
49. A data processing system that comprises a host processor, a system memory coupled to the host processor, and an adjunct processor integrated circuit that is coupled to the host processor and the system memory, the system further comprising:
means for providing a command queue in the system memory;
means for loading a command block into the command queue using the host processor, the command block comprising an input data field that contains input data;
means for performing an operation based on the input data using the adjunct processor to generate a result; and
means for storing the result in the input data field such that at least a portion of the input data is overwritten.
50. A data processing system as recited in
claim 49
, wherein the data processing system comprises a cryptographic data processing system, the adjunct processor integrated circuit comprises a cryptographic processor integrated circuit, and the means for performing the operation based on the input data comprises:
means for performing a hash operation based on the input data using the cryptographic processor to generate a hash value.
51. A data processing system as recited in
claim 50
, wherein the means for storing the result in the input data field comprises:
means for storing the hash value in the input data field such that the at least a portion of the input data is overwritten.
52. A data processing system as recited in
claim 50
, wherein the command block further comprises an input pointer field that contains an address in the system memory of an incoming packet and wherein the means for performing the hash operation comprises:
means for performing the hash operation based on the input data and the incoming packet using the cryptographic processor to generate the hash value.
53. A data processing system as recited in
claim 52
, wherein the command block further comprises an output pointer field that contains an address in the system memory for storing a decrypted packet, the data processing system further comprising:
means for decrypting the incoming packet using the cryptographic processor to generate the decrypted packet;
means for attaching the hash value to the decrypted packet; and
means for storing the decrypted packet with the attached hash value at the address in the system memory contained in the output pointer field.
54. A cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory and is coupled to the host processor and the system memory, the system further comprising:
means for providing a command queue in the system memory;
means for providing a read address for the command queue and a write address for the command queue;
means for loading a random number sample into the command queue using the cryptographic processor beginning at the write address; and
means for reading the random number sample using the host processor beginning at the read address.
55. A cryptographic data processing system as recited in
claim 54
, wherein the means for loading the random number sample into the command queue using the cryptographic processor beginning at the write address comprises:
means for determining if the write address plus an amount corresponding to a size of a single random number sample equals the read address; and
means for loading the random number sample into the command queue using the cryptographic processor beginning at the write address if the write address plus the amount corresponding to the size of the single random number sample does not equal the read address.
56. A cryptographic data processing system as recited in
claim 55
, further comprising:
means for incrementing the write address by the amount corresponding to the size of a single random number sample using the cryptographic processor if the write address plus the amount corresponding to the size of the single random number sample does not equal the read address, the means for incrementing being responsive to the means for loading the random number sample into the command queue using the cryptographic processor beginning at the write address.
57. A cryptographic data processing system as recited in
claim 54
, wherein the means for reading the random number sample using the host processor beginning at the read address comprises:
means for determining whether the read address is equal to the write address; and
means for reading the random number sample using the host processor beginning at the read address if the read address is not equal to the write address.
58. A cryptographic data processing system as recited in
claim 57
, further comprising:
means for incrementing the read address by an amount corresponding to a size of a single random number sample using the host processor, the means for incrementing being responsive to the means for reading the random number sample using the host processor beginning at the read address.
59. A computer program product for operating a cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory and is coupled to the host processor and the system memory, the computer program product comprising:
a computer readable program medium having computer readable program code embodied therein, the computer readable program code comprising:
computer readable program code for loading at least one operand from the system memory to the local memory;
computer readable program code for performing at least one operation on the at least one operand to generate a result in the local memory; and
computer readable program code for storing the result generated in the local memory in the system memory.
60. A computer program product as recited in
claim 59
, wherein the computer readable program code for performing the at least one operation, and the computer readable program code for storing the result execute without interaction with the host processor.
61. A computer program product for operating a cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory and is coupled to the host processor and the system memory, the computer program product comprising:
a computer readable program medium having computer readable program code embodied therein, the computer readable program code comprising:
computer readable program code for providing a command queue in the system memory;
computer readable program code for loading a command block into the command queue using the host processor;
computer readable program code for executing the command block using the cryptographic processor; and
computer readable program code for notifying the host processor that the command block has been executed.
62. A computer program product as recited in
claim 61
, further comprising:
computer readable program code for providing a read address for the command queue and a write address for the command queue;
wherein the computer readable program code for loading the command block into the command queue using the host processor comprises computer readable program code for loading the command block into the command queue using the host processor beginning at the write address, and wherein the computer readable program code for executing the command block using the cryptographic processor comprises computer readable program code for executing the command block using the cryptographic processor beginning at the read address.
63. A computer program product as recited in
claim 62
, wherein the computer readable program code for loading the command block into the command queue using the host processor beginning at the write address comprises:
computer readable program code for determining if the write address plus an amount corresponding to a size of a single command block equals the read address; and
computer readable program code for loading the command block into the command queue using the host processor beginning at the write address if the write address plus the amount corresponding to the size of the single command block does not equal the read address.
64. A computer program product as recited in
claim 63
, further comprising:
computer readable program code for incrementing the write address by the amount corresponding to the size of a single command block using the host processor if the write address plus the amount corresponding to the size of the single command block does not equal the read address, the computer readable program code for incrementing being responsive to the computer readable program code for loading the command block into the command queue using the host processor beginning at the write address.
65. A computer program product as recited in
claim 62
, wherein the computer readable program code for executing the command block using the cryptographic processor beginning at the read address comprises:
computer readable program code for determining whether the read address is equal to the write address; and
computer readable program code for executing the command block using the cryptographic processor beginning at the read address if the read address is not equal to the write address.
66. A computer program product as recited in
claim 65
, further comprising:
computer readable program code for incrementing the read address by an amount corresponding to a size of a single command block using the cryptographic processor, the computer readable program code for incrementing being responsive to the computer readable program code for executing the command block using the cryptographic processor beginning at the read address.
67. A computer program product as recited in
claim 61
, wherein the computer readable program code for notifying the host processor that the command block has been executed comprises computer readable program code for invoking an interrupt using the cryptographic processor after executing the command block.
68. A computer program product as recited in
claim 61
, wherein the computer readable program code for notifying the host processor that the command block has been executed comprises computer readable program code for updating a completion field in the command block using the cryptographic processor.
69. A computer program product as recited in
claim 68
, further comprising:
computer readable program code for providing a periodic interrupt; and
computer readable program code for reading the completion field using the host processor upon invocation of the periodic interrupt.
70. A method as recited in
claim 61
, wherein the computer readable program code for notifying the host processor that the command block has been executed comprises:
computer readable program code for setting a timer after loading the command block into the command queue using the host processor; and
computer readable program code for checking whether the command block has been executed after expiration of the timer.
71. A computer program product as recited in
claim 61
, further comprising:
computer readable program code for loading at least one operand from the command queue to the local memory;
computer readable program code for performing at least one operation on the at least one operand to generate a result in the local memory; and
computer readable program code for storing the result generated in the local memory in the command queue.
72. A computer program product for operating a cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that is coupled to the host processor and the system memory, the computer program product comprising:
a computer readable program medium having computer readable program code embodied therein, the computer readable program code comprising:
computer readable program code for providing a command queue in the system memory;
computer readable program code for loading a command block into the command queue using the host processor;
computer readable program code for setting a value of an interrupt field in the command block to request an interrupt when the command block has been executed;
computer readable program code for executing the command block using the cryptographic processor; and
computer readable program code for invoking an interrupt using the cryptographic processor after executing the command block if the interrupt field in the command block is set to the value to request the interrupt.
73. A computer program product as recited in
claim 72
, further comprising:
computer readable program code for storing error information in the command block that is associated with executing the command block using the cryptographic processor.
74. A computer program product for operating a cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that is coupled to the host processor and the system memory, the computer program product comprising:
a computer readable program medium having computer readable program code embodied therein, the computer readable program code comprising:
computer readable program code for providing a command queue in the system memory;
computer readable program code for loading a command block into the command queue using the host processor;
computer readable program code for executing the command block using the cryptographic processor; and
computer readable program code for updating a completion field in the command block using the cryptographic processor.
75. A computer program product as recited in
claim 74
, further comprising:
computer readable program code for providing a periodic interrupt; and
computer readable program code for reading the completion field using the host processor upon invocation of the periodic interrupt.
76. A computer program product as recited in
claim 74
, further comprising:
computer readable program code for storing error information in the command block that is associated with executing the command block using the cryptographic processor.
77. A computer program product for operating a data processing system that comprises a host processor, a system memory coupled to the host processor, and an adjunct processor integrated circuit that is coupled to the host processor and the system memory, the computer program product comprising:
a computer readable program medium having computer readable program code embodied therein, the computer readable program code comprising:
computer readable program code for providing a command queue in the system memory;
computer readable program code for loading a command block into the command queue using the host processor, the command block comprising an input data field that contains input data;
computer readable program code for performing an operation based on the input data using the adjunct processor to generate a result; and
computer readable program code for storing the result in the input data field such that at least a portion of the input data is overwritten.
78. A computer program product as recited in
claim 77
, wherein the data processing system comprises a cryptographic data processing system, the adjunct processor integrated circuit comprises a cryptographic processor integrated circuit, and the computer readable program code for performing the operation based on the input data comprises:
computer readable program code for performing a hash operation based on the input data using the cryptographic processor to generate a hash value.
79. A computer program product as recited in
claim 78
, wherein the computer readable program code for storing the result in the input data field comprises:
computer readable program code for storing the hash value in the input data field such that the at least a portion of the input data is overwritten.
80. A computer program product as recited in
claim 78
, wherein the command block further comprises an input pointer field that contains an address in the system memory of an incoming packet and wherein the computer readable program code for performing the hash operation comprises:
computer readable program code for performing the hash operation based on the input data and the incoming packet using the cryptographic processor to generate the hash value.
81. A computer program product as recited in
claim 80
, wherein the command block further comprises an output pointer field that contains an address in the system memory for storing a decrypted packet, the computer program product further comprising:
computer readable program code for decrypting the incoming packet using the cryptographic processor to generate the decrypted packet;
computer readable program code for attaching the hash value to the decrypted packet; and
computer readable program code for storing the decrypted packet with the attached hash value at the address in the system memory contained in the output pointer field.
82. A computer program product for operating cryptographic data processing system that comprises a host processor, a system memory coupled to the host processor, and a cryptographic processor integrated circuit that comprises a local memory and is coupled to the host processor and the system memory, the computer program product comprising:
a computer readable program medium having computer readable program code embodied therein, the computer readable program code comprising:
computer readable program code for providing a command queue in the system memory;
computer readable program code for providing a read address for the command queue and a write address for the command queue;
computer readable program code for loading a random number sample into the command queue using the cryptographic processor beginning at the write address; and
computer readable program code for reading the random number sample using the host processor beginning at the read address.
83. A computer program product as recited in
claim 82
, wherein the computer readable program code for loading the random number sample into the command queue using the cryptographic processor beginning at the write address comprises:
computer readable program code for determining if the write address plus an amount corresponding to a size of a single random number sample equals the read address; and
computer readable program code for loading the random number sample into the command queue using the cryptographic processor beginning at the write address if the write address plus the amount corresponding to the size of the single random number sample does not equal the read address.
84. A computer program product as recited in
claim 83
, further comprising:
computer readable program code for incrementing the write address by the amount corresponding to the size of a single random number sample using the cryptographic processor if the write address plus the amount corresponding to the size of the single random number sample does not equal the read address, the computer readable program code for incrementing being responsive to the computer readable program code for loading the random number sample into the command queue using the cryptographic processor beginning at the write address.
85. A computer program product as recited in
claim 82
, wherein the computer readable program code for reading the random number sample using the host processor beginning at the read address comprises:
computer readable program code for determining whether the read address is equal to the write address; and
computer readable program code for reading the random number sample using the host processor beginning at the read address if the read address is not equal to the write address.
86. A computer program product as recited in
claim 85
, further comprising:
computer readable program code for incrementing the read address by an amount corresponding to a size of a single random number sample using the host processor, the computer readable program code for incrementing being responsive to the computer readable program code for reading the random number sample using the host processor beginning at the read address.
US09/852,562 2000-05-11 2001-05-10 Cryptographic data processing systems, computer program products, and methods of operating same in which a system memory is used to transfer information between a host processor and an adjunct processor Abandoned US20010042210A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US09/852,562 US20010042210A1 (en) 2000-05-11 2001-05-10 Cryptographic data processing systems, computer program products, and methods of operating same in which a system memory is used to transfer information between a host processor and an adjunct processor

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US20340900P 2000-05-11 2000-05-11
US09/852,562 US20010042210A1 (en) 2000-05-11 2001-05-10 Cryptographic data processing systems, computer program products, and methods of operating same in which a system memory is used to transfer information between a host processor and an adjunct processor

Publications (1)

Publication Number Publication Date
US20010042210A1 true US20010042210A1 (en) 2001-11-15

Family

ID=22753864

Family Applications (4)

Application Number Title Priority Date Filing Date
US09/849,853 Expired - Fee Related US6820105B2 (en) 2000-05-11 2001-05-04 Accelerated montgomery exponentiation using plural multipliers
US09/849,667 Expired - Fee Related US6691143B2 (en) 2000-05-11 2001-05-04 Accelerated montgomery multiplication using plural multipliers
US09/852,937 Abandoned US20020004904A1 (en) 2000-05-11 2001-05-10 Cryptographic data processing systems, computer program products, and methods of operating same in which multiple cryptographic execution units execute commands from a host processor in parallel
US09/852,562 Abandoned US20010042210A1 (en) 2000-05-11 2001-05-10 Cryptographic data processing systems, computer program products, and methods of operating same in which a system memory is used to transfer information between a host processor and an adjunct processor

Family Applications Before (3)

Application Number Title Priority Date Filing Date
US09/849,853 Expired - Fee Related US6820105B2 (en) 2000-05-11 2001-05-04 Accelerated montgomery exponentiation using plural multipliers
US09/849,667 Expired - Fee Related US6691143B2 (en) 2000-05-11 2001-05-04 Accelerated montgomery multiplication using plural multipliers
US09/852,937 Abandoned US20020004904A1 (en) 2000-05-11 2001-05-10 Cryptographic data processing systems, computer program products, and methods of operating same in which multiple cryptographic execution units execute commands from a host processor in parallel

Country Status (4)

Country Link
US (4) US6820105B2 (en)
EP (1) EP1405170A2 (en)
AU (2) AU2001290508A1 (en)
WO (2) WO2001088692A2 (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030072454A1 (en) * 2001-10-11 2003-04-17 Krawetz Neal A. System and method for secure data transmission
EP1738510A2 (en) * 2004-03-23 2007-01-03 Texas Instruments Incorporated Hybrid cryptographic accelerator and method of operation thereof
US20080063183A1 (en) * 2006-09-07 2008-03-13 International Business Machines Corporation Maintaining encryption key integrity
US20080126753A1 (en) * 2006-09-25 2008-05-29 Mediatek Inc. Embedded system and operating method thereof
US20140189332A1 (en) * 2012-12-28 2014-07-03 Oren Ben-Kiki Apparatus and method for low-latency invocation of accelerators
US20150110114A1 (en) * 2013-10-17 2015-04-23 Marvell Israel (M.I.S.L) Ltd. Processing Concurrency in a Network Device
US9417873B2 (en) 2012-12-28 2016-08-16 Intel Corporation Apparatus and method for a hybrid latency-throughput processor
US20160242028A1 (en) * 2012-10-30 2016-08-18 Kt Corporation Security management in m2m area network
US9455907B1 (en) 2012-11-29 2016-09-27 Marvell Israel (M.I.S.L) Ltd. Multithreaded parallel packet processing in network devices
EP3094039A1 (en) * 2015-05-13 2016-11-16 Gemalto Sa Method for optimizing the execution of a function which generates at least one key within an integrated circuit device
US9542193B2 (en) 2012-12-28 2017-01-10 Intel Corporation Memory address collision detection of ordered parallel threads with bloom filters
US20170141911A1 (en) * 2015-11-13 2017-05-18 Nxp B.V. Split-and-merge approach to protect against dfa attacks
US10126952B2 (en) 2015-11-05 2018-11-13 International Business Machines Corporation Memory move instruction sequence targeting a memory-mapped device
US10140129B2 (en) 2012-12-28 2018-11-27 Intel Corporation Processing core having shared front end unit
US10140052B2 (en) 2015-11-05 2018-11-27 International Business Machines Corporation Memory access in a data processing system utilizing copy and paste instructions
US10152322B2 (en) 2015-11-05 2018-12-11 International Business Machines Corporation Memory move instruction sequence including a stream of copy-type and paste-type instructions
US10243937B2 (en) * 2016-07-08 2019-03-26 Nxp B.V. Equality check implemented with secret sharing
US10241945B2 (en) 2015-11-05 2019-03-26 International Business Machines Corporation Memory move supporting speculative acquisition of source and destination data granules including copy-type and paste-type instructions
US10346195B2 (en) 2012-12-29 2019-07-09 Intel Corporation Apparatus and method for invocation of a multi threaded accelerator
US10346164B2 (en) * 2015-11-05 2019-07-09 International Business Machines Corporation Memory move instruction sequence targeting an accelerator switchboard
US11455257B2 (en) * 2019-04-07 2022-09-27 Intel Corporation Ultra-secure accelerators

Families Citing this family (78)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7600131B1 (en) 1999-07-08 2009-10-06 Broadcom Corporation Distributed processing in a cryptography acceleration chip
US7240204B1 (en) * 2000-03-31 2007-07-03 State Of Oregon Acting By And Through The State Board Of Higher Education On Behalf Of Oregon State University Scalable and unified multiplication methods and apparatus
US6820105B2 (en) * 2000-05-11 2004-11-16 Cyberguard Corporation Accelerated montgomery exponentiation using plural multipliers
JP3785044B2 (en) * 2001-01-22 2006-06-14 株式会社東芝 Power residue calculation device, power residue calculation method, and recording medium
US6836839B2 (en) * 2001-03-22 2004-12-28 Quicksilver Technology, Inc. Adaptive integrated circuitry with heterogeneous and reconfigurable matrices of diverse and adaptive computational units having fixed, application specific computational elements
US7752419B1 (en) 2001-03-22 2010-07-06 Qst Holdings, Llc Method and system for managing hardware resources to implement system functions using an adaptive computing architecture
US7400668B2 (en) * 2001-03-22 2008-07-15 Qst Holdings, Llc Method and system for implementing a system acquisition function for use with a communication device
US7489779B2 (en) * 2001-03-22 2009-02-10 Qstholdings, Llc Hardware implementation of the secure hash standard
US7962716B2 (en) 2001-03-22 2011-06-14 Qst Holdings, Inc. Adaptive integrated circuitry with heterogeneous and reconfigurable matrices of diverse and adaptive computational units having fixed, application specific computational elements
US20040133745A1 (en) * 2002-10-28 2004-07-08 Quicksilver Technology, Inc. Adaptable datapath for a digital processing system
US7653710B2 (en) * 2002-06-25 2010-01-26 Qst Holdings, Llc. Hardware task manager
US6577678B2 (en) 2001-05-08 2003-06-10 Quicksilver Technology Method and system for reconfigurable channel coding
JP2002358010A (en) * 2001-05-31 2002-12-13 Mitsubishi Electric Corp Exponentiation remainder computing element
US7194088B2 (en) * 2001-06-08 2007-03-20 Corrent Corporation Method and system for a full-adder post processor for modulo arithmetic
US7240203B2 (en) * 2001-07-24 2007-07-03 Cavium Networks, Inc. Method and apparatus for establishing secure sessions
US20030048908A1 (en) * 2001-08-31 2003-03-13 Hamilton Jon W. System and method for protecting the content of digital cinema products
DE10151129B4 (en) * 2001-10-17 2004-07-29 Infineon Technologies Ag Method and device for calculating a result of an exponentiation in a cryptography circuit
US7046635B2 (en) * 2001-11-28 2006-05-16 Quicksilver Technology, Inc. System for authorizing functionality in adaptable hardware devices
US6986021B2 (en) 2001-11-30 2006-01-10 Quick Silver Technology, Inc. Apparatus, method, system and executable module for configuration and operation of adaptive integrated circuitry having fixed, application specific computational elements
US8412915B2 (en) 2001-11-30 2013-04-02 Altera Corporation Apparatus, system and method for configuration of adaptive integrated circuitry having heterogeneous computational elements
US7602740B2 (en) * 2001-12-10 2009-10-13 Qst Holdings, Inc. System for adapting device standards after manufacture
US7215701B2 (en) 2001-12-12 2007-05-08 Sharad Sambhwani Low I/O bandwidth method and system for implementing detection and identification of scrambling codes
US7088825B2 (en) * 2001-12-12 2006-08-08 Quicksilver Technology, Inc. Low I/O bandwidth method and system for implementing detection and identification of scrambling codes
US7231508B2 (en) * 2001-12-13 2007-06-12 Quicksilver Technologies Configurable finite state machine for operation of microinstruction providing execution enable control value
GB2383435A (en) * 2001-12-18 2003-06-25 Automatic Parallel Designs Ltd Logic circuit for performing modular multiplication and exponentiation
US7403981B2 (en) * 2002-01-04 2008-07-22 Quicksilver Technology, Inc. Apparatus and method for adaptive multimedia reception and transmission in communication environments
JP2003241659A (en) * 2002-02-22 2003-08-29 Hitachi Ltd Information processing method
US7305567B1 (en) 2002-03-01 2007-12-04 Cavium Networks, In. Decoupled architecture for data ciphering operations
US7260217B1 (en) * 2002-03-01 2007-08-21 Cavium Networks, Inc. Speculative execution for data ciphering operations
US7346159B2 (en) * 2002-05-01 2008-03-18 Sun Microsystems, Inc. Generic modular multiplier using partial reduction
US7328414B1 (en) * 2003-05-13 2008-02-05 Qst Holdings, Llc Method and system for creating and programming an adaptive computing engine
US7660984B1 (en) 2003-05-13 2010-02-09 Quicksilver Technology Method and system for achieving individualized protected space in an operating system
US8108656B2 (en) 2002-08-29 2012-01-31 Qst Holdings, Llc Task definition for specifying resource requirements
JP4360792B2 (en) * 2002-09-30 2009-11-11 株式会社ルネサステクノロジ Power-residue calculator
US7937591B1 (en) 2002-10-25 2011-05-03 Qst Holdings, Llc Method and system for providing a device which can be adapted on an ongoing basis
US8276135B2 (en) 2002-11-07 2012-09-25 Qst Holdings Llc Profiling of software and circuit designs utilizing data operation analyses
US7225301B2 (en) 2002-11-22 2007-05-29 Quicksilver Technologies External memory controller node
US7434043B2 (en) * 2002-12-18 2008-10-07 Broadcom Corporation Cryptography accelerator data routing unit
US7191341B2 (en) * 2002-12-18 2007-03-13 Broadcom Corporation Methods and apparatus for ordering data in a cryptography accelerator
US20040123123A1 (en) * 2002-12-18 2004-06-24 Buer Mark L. Methods and apparatus for accessing security association information in a cryptography accelerator
US20040123120A1 (en) * 2002-12-18 2004-06-24 Broadcom Corporation Cryptography accelerator input interface data handling
US7568110B2 (en) * 2002-12-18 2009-07-28 Broadcom Corporation Cryptography accelerator interface decoupling from cryptography processing cores
US7260595B2 (en) * 2002-12-23 2007-08-21 Arithmatica Limited Logic circuit and method for carry and sum generation and method of designing such a logic circuit
US20040230813A1 (en) * 2003-05-12 2004-11-18 International Business Machines Corporation Cryptographic coprocessor on a general purpose microprocessor
GB0314557D0 (en) * 2003-06-21 2003-07-30 Koninkl Philips Electronics Nv Improved reduction calculations
US7609297B2 (en) * 2003-06-25 2009-10-27 Qst Holdings, Inc. Configurable hardware based digital imaging apparatus
US8194855B2 (en) * 2003-06-30 2012-06-05 Oracle America, Inc. Method and apparatus for implementing processor instructions for accelerating public-key cryptography
US7200837B2 (en) * 2003-08-21 2007-04-03 Qst Holdings, Llc System, method and software for static and dynamic programming and configuration of an adaptive computing architecture
US20050157872A1 (en) * 2003-11-12 2005-07-21 Takatoshi Ono RSA public key generation apparatus, RSA decryption apparatus, and RSA signature apparatus
US8526601B2 (en) * 2004-04-05 2013-09-03 Advanced Micro Devices, Inc. Method of improving operational speed of encryption engine
WO2005109221A2 (en) * 2004-05-03 2005-11-17 Silicon Optix A bit serial processing element for a simd array processor
US7519644B2 (en) * 2004-05-27 2009-04-14 King Fahd University Of Petroleum And Minerals Finite field serial-serial multiplication/reduction structure and method
US20060026601A1 (en) * 2004-07-29 2006-02-02 Solt David G Jr Executing commands on a plurality of processes
US7496753B2 (en) * 2004-09-02 2009-02-24 International Business Machines Corporation Data encryption interface for reducing encrypt latency impact on standard traffic
US7668895B2 (en) * 2004-12-01 2010-02-23 Integrated System Solution Corp. Galois field computation
US20060136717A1 (en) 2004-12-20 2006-06-22 Mark Buer System and method for authentication via a proximate device
US8295484B2 (en) 2004-12-21 2012-10-23 Broadcom Corporation System and method for securing data from a remote input device
US7602655B2 (en) * 2006-01-12 2009-10-13 Mediatek Inc. Embedded system
US8364965B2 (en) * 2006-03-15 2013-01-29 Apple Inc. Optimized integrity verification procedures
US8135960B2 (en) * 2007-10-30 2012-03-13 International Business Machines Corporation Multiprocessor electronic circuit including a plurality of processors and electronic data processing system
US20090183161A1 (en) * 2008-01-16 2009-07-16 Pasi Kolinummi Co-processor for stream data processing
US20090234866A1 (en) * 2008-03-17 2009-09-17 Paul Caprioli Floating Point Unit and Cryptographic Unit Having a Shared Multiplier Tree
US20100011047A1 (en) * 2008-07-09 2010-01-14 Viasat, Inc. Hardware-Based Cryptographic Accelerator
US8356185B2 (en) * 2009-10-08 2013-01-15 Oracle America, Inc. Apparatus and method for local operand bypassing for cryptographic instructions
JP5990466B2 (en) 2010-01-21 2016-09-14 スビラル・インコーポレーテッド Method and apparatus for a general purpose multi-core system for implementing stream-based operations
US8560814B2 (en) 2010-05-04 2013-10-15 Oracle International Corporation Thread fairness on a multi-threaded processor with multi-cycle cryptographic operations
US8583902B2 (en) 2010-05-07 2013-11-12 Oracle International Corporation Instruction support for performing montgomery multiplication
WO2013036217A1 (en) * 2011-09-06 2013-03-14 Intel Corporation Number squaring computer-implemented method and apparatus
US8799343B2 (en) * 2011-09-22 2014-08-05 Intel Corporation Modular exponentiation with partitioned and scattered storage of Montgomery Multiplication results
US8856479B2 (en) 2012-04-20 2014-10-07 International Business Machines Corporation Implementing storage adapter performance optimization with hardware operations completion coalescence
US9355068B2 (en) 2012-06-29 2016-05-31 Intel Corporation Vector multiplication with operand base system conversion and re-conversion
US10095516B2 (en) 2012-06-29 2018-10-09 Intel Corporation Vector multiplication with accumulation in large register space
GB2511843B (en) * 2013-03-15 2015-05-13 Eisergy Ltd A power factor correction circuit
JP6102649B2 (en) * 2013-09-13 2017-03-29 株式会社ソシオネクスト Arithmetic circuit and control method of arithmetic circuit
US9531531B2 (en) * 2015-05-06 2016-12-27 Qualcomm Incorporated Methods and devices for fixed execution flow multiplier recoding and scalar multiplication
WO2021217034A1 (en) * 2020-04-23 2021-10-28 University Of Southern California Design of high-performance and scalable montgomery modular multiplier circuits
US20220121424A1 (en) * 2020-10-21 2022-04-21 PUFsecurity Corporation Device and Method of Handling a Modular Multiplication
US11210067B1 (en) 2020-11-27 2021-12-28 Pqsecure Technologies, Llc Architecture for small and efficient modular multiplication using carry-save adders

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4763242A (en) * 1985-10-23 1988-08-09 Hewlett-Packard Company Computer providing flexible processor extension, flexible instruction set extension, and implicit emulation for upward software compatibility
US5329623A (en) * 1992-06-17 1994-07-12 The Trustees Of The University Of Pennsylvania Apparatus for providing cryptographic support in a network
US5706489A (en) * 1995-10-18 1998-01-06 International Business Machines Corporation Method for a CPU to utilize a parallel instruction execution processing facility for assisting in the processing of the accessed data
US5961626A (en) * 1997-10-10 1999-10-05 Motorola, Inc. Method and processing interface for transferring data between host systems and a packetized processing system
US6075546A (en) * 1997-11-10 2000-06-13 Silicon Grahphics, Inc. Packetized command interface to graphics processor
US6081895A (en) * 1997-10-10 2000-06-27 Motorola, Inc. Method and system for managing data unit processing
US6219789B1 (en) * 1995-07-20 2001-04-17 Dallas Semiconductor Corporation Microprocessor with coprocessing capabilities for secure transactions and quick clearing capabilities

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ATE183315T1 (en) 1991-09-05 1999-08-15 Canon Kk METHOD AND DEVICE FOR ENCRYPTING AND DECRYPTING COMMUNICATION DATA
US5274707A (en) 1991-12-06 1993-12-28 Roger Schlafly Modular exponentiation and reduction device and method
US5513133A (en) 1992-11-30 1996-04-30 Fortress U&T Ltd. Compact microelectronic device for performing modular multiplication and exponentiation over large numbers
EP0656709B1 (en) 1993-11-30 2005-07-13 Canon Kabushiki Kaisha Encryption device and apparatus for encryption/decryption based on the Montgomery method using efficient modular multiplication
US5844986A (en) * 1996-09-30 1998-12-01 Intel Corporation Secure BIOS
DE69727796T2 (en) * 1996-10-31 2004-12-30 Atmel Research Coprocessor for performing modular multiplication
WO1998050851A1 (en) 1997-05-04 1998-11-12 Fortress U & T Ltd. Improved apparatus & method for modular multiplication & exponentiation based on montgomery multiplication
US5987131A (en) 1997-08-18 1999-11-16 Picturetel Corporation Cryptographic key exchange using pre-computation
US6061706A (en) 1997-10-10 2000-05-09 United Microelectronics Corp. Systolic linear-array modular multiplier with pipeline processing elements
US6029170A (en) * 1997-11-25 2000-02-22 International Business Machines Corporation Hybrid tree array data structure and method
US6085210A (en) 1998-01-22 2000-07-04 Philips Semiconductor, Inc. High-speed modular exponentiator and multiplier
DE69828150T2 (en) * 1998-03-30 2005-12-15 Rainbow Technologies Inc., Irvine Computationally efficient modular multiplication method and device
US6240436B1 (en) * 1998-03-30 2001-05-29 Rainbow Technologies, Inc. High speed montgomery value calculation
US6567911B1 (en) * 1999-12-06 2003-05-20 Adaptec, Inc. Method of conserving memory resources during execution of system BIOS
US6820105B2 (en) * 2000-05-11 2004-11-16 Cyberguard Corporation Accelerated montgomery exponentiation using plural multipliers
US6763365B2 (en) 2000-12-19 2004-07-13 International Business Machines Corporation Hardware implementation for modular multiplication using a plurality of almost entirely identical processor elements

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4763242A (en) * 1985-10-23 1988-08-09 Hewlett-Packard Company Computer providing flexible processor extension, flexible instruction set extension, and implicit emulation for upward software compatibility
US5329623A (en) * 1992-06-17 1994-07-12 The Trustees Of The University Of Pennsylvania Apparatus for providing cryptographic support in a network
US6219789B1 (en) * 1995-07-20 2001-04-17 Dallas Semiconductor Corporation Microprocessor with coprocessing capabilities for secure transactions and quick clearing capabilities
US5706489A (en) * 1995-10-18 1998-01-06 International Business Machines Corporation Method for a CPU to utilize a parallel instruction execution processing facility for assisting in the processing of the accessed data
US5961626A (en) * 1997-10-10 1999-10-05 Motorola, Inc. Method and processing interface for transferring data between host systems and a packetized processing system
US6081895A (en) * 1997-10-10 2000-06-27 Motorola, Inc. Method and system for managing data unit processing
US6075546A (en) * 1997-11-10 2000-06-13 Silicon Grahphics, Inc. Packetized command interface to graphics processor

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8117450B2 (en) * 2001-10-11 2012-02-14 Hewlett-Packard Development Company, L.P. System and method for secure data transmission
US20030072454A1 (en) * 2001-10-11 2003-04-17 Krawetz Neal A. System and method for secure data transmission
EP1738510A2 (en) * 2004-03-23 2007-01-03 Texas Instruments Incorporated Hybrid cryptographic accelerator and method of operation thereof
EP1738510A4 (en) * 2004-03-23 2010-08-11 Texas Instruments Inc Hybrid cryptographic accelerator and method of operation thereof
US20080063183A1 (en) * 2006-09-07 2008-03-13 International Business Machines Corporation Maintaining encryption key integrity
US7817799B2 (en) * 2006-09-07 2010-10-19 International Business Machines Corporation Maintaining encryption key integrity
US20080126753A1 (en) * 2006-09-25 2008-05-29 Mediatek Inc. Embedded system and operating method thereof
US20160242028A1 (en) * 2012-10-30 2016-08-18 Kt Corporation Security management in m2m area network
US9986428B2 (en) * 2012-10-30 2018-05-29 Kt Corporation Security management in M2M area network
US9455907B1 (en) 2012-11-29 2016-09-27 Marvell Israel (M.I.S.L) Ltd. Multithreaded parallel packet processing in network devices
US10140129B2 (en) 2012-12-28 2018-11-27 Intel Corporation Processing core having shared front end unit
US10089113B2 (en) 2012-12-28 2018-10-02 Intel Corporation Apparatus and method for low-latency invocation of accelerators
US9361116B2 (en) * 2012-12-28 2016-06-07 Intel Corporation Apparatus and method for low-latency invocation of accelerators
US10664284B2 (en) 2012-12-28 2020-05-26 Intel Corporation Apparatus and method for a hybrid latency-throughput processor
US10255077B2 (en) 2012-12-28 2019-04-09 Intel Corporation Apparatus and method for a hybrid latency-throughput processor
US20140189332A1 (en) * 2012-12-28 2014-07-03 Oren Ben-Kiki Apparatus and method for low-latency invocation of accelerators
US10101999B2 (en) 2012-12-28 2018-10-16 Intel Corporation Memory address collision detection of ordered parallel threads with bloom filters
US9542193B2 (en) 2012-12-28 2017-01-10 Intel Corporation Memory address collision detection of ordered parallel threads with bloom filters
US10095521B2 (en) 2012-12-28 2018-10-09 Intel Corporation Apparatus and method for low-latency invocation of accelerators
US9417873B2 (en) 2012-12-28 2016-08-16 Intel Corporation Apparatus and method for a hybrid latency-throughput processor
US10083037B2 (en) 2012-12-28 2018-09-25 Intel Corporation Apparatus and method for low-latency invocation of accelerators
US10346195B2 (en) 2012-12-29 2019-07-09 Intel Corporation Apparatus and method for invocation of a multi threaded accelerator
US9467399B2 (en) * 2013-10-17 2016-10-11 Marvell World Trade Ltd. Processing concurrency in a network device
US20150110114A1 (en) * 2013-10-17 2015-04-23 Marvell Israel (M.I.S.L) Ltd. Processing Concurrency in a Network Device
US9461939B2 (en) 2013-10-17 2016-10-04 Marvell World Trade Ltd. Processing concurrency in a network device
WO2016180710A1 (en) * 2015-05-13 2016-11-17 Gemalto Sa Method for optimizing the execution of a function which generates at least one key within an integrated circuit device
EP3094039A1 (en) * 2015-05-13 2016-11-16 Gemalto Sa Method for optimizing the execution of a function which generates at least one key within an integrated circuit device
RU2703347C2 (en) * 2015-05-13 2019-10-16 Жемальто Са Method for optimizing execution of function which generates at least one key in device on integrated circuit
US10126952B2 (en) 2015-11-05 2018-11-13 International Business Machines Corporation Memory move instruction sequence targeting a memory-mapped device
US10140052B2 (en) 2015-11-05 2018-11-27 International Business Machines Corporation Memory access in a data processing system utilizing copy and paste instructions
US10152322B2 (en) 2015-11-05 2018-12-11 International Business Machines Corporation Memory move instruction sequence including a stream of copy-type and paste-type instructions
US10241945B2 (en) 2015-11-05 2019-03-26 International Business Machines Corporation Memory move supporting speculative acquisition of source and destination data granules including copy-type and paste-type instructions
US10346164B2 (en) * 2015-11-05 2019-07-09 International Business Machines Corporation Memory move instruction sequence targeting an accelerator switchboard
US10020932B2 (en) * 2015-11-13 2018-07-10 Nxp B.V. Split-and-merge approach to protect against DFA attacks
US20170141911A1 (en) * 2015-11-13 2017-05-18 Nxp B.V. Split-and-merge approach to protect against dfa attacks
CN106953723A (en) * 2015-11-13 2017-07-14 恩智浦有限公司 Prevent fractionation and merging method that DFA is attacked
US10243937B2 (en) * 2016-07-08 2019-03-26 Nxp B.V. Equality check implemented with secret sharing
US11455257B2 (en) * 2019-04-07 2022-09-27 Intel Corporation Ultra-secure accelerators

Also Published As

Publication number Publication date
AU2001286382A1 (en) 2001-11-26
US6691143B2 (en) 2004-02-10
WO2001093012A3 (en) 2002-04-25
WO2001088692A3 (en) 2002-04-18
WO2001093012A2 (en) 2001-12-06
EP1405170A2 (en) 2004-04-07
WO2001088692A2 (en) 2001-11-22
US20020010730A1 (en) 2002-01-24
US6820105B2 (en) 2004-11-16
AU2001290508A1 (en) 2001-12-11
US20020013799A1 (en) 2002-01-31
US20020004904A1 (en) 2002-01-10

Similar Documents

Publication Publication Date Title
US20010042210A1 (en) Cryptographic data processing systems, computer program products, and methods of operating same in which a system memory is used to transfer information between a host processor and an adjunct processor
TWI747933B (en) Hardware accelerators and methods for offload operations
US10558580B2 (en) Methods and apparatus for loading firmware on demand
KR101764187B1 (en) Apparatus and method for low-latency invocation of accelerators
US5388237A (en) Method of and apparatus for interleaving multiple-channel DMA operations
US7590774B2 (en) Method and system for efficient context swapping
JP2741594B2 (en) Execution device for I / O processor
JP2021174506A (en) Microprocessor with pipeline control for executing instruction in preset future time
US20070186077A1 (en) System and Method for Executing Instructions Utilizing a Preferred Slot Alignment Mechanism
JP2006338664A (en) System for performing code during operating system initialization
KR100570138B1 (en) System and method for loading software on a plurality of processors
IE990754A1 (en) An apparatus for software initiated prefetch and method therefor
US5805930A (en) System for FIFO informing the availability of stages to store commands which include data and virtual address sent directly from application programs
JP4226085B2 (en) Microprocessor and multiprocessor system
WO2012054159A1 (en) Memories and methods for performing atomic memory operations in accordance with configuration information
US8190794B2 (en) Control function for memory based buffers
CN111381870A (en) Hardware processor and method for extended microcode patching
US6438683B1 (en) Technique using FIFO memory for booting a programmable microprocessor from a host computer
JP2001290706A (en) Prefetch for tlb cache
JP2003521034A (en) Microprocessor system and method of operating the same
JP4130465B2 (en) Technology for executing atomic processing on processors with different memory transfer processing sizes
TW202314497A (en) Circuitry and methods for accelerating streaming data-transformation operations
WO2001038970A2 (en) Buffer memories, methods and systems for buffering having seperate buffer memories for each of a plurality of tasks
WO2001086430A2 (en) Cryptographic data processing systems, computer programs, and methods of operating same
JPH10207717A (en) Microcomputer

Legal Events

Date Code Title Description
AS Assignment

Owner name: NETOCTAVE, INC., NORTH CAROLINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:BLAKER, DAVID M.;SAVARDA, RAYMOND;HANNA, MICHAEL;REEL/FRAME:011807/0348;SIGNING DATES FROM 20010504 TO 20010508

Owner name: NETOCTAVE, INC., NORTH CAROLINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CELOTEK CORPORATION;REEL/FRAME:011807/0332

Effective date: 20000809

AS Assignment

Owner name: INTERSOUTH PARTNERS V, L.P. AS AGENT FOR THE SUCUR

Free format text: SECURITY INTEREST;ASSIGNOR:NETOCTAVE, INC.;REEL/FRAME:013268/0282

Effective date: 20020827

AS Assignment

Owner name: NETOCTAVE, INC., NORTH CAROLINA

Free format text: TERMINATION OF SECURITY INTEREST;ASSIGNOR:INTERSOUTH PARTNERS V, L.P. AS AGENT FOR THE SECURED PARTIES PURSUANT TO THE TERMINATION OF SECURITY INTEREST;REEL/FRAME:013335/0175

Effective date: 20020927

AS Assignment

Owner name: CYBERGUARD CORPORATION, FLORIDA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NETOCTAVE, INC.;REEL/FRAME:013495/0063

Effective date: 20030304

AS Assignment

Owner name: NBMK ENCRYPTION TECHNOLOGIES, INC., NORTH CAROLINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:CYBERGUARD CORPORATION;REEL/FRAME:017596/0264

Effective date: 20060421

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO PAY ISSUE FEE