CN101477458A - Hardware thread execution method based on processor and FPGA mixed structure - Google Patents

Hardware thread execution method based on processor and FPGA mixed structure Download PDF

Info

Publication number
CN101477458A
CN101477458A CNA2008101632107A CN200810163210A CN101477458A CN 101477458 A CN101477458 A CN 101477458A CN A2008101632107 A CNA2008101632107 A CN A2008101632107A CN 200810163210 A CN200810163210 A CN 200810163210A CN 101477458 A CN101477458 A CN 101477458A
Authority
CN
China
Prior art keywords
hardware
thread
section
fpga
joint
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CNA2008101632107A
Other languages
Chinese (zh)
Inventor
陈天洲
严力科
陈度
王罡
冯德贵
吴斌斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CNA2008101632107A priority Critical patent/CN101477458A/en
Publication of CN101477458A publication Critical patent/CN101477458A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a method for executing hardware thread based on processors and FPGA hybrid architecture. The method comprises the following steps: adding a hardware section in an executable file; operating the function of the hardware in a hardware thread manner; creating the hardware thread to execute the computing-intensive part of the program. The invention provides a unified thread control block for the hardware thread, unifies the control interface of software-hardware threads, simplifies the control and the management of software-hardware threads, and effectively improves the computational performance of the system.

Description

Hardware thread execution method based on processor and FPGA mixed architecture
Technical field
The present invention relates to a kind of technical method that the system software support is provided for FPGA in restructural computing technique field, operating system design field, threading field.
Background technology
In recent years, along with the fast development of microelectric technique, field of microprocessors constantly develops towards improving framework execution efficient, multinuclear design, flexible flexible expansion and profound function integration direction, and has obtained very much progress.Especially in recent years the development of processor/FPGA restructural mixed system framework made field of microprocessors reach a brand-new developing stage.
Restructural calculates the demand of having catered to following calculating field, higher level concurrent operation ability can be provided to more convenient and quicker, owing to used the hardware mode execution, the calculated performance of the application program that those concurrency demands are higher is greatly improved, being highly suitable for high-speed digital signal handles, multimedia is handled, and biological information is handled.Yet, owing to be still far from perfect, require developer and user to possess the experience of hardware design, and the construction cycle is long, cost is too high, so restructural calculates and does not still obtain using very widely at the operating system support of restructural computing platform.
For simply manage efficiently and use a computer in the restructural resource, certain methods solves from the angle of software, hardware and software-hardware synergism respectively.BORPH operating system is encapsulated in the restructural resource in the operating system service as CPU operation time and internal memory at based on the reconfigurable computing machine of FPGA, and the developer only need pay close attention to the higher-layer programs design, has simplified the use of restructural resource.The Warp processor is simplified the difficulty that the developer uses reconfigurable resource from the hardware design aspect, this processor can be dynamically transfers the zone consuming time of a program to circuit that reconfigurable hardware realizes automatically to be carried out, thereby reduces execution time and energy resource consumption.HWTI (Hardware Thread Interface, hardware thread interface) realizes mixed architecture software and hardware multithreading by the hardware thread interface.
BORPH operating system is to reaching efficient preferably to these two kinds of schemes of reconfigurable support utilizing on the reconfigurable resource on reconfigurable support and the Warp processor hardware, but in the design of BORPH operating system Linux there is very big change, and also suitable simple of the function that is realized, and relatively difficulty is revised and improved to embedded performance analyzer the and dynamically realization of the related hardware aspect of hardware-software partition module is too complicated in the Warp processor therefore.HWTI hardware thread interface then needs the operating system service support of hardwareization, is difficult to satisfy these conditions on the multi-purpose computer.
Summary of the invention
In order to manage and use the FPGA resource in processor/FPGA mixed architecture simply efficiently, unified software and hardware thread interface is provided, improve the execution performance of application program, the object of the present invention is to provide a kind of hardware thread execution method based on processor and FPGA mixed architecture.
The technical scheme that technical solution problem of the present invention is adopted is:
A kind of hardware thread execution method based on processor and FPGA mixed architecture is characterized in that comprising:
1) executable file generative process
When program generates executable file, performing step:
1. increase hardware joint and hardware section in executable file, each hardware joint is made of section header and joint content, and section header has defined the target FPGA model of type, title, start address, content-length and the joint of hardware joint; The joint content is preserved the configuration data of specifying the FPGA model; Each hardware section comprises an independent hardware joint;
2. respectively the hardware identification code of program is carried out comprehensively, generate the realization corresponding function configuration data of specifying on the FPGA model, and be hardware joint/section by format organization 1.;
3. software joint/section and the hardware joint/section with program is linked as an executable file; The head of executable file has been described the type of each program segment, the offset address that begins from file header, first address and the size the program space;
2) program process
1. executable file loads
Executable file includes code segment, hardware section and data segment; The loading performing step of executable file is as follows:
I is allocation space in Installed System Memory, and data segment, code segment and hardware section are loaded in the memory headroom of distribution;
II is according to the program implementation flow process, and the hardware section of carrying out at first is pre-configured to the FPGA equipment of appointment;
2. thread creation is carried out
The foundation step of thread is as follows during operation:
I judges the thread type that will create according to the type of the program segment that loads, and generates corresponding thread control block then, i.e. software thread controll block or hardware thread controll block;
II is if create hardware thread, and this hardware section has been configured on the FPGA, so only needs to make up the mapping of this hardware section to corresponding hardware thread controll block;
III is if create hardware thread, but this hardware Duan Shangwei is configured on the FPGA, needs so at first this hardware section to be configured on the FPGA of specified type, makes up this hardware section then to the mapping for hardware thread controll block;
3. thread finishes
The I hardware thread enters done state after executing self task;
If II executes is hardware thread independently, and this hardware thread state is arranged to completion status so, and this moment, the FPGA calculation resources and the hardware thread controll block of this hardware thread all were recovered;
If the dependent hardware thread that III executes, other threads are just arranged at the execution result of waiting for this hardware thread, this hardware thread will keep execution result so, after waiting other threads had obtained the execution result of this hardware thread, the FPGA calculation resources and the hardware thread controll block of this hardware thread just were recovered.
The beneficial effect that the present invention has is: at first, increase the hardware section in executable file, with the organization of unity of software and hardware phase (PH), simplified the management of software and hardware executable program; Secondly, move hardware capability, unified software and hardware thread management is provided, simplified the control and the management of software and hardware thread in the mode of hardware thread; Once more, according to program execution flow, the pre-configured hardware section of carrying out the earliest and loads other hardware sections and is cached in the Installed System Memory to FPGA, can effectively reduce the transmission time of FPGA configuration data, reduces the cost of FPGA configuration; At last, by creating the part that hardware thread comes computation-intensive in the executive routine, can effectively improve the calculated performance of system.
Description of drawings
Fig. 1 is the overview flow chart of a kind of hardware thread execution method embodiment based on processor and FPGA mixed architecture of the present invention.
Embodiment
The specific implementation flow process of the hardware thread execution method based on processor and FPGA mixed architecture of the present invention is as follows:
1) executable file generative process
When program generates executable file, performing step:
1. in executable file, increase hardware joint and hardware section
By improving the ELF file layout, in the ELF executable file, increase hardware joint and hardware section, joint is made up of section header and joint content, and the section header definition structure is as follows:
typedef?struct{
Elf32_Word?sh_name;
Elf32_Word?sh_type;
Elf32_Word?sh_flags;
Elf32_Addr?sh_addr;
Elf32_Off?sh_offset;
Elf32_Word?sh_si?ze;
Elf32_Word?sh_link;
Elf32_Word?sh_info;
Elf32_Word?sh_addral?ign;
Elf32_Word?sh_entsize;
}Elf32_Shdr;
When hardware joint of definition, its sh_type is SHT_HW.The sh_name Field Definition section name claim, be .hardware in this method.Begin to end up being the position of joint internal memory from the numerical value of sh_addr field to sh_addr+sh_size.Stored the configuration data of specifying the FPGA model in this joint.The sh_flags Field Definition content that comprises in the joint district whether can revise, whether can be performed etc. sign, the span of sh_flags field is SHF_WRITE, SHF_ALLOC, SHF_EXECINSTR, SHF_HARDWARE and SHF_MASKPROC..hardware the value of the joint district sh_flags field of type is SHF_HARDWARE.Sh_offset is the side-play amount between joint content starting position and the exehead.Other fields are not used, and are arranged to default value.
Independent hardware section of each hardware joint.
2. respectively the hardware identification code of program is carried out comprehensively, generate the realization corresponding function configuration data of specifying on the FPGA model, and be hardware joint/section by format organization 1.;
3. software joint/section and the hardware joint/section with program is linked as an executable file;
The information such as offset address, the first address the program space and size that the head of executable file has been described the type of each program segment, begun from file header; Concrete file header structure is as follows:
typedef?struct{
Elf32_Word?p_type;
Elf32_Off?p_offset;
Elf32_Addr?p_vaddr;
Elf32_Addr?p_paddr;
Elf32_Word?p_filesz;
Elf32_Word?p_memsz;
Elf32_Word?p_flags;
Elf32_Word?p_align;
}Elf32_phdr;
Type of hardware PT_HW of definition among the p_type in the file header, its numerical value is 6.Because joint of the hardware in this method and hardware section are equal to, therefore the skew p_offset from file header to first byte of this section is exactly the off-set value of file header to this hardware joint.P_paddr then points to the value of map addresses in the program space that this section will be placed.P_filesz provides this section or this joint shared byte number in executable file, and p_memsz then represents the byte number that this section accounts in Installed System Memory.
2) program process
1. executable file loads
Executable file includes code segment, hardware section and data segment.The loading performing step of executable file is as follows:
I is allocation space in Installed System Memory, and data segment, code segment and hardware section are loaded in the memory headroom of distribution;
II is according to the program implementation flow process, and the hardware section that selection will be carried out the earliest is pre-configured to the FPGA equipment of appointment.
2. thread creation is carried out
The constructive process concrete steps of thread are as follows during operation:
I judges the thread type that will create according to the type of the program segment that loads, and generates corresponding thread control block then, i.e. software thread controll block or hardware thread controll block;
No matter hardware thread or software thread, it all has a thread control block in system, and its structural core data definition is as follows:
struct?hwthread
{
pid_ttid;
pid_t?pid;
bool?hwsw_flag;
struct?hwthread_key_data*specific[KEY_SIZE];
bool?specific_used;
bool?stopped_start;
struct?hwthread*join;
void*result;
int?schedpolicy;
union{
void*(*start_routine)(void*);
void*controller;
};
void*arg;
int?status;
}
In this structure, the process number at the corresponding thread of pid, tid difference place and the thread number of thread itself.Hwsw_flag is used for judging that this thread is software thread or hardware thread.If this thread is a hardware thread, the value that is hwsw_flag is 1, the memory-mapped register access that begins by the controller address is controlled the hardware thread performance element on the FPGA so, and the status register of the hardware thread performance element on the FPGA is mapped on status and the stopped_start.If this thread is a software thread, promptly hwsw_flag is 0, calls start_routine so and carries out as thread body function.
II is if the establishment hardware thread needs at first to judge whether the pairing hardware section of this hardware thread is configured on the FPGA, if be configured on the FPGA, so directly enters Step II I; If the pairing hardware Duan Shangwei of this hardware thread is configured to FPGA, need at first this hardware section to be configured on the FPGA of specified type, continue Step II I then.
III makes up the mapping of the performance element of this hardware section on FPGA to corresponding hardware thread controll block, is about to controller and points to the memory address that hardware thread performance element state and control register on the FPGA are shone upon.
3. thread finishes
The I hardware thread enters done state after executing self task;
If II executes is hardware thread independently, and this hardware thread state is arranged to completion status so, and this moment, the FPGA calculation resources and the thread control block of this hardware thread all were recovered;
If the dependent hardware thread that III executes, other threads are just arranged at the execution result of waiting for this hardware thread, this hardware thread will keep execution result so, after waiting other threads had obtained the execution result of this hardware thread, the FPGA calculation resources and the thread control block of this hardware thread just were recovered.

Claims (1)

1, a kind of hardware thread execution method based on processor and FPGA mixed architecture is characterized in that comprising:
1) executable file generative process
When program generates executable file, performing step:
1. in executable file, increase hardware joint and hardware section, each hardware joint is made of section header and joint content, section header has defined the target FPGA model of type, title, start address, content-length and the joint of hardware joint, the joint content is preserved the configuration data of specifying the FPGA model, and each hardware section comprises an independent hardware joint;
2. respectively the hardware identification code of program is carried out comprehensively, generate the realization corresponding function configuration data of specifying on the FPGA model, and be hardware joint/section by format organization 1.;
3. software joint/section and the hardware joint/section with program is linked as an executable file; The head of executable file has been described the type of each program segment, the offset address that begins from file header, first address and the size the program space;
2) program process
1. executable file loads
Executable file includes code segment, hardware section and data segment; The loading performing step of executable file is as follows:
I is allocation space in Installed System Memory, and data segment, code segment and hardware section are loaded in the memory headroom of distribution;
II is according to the program implementation flow process, and the hardware section of carrying out at first is pre-configured to the FPGA equipment of appointment;
2. thread creation is carried out
The foundation step of thread is as follows during operation:
I judges the thread type that will create according to the type of the program segment that loads, and generates corresponding thread control block then, i.e. software thread controll block or hardware thread controll block;
II is if create hardware thread, and this hardware section has been configured on the FPGA, so only needs to make up the mapping of this hardware section to corresponding hardware thread controll block;
III is if create hardware thread, but this hardware Duan Shangwei is configured on the FPGA, needs so at first this hardware section to be configured on the FPGA of specified type, makes up this hardware section then to the mapping for hardware thread controll block;
3. thread finishes
The I hardware thread enters done state after executing self task;
If II executes is hardware thread independently, and this hardware thread state is arranged to completion status so, and this moment, the FPGA calculation resources and the hardware thread controll block of this hardware thread all were recovered;
If the dependent hardware thread that III executes, other threads are just arranged at the execution result of waiting for this hardware thread, this hardware thread will keep execution result so, after waiting other threads had obtained the execution result of this hardware thread, the FPGA calculation resources and the hardware thread controll block of this hardware thread just were recovered.
CNA2008101632107A 2008-12-15 2008-12-15 Hardware thread execution method based on processor and FPGA mixed structure Pending CN101477458A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNA2008101632107A CN101477458A (en) 2008-12-15 2008-12-15 Hardware thread execution method based on processor and FPGA mixed structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNA2008101632107A CN101477458A (en) 2008-12-15 2008-12-15 Hardware thread execution method based on processor and FPGA mixed structure

Publications (1)

Publication Number Publication Date
CN101477458A true CN101477458A (en) 2009-07-08

Family

ID=40838180

Family Applications (1)

Application Number Title Priority Date Filing Date
CNA2008101632107A Pending CN101477458A (en) 2008-12-15 2008-12-15 Hardware thread execution method based on processor and FPGA mixed structure

Country Status (1)

Country Link
CN (1) CN101477458A (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011123991A1 (en) * 2010-04-07 2011-10-13 福州福昕软件开发有限公司 Memory access method for parallel computing
CN102411512A (en) * 2010-09-20 2012-04-11 国际商业机器公司 Scaleable status tracking method and system of multiple assist hardware threads
US8793474B2 (en) 2010-09-20 2014-07-29 International Business Machines Corporation Obtaining and releasing hardware threads without hypervisor involvement
CN104267955A (en) * 2014-09-28 2015-01-07 曙光信息产业股份有限公司 Elimination method for running dependence among modules during starting time and stopping time of program
US9152426B2 (en) 2010-08-04 2015-10-06 International Business Machines Corporation Initiating assist thread upon asynchronous event for processing simultaneously with controlling thread and updating its running status in status register
WO2016090754A1 (en) * 2014-12-09 2016-06-16 中兴通讯股份有限公司 Method and apparatus for realizing functions of receiving and sending packet
CN106155776A (en) * 2015-06-03 2016-11-23 上海红神信息技术有限公司 Restructural resource parallel constructing method and system in reconfigurable system
CN106407008A (en) * 2016-08-31 2017-02-15 北京比特大陆科技有限公司 Mining business processing method, device and system
CN107818071A (en) * 2017-09-27 2018-03-20 武汉科技大学 A kind of hardware thread implementation method based on FPGA
CN110083469A (en) * 2019-05-11 2019-08-02 肖银皓 A kind of isomerization hardware tissue runs unified core method and system

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011123991A1 (en) * 2010-04-07 2011-10-13 福州福昕软件开发有限公司 Memory access method for parallel computing
US9448857B2 (en) 2010-04-07 2016-09-20 Foxit Corporation Memory access method for parallel computing
US9152426B2 (en) 2010-08-04 2015-10-06 International Business Machines Corporation Initiating assist thread upon asynchronous event for processing simultaneously with controlling thread and updating its running status in status register
US8898441B2 (en) 2010-09-20 2014-11-25 International Business Machines Corporation Obtaining and releasing hardware threads without hypervisor involvement
US8719554B2 (en) 2010-09-20 2014-05-06 International Business Machines Corporation Scaleable status tracking of multiple assist hardware threads
CN102411512B (en) * 2010-09-20 2015-04-08 国际商业机器公司 Scaleable status tracking method and system of multiple assist hardware threads
US8713290B2 (en) 2010-09-20 2014-04-29 International Business Machines Corporation Scaleable status tracking of multiple assist hardware threads
CN102411512A (en) * 2010-09-20 2012-04-11 国际商业机器公司 Scaleable status tracking method and system of multiple assist hardware threads
US8793474B2 (en) 2010-09-20 2014-07-29 International Business Machines Corporation Obtaining and releasing hardware threads without hypervisor involvement
CN104267955B (en) * 2014-09-28 2017-11-07 曙光信息产业股份有限公司 The removing method that intermodule operation is relied on during a kind of program start and stop
CN104267955A (en) * 2014-09-28 2015-01-07 曙光信息产业股份有限公司 Elimination method for running dependence among modules during starting time and stopping time of program
WO2016090754A1 (en) * 2014-12-09 2016-06-16 中兴通讯股份有限公司 Method and apparatus for realizing functions of receiving and sending packet
CN106155776B (en) * 2015-06-03 2019-12-03 上海红神信息技术有限公司 Restructural resource parallel constructing method and system in reconfigurable system
CN106155776A (en) * 2015-06-03 2016-11-23 上海红神信息技术有限公司 Restructural resource parallel constructing method and system in reconfigurable system
CN106407008A (en) * 2016-08-31 2017-02-15 北京比特大陆科技有限公司 Mining business processing method, device and system
CN106407008B (en) * 2016-08-31 2019-12-03 北京比特大陆科技有限公司 Dig mine method for processing business, device and system
CN107818071A (en) * 2017-09-27 2018-03-20 武汉科技大学 A kind of hardware thread implementation method based on FPGA
CN107818071B (en) * 2017-09-27 2021-05-04 武汉科技大学 Hardware thread implementation method based on FPGA
CN110083469A (en) * 2019-05-11 2019-08-02 肖银皓 A kind of isomerization hardware tissue runs unified core method and system
CN110083469B (en) * 2019-05-11 2021-06-04 广东财经大学 Method and system for organizing and running unified kernel by heterogeneous hardware

Similar Documents

Publication Publication Date Title
CN101477458A (en) Hardware thread execution method based on processor and FPGA mixed structure
JP6525286B2 (en) Processor core and processor system
US8793686B2 (en) Operating system decoupled heterogeneous computing
US7487341B2 (en) Handling address translations and exceptions of a heterogeneous resource of a processor using another processor resource
Peck et al. Hthreads: A computational model for reconfigurable devices
JP3816961B2 (en) Data processing apparatus for processing virtual machine instructions
US20120066668A1 (en) C/c++ language extensions for general-purpose graphics processing unit
WO2013184380A2 (en) Systems and methods for efficient scheduling of concurrent applications in multithreaded processors
JP2014504416A (en) Device discovery and topology reporting in combined CPU / GPU architecture systems
Berezovskyi et al. Makespan computation for GPU threads running on a single streaming multiprocessor
US20220414052A1 (en) Multi-Core Processor, Multi-Core Processor Processing Method, and Related Device
CN103049305B (en) Multithreading method for the dynamic code conversion of Godson multi-core CPU simulation
US20130290692A1 (en) Method and Apparatus for the Definition and Generation of Configurable, High Performance Low-Power Embedded Microprocessor Cores
Vander An et al. Instruction buffering exploration for low energy vliws with instruction clusters
KR100809294B1 (en) Apparatus and method for executing thread scheduling in virtual machine
Cartwright et al. Automating the design of mlut mpsopc fpgas in the cloud
JP2005234968A (en) Arithmetic processing unit
CN110262884B (en) Running method for multi-program multi-data-stream partition parallel in core group based on Shenwei many-core processor
CN112882701A (en) Executable file static pile inserting technical framework supporting multiple architectures
Shin et al. A compositional framework for real-time embedded systems
KR102560087B1 (en) Method and apparatus for translating memory addresses in manycore system
US20230195426A1 (en) Method for Managing a Runtime System for a Hybrid Computing Architecture, Managed Runtime System, Apparatus and Computer Program
US20240004645A1 (en) Intermediate Representation Controller Circuit for Selecting Hardware Compute Units to Process Microcode According to Identified Intermediate Representation Primitives
Höttger et al. Constrained mixed-critical parallelization for distributed heterogeneous systems
Palomar et al. Energy minimization at all layers of the data center: The ParaDIME project

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Open date: 20090708