Two, background technology
Because the continuous progress of semiconductor technology; Transistor size integrated in chip is to tens, and traditional single-processor computers architecture faces this several technological barriers: storage wall (Memory Wall), ILP wall (Instruction Level Parallel Wall), power consumption wall (Power Wall) etc.Existing concurrent computer architecture just part has solved the problems referred to above, but can't tackle the problem of the common brick wall (Red Brick Wall) that occurs under the deep sub-micron technique.The object of the invention is exactly to solve the problems referred to above through novel architecture.
Three, summary of the invention
The present invention proposes a kind of novel based on the array structure of adjacent interconnected and the synchronization structure between the array element, as shown in Figure 1.Each unit in this array through software programmable, is called processing elements PPE able to programme (Programmable Processing Element), explains as follows:
(1) annexation: Fig. 1 is a structure by N*N, and except unit all around, the line between each unit and its adjacent unit has only the East, West, South, North four direction, is a kind of fixed connection relation.
(2) PPE unit: the PPE unit shown in accompanying drawing 1, is a kind of programmable unit through instruction programming; It can be homogeneity; Also can be heterogeneous, have control module, data path among present single CPU, also comprise data-carrier store, command memory or comprise both.
(3) data sync between the PPE unit: PPE realizes synchronization of data through flag register between the unit.Each PPE unit is referred to as local PPE, has comprised four different directions registers of East, West, South, North, is referred to as the Directional Sign register.After the DSR of certain direction, indication (high or low level) is set for the respective direction flag register.Local PPE unit reads its four data on the direction according to indication in the direction register, and after local PPE reading of data, the standard of respective direction flag register is set to reset (high or low level), with this mechanism to accomplish data sync.
(4) data transfer between the PPE unit: the PPE unit has comprised four registers group of four corners of the world four direction, is referred to as the data transfer registers group.After local PPE handled, the requirement according to data transfer direction in the instruction write data in the corresponding transmission registers group.
(5) command memory (Instruction Memory): the order register among Fig. 1 is used for the parallel level instruction of store data or is used to deposit the Call instruction that starts each PE unit.
Five, embodiment
Specifically introduce concrete working method of the present invention below in conjunction with accompanying drawing.
Fig. 2 is PPE inner structure and synchronization structure synoptic diagram.As shown in Figure 2, said synchronization structure is linking to each other with the ALU of processing unit on the four direction through data line independently up and down.Output data line has port register, so that the storage output data supplies local processing unit or adjacent processing unit to use.There is data selector synchronization structure inside, in order to realize the visit of processing unit built-in command to all directions port register.Have zone bit in the port register, adjacent processing unit is realized adjacent PPE unit through the judgement to zone bit data interaction with communicate by letter.The data sync structure of this adjacent cells is not only relevant with zone bit; Also relevant with instruction local and that adjacent PPE unit is inner; Local PPE confirms according to this locality instruction which direction port register is the data after the operation send into; Be only to transmit data, still transmit data to a plurality of directions to a direction.Whether adjacent PPE unit not only will arrive according to the zone bit judgment data, and the data that also will judge arrival are one or a plurality of direction all has data arrives, and what confirm needs according to instruction simultaneously is the data of that direction.The built-in command that has needs basis to come the direction of data and the value that value determines whether to ignore certain direction.