CN1297887C - Processor and method for trans-boundary aligned multiple transient memory data - Google Patents

Processor and method for trans-boundary aligned multiple transient memory data Download PDF

Info

Publication number
CN1297887C
CN1297887C CNB2003101188147A CN200310118814A CN1297887C CN 1297887 C CN1297887 C CN 1297887C CN B2003101188147 A CNB2003101188147 A CN B2003101188147A CN 200310118814 A CN200310118814 A CN 200310118814A CN 1297887 C CN1297887 C CN 1297887C
Authority
CN
China
Prior art keywords
address
working storage
bit
group
output terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB2003101188147A
Other languages
Chinese (zh)
Other versions
CN1622031A (en
Inventor
梁伯嵩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sunplus Technology Co Ltd
Original Assignee
Sunplus Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sunplus Technology Co Ltd filed Critical Sunplus Technology Co Ltd
Priority to CNB2003101188147A priority Critical patent/CN1297887C/en
Publication of CN1622031A publication Critical patent/CN1622031A/en
Application granted granted Critical
Publication of CN1297887C publication Critical patent/CN1297887C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Abstract

The present invention provides a processor which can align the information of a plurality of temporary storages by crossing a boundary and a method thereof, wherein a decoding device is used for decoding a multiplex shift instruction; a temporary storage archive is provided with a plurality of temporary storages, and each temporary storage has N bits; the output contents of the first output terminal and the second output terminal of the temporary storage archive are connected in series into a 2N-bit character set by a shifter, the 2N-bit character set is then shifted by w bits, and the previous N bits are output; the temporary storage archive is set by a control device according to the decoded multiplex shift instruction, the contents of the corresponding temporary storages are read out so as to shift the read out contents of the temporary storages by w bits by the shifter, and the output of the shifter is written into the temporary storage archive.

Description

Trans-boundary alignment multiple transient memory DATA PROCESSING device and method thereof
Technical field
The invention relates to the technical field of Data Processing; Especially refer to a kind of trans-boundary alignment multiple transient memory DATA PROCESSING device and method thereof utilized.
Background technology
When processor carried out Data Processing, whether the alignment of data was related to the usefulness of many key operations, for example the usefulness of computing such as word string, array.As shown in Figure 1, a data (ABCDEFGHIJKL) that needs to handle is often crossed over the data storage border, when a processor carries out word string or array operation to this document, need to carry out earlier many extra computings, so that after can be with this document being reduced into the form of alignment, this processor could be to the document utilization of being correlated with.
At the unjustified problem of processing data, a kind of known method is after data is written into processor, utilizes various processor instructions to operate again and obtain needed data.As shown in Figure 2, the data (ZABC) that will be arranged in the 100h place earlier is written into working storage R16, working storage R16 is moved to left 8 bits so that unwanted data (Z) is removed, the data (DEFG) that will be arranged in the 104h place again is written into working storage R17, and working storage R17 moved to right 24 bits so that unwanted data (EFG) is removed, at last with working storage R16 and working storage R17 carries out or (OR) computing and its result deposited to working storage R16, the content among this moment working storage R16 is the data (ABCD) of required processing.According to above-mentioned same steps as, data EFGH and IJKL are written among working storage R17 and the working storage R18 in regular turn.
As shown in the above description: if the required unjustified data length that is written into is n word group (a word group is 32 bits), known method then needs 5n instruction to describe and reads action, simultaneously need 5n instruction cycle just can finish at least and read action, this makes procedure code tediously long, occupy the storage area, the burden that also increases processor simultaneously makes processor efficient unclear.
Use processor instruction to handle the problem that unjustified data is drawn the tediously long and efficient of Hyper program sign indicating number at known method, in U.S. USP4,814, in No. 976 patent announcements, be to be written into the action that unjustified data is promptly alignd simultaneously, and, be divided into twice and read a document of crossing the boundary.As shown in Figure 3, the data (ABC) that will be arranged in 101h to 103h place earlier is written into the bit group 0,1,2 of working storage R16, this moment working storage R16 bit group 3 in data be X (don ' t care), the data (D) that will be arranged in the 104h place again is written into the bit group 3 of R16, and the content among the working storage R16 is the data (ABCD) of required processing at this moment.Same steps as is written into data EFGH and IJKL among working storage R17 and the working storage R18 in regular turn according to this.
As shown in the above description,, then need 2n instruction to describe and read action, need 2n instruction cycle just can finish at least simultaneously and read action if the required unjustified data length that is written into is n word group.And, make the processor pipeline stop (Pipeline Stall) possibility and improve because same reservoir and working storage position are made repetitive read-write.Same reservoir position is repeated to read, can waste bus bandwidth, especially in some system that does not have cache, the delay that is caused is obvious especially.
Summary of the invention
The object of the present invention is to provide a kind of with trans-boundary alignment multiple transient memory DATA PROCESSING device and method thereof, tediously long with the procedure code of avoiding known technology, as to occupy storage area problem, can avoid because same reservoir is repeated to read the problem of waste bus bandwidth simultaneously.
According to one of characteristic of the present invention, a kind of trans-boundary alignment multiple transient memory DATA PROCESSING apparatus is proposed, it mainly comprises:
One decoding device is decoded so that a multiple shift is instructed;
One working storage group, have a plurality of working storages, each working storage is the N bit, this working storage group can read working storage respectively according to one first address and one second address, and by one first output terminal and the output of one second output terminal, and can write this multiple transient memory one of them (N is a positive integer) via an input end according to one the 3rd address;
One shift unit, be coupled to first output terminal and second output terminal of this working storage group, and the output content of this first output terminal and second output terminal is concatenated into a 2N bit word group, again according to a shift value w with this 2N bit word group displacement w bit (w is a positive integer), and export top n bit in this 2N bit word group; And
One control device, be coupled to this decoding device and working storage group, according to this decoded multiple shift instruction, to set this first address, second address, the 3rd address and shift value w, read the content of corresponding working storage, with by this shift unit with the content of read working storage displacement w bit, and the output of this shift unit is write this working storage group according to the 3rd address.
Described device, wherein N is 32.
Described device, wherein w be 8,16,24 one of them.
Described device, wherein this shift unit w bit that can be shifted to the left or to the right.
Described device, wherein the 3rd address is that setting is identical with this first address.
Described device, wherein this second address is the follow-up address that is set at this first address.
According to another characteristic of the present invention, the align method of a plurality of working storage data of a kind of trans-boundary is proposed, these a plurality of working storages form a working storage group, each working storage is the N bit, this working storage group can read working storage respectively according to one first address and one second address, and by one first output terminal and the output of one second output terminal, and can write this multiple transient memory one of them (N is a positive integer) via an input end according to one the 3rd address, this method mainly comprises the following step:
(A) set this first address, this second address, the 3rd address and a shift value w according to multiple shift instruction;
(B) content of reading corresponding working storage according to this first address and second address; And
(C) content strings of step (B) working storage of reading is connected into the word group of 2N bit, again to this 2N bit word group w bit that is shifted, and top n bit in this 2N bit word group after will being shifted, according to the 3rd address write these a plurality of working storages one of them.
Described method, wherein step (A) to step (C) is heavily to cover execution, has all finished displacement up to the working storage of a predetermined number.
Described method, wherein N is 32.
Described method, wherein w be 8,16,24 one of them.
Described method, wherein displacement w bit can be the w bit that is shifted to the left or to the right in the step (C).
Described method, wherein the 3rd address is that setting is identical with this first address.
Described method, wherein this second address is the follow-up address that is set at this first address.
Description of drawings
Fig. 1: be one group of synoptic diagram that unjustified data is arranged in reservoir.
Fig. 2: the procedure code that is written into one group of unjustified data for known technology.
Fig. 3: for another known technology is written into the procedure code of one group of unjustified data and the synoptic diagram of working storage.
Fig. 4: be the calcspar of trans-boundary alignment multiple transient memory DATA PROCESSING apparatus of the present invention.
Fig. 5: be the detailed circuit diagram of the technology of the present invention control device 5.
Fig. 6: be the technology of the present invention running synoptic diagram.
Fig. 7: be an exemplary applications of the technology of the present invention.
Embodiment
Fig. 4 shows the calcspar that utilizes trans-boundary alignment multiple transient memory DATA PROCESSING device of the present invention, and it includes a decoding device 100, a control device 200, a working storage group 300 and a shift unit 400.Working storage group 300 has a plurality of working storages 3001, and each working storage 3001 is the N bit, and in the present embodiment, the N value is preferably 32.This working storage group 300 can read working storage 3001 respectively according to one first address 301 and one second address 302, and by one first output terminal 310 and 320 outputs of one second output terminal, and can write this multiple transient memory 3001 one of them (N is a positive integer) via an input end 330 according to one the 3rd address 303.
This decoding device 100 is that instruction is decoded to a multiple shift, and this multiple shift instruction can be divided into a multiple left shift instruction (Multiple Left Shin Instruction, MLSI) and a multiple right shift instruction (Multiple Right Shift Instruction, MRSI).Wherein, multiple left shift instruction form is MLSIRx, Ry, and w, it is represented the working storage contents value in x to the y scope, and integral body is carried out to the action w bit that shifts left.And multiple right shift instruction form is MRSI Rx, Ry, and w, it is represented the working storage contents value in x to the y scope, and integral body is carried out the action w bit of right shift.Decoding device 100 is after instruction is decoded to a multiple shift, can produce x, y, L_R *And the w signal, and export this control device 200 to, and wherein, L_R *Signal is only first in order to the mobile to the left or to the right w of indication, works as L_R *Signal is 1 o'clock, and expression is moved to the left the w bit, works as L_R *Signal is 0 o'clock, represents to move right the w bit.
This shift unit 400 is first output terminal 310 and second output terminals 320 that are coupled to this working storage group 300, and the output content of this first output terminal 310 and second output terminal 320 is concatenated into one 64 bit space groups, again according to a shift value w and a L_R *Signal is this 64 bit word group w bit (w is a positive integer) that is shifted to the left or to the right, and exports preceding 32 bits in these displacement back 64 bit word groups.
This control device 200 is coupled to this decoding device 100 and working storage group 300, according to this decoded x, y, and L_R *And w signal, setting first address 301, second address 302, the 3rd address 303 and the shift value w of this working storage group 300, and the content of reading x working storage and y working storage in this working storage group 300 by first output terminal 310 of this working storage group 300 and second output terminal 320.
Fig. 5 is the detailed circuit diagram of this control device 200, and it mainly comprises a multiplexer 210, a comparer 220, one first address working storage 230, a totalizer 240 and one second address working storage 250.This multiplexer 210 is selected an x signal that is produced by decoding device 100 or by the contents value of this second address working storage 250.The output of this multiplexer 210 writes this first address working storage 230, and it exports first address 301 of this working storage group 300 to, with the working storage 3001 of these first address, 301 indications of access.This totalizer 240 is written to this second address working storage 250 after the contents value of this first address working storage 230 is added 1 again, and the contents value of this second address working storage 250 is in order to the working storage 3001 of these second address, 302 indications of access.This comparer 220 is the contents value of this first address working storage 230 and the y signal that decoding device 100 is produced relatively, if the contents value of this first address working storage 230 during more than or equal to this y signal, then produces a stop signal (stop_signal).
Fig. 6 shows running synoptic diagram of the present invention, and it carries out a MLSIR16, R19, and 8 instructions, this instruction represent that contents value with working storage R16, R17, R18 and R19 is to 8 bits that shift left.When first performance period began, these decoding device 100 these instructions of decoding, and produce x=16, y=19, L_R *=1 and the w=8 signal.This multiplexer 210 is selected an x signal (=16) that is produced by decoding device 100, and 200 of control device insert 16 with this first address working storage 230, and via these totalizer 240 computings this second address working storage 250 are inserted 17.Because the first address working storage 230 is 16, it is less than 19, so comparer 220 can not produce this stop signal (stop_signal).That is this working storage group 300 can according to this first address 301 (=16) and second address 302 (=17) read respectively working storage R16 contents value (=ZABC) and the contents value of R17 (=DEFG).And export this shift unit 400 to by first output terminal 310 and second output terminal 320.
This shift unit 400 with the contents value of this first output terminal 310 (=ZABC) and the contents value of second output terminal 320 (=DEFG) be concatenated into one 64 bit word groups (=ZABCDEFG), again according to a shift value w=8 and a L_R *=1 signal with this 64 bit word group to 8 bits that shift left (=ABCDEFG0), and export in the 64 bit word groups of this displacement back (=ABCDEFG0) preceding 3 bits (=ABCD).200 of control device according to the 3rd address 303 with the output of this shift unit 400 (=ABCD) write among the working storage R16 of this working storage group 300.
When second performance period began, this multiplexer 210 is selected the contents value (=17) of this second address working storage 250,200 of control device insert 18 with this first address working storage 230, and via these totalizer 240 computings this second address working storage 250 are inserted 18.Its implementation was same as for first performance period, so when second performance period finished, the contents value of this working storage R17 was EFGH.In like manner, so when the 3rd performance period finished, the contents value of this working storage R18 was IJKL.
When the 4th performance period began, this multiplexer 210 is selected the contents value (=19) of this second address working storage 250,200 of control device insert 19 with this first address working storage 230, because the first address working storage 230 is 19, so comparer 220 can produce this stop signal (stop_signal) and stop executive routine, that is only needs three performance periods to get final product.
Fig. 7 shows utilization synoptic diagram of the present invention, when desire is written into one group of unjustified data, can respectively unjustified data be written among working storage R16, R17, R18 and the R19 with being written into instruction (LW) earlier, re-using multiple left shift instruction of the present invention (MLSI) can finish.As shown in Figure 7, its procedure code only needs 5 word groups.
As shown in the above description, technology of the present invention can solve the problem that the known technology procedure code is tediously long, occupy the storage area, can avoid because same reservoir is repeated to read the problem of waste bus bandwidth simultaneously.
It should be noted that above-mentioned many embodiment give an example for convenience of explanation, the interest field that the present invention advocated should be as the criterion so that claim is described certainly, but not only limits to the foregoing description.

Claims (12)

1. trans-boundary alignment multiple transient memory DATA PROCESSING apparatus mainly comprises:
One decoding device is decoded so that a multiple shift is instructed;
One working storage group, have a plurality of working storages, each working storage is the N bit, this working storage group can read working storage respectively according to one first address and one second address, and by one first output terminal and the output of one second output terminal, and can according to one the 3rd address via an input end write this multiple transient memory one of them, N is a positive integer;
One shift unit, be coupled to first output terminal and second output terminal of this working storage group, and the output content of this first output terminal and second output terminal is concatenated into a 2N bit word group, foundation one shift value w is with this 2N bit word group displacement w bit again, w is a positive integer, and exports the top n bit in this 2N bit word group; And
One control device, be coupled to this decoding device and working storage group, according to this decoded multiple shift instruction, to set this first address, second address, the 3rd address and shift value w, read the content of corresponding working storage, with by this shift unit with the content of read working storage displacement w bit, and the output of this shift unit is write this working storage group according to the 3rd address.
2. device as claimed in claim 1 is characterized in that, wherein N is 32.
3. device as claimed in claim 1 is characterized in that, wherein w be 8,16,24 one of them.
4. device as claimed in claim 1 is characterized in that, wherein this shift unit w bit that can be shifted to the left or to the right.
5. device as claimed in claim 1 is characterized in that, wherein the 3rd address is that setting is identical with this first address.
6. device as claimed in claim 1 is characterized in that, wherein this second address is the follow-up address that is set at this first address.
7. the trans-boundary method of a plurality of working storage data of aliging, these a plurality of working storages form a working storage group, each working storage is the N bit, this working storage group can read working storage respectively according to one first address and one second address, and by one first output terminal and the output of one second output terminal, and can according to one the 3rd address via an input end write this multiple transient memory one of them, N is a positive integer, this method mainly comprises the following step:
(A) set this first address, this second address, the 3rd address and a shift value w according to multiple shift instruction;
(B) content of reading corresponding working storage according to this first address and second address; And
(C) content strings of step (B) working storage of reading is connected into the word group of 2N bit, again to this 2N bit word group w bit that is shifted, and top n bit in this 2N bit word group after will being shifted, according to the 3rd address write these a plurality of working storages one of them; And
(D) repeated execution of steps (A) has all been finished displacement to step (C) up to the working storage of a predetermined number.
8. method as claimed in claim 7 is characterized in that, wherein N is 32.
9. method as claimed in claim 7 is characterized in that, wherein w be 8,16,24 one of them.
10. method as claimed in claim 7 is characterized in that, wherein displacement w bit can be the w bit that is shifted to the left or to the right in the step (C).
11. method as claimed in claim 7 is characterized in that, wherein the 3rd address is that setting is identical with this first address.
12. method as claimed in claim 7 is characterized in that, wherein this second address is the follow-up address that is set at this first address.
CNB2003101188147A 2003-11-28 2003-11-28 Processor and method for trans-boundary aligned multiple transient memory data Expired - Fee Related CN1297887C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2003101188147A CN1297887C (en) 2003-11-28 2003-11-28 Processor and method for trans-boundary aligned multiple transient memory data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2003101188147A CN1297887C (en) 2003-11-28 2003-11-28 Processor and method for trans-boundary aligned multiple transient memory data

Publications (2)

Publication Number Publication Date
CN1622031A CN1622031A (en) 2005-06-01
CN1297887C true CN1297887C (en) 2007-01-31

Family

ID=34761217

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2003101188147A Expired - Fee Related CN1297887C (en) 2003-11-28 2003-11-28 Processor and method for trans-boundary aligned multiple transient memory data

Country Status (1)

Country Link
CN (1) CN1297887C (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10394735B2 (en) * 2017-01-09 2019-08-27 Nanya Technology Corporation Comparative forwarding circuit providing first datum and second datum to one of first circuit and second circuit according to target address

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4814976A (en) * 1986-12-23 1989-03-21 Mips Computer Systems, Inc. RISC computer with unaligned reference handling and method for the same
WO2003038601A1 (en) * 2001-10-29 2003-05-08 Intel Corporation Method and apparatus for parallel shift right merge of data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4814976A (en) * 1986-12-23 1989-03-21 Mips Computer Systems, Inc. RISC computer with unaligned reference handling and method for the same
US4814976C1 (en) * 1986-12-23 2002-06-04 Mips Tech Inc Risc computer with unaligned reference handling and method for the same
WO2003038601A1 (en) * 2001-10-29 2003-05-08 Intel Corporation Method and apparatus for parallel shift right merge of data

Also Published As

Publication number Publication date
CN1622031A (en) 2005-06-01

Similar Documents

Publication Publication Date Title
CN1203420C (en) Linked list DMA descriptor architecture
US9058253B2 (en) Data tree storage methods, systems and computer program products using page structure of flash memory
JP2534465B2 (en) Data compression apparatus and method
JP3229180B2 (en) Data compression system
Moffat Word‐based text compression
JP2610084B2 (en) Data expansion method and apparatus, and data compression / expansion method and apparatus
CN104331269B (en) A kind of embedded system executable code compression method and code decompression compression system
EP0633668B1 (en) Data compression apparatus
CN1297887C (en) Processor and method for trans-boundary aligned multiple transient memory data
CN1319801A (en) Effective calculation method and device for cyclic redundant check
CN1335958A (en) Variable-instruction-length processing
CN100336038C (en) Computer system embedding sequential buffers therein for improving the performance of a digital signal processing data access operation and a method thereof
CN1296815C (en) Marker digit optimizing method in binary system translation
US7676651B2 (en) Micro controller for decompressing and compressing variable length codes via a compressed code dictionary
CN1238788C (en) First-in first-out register quenue arrangement capable of processing variable-length data and its control method
CN100346291C (en) Method and device for coutrolling block transfer instruction for multi address space
CN1078720C (en) First-in first-out memory device for enabling sizes of input/output data to be different from each other and method therefor
CN1190738C (en) Data processing device and its data read method
CN1229718C (en) Digital signal processing apparatus and method for controlling same
CN1126022C (en) Signal processor
CN1087086C (en) Device for controlling display device
CN1293485C (en) Processor unit and method for protecting data by data block confounding processing
JP3792633B2 (en) Microcontroller and microcontroller device
CN1723438A (en) Method and apparatus for encoding design description in reconfigurable multi-processor system
CN111966606A (en) Data storage device and data processing method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20070131

Termination date: 20141128

EXPY Termination of patent right or utility model