CA1293331C

CA1293331C - Risc computer with unaligned reference handling and method for the same

Info

Publication number: CA1293331C
Application number: CA000555343A
Authority: CA
Inventors: Craig C. Hansen; Thomas J. Riordan
Original assignee: MIPS Computer Systems Inc
Current assignee: MIPS Tech LLC
Priority date: 1986-12-23
Filing date: 1987-12-23
Publication date: 1991-12-17
Anticipated expiration: 2008-12-17
Also published as: US4814976C1; KR960003046B1; US4814976A; JP2965206B2; JPH01502700A; AU619734B2; AU1185288A; WO1988004806A1; KR890700244A

Abstract

ABSTRACT OF THE DISCLOSURE
In a RISC device a set of four instructions are provided which allow either the loading or the storage of an unaligned reference. The instructions are overlapped to reduce the overall execution time of the device. A
circuit is also provided for executing the instruction set.

Description

P~ISC CQMPU~ER WITH UNALIGNED RI~FERE~CE

5 ~2~

This invention pert~ins to a ~omputer with a reduced ir.~truction ~et cap~bl~ of h~ndling un21igned r~0r~n~e~, and m~re pa~ticularly, the re~ding ~nd 10 wri~in~ of data hav~ng fractional word length, ~s well ~s ~ method ~or h~ndling the ~ne.

A new dev~lopment in co~puter ~rchitec~ure has been th~
introduction of so called RISC ~edueed Instruction Se~
15 CvMputer) devic~, ln which ~h in6truction i~ id~ally performed ~ n a sinqle operational ~ycl~. Su~h devic~s ~e ~dvantageous ov~r oomputers ha~ing ~tandard ar~hite~ure ~nd in~tructlon ~et6 in that t~ey are capable of much higher da~ca p~o~es~ ~ ng ~peed3 du~ eO
20 3:heir abili~y So perfor~ ~r~qu~nt oper~tion~ in ~hort~r p~io~s of tir~ r~uently, comput~rC and ~imilar data p~o~a~sors mu~c be ~ble to h~ndle data having fra~ti~n~l word ~ th. ror ~xampl@, slthou~h many ~ompute~ ar~ d~igned ~o h~ndle ~ords two or four bytes 25 in l~ngth li-e-, words o~ 16 or 32 bit~ ~ach), ~ert~in perlphar~l d~icea and applications ~enerate or ac~ep~
data ~ only one or two byte~. Thi~ 15 o ten th~ e~se with d~t~ proce4sing ~r~gra~ns and products. One result of ~his ~yp2 o~ d~tz~ i~ th~t i~ produce~ ~n unalign~d ~k ~2-referenee. Na~ely, ~or a ~achine capable of handling ~our-byt~ words (32 bit d~vice~), i.f incoming dat~ is located ~e~uentially as two bytes of data ~ollowed by four byte~ of da~, th~ four by~es of data can~o~ be S r~trieved or ~tored in ~ ~ingle cycle bec~use it would ove~ldp a word boundaEy within ~h~ ~emory~ Thi~ eff~ct iS ~VQ~ more problem~tical if a wor~ o~erlaps a page bound~ry within th~ mem~y b~cause, i~ ~ ~ir~ual memory .ystem is u~ed, only ~ portion of the word may actually resi~e in addre~sable memory. Therefore, prior art RlSC
devic~ either do not accept data in ~his form, in which case ~peci~1 pro~edure~ must be followed to ensur~ thae ~11 data is aligned in word boundaries/ or very involved programming is require~ which us~ up at least two consecutive instruction cycles. One way to en~ure ~or example that all data is alignRd in wor~ boundaries would ~e to add extra bits to dat~ of shorter length, usually known as bit ~tu~ing. Whether bi~ ~tuff ing is used o~ the progr~mmin~ i~ altered, it i9 obviou~ that unaligned refe~enc~ ~eriously degrade the per~ormance of prior ~rt ~ISC d~vices.

~15~, it ~hould be noted that dat~ is organized in modern ~omputers in either of two ~ormats Qr in som~
combination of those formats. The formats are known as "big ~ndian," in which t~e high order ~it, byte, or other uni~ of information is loeated in ~he lowe~
number~d uni~ add~ess, ~nd "little enditn," in which ~he high order unit of lnformaeion is located in the higher 0 numb~red unit addr~s. Thus, in a ~ru~ big ~ndian co~put2r ar~hitecture, bit~ of d~t~ ar~ ~hought of as belng lin~d up f~om lef~ to right, the low~t numbered and most ~ignl~icant bit b~in~ on the left. When this ~tring of bits l~ divided lnto, ~or example, 8-bi~
byt~, 16-bit halfword~, and/or 32-bit words, the lowest nu~bered and mo~t ~ignificant byte, hal~word, or word continues to ba located on the leEt. In a true little 3~

endian architecture, -the scheme is exactly the opposite.
Bits, bytes, hal~words, and words are numbered right -to left, the least significant bi-t, by-te, halfword or word being located on the right.

The present invention seeks to provide a means and method of handling unaligned references within a RISC device.

The invention also seeks to provide a RISC device which is capable of loading or storing an unaligned reference in a reduced number of instruction cycles, thereby maintaining a high processing speed for the device.

Still further the invention seeks to provide a method and means of handling unaligned references which can be easily implemented, without any ma~or changes in the hardware or the operating system.

In accordance with one aspect of the invention there is provided in a reduced instruction set computer with a memory holding m-bit words separated by word boundaries, a device for retrieving an unaligned reference from said memory comprising: a) a general register; b) means for retrieving a first word containing a first portion of said unaligned reference in response to a nth instruction and a second worcl containing a second portion of said unaligned reference from said memory in response to an (n+k)th instruction; c) shift-ing means for shifting said first portion to a :Eirst position and a section portion to a second position; and d) combining means for combining said first and second portions in said general register, wherein k and n are positive integers.

In accordance with another aspect of the invention there is provided in reduced instruction set computer, a device for s-toring an unaligned reference into a memory with m-bit locations comprising: shifting means for shifting said unaligned reference in a first direction in response to an nth instruction and in a second direction in response to (n-~k)th instruction, said means generating sequentially a ~93~
- 3a -first and second por-tion each having less than m-bits; and means for storing said firs-t and second portions sequen-tially into said memory, wherein k and n are positive integers.

In an embodiment of the invention there is provided in a reduced instruction set computer wi-th a memory for holding m-bit words, a device for loading a first unaligned reference having first and second portions of less than m-bits, said first portion being s-tored into a first section of said memory and said second portion being stored into a second section of said memory, and for storing a second aligned reference into said first and said second sections, compris-ing: a shift/merge unit having first and second inputs and being provided to shift first data bytes received from said first input, said first input being coupled to said memory unit to receive said firs-t and second portions sequentially, and merge said first data bytes with second data bytes from said second input to form an m-bit word; a first latch means for storing said first and second data bytes, said latch having an output coupled to said second input; an m-bit general register coupled to said first latch means and provided for holding selectively one of said first or section unaligned references; a second latch means coupled to said register for storing said second unaligned reference; shift-ing means for shifting said second unaligned references; and output means for storing said second unaligned reference after shifting by said shifting means into said memory.

In another aspect of the invention there is provided a method of loading an m-bit unaligned reference from a memory, said ~memory holding m-bit words separated by word boundaries, said m-bit unaligned reference being divided into a first portion and a second portion by a word boundary, comprising the steps of: a) retrieving a first word from said memory containing said first portion during an (nth) instruction; b) shifting said first portion to a first position; c) retrieving a second word containing said second portion during an (n+k)th instruction; d) shifting said second portion to a second position; and e) merging said first and second portions;
wherein said k and n are positive integers and wherein said first and second portions have less than m bits.

. -~

~333~
- 3b --In still ano-ther aspect of the invention -there is provided a me-thod of storing an unaligned reference into a computer memory, said computer memory holding m-bit locations separated by word boundaries, comprising the steps of: a) shifting a first portion of said reference -to a first position; b) storing said first portion in one location within an nth instruction; c) shifting a second portion of second por-tion of said reference to a second position; and d) storing said second portion to a second location within an (n~k)th ins-truction, wherein n and k are positive integers and wherein said first and second portions have less than m bits.

Briefly, a RISC device for having unaligned references includes an instruction set which has four instructions: two instructions (Load Word Left and Load Word Right) for loading an n byte unaligned reference from a memory into an n byte general register; and two instructions (Store Word Left and Store Word Right) for storing an unaligned reference from the general register into the memory. The two instructions are used sequentially in a manner which allows the corresponding instruction sequences to overlap. Therefore, the total time required to store or load an unaligned reference :is much shorter than the time required to execute two _4~ 3 33 ind~pendent instructions.

The device includes ~veral l~tche~ thr~u~h which da~a is propagate~ durin~ the above-mentione~ lnstru~ion~
and a multip~ex@r ~egi~te~ us~d to as~emble the different sections of an unali~ned refe~enc~.

B~$~ - DE$cR t~pl~ t?F ~E ]2iE~INGS

The present in~ention i~ illustrated by way of example and not limita~ion in ~h~ ~igures of th@ accompanying drawings, in which like re~erences lndicat~ ~imilar elements, and in which:

Figure 1 shows in diagramatic form elements of an emdodlment o~ the present invention;

Figure 2 6hows the gene~al register after a Load Word Left lnstruction:
Pigure 3 show~ the general r~gi~er a~ter ~ Lo~d Word Right in~ruction;

Fi~ure 4 $hows ~ucce~sive opera~ional intervals for Load ~ord ~e~t ~nd ~o~d Word Right instructions;

~igure 5 shows tha ~eneral regi3ter and the cache memory bQ~ore the S~ORE in~tru~tion~;

Figur~ 6 show~ the ~ache memo~y after the unalign~d ref~ren~e has be~n ~tored; and ~iguræ 7 ~hows, in block tiagram orm, a circuit a~rangement u~ed for ~xecuting the instr~ction -~et.
a~TA~LE~ DES~ TION QE T~E ~ 3~}Q~

_5~ g333~
Embodiments o~ the invention ~hall be de~cribed in connection with a 32-bit devic~, i.e., a RISC devi~e in which ~our-by~e w~ds with eight bit bytes are handled.
~owever, i~ ~hould be under~tood ~hat the mean an~
~thod ~or h~n~ling unaligned referen~e~ cribed herein is ~qually ~p~ able to devic~ that ha~dle long~r or ~horter words or ~yte5.

Further, although this description is with re.epect to big endian addres~ing, it i5 equally applicable to llt~le endian addr~ssin~.

With r~ference to the drawings, Fi~ure l ~hows a ~ISC
device l~ comprisin~ an instru~ion ~emory 12 (which is comprised of random access memory ("RAM"), read-only memory ("ROM"), or an in~t~uction cache m~o~y) which holds the instructions which make up the operation 3y3tem of the d~vice, an ariehmetic logic uni~ ("A~
14, a general reglst~r 16, and a ~ache memory 18. The 20 ~ener~l register ' 6 is four byte~ wide, and cells are identified in Fig. l ~s cells ~, K, ~, and M, r~spectively. Similarly, cache memory 18 iQ organlzed ~o hold data in row6, with ea~h r~ of four bytes being sddre~abl~ simultaneQusly~ Each row therefore can be identi~ied by the first eell of the row. Thu~, the cache memory i3 made up of ~ows O, ~, 8, ~tG . ~or e~ample, cache ~emory may contain a two byte data group Xl, X2; a ~our by~e d~ta group Dl, D2, D3 and D4, and another two byte group Yl and Y2. As can be ~een fro~
Figure l, becau3e the first group (Xl, X2) is only ~wo bytes long, the full or one word long data group Dl-D4 overl~p~ the boundary ~etween rows O and 4 o~ the cache ~emo~y. As a re ul~, lf a normal load instruction is u~ed su~h a5 ~OA~ WO~D 0 ~o load ~he conten~s of memory ~ow D into g~neral re~ster 16, only the ~irSt two ~ytes D. an~ ~2 are obtained. Sp~eial provisions mus~ be made to save these by~es and thon ~OA~ WORD 4 to obtain ~he ~Z~33331L

remaining byteq D3 and D4~ This is accomplished in the present invention by us~ng two special instructions named ~oad Word 1efe ~nd I.oad Wc)rd P~ight hereinafter called ~WL ~nd ~W~, respectively. E~ctl o~ ~hese 5 instruction~ llow~d by two arg~m@nt~. The two instruction$ and th~ir ~rguments ~re de~ined mor~
specifically below:

TAE~I.S 1 LQ~ r~UC~ION~

Ir~ru~$iorl ~m~ ~n~tion Load Word I.e~t R, ~yt~ Addre3s load~ th~ left portior of ~h~ re~lster R with dat~ b~ginning at the Rpec:i~ied memory byte address and proceeding right~ard to the memo~y Z~ word bounda ry .

I,o~d Word ~ight R, 9yte Addres~ load3 the right portion of the r~gister R with d~t~ beginning at the memory word boundary and proceeding rightward to the sp~ci~ied memory byte ~ddress.

30 A~ shown below, at the end of the fourth inter~Jal, the dat~ byt~ ~emoved frorn the cache mQmory are saved in the g~noral registe~ in ~u~h a ~nanner ~hat they arls not e~as~d by the n~x~ lo~d op~ration I~W~). This ~llow~
the byt~s ~btained by I,5~ an~ LWR instruct~ons to be 35 combined prop~rly.

Thus, in order to loa~ word Dl-P4 f rom the cach~ r~emory ~333 into the qeneral regis~er lS, first the ~ollowing instruction ~s used: LW~ R,2. This instruction loads bytes Dl and D2 into cells J and K, respe~tively, a~
shown in Figure ~ . ~hereaf~e~, the instruction ~WFt R, 5 5 is used whi~h loads byte~ D3 and D4 into cells L and M, ~espectiv~ly, as ~hown in Figure 3, thereby completing the loading of ~h~ wor~. In ~ene~al, ~o~ ~ big endiar~
device ~nd ~ memory having row~ four by'ees wide, if the Byte P.ddress f~r the L~I~ instruction is X, the Byte 10 Address ~or the ~rresponding ~WR instrUction iB X + 3.

Advant~geously, the two instruc~ions describ~d above may be executed in an overlapping matter. ~hu~, the follo~ing five step sequen~e may be required to perform 15 one of the load operation~ de~ribed above:

1. Petch instructlon from the ROM (st~p "I" ), 2. Read ~egister File ~step "R" );

3. Compute address ( ~tep "~" ) i ~0 4 . Access C~che Memory ( step "M" ): and 5. W~ite in~o ~egi3ter File (step "W").

The~e steps a~e taker, by the ALU 14 and may be overlapped as ~h~wn in ~igure 4 a~ follows. 'rhe $irst 25 in~truction ~ WL R, 2 -- may st~rt during interval 1 ~nd ~nd in in~erval 5 with each of ~he intervals being ~Ised ~or one of ehe s~eps I, R, A, M, and W as def ined above. Howeve~, the ~econd instructior - I~WR R, 5 --can start d~lring interval numbe~ 2 a~ sho~n in ~igure 4.
30 ~ec~use the device does not h~ve to wait for th~
completion of th~ ~econd instruction b~fore the ~or~ple~ion of ~he f ir~, t~e overall ~peed of opera~ion o~ the device i~ increa~ed. Thus, the tvtal ~ime r~e~uir~d to load the unallgned ref~rence word requires 35 ollly ~ix ~ nt~rval~, only one in'cerval mor~ than the number o intervals r~guir~d for a sirlgle instru~eion.

-8~ 333~
~he pair of ~OAD instructions LWL and LWR can be execut~d in either order, howe~er: either ~WL or ~WR can com~ ~irst. Furth~r~or~, the ~OAD instruction~ still work when they ~re ~ot a~jacent~

The bove-d~scribed proee~ure i~ readily ex~endable to th~ ~tor~ge of an unaligned re~erence. In Figure 5, g2ner~1 re~ister 16 ~oncain~ B our ~yt~ word El, E2, E3 and E4 which is to be ~rored in the same o~der in posltions Pl-P4. In order to perform ~hi6 operation, the device uses the in~tructians S~ore Word ~eft ~"SWL") and Store Word Right ("SWR"), e~ch having two ~rguments.
~e two STORE instruction are defined in table ~ below~

~BLE 2 ~stFuction ~ m~n~ P~finition 5tore Word Le~t R, Byte Address store~ data ~rom ~he left po~tion of the reglst~r R
into the specif ied memory by~e addreqs and proceeding rightward to ~he memory word ~oundary.

Store Word ~i~ht R, ~yte Address stores data ~rom th~ right portion o~ ~he regi~ter into th~ m~mory byte ju~t ~ter the memory word boundary, and proc~eding a3~
g rightward to the ~peci~ied memory byte addr~s.

In gener~l, for ~ big ~ndian devi~e ~nd a m~mory h~ving ~OW8 ~our byte~ wid~ he ~yt~ ~ddress ~or th~ SWL
instruction is X~ then the Byte Addr2~s for ~he corresponding SWR instru~tion is X + 3.

At ~he ~nd of the first STORE in~tru~tion, ~ytes El and E2 are store~ at ~ddr~s~es 2 and 3, resp~c~ively, and at t~e 2nd o ~he s~cvnd ~tore ~nstruction, bytes E3 and E4 are stor2d ~t ~ddresses 4 ~nd S, respectively, a~ shown in ~igure 6.
~ike the ~OAD instru~tions, the STORE instructions c~n be ex~cuted in either oxder; either SWL o~ SWR can come first. ~urthermore, the STO~ instructions ~till work when they are not ad~acent.
~0 c1r~uit ~or ~xecuting the ~our instructions is ~hown in blo~k diaqram form in F~gure 7. ~his clrcuit may be implement~d dire~tly, or by using ~oftwar2. The circuit co~rl~s ~ shi~/m~rge uni~ 20 which recelves an input ~5 from cache memory 1~ an~ generates an output whi~h is ed (in parallel) ~ a lat~h 22 . ~he 1 tch 22 in turn eed~ ~ general regi~ter 16 to b~ desi~nated by the arqument ~ in th~ approp~iate instruction. The cont~nts of gene~al ~egiste~ 16 are propagated during each operational int~rval thou~h a latch ~4, shift ~n1t ~6, ~nd I~tch ~8. La~ch ~ can feed the ~che memory 1~.
~here i~ al~ a ~ir~t f~dbaok p~th from the ~utput of ~atch 28 ~o ~ ~irst in~ut o~ bypa~s multlplexer unit 30 ~he mult~pl~x~r unlt 30 has a ~co~d input conne~ed to th~ output o latch ~2 ~hi~h therefo~e ~orms a ~eGond ~eedb~ck path, The outpu~ o m~ltiplexer unit 30 1 ~lso ~ed to shi~t/merge ~nit ~ uring the STORE

~3~ ~

instr~ctions, the mul~iplexer 30, shift/merge unit 20, ~nd latch 22 are not in operation. ~ring the LOAP
instruc~ion~, ~hif~ unit 26 me~ely feeds through the data ~rom latch 24 to latch 28 without any appreclable time delay, One of the purpo~ of latch Z4 and latch 28 i% to match the d~l~y of ~he circuit p~th containing those latohes with the number of ~teps makin~ up ~n instruction. I~ the num~er of step~ making up an ina~ru~tion were 1ncrea3ed or decreased, the number of 10 latches in ~he circuit ~ould change ac~ordingly. ~h~
circui~ o~ ~ig. 7 ope~tes as follo~s.

A ~WL in~truction i5 r~c~ived during int~rv~1 1 (see Figure 4~. Then in interval 4, the four bytes ~ro~ the lS row c~ntaining the ~ddres~ de~ined in th~ argum~nt Byte Address are shi~ted to the left by the shi~t/merge unit 20 and merged with what had been the contents of the general register 16 two intervals earlier. (The contents o~ gener~l ~egi5ter 16 having b~en fe~ through l~t~h 24, ~hift unit 26, l~tch 2~, and ~ypa~s ~ultiplexer 30). ~he results of this operation are stored in lat~h 22 at the end o interval 4. ~hu5, lf ~Ow O ie read from the ca~he m2mory 1~, latch 22 wil~
contain the ~y~es Dl, ~2, Y, and Z, wh~rein Y ~nd Z were ~5 the ea~lier contents o~ general r~gist~r 16 memory cells L and M. Earller, durin~ interval 2, in~truction LWR
R,S is al~o received. In interval 5, the contents o~
latch 2Z a~e ed to gen~ral register ~. At the same ti~, th~ ~R ln~truction causes the contents of the row 4 ~o be r~ad into ~hi~t/me~ge unlt 20. ~hls time thes@
~yt~ ~re shi~ted right until the end of the word boun~ary. ~ec~u3e th~ twO inBtru~iOns ~efer to the s~e gen~al segister and are adjdce~t~ multiplexer 30 is now ~t ~o ~ed the con~ents o~ latch 22 to ~hif~merge uni~ 20. Thu~, during interval 5, the bytes Dl, D~, D3, and D4 are ~ss~mb~ed within t~e ~hift~m~ge uni~ 20 and f~ ~o latch ~2. Durin~ int~s~al 6 ehese bytes are ~d to ~egister 16.

The STO~E ins~ruceions are executed as follows. The un~ligned refe~ence w02d is fed from th~ general ~egicter 16 (ldentifi@d ~ regi~ter R) to latch 24.
During the ~irst STO~E ln~truct~on -- ~WL R,2 -- the word fed fr~m latch ~4 is shi~ted in shlft unit 26 to the right by two byt~s so that byt8s El and E2 ~r~ in the right hand position. The Content~ of the shift unit 26 a~e then Fed to la~ch ~8, which then ~ends ~he ~ame to ~he address 2 of the ~a~he memo~y. More particularly, ~r SWL R,2/ E1 and E2 ~re stored ~t ~ocation~ Pl and P~, respectively, without ~isturbing the COntentB at memory add~es~ O and 1 (Flg~ ~ snd 6).
The unaligned ree~enc~ word is again fed ~rom general register 16 to latch 2~. In response to the SWR R,5, the contents of la~ch 24 are ~hi~ted ~o the l~t so ~hat bytes 3 and E4 are on the l~ft side o~ ~he shi~t uni~
20 26, and are then ~ed to row 4 ~y latch 28. More particularly, during SWR, by~e~ E3 and E4 are stored ln locations P3 and P4 without disturbing ehe ~ontents at addresses 6 and 7.

In devices in which ~rror correction coding ("ECC") is u~ed, a r~ad modify write cycle is performQd ~o that a new ECC Code is calcula~ed a~tar each STORE inseruction.

As with the LOA~ instruetions, th~ STORE lnstruetions ~R and SWL are overlApped to r~duce th~ overall time r~qui~d to completa the instructions. Thu~, ehe tw~
lns~ru~t~on~ ~equired to store the un ligned r~eren~
requir~ only ~ix ~ntervals, only one in~erval more th~n the number o~ intervals r~uired for a single instru~tion. I~ should ~e ap~r~e~ated that since ea~h row o~ ~he cache memory is handl~d ~p~rat~ly on an individual ba~i6, the faet that a r~ference may overl~p 33~3~

a p~ge boundary within th~ memory has no ef ~ect on the devi~e.

It ~hould also be noted th~t th~ pair of STO~E
in~ructions ~n b~ e~ecute~ in either order; ~ither SWL
o~ SWR can ~o~e fir~t. Corre~pondingly, the pair of ~OAD in~tr~ctions ~an ~l~o b~ executed in either ord~r:
~ither ~WL or ~W~ ~an ~o~e fir~t. ~urther, ~he LOAD
instru¢tions ~till work when they ~re not ~djaeent, and the same is true with respect to ~he STO~E ~nstructions.

The above ~et of inseruc~lons ~re sui~ble for ~ big endian device, ~.e., a devi~e in whi~h th* le~t~ost bit is th~ most ~ignificant b~t. ~owever, th~ same arran~ement and procedure ~ay used for a li~tl2 endian device, i.e., a device wherein the le~tmost bit of a byte is the lea~t significant bit. The only ~hange that nQ~ds to be made ~ ~o increment the addre~s value o~
the a~gument~ to the ~wL ~nd SWL instructions by 3 ~ather ~han to in~remen~ the arguments to the LWR and SWR inst~u~tlons (a~ i~ done in the big endian devi¢e).
Alternatlvely, ~ g~neric set o in~tru~tions coul~ be u~ed ~y changing "lef~" and "right" in th2 ~bove instruetion~ t~ "lower ad~ress" and "higher address,"
w~r~in the "lower ~ddress" inotruction~ would operate a~ "~eft" on a lit~le endian machine and "righ~" on a bi~ ~ndi~n machine, and the "higher addre~s"
in~tru~ions would opera~ ~s "riqht" on a littl~ endian m~hine and "left" on a big endian machine. This ~e~ of in~tru~tions could also be used for devices w~ich c~n h~ndl~ both bi~ endian and little endian data li-e-~dual byte orde~ devices).

In the ~or~oinq ~p~cification, the inv~n~ion ha~ been described wlth referenc~ to ~peeifie exemplary embodim~nt~ the~o~. It will, however, be evldent that various modi~ic~tion~ and ch~nges may be made ~hereto -13~ 3~
without departing ~rom ~he broader 3pirit an~ scope of the i~ven~ion a~ set foreh in ~he append~d ~laim~. ~he spe~i~ica~ion and d~awing are, a~cordinglyi to be regarded in an illus~rative rather than d r@3trictiv~
S ~ense.

Claims

1. In a reduced instruction set computer with a memory holding m-bit words separated by word boundaries, a device for retrieving an unaligned reference from said memory comprising:
a) a general register;
b) means for retrieving a first word containing a first portion of said unaligned reference in response to a nth instruction and a second word containing a second portion of said unaligned reference from said memory in response to an (n+k)th instruction;
c) shifting means for shifting said first portion to a first position and second portion to a second position; and d) combining means for combining said first and second portions in said general register, wherein k and n are positive integers.

2. In reduced instructions set computer, a device for storing an unaligned reference into a memory with m-bit locations comprising:
shifting means for shifting said unaligned reference in a first direction in response to an nth instruction and in a second direction in response to (n+k)th instruction, said means generating sequentially a first and second portion each having less than m-bits; and means for storing said first and second portions sequentially into said memory, wherein k and n are positive integers.

3. In a reduced instruction set computer with a memory for holding m-bit words, a device for loading a first unaligned reference having first and second portions of less than m-bits, said first portion being stored into a first section of said memory and said second portion being stored into a second section of said memory, and for storing a second aligned reference into said first and said second sections, comprising:

a shift/merge unit having first and second inputs and being provided to shift first data bytes received from said first input, said first input being coupled to said memory unit to receive said first and second portions sequentially, and merge said first data bytes with second data bytes from said second input to form an m-bit word;
a first latch means for storing said first and second data bytes, said latch having an output coupled to said second input;
an m-bit general register coupled to said first latch means and provided for holding selectively one of said first or second unaligned references;
a second latch means coupled to said register for storing said second unaligned reference;
shifting means for shifting said second unaligned references; and output means for storing said second unaligned reference after shifting by said shifting means into said memory.

4. The device of claim 3, wherein said shift/merge unit shifts bytes received from said memory in a first direction in response to a first load instruction, and in a second direction in response to a second load instruction.

5. The device of claim 3, wherein said shifting means shifts bytes received from said second latch means in a first direction in response to a first store instruction, and in a second direction in response to a second store instruction.

6. The device of claim 3, further comprising a bypass multiplexer for selectively coupling to said second input one of the outputs of said first and second latching means.

7. The device of claim 4, wherein said first and second load instructions are at least partially overlapped.

8. The device of claim 5, wherein said first and second store instructions are at least partially overlapped.

9. A method of loading an m-bit unaligned reference from a memory, said memory holding m-bit words separated by word boundaries, said m-bit unaligned reference being divided into a first portion and a second portion by a word boundary, comprising the steps of:
a) retrieving a first word from said memory containing said first portion during an (nth) instruction;
b) shifting said first portion to a first position;
c) retrieving a second word containing said second portion during an (n+k)th instruction;
d) shifting said second portion to a second position;
and e) merging said first and second portions;
wherein said k and n are positive integers and wherein said first and second portions have less than m bits.

10. The method of claim 9, wherein said first and second positions are defined by said nth and (n+k)th instruction respectively.

11. The method of claim 9, wherein said nth and (n+k)th instruction are overlapped.

12. A method of storing an unaligned reference into a computer memory, said computer memory holding m-bit locations separated by word boundaries, comprising the steps of:
a) shifting a first portion of said reference to a first position;
b) storing said first portion in one location within a nth instruction;
c) shifting a second portion of second portion of said reference to a second position; and d) storing said second portion to a second location within an (n+k)th instruction, wherein n and k are positive integers and wherein said first and second portions have less than m bits.

13. The method of claim 12, wherein said first and second position are defined by said nth and (n+k)th instruction respectively.

14. The method of claim 12, wherein said nth and (n+k)th instructions are overlapped.

#8-11/22/1990