WO2008108775A2

WO2008108775A2 - Dynamic partitioning for area-efficient multi-port memory

Info

Publication number: WO2008108775A2
Application number: PCT/US2007/008542
Authority: WO
Inventors: Xinghao Chen; Hassan Bajwar
Original assignee: Xinghao Chen; Hassan Bajwar
Priority date: 2006-04-07
Filing date: 2007-04-06
Publication date: 2008-09-12
Also published as: WO2008108775A3

Abstract

Dynamic partitioning for area-efficient multi-port memory (40) At least one bit-9 line isolation node (52, 54) is placed on bit lines (46, 48) between word line sections (42, 44) and separate a long physical bit line (46, 48) into two virtually and electronically 5 separate sections (66, 70), with one section (66) electronically connected to an upper port (64) and the other section (70) electronically connected to a lower port (68) to shorten effective bit lines physically connected to a port (64, 68) At least one isolation control line (50), each of which controls the bit-line isolation nodes (52, 54) separating same word line sections (42, 44) At least one dynamic memory partition control circuit (51) has outputs feeding corresponding isolation control lines (50)

Description

DYNAMIC PARTITIONING FOR AREA- EFFICIENT MULTI-PORT MEMORY

Background of the Invention

Field of the Invention: The embodiments of the present invention relate to a multi-port memory, and more particularly, the embodiments of the present invention relate to dynamic partitioning for area-efficient multi-port memory.

Description of the Prior Art:

Advances in integrated circuit technology lead to much faster and high-performance microprocessors. At the same time, however, latency in memory access time has emerged as a serious limitation in overall system performance.¹ Design houses and manufacturers responded with fast memory systems by using efficient readout and layout algorithms,² hierarchical cache memory/ high-speed memory clock,⁴ and multiple ports to access memory partitions in parallel.⁵ As shown in FIGURE 1, which is a diagrammatic representation of a typical

SRAM cell in a single-port memory system, in a typical SRAM cell in a single-port

¹ A. Cristal, O. J. Santana, F. Cazorla, M. Galluzzi, T. Ramirez, M. Pericas, and M. Valero, "Kilo- instruction Processors: Overcoming the Memory Wall." IEEE Micro, Vol. 25, No. 3, pp. 48-57, May-June 2005.

² Y. Choi and T. Kim, "Memory Layout Techniques for Variables Utilizing Efficient DRAM Access Modes in Embedded System Design." In the Proceedings of Design Automation Conference, pp. 881 -886, June 2003; and R. D. Adams, "High Performance Memory Testing: Design Principles, Fault Modeling and Self-Test." Kluwer Academic Publishers, Boston, 2003.

³ N. Shibata, M. Watanabe, and Y. Tanabe, "A Current Sensed High Speed and Low Power First-in First-Out Memory Using a Word/Bit Line Swapped Dual-Port SRAM Cell." IEEE Journal of Solid-Slate Circuits, Vol. 37, No. 6, pp. 735-750, June 2002.

⁴ W. Fang, J. Lijiu, "Design of High Speed 2-Write/6-Read Eight-Port Register File ASIC." In the Proceedings of the 5"' International Conference on ASIC, Vol. ; , pp. 498-501 , October 2003.

⁵ [4] W. Fang, J. Lijiu, "Design of High Speed 2-Write/6-Read Eight-Port Register File ASIC." in the Proceedings of the 5^lh International Conference on ASIC, Vol. 1 , pp. 498-501 , October 2003; and R. D. Adams, "High Performance Memory Testing: Design Principles, Fault Modeling and Self-Test." Kluwer Academic Publishers, Boston, 2003. memory system 10, when a word line (WL) 12 is selected, the SRAM cell 10 is connected to a pair of compliment bit lines (BL and BL) 14, 16 via a transistor (T5) 18 and a transistor (T6) 20.

As shown in FIGURE 2, which is a diagrammatic representation of a classic hardwired dual-port memory architecture by which each SRAM cell is accessible by two ports with dedicated word and bit lines to each, the addition of the word line (WL) 22 and the bit lines 23, 25 and access transistor (77) 24 and access transistor (T8) 26 would almost double the silicon area of the SRAM cell 28.

Each of dual ports 30, 32 of the SRAM cell 28 is hardwired with dedicated wofd lines (WL) 12, 22 and bit lines (BL and BL) 14, 16, 23, 25. The duplicated bit lines 23 and 25 for the second port cause bit-line leakage current to double. The addition of word line (WL) 22 and the bit lines (BL) 23, (BL) 25 and the access transistors (TT) 24 and (T8) 26 for the second port 32 would almost double the silicon area of the single-p"ort configuration shown in FIGURE 1. Subthreshold leakage current, drain-source current of the transistor when the transistor is operating in weak inversion is a major contributor towards SRAM leakage current.⁶ Pre-charging of bit lines (BL and BL) 14, 16, 23, 25, as well as keeping these bit lines-high, causes significant power dissipation and contributes heavily to the total power dissipation.

The dual-port — as well as multi-port — memory architecture has been applied with instruction and data cache implementations in multi-core processors in recent years. The most important advantage of dual- and multi-port cache is that it can execute multiple cache accesses simultaneously. Therefore, it doubles — in the case of dual-port — or multiplies — in the case of multi-port — the speed of a single-port cache.

The IBM Power5^® is a single-die silicon solution that contains two identical processor cores, each supporting two logical threads.⁷ Power5^® uses separate busses for READ and WRITE with L3 cache. Its L2 cache is a dual-port design and is implemented

⁶ M. Mamidipaka, K. Khouri, N. Dutt, and M. Abadir "Analytical Models for Leakage Power Estimation of Memory Array Structures" In International Conference on Hardware/Software and Co-design and System Synthesis (CODES+ ISSS) pp. 146- 151 , 2004.

⁷ R. KaIIa, B. Sinharoy, J. M. Tendler, "IBM Power5^® Chip: A Dual-Core Multithreaded Processor." IEEE Micro, Vol. 24, No. 2, pp. 40-47, March-April 2004. on the same die as three identical slices with separate controllers. Each processor core can independently access each controller.

The Intel Itanium^®-2 uses a combination of dual-port and four-port caches to support the dual-core processors on the chip.⁸ A four-port data cache allows four M operations per cycle. The ltanium^®-2 Ll cache uses separate memory ports for different functions, i.e., Ports 0 and 1 support load operation and Ports 2 and 3 support integer operations. The L2 cache is shared for data and instructions.

The new Intel Montecito dual-core dual-thread Itanium^® processor uses 3 levels of cache.⁹ The Ll cache uses two separate caches for instruction (LlI) and data (LlD). The LlD allows two integer load and store with its dual-port at the same time, while the LlI use dual-ported tags and single data port to support simultaneous demand and prefetch accesses. The L2 cache also has dedicated L2I and L2D caches, with the tag array supporting four-port independent accesses in the same cycle.

With multi-core processor technologies, such as the IBM Power5^®, the Intel Itanium^®-2, and the Pentium^®-D that are capable of integrating 2 processor cores on the same die, moving into the mainstream applications, making cache memory highly accessible by multiple processors becomes a necessity, with many technical challenges.

Multi-port cache memory would provide the needed accessibility to multiple CPUs.

Historically, multi-port memory designs have been implemented with dedicated word and bit lines for each port. Since the silicon areas used by word and bit lines almost cover the entire memory area, duplicating the word and bit lines results in multiplying the silicon areas used by a multi-port memory system with the number of ports.

Thus, there exists a need for a multi-port memory system that can be implemented without the duplications of word and bit lines and utilizing dynamic memory partitioning technology with the use of isolation nodes to facilitate multi-port accesses to the same

⁸ S. D. Naffziger, G. Colon-Bonet, T. Fischer, R. Riedlinger, T. J. Sullivan, and T. Grutkowski, "The Implementation of the Itanium^®-2 Microprocessor." IEEE Journal of Solid-Slate Circuits, Vol. 37, No. 1 1 , pp. 1448-1460, Nov. 2002; and T. Lyon, F. Delano, C. McNairy, and D.Mulla, "Data Cache Design Considerations for the ltanium^®-2 Processor." In the Proceedings of the IEEE International Conference on Computer Design' VLSI in Computers and Processors, pp. 356-362, September 2002.

⁹ C. McNairy and R. Bhatia, "Montecito: A Dual-Core, Dual-Thread Itanium^® Processor." IEEE Micro, Vol. 25, No. 2, pp. 10-20, March-April 2005. memory core without duplicating the word and bit lines for each port, thereby maximizing utilization of silicon space for a multi-port memory.

Summary of the Invention

Thus, an object of the present invention is to provide dynamic partitioning for area-efficient multi-port memory that avoids the disadvantages of the prior art.

Briefly stated, another -object of the present invention is to provide dynamic partitioning for area-efficient multi-port memory. An isolation control line drives the same number of isolation nodes as with the number of access transistors a word line must drive. Therefore, circuits driving the isolation control lines are similar to those driving the word lines. To reduce the dynamic load of isolation control lines, it is possible to utilize the selected bit lines from the column decoder to dynamically turn ON or OFF only selected isolation nodes of a selected isolation control line.

The novel features considered characteristic of the present invention are set forth in the appended claims. The invention itself, however, both as to its construction and its method of operation and together with additional objects and advantages thereof will be best understood from the following description of the specific embodiments when read and understood in connection with the accompanying drawing.

Brief Description of the Drawing

The figures of the drawing are briefly described as follows: FIGURE 1 is a diagrammatic representation of a typical SRAM cell in a single-port memory system; FIGURE 2 is a diagrammatic representation of a classic hardwired dual-port memory architecture by which each SRAM cell is accessible by two ports with dedicated word and bit lines to each; and FIGURE 3 is a diagrammatic representation of, in the case of SRAM, placement of isolation control line and isolation nodes on bit lines.

List of Reference Numerals Utilized in the Drawing

10 typical SRAM cell in single-port memory system 12 word line (WL) of typical SRAM cell in single-port memory system 10 14 bit line (BL) of typical SRAM cell in single-port memory system 10 16 bit line (BL) of typical SRAM cell in single-port memory system 10 18 transistor (T5) of typical SRAM cell in single-port memory system 10

20 transistor (T6) of typical SRAM cell in single-port memory system 10

22 word line (WL) of typical SRAM cell in dual-port memory system 28

23 bit line (BL) of second port of typical SRAM cell in hardwired dual-port memory system 28

24 access transistor (77) of typical SRAM cell in dual-port memory system 28

25 bit line (BL) of second port of typical SRAM cell in hardwired dual-port memory ssyysstteemm 2288

26 access transistor (T8) of typical SRAM cell in dual-port memory system 28 28 typical SRAM cell in hardwired dual-port memory system 30 port of typical SRAM cell in hardwired dual-port memory system 32 port of typical SRAM cell in hardwired dual-port memory system 40 hardwired dual-port memory using dynamic partitioning

42 word line of hardwired dual-port memory using dynamic partitioning 40 44 word line of hardwired dual-port memory using dynamic partitioning 40 46 bit line (BL) of hardwired dual-port memory using dynamic partitioning 40 48 bit line (BL) of hardwired dual-port memory using dynamic partitioning 40 50 isolation control line of hardwired dual-port memory using dynamic partitioning

4400

51 at least one dynamic memory partition control circuit 52 isolation node of hardwired dual-port memory using dynamic partitioning 40 54 isolation node of hardwired dual-port memory using dynamic partitioning 40 56 access transistor of hardwired dual-port memory using dynamic partitioning 40, which is same as transistor (T5) 18 of typical SRAM cell in single-port memory system 10 but differentiated for legal clarity access transistor of hardwired dual-port memory using dynamic partitioning 40, which is same as transistor (T6) 20 of typical SRAM cell in single-port memory system 10 but differentiated for legal clarity access transistor of hardwired dual-port memory using dynamic partitioning 40, which is same as access transistor (77) 24 of typical SRAM cell in dual-port memory system 28 but differentiated for legal clarity access transistor of hardwired dual-port memory using dynamic partitioning 40, which is same as access transistor (T8) 26 of typical SRAM cell in dual-port memory system 28 but differentiated for legal clarity upper port of hardwired dual-port memory using dynamic partitioning 40 upper section, resulting from dynamic partitioning, of hardwired dual-port memory using dynamic partitioning 40 lower port of hardwired dual-port memory using dynamic partitioning 40 lower section, resulting from dynamic partitioning, of hardwired dual-port memory using dynamic partitioning 40

Detailed Description of the Preferred Embodiments

Dual- or Multi-Port Memory

A dual-port — or multi-port — memory 28 can be implemented with explicit duplications of word line 12 and bit lines (BL) 14, (BL) 16 of the port 30 for the second port 32 as word line 22, bit lines (BL) 23, (BL) 25 — hence hardwired dual-port as shown in FIGURE 2 — or via software support on top of a single-port memory — hence soft dual-port. With the hardwired dual-port — or multi-port — memory 28, the silicon area is almost doubled — or multiplied. This is largely due to the duplicated word line 22 and bit lines (BL) 23, (BL) 25 for the second port. With a hardwired dual-port memory 28, the dedicated word lines 12, 22 and the bit lines 14, 16, 23, 25 allow the two ports 30, 32 to access a memory cell at the same time by providing direct parallel accesses. The direct parallel accesses to the same memory cells, however, are utilized for simultaneous READ operations and multiple READ and WRITE operations accessing different locations. For two simultaneous READ operations accessing the same memory cells, direct parallel accesses provided by the two ports 30, 32 connects the two sets of bit lines 14, 16, 23, 25 to the same memory cells. Therefore, each of the accessed memory cells must be able to drive the double-load. This is the case when the transistor (715) 18, the transistor (T6) 20, the transistor (77) 24, and the transistor (TS) 26 — in FIGURE 2 — are turned ON at the same time. For two simultaneous READ operations — or one READ and one WRITE — accessing different memory cells, each of the accessed memory cells connects to one of the ports 30, 32 and, therefore, is in the similar load condition of a single-port memory 10. This is the case when the transistor (T5) 18 and the transistor (T6) 20 are turned OFF and the transistor (77) 24 and the transistor (T8) 26 are turned ON — or vise versa.

For simultaneous READ from one of the ports 30, 32 and WRITE from another port of ports 32, 30 — as well as two simultaneous WRITE operations — accessing the same memory cells, the READ and WRITE operations must be sequenced properly in order to ensure correct results. In other words, READ and WRITE operations don't access the same memory cells at the same time. In this case, the READ is typically performed first followed by the WRITE to the same cells followed by a READ from the same cells again. Depending on specific designs, this sequencing can vary from design to design.

Dynamically-Partitioned Memory

It has been shown that a hardwired dual-port memory 28 can be implemented without duplicated word line 22 and bit lines 23, 25 for the second port by using dynamic partitioning. As shown in FIGURE 3, which is a diagrammatic representation of, in the case of SRAM, placement of isolation control line and isolation nodes on bit lines, an isolation control line 50 is placed between the two neighboring word lines 42, 44 and is fed by one of the outputs of at least one dynamic memory partition control circuit 51 and has corresponding isolation control lines thereof fed by outputs of at least one dynamic memory partition control circuit 51. Isolation nodes 52, 54, controlled by the isolation control line 50, are interconnect switches that are placed on the bit lines 46, 48 between access transistors 56, 58, 60, 62 on neighboring word lines 42, 44, respectively. When one isolation line 50 turns OFF its isolation nodes 52, 54 and other isolation nodes on the same bit lines 46, 48 are turned ON, the memory block is literally separated into two virtually-disjoint sections, with an upper port 64 accessing an upper section 66 and a lower port 68 accessing a lower section 70. When all isolation nodes 52, 54 are ON, both the upper port 64 and the lower port 68 share the same bit lines 46, 48 — at both ends — to access the same memory cells. By default, all isolation nodes 52, 54 are set to ON initially. Only those isolation nodes 52, 54 of a selected isolation control line 50 are to be turned OFF and those isolation nodes 52, 54 of a previously selected isolation control line 50 are to be turned back ON again. In most cases, when a previously selected isolation line 50 can be used for a current configuration, there is no repartitioning needed. If the word lines 42, 44 are to be chosen in random order, the isolation control line 50 positioned in the middle, between the upper section 66 and the lower section 70, is expected to be used most often.

With the new multi-port configuration 40, identification and setting of the dynamic isolation control line 50, hence dynamic partitioning (DP), must be carried out before the actual accesses to the selected memory cells by a multi-port memory 40 operation. The DP overhead is minimal because it takes place in parallel in association with setting the memory-access addresses. Assuming the identification of an isolation line 50 is processed in parallel with the setting of addresses and/or pre-charge of the bit lines 46, 48, the only DP delay overhead is the setting of the isolation nodes 52, 54 of the selected isolation line 50. Depending on the actual design of the multi-port memory 40, turning isolation nodes 52, 54 OFF can take place in parallel of normal memory operations, such as during bit-line pre-charging. Overall, the impact of DP delay overhead with respect to the overall multi- port memory operation cycles is minimal and insignificant, if not zero.

In the case of simultaneous READ operations accessing the same memory cells via the upper port 64 and the lower port 68, all isolation nodes 52, 54 are set to ON. The upper port 64 and the lower port 68 are connected to the same memory cells with one pair of bit lines 46, 48. In the case of simultaneous READ operations accessing different memory cells, an isolation control line 50 between the two selected — and different — word lines 42, 44 is selected to turn OFF its isolation nodes 52, 54, which separate the upper section 66 containing one selected word line 42 from the lower section 70 containing the other selected word line 44. In both cases, the simultaneous READ operations take one READ cycle.

In the case of one READ and one WRITE operations accessing memory cells on two different word lines 42, 44, the DP configuration process is basically the same as with two simultaneous READ operations accessing memory cells on two different word lines 42, 44. Because the READ and WRITE operations take place in parallel, whichever requiring more time to complete determines the length of this READ/WRITE cycle.

In the case of one READ and one WRITE — as well as two simultaneous WRITE — operations accessing the same memory cells, no isolation line 50 is selected and all isolation nodes 52, 54 are ON. Correct operation results are ensured by careful sequencing of the two operations, which is similar to the same case with the classic hardwired dual-port memory architecture 28 as shown in FIGURE 2.

The isolation control line 50 drives the same number of isolation nodes 52, 54 as with the number of access transistors 56, 58 and 60, 62, which word lines 42, 44 must drive, respectively. Therefore, circuits driving the isolation control lines 50 are similar to those driving the word lines 42, 44. To reduce the dynamic load of isolation control lines 50, it is possible to utilize the selected bit lines 46, 48 from the column decoder to dynamically turn ON or OFF only selected isolation nodes 52, 54 of a selected isolation control line 50.

The dual-port memory architecture 40 does not require duplicating the word lines 42, 44 and the bit lines 46, 48 for the second port 64 or 68. It employs a technique that dynamically partitions a memory block into two virtually independent upper section 66 and lower section 70. The partition is dynamic because it is determined based on the actual locations to be assessed in real-time.

Compared with the classic hardwired dual-port memory 28 as shown in FIGURE 2, the dynamic partitioning technique eliminates the dedicated word line 22 and bit lines 23, 25 for the second port 32. ^'Since the word lines 12, 22 and the bit lines 14, 16, 23, 25 dominate the silicon area used by a dual-port memory 28, this elimination of dedicated word line 22 and bit lines 23, 25 for the second port 32 presents a significant saving in silicon area. It also benefits the signal quality due to less parallel bit lines. The use of isolation nodes 52, 54 would appear to increase the silicon area. In reality, this increase is insignificant, if not ignorable, for the same reason that the silicon area used by dual-port memory 28 — or multi-port memory — is dominated by the word lines 12, 22 and the bit lines 14, 16.

One concern with the proposed area-efficient multi-port memory architecture 40 is the addition of the isolation nodes 52, 54 to the bit lines 46, 48. Typically, the bit lines 46, 48 may cross over 512 or less word lines 42, 44, which may need 51 1 or less isolation nodes 52, 54 on each bit line 46, 48. Therefore, it is important that the isolation nodes 52, 54 must be designed with very good switching characteristics, i.e., very little resistance when being turned ON and very high resistance when being turned OFF.

It should be noted that implementations of DP are not limited to placing an isolation control line 50 and its direct-controlled isolation nodes 52, 54 between every adjacent pair of word lines 42, 44. An isolation control line 50 can be placed between adjacent sections of n word lines 42, 44, resulting in less overhead and timing impact to the bit lines 46, 48. The exact and optimal n should be determined based on statistical data of memory access patterns by specific and different types of applications. Impressions

A typical classic memory system includes:

• At least one memory cell.

• At least one bit line. • At least one word line.

• At least one sense amplifier.

• At least one address decoder.

• At least one local data bus.

• At least one I/O circuit. • At least one bit-line pre-charge circuit.

The dynamic memory partitioning function is accomplished by integrating into this typical classic memory system:

• At least one bit-line isolation node placed on bit lines between word line sections, with each comprising at least one consecutive word lines. • At least one isolation control line, with each controlling the isolation nodes on bit lines separating the same word line sections.

• At least one dynamic memory partition control circuit having outputs feeding the corresponding isolation control lines.

• Bit line pre-charge circuits, local data bus, and sense amplifiers used for the single- port — now named as the lower port — configuration are duplicated at the other end of the bit lines to form the second port — now named as the upper port.

These circuit components are added to a typical memory system on silicon according to IC manufacturing processes and sequences of steps.

For a typical classic memory system, for READ operation - a desired access address specifying memory cells to be read from is first processed by the at least one address decoder enabling the desired word and bit lines. The selected or all of the bit lines are pre-charged. The selected at least one word line turns on the access devices of the selected memory cells. The memory cells are now connected to the selected bit lines now connected to the local data bus feeding the inputs of the at least one sense amplifier which then send data to the I/O circuits to complete the READ opeiation. For WRITE operation - a desired access address specifying memory cells to be written into is first processed by the at least one address decoder enabling the desired word and bit lines. The selected bit lines are connected to the local data bus and are pre-charged via the at least one pre-charge circuit. The at least one selected word line turns on the access devices of the selected memory cells. The I/O circuits put the data to be written into the selected memory cells on the selected bit lines via the local data bus. The data are transmitted via the local data bus on to the bit lines to arrive at the selected memory cells and then be written into the selected memory cells.

For a classic hardwired dual- and multi-port memory system, dedicated word and bit lines, as well as local data bus and sense amplifiers, are used for each port.

Simultaneous access conflicts, e.g., READ and WRITE to the same memory cells at the same time, are managed by established protocols.

Integrated with the dynamic memory partitioning capability, the new multi-port memory system places isolation nodes on bit lines between word-line sections, each consists of 1 or more consecutive word lines. Isolation nodes along the same word line sections are controlled by the same isolation control lines, which are connected to the outputs of the at least one dynamic memory partition control circuit. Additionally, separate local data bus and sense amplifiers are deployed for both ends of the bit lines residing in the same physical memory block. Initially, all isolation nodes are turned ON, such that all sections of the bit lines are connected via the isolation nodes. Each physical memory block is then dual-port capable.

For simultaneous (READ and/or WRITE) accesses to this physical memory block via the upper and lower dual ports, a dynamic partition can be established such that one access can be facilitated via the upper port and the other access can be facilitated via the lower port. This dynamic memory partition is constructed by (1) identifying the isolation control line separating the selected memory cells defined by one access address from those defined by the other access address, and (2) turning off the isolation nodes controlled by the identified isolation control line. Once a dynamic partition is formed, the remaining memory access operations at each port are carried out in the same way as with the classic hardwired multi-port memory system. It is important to indicate that to minimize impact to access latency, dynamic partitioning steps are carried out in parallel with the address line setup and bit line pre-" charge phases in the normal memory access operation sequences. Furthermore, it is also important to note that a new dynamic partition is performed only when simultaneous accesses cannot be facilitated with the current partition or to minimize the length of effective bit lines connected to the ports in real time.

For simultaneous (READ and/or WRITE) accesses to the same memory cells, no isolation control line is established and all isolation nodes on the bit lines are turned or remain ON. The system configuration is the same as with the classic hardwired multi-port memory system, except that the bit lines are shorter.

It is worth noting that the dynamically partitioned dual-port physical memory block can be used to form highly efficient memory systems containing more than two ports by employing multiplexers to channel the upper and lower ports of a same physical memory block to the multiple ports — a technique known as port multiplexing in practice. The new circuit blocks are added within a memory system. They do not require external control(s). Therefore, a memory system integrating the dynamic memory partitioning would appear the same in operations, except that it takes smaller foot print on silicon, consumes less power, and incurs less bit line latency.

The function of an isolation node on a bit line includes separating long physical bit lines into two virtually and electronically separate sections, with one section electronically connected to the upper port and the other to the lower port to shorten effective bit lines physically connected to a port.

The function of an isolation control line is to turn ON or OFF its isolation nodes in order to partition a physical memory block into virtually and electronically separated sections.

The function of the dynamic memory partition control circuit is to compare access addresses and identify at least one isolation control line needed for dynamic memory • partitioning.

The advantages of the embodiments of the present invention as compared with an explicitly hardwired multi-port memory system include: • Smaller foot print on silicon, as dynamic memory partitioning eliminates the duplications of word and bit lines for multiple ports.

• Less bit line latency (average), as effective bit lines are shorter.

• Less power consumption due to shorter effective bit lines. • Other benefits due to elimination of dedicated word and bit lines for multiple ports.

The method generally for dynamic memory partitioning performed in real-time in parallel with a normal address line setup and bit line pre-charge phases, comprises the steps of: STEP 1 : Initialize all of the isolation nodes of selected physical memory blocks by turning them ON.

STEP 2: Identify a best memory partition facilitating simultaneous accesses.

STEP 3: Turn the selected isolation nodes OFF to form a new partition and turn those isolation nodes that were turned OFF in forming an old partition but need to be ON in forming the new partition back ON.

STEP 4: Continue on with remaining memory READ and/or WRITE operation steps at the multiple ports.

Note that STEPS 1 to 3 are carried out in parallel with address setup and bit-line pre-charge phases of normal memory access operations, and STEP 4 is a summary description of remaining steps in normal memory access operations.

The method specifically for dynamic memory partitioning of a physical memory block performed in real-time in parallel with a normal address line setup and bit line pre- charge phases, comprises the steps of:

STEP 1 : Turn initially, by the at least one isolation control line, all isolation nodes of selected physical memory blocks ON so as to allow bit line sections on both sides of the isolation nodes to be electronically connected. STEP 2: Compare simultaneous memory access addresses accessing different memory locations, by the dynamic memory partition control circuit, to identify the at least one isolation control line needed to form the best partition facilitating the simultaneous memorj accesses. STEP 3: Determine if simultaneous memory access addresses access the same memory locations.

STEP 4: Go to STEP 6 if answer to STEP 3 is yes, else go to STEP 5.

STEP 5: Turn OFF the isolation nodes connected and controlled by the selected at least one isolation control line so as to allow bit line sections at the two sides of these isolation nodes to be electronically disconnected, and at the same time, turn back ON the isolation nodes that were turned OFF with the previous partition and are needed to be ON in the new partition if different so as to divide the physical memory block into two virtually isolated sections so as to form a dynamically partitioned memory block, with an_, upper section containing one of the two desired memory locations to be accessed via the upper port and the lower section containing the other of the two desired memory locations to be accessed via the lower port. STEP 6: Make ready the dynamically partitioned memory block for the remaining normal memory READ and/or WRITE operation steps.

STEP 7: Repeat STEPS 2 to 6 for all subsequent simultaneous memory accesses with a multi-port memory system.

Note that for simultaneous memory access conflicts, such as READ and WRITE performed to the same memory locations at the same time, well-known protocols have been established to handle such conflicts. Dynamic memory partitioning does not change these protocols.

Other than the added physical circuit components, which are bounded within a memory system, the operation steps of dynamic memory partitioning do not require dedicated external instructions and do not generate new outcomes that require additional external processing other than the normal memory access operations.

Conclusions.

It is proposed an area-efficient hardwired multi-port memory architecture by employing a dynamic partitioning technique that uses isolation control lines and isolation nodes. Compared with the classic hardwired dual-port memory architecture, it has been shown that the new area-efficient hardwired multi-port memory architecture with dynamic memory partitioning provides a compact design with no significant impact to the timing of memory access operations. It reduces the use of silicon area largely due to the elimination of duplicating bit lines.

Although the examples used illustrate the method with SRAM, the technology can be applied to DRAM as well.

It will be understood that each of the elements described above or two or more together may also find a useful application in other types of constructions differing from the types described above.

While the invention has been illustrated and described as embodied in dynamic partitioning for area-efficient multi-port memory, however, it is not limited to the details shown, since it will be understood that various omissions, modifications, substitutions, and changes in the forms and details of the device illustrated and its operation can be made by those skilled in the art without departing in any way from the spirit of the present invention. Without further analysis the foregoing will so fully reveal the gist of the present invention that others can by applying current knowledge readily adapt it for various applications without omitting features that from the standpoint of prior art fairly constitute characteristics of the generic or specific aspects of the invention.

Claims

ClaimsThe invention claimed is:

1. An improved multi-port memory system of the type having at least one memory cell, at least one bit line, at least one word line, at least one sense amplifier, at least one address decoder, at least one local data bus, at least one I/O circuit, and at least one bit-line pre-charge circuit, wherein said improvement comprises: a) at least one bit-line isolation node placed on bit lines between word line sections and separating long physical bit lines into two virtually and electronically separate sections, with one section electronically connected to an upper port and the other section electronically connected to a lower port to shorten effective bit lines physically connected to a port; b) at least one isolation control line, each of which controlling said bit-line isolation nodes on bit lines separating same word line sections and turning ON or OFF its isolation nodes in order to partition a physical memory block into virtually and electronically separated sections; and c) at least one dynamic memory partition control circuit having outputs feeding corresponding isolation control lines and comparing access addresses and identifying at least one isolation control line needed for dynamic memory partitioning; wherein the bit-line pre-charge circuits, the local data bus, and the sense amplifiers are duplicated at other end of said bit lines to form said upper port.

2. The system of claim 1 , wherein said improvement further comprises each word line section comprising at least one consecutive word line.

3. The system of claim 1, wherein said isolation nodes along said same word line sections are controlled by same isolation control lines; and wherein said same isolation control lines are connected to said outputs of said at least one dynamic memory partition control circuit.

4. The system of claim 1 , wherein separate local data bus and sense amplifiers are deployed for both ends of said bit lines residing in a same physical block; and wherein_. all isolation nodes are turned ON initially so as to allow all sections of said bit lines to be connected via said isolation nodes and each physical block is then dual-port capable.

5. The system of claim 1, wherein a dynamic partition is established so as to allow one access to be facilitated via said upper port and another access to be facilitated via said lower port for simultaneous READ and/or WRITE accesses to said physical memory block via said upper port and said lower port.

6. The system of claim 1 , wherein a dynamic partitioning is carried out in parallel with address line setup and bit line pre-charge phases in normal memory access operation sequences to minimize impact to access latency.

7. The system of claim 1 , wherein no isolation control line is established for simultaneous READ and/or WRITE accesses to same memory cells.

8. A method for dynamic memory partitioning performed in real-time in parallel with a normal address line setup and bit line pre-charge phases, comprising the steps of: a) initializing all isolation nodes of selected physical memory blocks by turning them ON; b) identifying a best memory partition facilitating simultaneous accesses; • c) turning the selected isolation nodes OFF to form a new partition and turn those isolation nodes that were turned OFF in forming an old partition and are needed to be ON in order to form the new partition back ON; and d) continuing on with remaining memory READ and/or WRITE operation steps at multiple ports.

9. The method of claim 8, wherein said initializing step said identifying step, and said turning step are carried out in parallel with address setup and bit-line pre- charge phases of normal memory access operations; and wherein said continuing step is a summary description of remaining steps in normal memory access operations.

10. A method for dynamic memory partitioning of a physical memory block performed in real-time in parallel with a normal address line setup and bit line pre-charge phases, comprising the steps of: a) turning initially, by at least one isolation control line, all selected isolation nodes ON so as to allow bit line sections on both sides of the isolation nodes to be electronically connected; b) comparing simultaneous memory access addresses accessing different memory locations, by a dynamic memory partition control circuit, to identify the at least one isolation control line needed to form a best partition facilitating simultaneous memory accesses; c) determining if simultaneous memory access addresses access same memory locations; d) going to step f) if answer to step c) is yes, else going to step e); e) turning OFF the isolation nodes connected and controlled by the selected at least one isolation control line so as to allow bit line sections at the two sides of these isolation nodes to be electronically disconnected, and at a same time, turning back ON the isolation nodes that were turned OFF with a previous partition and are needed to be ON in order to form the new partition if different so as to divide the physical memory block into two virtually isolated sections so as to form a dynamically partitioned memory block, with an upper section containing one of two desired memory locations to be accessed via an upper port and a lower section containing the other of the two desired memory locations to be accessed via a lower port; f) making ready the dynamically partitioned memory block for remaining normal memory READ and/or WRITE operation steps; and g) repeating steps b) to f) for all subsequent simultaneous memory accesses with a multi-port memory system.