« PrécédentContinuer »
MONITORING OF MEMORY AND EXTERNAL EVENTS
CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims the benefit of U.S. Provisional Application Ser. No. 60/681,497, filed May 16, 2005, titled “Emulation/Debugging With Real-Time System Monitoring,” and U.S. Provisional Application Ser. No. 60/681,427, filed May 16, 2005, titled “Debugging Software-Controlled Cache Coherence,” both of which are incorporated by reference herein as if reproduced in full below.
The following applications contain subject matter related to the subject matter of this application, and are incorporated herein by reference:
Ser. Nos. ll/383,361 ll/383,389 ll/383,464 ll/383,465
1 1/383,466 ll/383,472 1 1/383,438 1 1/383,433
Integrated circuits are ubiquitous in society and can be found in a wide array of electronic products. Regardless of the type of electronic product, most consumers have come to expect greater functionality when each successive generation of electronic products are made available because successive generations of integrated circuits offer greater functionality such as faster memory or microprocessor speed. Moreover, successive generations of integrated circuits that are capable of offering greater functionality are often available relatively quickly. For example, Moore’s law, which is based on empirical observations, predicts that the speed of these integrated circuits doubles every eighteen months. As a result, integrated circuits with faster microprocessors and memory are often available for use in the latest electronic products every eighteen months.
Although successive generations of integrated circuits with greater functionality and features may be available every eighteen months, this does not mean that they can then be quickly incorporated into the latest electronic products. In fact, one major hurdle in bringing electronic products to market is ensuring that the integrated circuits, with their increased features and functionality, perform as desired. Generally speaking, ensuring that the integrated circuits will perfonn their intended functions when incorporated into an electronic product is called “debugging” the electronic product. Also, determining the performance, resource utilization, and execution of the integrated circuit is often referred to as “profiling”. Profiling is used to modify code execution on the integrated circuit so as to change the behavior of the integrated circuit as desired. The amount of time that debugging and profiling takes varies based on the complexity of the electronic product. One risk associated with the process of debugging and profiling is that it delays the product from being introduced into the market.
To prevent delaying the electronic product because of delay from debugging and profiling the integrated circuits, software based simulators that model the behavior of the integrated circuit are often developed so that debugging and profiling can begin before the integrated circuit is actually available. While these simulators may have been adequate in debugging and profiling previous generations of integrated circuits, such simulators are increasingly unable to accurately model the intricacies of newer generations of integrated circuits. Further, attempting to develop a more complex simulator that copes with the intricacies of integrated circuits with cache memory takes time and is usually not an option because
of the preferred short time-to-market of electronic products. Unfortunately, a simulator’s inability to effectively model integrated circuits results in the integrated circuits being employed in the electronic products without being debugged and profiled fully to make the integrated circuit behave as desired.
Disclosed herein is a system and method monitoring various memory events and extemal events. In at least one embodiment, a system comprises a circuit configured to execute instructions and output event data corresponding to the execution of the instructions. The system also comprises a monitoring device coupled to the circuit. The monitoring device receives information about said event data. The event data comprises event data selected from a group consisting of memory events and extemal events.
In accordance with another embodiment, a method comprises executing instructions on a circuit, determining event data corresponding to the execution of the instruction, and a monitoring device receiving the event data. The event data comprises event data selected from a group consisting of memory events and extemal events.
In accordance with yet another embodiment, a circuit comprises a memory subsystem and logic that receives memory event data from the memory sub system and receives extemal event data. The circuit also comprises an interface coupled to the logic. The memory event data and the extemal event data are provided by the interface to an external monitoring system coupled to the circuit. The memory event data provides information about memory events associated with said memory sub system and the external event data provide external events associated with the circuit aside from the memory subsystem.
BRIEF DESCRIPTION OF THE DRAWINGS
For a detailed description of exemplary embodiments of the invention, reference will now be made to the accompanying drawings in which:
FIG. 1 depicts an exemplary debugging and profiling system in accordance with a preferred embodiment of the invention;
FIG. 2 depicts an embodiment of circuitry where code is being debugged and profiled using a trace;
FIG. 3 depicts a preferred embodiment of circuitry where code is being debugged and profiled using a trace;
FIG. 4 depicts an example of an implementation of an event encoder;
FIG. 5A depicts a preferred implementation of aligmnent blocks;
FIG. 5B depicts the operation of the aligmnent blocks;
FIG. 6 depicts a preferred implementation of either a priority encoder or a translator;
FIG. 7A depicts an implementation of any of the groups shown in FIG. 6 for prioritizing the input events;
FIG. 7B depicts an example of the operation of FIG. 7A; and
FIG. 7C depicts an example of the operation of FIG. 7A.
FIG. 1 depicts an exemplary debugging and profiling system 100 including a host computer 105 coupled to a target device 110 through a connection 115. A user may debug and profile the operation of the target device 110 by operating the host computer 105. The target device 110 may be debugged
and profiled in order for the operation of the target device 110 to perfonn as desired (for example, in an optimal mamier) with circuitry 145. To this end, the host computer 105 may include an input device 120, such as a keyboard or mouse, as well as an output device 125, such as a monitor or printer. Both the input device 120 and the output device 125 couple to a central processing unit 130 (CPU) that is capable of receiving commands from a user and executing software 135 accordingly. Software 135 interacts with the target 110 and may allow the debugging and profiling of applications that are being executed on the target 110.
Connection 115 couples the host computer 105 and the target device 110 and may be a wireless, hard-wired, or optical cormection. Interfaces 140A and 140B may be used to interpret data from or communicate data to connection 115 respectively according to any suitable data communication method. Connection 150 provides outputs from the circuitry 145 to interface 140B. As such, software 135 on host computer 105 communicates instructions to be implemented by circuitry 145 through interfaces 140A and 140B across connection 115. The results of how circuitry 145 implements the instructions is output through connection 150 and communicated back to host computer 105. These results are analyzed on host computer 105 and the instructions are modified so as to debug and profile applications to be executed on target 110 by circuitry 145.
Connection 150 may be a wireless, hard-wired, or optical cormection. In the case of a hard-wired connection, comiection 150 is preferably implemented in accordance with any suitable protocol such as a Joint Testing Action Group (J TAG) type of cormection. Additionally, hard-wired comiections may include a real time data exchange (RTDX) type of connection developed by Texas instruments, Inc. Briefly put, RTDX gives system developers continuous real-time visibility into the applications that are being implemented on the circuitry 145 instead of having to force the application to stop, via a breakpoint, in order to see the details of the application implementation. Both the circuitry 145 and the interface 140B may include interfacing circuitry to facilitate the implementation of JTAG, RTDX, or other interfacing standards.
The target 110 preferably includes the circuitry 145 executing code that is actively being debugged and profiled. In some embodiments, the target 110 may be a test fixture that accommodates the circuitry 145 when code being executed by the circuitry 145 is being debugged and profiled. The debugging and profiling may be completed prior to widespread deployment of the circuitry 145. For example, if the circuitry 145 is eventually used in cell phones, then the executable code may be designed using the target 110.
The circuitry 145 may include a single integrated circuit or multiple integrated circuits that will be implemented as part of an electronic device. For example, the circuitry 145 may include multi-chip modules comprising multiple separate integrated circuits that are encapsulated within the same packaging. Regardless of whether the circuitry 145 is implemented as a single-chip or multiple-chip module, the circuitry 145 may eventually be incorporated into an electronic device such as a cellular telephone, a portable gaming console, network routing equipment, etc.
Debugging and profiling the executable finnware code on the target 110 using breakpoints to see the details of the code execution is an intrusive process and affects the operation and performance of the code being executed on circuitry 145. As such, a true understanding of the operation and performance of the code execution on circuitry 145 is not gained through the use of breakpoints.
FIG. 2 depicts an embodiment of circuitry 145 where code is being debugged and profiled using a trace on circuitry 145 to monitor events. Circuitry 145 includes a processor 200 which executes the code. Through the operation of the processor 200 many events 205 may occur that are significant for debugging and profiling the code being executed by the processor 200. The tenn “events” or “event data” herein is being used broadly to describe any type of stall in which processor 200 is forced to wait before it can complete executing an instruction, such as a CPU stall or cache stall; any type of memory event, such as a read hit or read miss; and any other occurrences which may be useful for debugging and profiling the code being executed on circuitry 145. The intemal trace memory 210 records the events 205 as event data and outputs the event data through connection 150 to computer 105. This enables a user of the computer 105 to see how the execution of the code is being implemented on circuitry 145.
As successive generations of processors are developed with faster speeds, the number of events occurring on a processor such as processor 200 similarly increases, however, the bandwidth between computer 105 and circuitry 145 through connection 150 is limited. The amount of event data 205 recorded using a trace may exceed the bandwidth of comiection 150. As such, for this solution to be implemented a trace may only be run for a very limited amount of time so as to not fill up intemal trace memory 210. This situation is analogous to a sink that drains much less water than the faucet is putting into the sink. In order to prevent the sink from overflowing the faucet may only be tumed on for a limited amount of time. This solution of only running the trace for a very short time may not be preferable since it would give a very limited view of the execution of the code on circuitry 145. Altematively, internal trace memory 210 may be very large so as to accommodate the large amount of event data. This may not be preferable either, since trace memory 210 would then take up a large area on circuitry 145 and consume more power.
As such, intelligent ways of reducing the amount of event data without loosing any or much information are desirable. FIG. 3 discloses another embodiment of circuitry 145 where code is being debugged and profiled using a trace on circuitry 145 to monitor events. Circuitry 145 includes a processor core 300 which executes the code. Processor 300 interacts with memory controller 320 in order to input data and instructions from various levels of a memory subsystem and output data manipulated according to the instructions. The memory subsystem may include an L1 cache memory 305, which may be divided into a program portion of L1 cache and a data portion of L1 cache; an L2 cache memory 310, which may be larger and slower than the L1 cache memory; and an extemal memory 315, which may be a random access memory (RAM), or any other suitable extemal storage.
The exemplary embodiment of FIG. 3 implements a writeback cache, and any write of data not already within the next lower level of cache after the L1 cache 305 is written to a write buffer 345. Once the data is written to write buffer 345, the core 300 continues processing other instructions while the write buffer 345 is emptied into the L2 cache 310, bypassing the L1 cache 305. Thus, in the embodiment of FIG. 3, core 300 only stalls on write misses to L1 cache 305 when write buffer 345 is full. Write buffer 345 fills up when the rate of writes to write buffer 345 exceeds the rate at which write buffer 345 is being drained. It should be noted that although the example of FIG. 3 shows a write buffer used in conjunction with the L1 cache, such write buffers may also be implemented at any level of a cached memory system, and all such implementations are intended to be within the scope of the present disclosure.