WO2012069831A1 - Method and arrangement for a multi-core system - Google Patents

Method and arrangement for a multi-core system Download PDF

Info

Publication number
WO2012069831A1
WO2012069831A1 PCT/GB2011/052303 GB2011052303W WO2012069831A1 WO 2012069831 A1 WO2012069831 A1 WO 2012069831A1 GB 2011052303 W GB2011052303 W GB 2011052303W WO 2012069831 A1 WO2012069831 A1 WO 2012069831A1
Authority
WO
WIPO (PCT)
Prior art keywords
shared memory
buffer
memory area
data
processor
Prior art date
Application number
PCT/GB2011/052303
Other languages
French (fr)
Inventor
Keith Athaide
Original Assignee
Tte Systems Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GBGB1019895.0A external-priority patent/GB201019895D0/en
Priority claimed from GBGB1019890.1A external-priority patent/GB201019890D0/en
Application filed by Tte Systems Ltd filed Critical Tte Systems Ltd
Publication of WO2012069831A1 publication Critical patent/WO2012069831A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30076Arrangements for executing specific machine instructions to perform miscellaneous control operations, e.g. NOP
    • G06F9/3009Thread control instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/543Local

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Debugging And Monitoring (AREA)
  • Multi Processors (AREA)

Abstract

A method of communicating data between a plurality of processors in a multi-core system via a plurality of buffers, each buffer comprising memory divided into a plurality of shared memory areas such that each buffer comprises a corresponding version of each of the plurality of shared memory areas. For all of the buffers in the system, for the corresponding version of each same shared memory area of each buffer, a maximum of one shared memory can be written to any one time, whilst for all of the buffers associated with each individual processor, for the corresponding version of each same shared memory area a maximum of one shared memory area can be read from at any one time. The method comprises identifying a first write operation scheduled to be performed by a first processor of the plurality of processors to a first shared memory area in at least a first buffer, said first buffer being coupled to said first processor, and in response identifying a second shared memory area corresponding to the first shared memory area in at least a second buffer, said second buffer being coupled to a second processor, wherein the second shared memory area is in a first state where it is available to receive data. The method further comprises writing data associated with the first write operation to the first shared memory area in at least the first buffer, and in response thereto changing the state associated with second shared memory area of said second buffer to indicate the data in the second shared memory area is most recent data written to the second shared memory area, and writing said data to the second shared memory area of the second buffer, to thereby ensure each corresponding shared memory area contains the most recent version of data.

Description

METHOD AND ARRANGEMENT FOR A MULTI-CORE SYSTEM
TECHNICAL FIELD
The invention relates to a method and system for enabling data to be exchanged between processors in a multi-core system.
BACKGROUND
Microprocessors are typically associated with computing based
applications such as running software on devices like personal computers, smart-phones, games consoles and so on. In such devices, a
microprocessor is usually arranged to run complex software enabling the device to perform many different functions.
However, microprocessors are also employed to control systems that are designed and built to perform more specific functions. These systems are sometimes referred to as "embedded" systems. As is known in the art, embedded systems are used to provide control for a very large range of applications. For example, an embedded system employing a simple microprocessor might be used for controlling a domestic appliance such as a washing machine or an oven. On the other hand, in more complex examples a more sophisticated embedded system might be used: in an avionics system; in an aircraft or in a control system for a robotic arm used in a factory.
Some embedded systems employ a so-called "time-triggered" interrupt system to provide improved predictability and stability in the behaviour of the microprocessor.
A conventional microprocessor includes a number of interrupt lines. As is known in the art, these interrupts allow a task currently being executed on the processor to be interrupted by an external event. When designing such a system, it is difficult to say with complete certainty that the processor will always behave as intended because it is very difficult to anticipate the effect of every conceivable interrupt occurring at every conceivable point during every possible task that might be run on the microprocessor.
On the other hand, a system using a "time-triggered" technique is arranged such that the number of interrupt lines is reduced, typically to a single interrupt line. Moreover time-triggered systems are typically designed such that the point in time during task execution when an interrupt will occur is determined prior to the interrupt occurring. In some examples, time-triggered systems are designed such that interrupt timing is known only to the extent of when the next interrupt will occur. In other examples, more precise interrupt timing is known, for example the point in time at which every interrupt will occur is known.
In some examples of the time-triggered technique, such as a "time- triggered co-operative" technique, a system is designed such that every task performed by the processor always runs to completion. In other words the system and software for the system are designed such that a task being performed on a processor is never interrupted by an external event. The time-triggered techniques described above can reduce to some extent the speed at which tasks are executed and also typically require specific time-triggered software to be written, specifically for use in the time- triggered system. However, employing such techniques can significantly ease the process of predicting a microprocessor's behaviour, and that of the system of which it is a part. This is because all that is necessary is to model the behaviour of a time-triggered system is to separately model the execution of each task and the effect of the interrupts occurring at the known times. This alleviates the requirement to undertake the complex and time-consuming task of modelling the effect of interrupts occurring at non-predetermined times. Accordingly, time-triggered systems are particularly useful in safety critical applications such as avionics, where a trade off in speed is acceptable for an increase in predictability of the behaviour of the system. Another characteristic of time-triggered systems is that they have an inherent "single writer" restriction. In other words, as each task is run to completion, it can be assumed that for a given memory location at any single point in time there will only ever be one task (i.e. "writer") writing to memory.
Returning to microprocessor systems in general, as the demand for more complex systems has grown, it is now common, particularly in computing applications, to employ microprocessor based systems which include more than one microprocessor core. In other words, rather than having a single processor executing all of the tasks required by the system, multiple cores are provided and the various tasks to be performed by the system are divided amongst the cores. This multi-core approach has also begun to be employed in embedded systems.
In a single-core processor, tasks frequently communicate by reading and writing to common memory locations. However, when executed on different cores of a distributed memory multi-core, the task design must change significantly to allow messages to be passed or to use some other mechanism to exchange data between tasks. In other words when designing a multi-core system, techniques have to be employed to enable the tasks operating on the different microprocessors to communicate data with each other in such a way that data integrity is maintained. This must be done even if the tasks are operating asynchronously (e.g. writing and reading data at different times) and running at different rates (e.g. writing and reading data at different rates). Thus, with an increase in the number of cores in a system, a mechanism is necessary for tasks on different cores to communicate. In particular it is desirable to take advantage of the characteristics of time-triggered systems to provide an improved technique for managing multi-core systems.
In the following, the core that is executing a task that produces data is termed a writer and the core executing a task that receives data is termed a reader. The reader and writer may be executing asynchronously, where the read and write tasks run at different rates, as shown schematically in Figure 1 .
Since the periodicity of the applications is a part of the application design, it can be assumed that the writer will buffer data appropriately at the application level if it runs faster than the reader. For example, in Figure 1 (a), the writer will only use one buffer while in Figure 1 (b), the writer will create and use two buffers at the application level.
Figure 2 shows a schematic illustration of possible overlaps between a writer and reader (Kopetz et al. 1993), where a single-core processor provides control over concurrent task execution by the option of utilising pre-emption and a multi-core device will always exhibit concurrent task executions. The overlaps can be seen in Figure 2 (a) and (b) where tasks occur within each other, Figure 2 (c) where the writer is started while the reader is executing and Figure 2 (d) where the reader is started while the writer is executing. Such overlap is permissible but requires special measures to maintain data integrity.
SUMMARY OF THE INVENTION
In accordance with a first aspect of the present invention there is provided a method of communicating data between a plurality of processors in a multi-core system via a plurality of buffers, each buffer comprising memory divided into a plurality of shared memory areas such that each buffer comprises a corresponding version of each of the plurality of shared memory areas. For all of the buffers in the system, for the corresponding version of each same shared memory area of each buffer, a maximum of one shared memory can be written to any one time, whilst for all of the buffers associated with each individual processor, for the corresponding version of each same shared memory area a maximum of one shared memory area can be read from at any one time. The method comprises identifying a first write operation scheduled to be performed by a first processor of the plurality of processors to a first shared memory area in at least a first buffer, said first buffer being coupled to said first processor, and in response identifying a second shared memory area corresponding to the first shared memory area in at least a second buffer, said second buffer being coupled to a second processor, wherein the second shared memory area is in a first state where it is available to receive data. The method further comprises writing data associated with the first write operation to the first shared memory area in at least the first buffer, and in response thereto changing the state associated with second shared memory area of said second buffer to indicate the data in the second shared memory area is most recent data written to the second shared memory area, and writing said data to the second shared memory area of the second buffer, to thereby ensure each corresponding shared memory area contains the most recent version of data.
In accordance with this aspect of the invention, a technique is provided which takes advantage of the characteristic of microprocessor based systems, such as "time-triggered" based systems, in which a "single-writer" condition is imposed such that at any one time, only a single task scheduled to be performed by the system will perform a write operation on any of the versions of a specific memory location held in various buffers within the system. It is recognised that by further imposing a "single- reader" condition such that at any one time, only a single task scheduled to be performed by the system will read from any of the versions of a specific memory location held in the various buffers connected to a specific processor, an advantageous technique for communicating data between tasks operating in a multi-core system can be provided. It should be noted that the "single-writer" condition is applied "globally" i.e. to all the buffers in the system. In other words, across the entire system only one version of a shared memory area is written to at any one time. On the other hand, the "single-reader" condition is applied locally. In other words, for the buffers in one buffer group attached to one processor only one version of a shared memory area is read from at any one time. However corresponding shared memory areas from buffers attached to other cores (i.e. from different buffer groups) may be read from during this time.
As explained above, multi-core systems must be arranged such that the communication of data between tasks run on individual processor cores does not cause conflict. In accordance with the first aspect of the present invention, the single-reader/single writer constraint is imposed and each processor core is provided with a plurality of corresponding buffers in which a number of versions of the shared memory areas are provided which are updated when shared memory areas are written to by other cores. Further, data that is most recently written to a shared memory area in a particular buffer is flagged as such by virtue of a state associated with that memory area in that buffer. As a result, different tasks are able to run independently on different cores with a reduced risk of the different tasks writing data to shared memory areas that will cause a conflict. In some examples, this risk may be effectively entirely mitigated.
Thus, although a write and a read operation may be performed on the same shared memory area (i.e. different versions of that shared memory area) at the same time by different processors, this can be tolerated because for a given processor core, the shared memory area is replicated in a number of buffers (allowing simultaneous reading and writing).
Further as write operations by one core are replicated at other cores and the most recent (and thus correct) data for a particular memory area is identified, the chances of inter-task conflicts arising are reduced. In some examples, this risk may be effectively entirely mitigated.
Thus, the ability for tasks to communicate via common memory locations even if running concurrently on a distributed memory multi-core with the use of hardware extensions is extended. As a result it is possible to continue developing applications (i.e. software) as if targeting a single- core system.
In some embodiments the method further includes reading data from the second shared memory area of the second buffer by identifying a first read operation scheduled to be performed by the second processor from the second shared memory area; identifying that the second shared memory area of the second buffer is in the state indicating the data stored therein is most recent data; and performing the first read operation by reading the data from the second memory area of the second buffer.
In some embodiments, after identifying that the second shared memory area of the second buffer is in the state indicating the data stored therein is most recent data, the method includes changing the state of the second shared memory area of the second buffer to indicate the second shared memory area of the second buffer is in use by the second processor. In some embodiments, identifying the first write operation comprises identifying the first write operation by a scheduler coupled to the first processor.
In some embodiments the plurality of buffers are clocked at a same rate as the plurality of processors. In some embodiments the plurality of buffers are arranged into a plurality of buffer groups, each processor being coupled to buffers of one buffer group, and each buffer group comprises three buffers.
In some embodiments the read operations performed by the plurality of processors and write operations performed by the plurality of processors are performed at different rates.
In accordance with a second aspect of the invention, there is provided a multi-core system comprising a plurality of processors and a plurality of buffers, each buffer comprising memory divided into a plurality of shared memory areas such that each buffer comprises a corresponding version of each of the plurality of shared memory areas. The system is arranged such that for all of the buffers in the system, for the corresponding version of each same shared memory area of each buffer, a maximum of one shared memory can be written to at any one time, whilst for all of the buffers associated with each individual processor, for the corresponding version of each same shared memory area a maximum of one shared memory area can be read from at any one time. The system further comprises a first scheduler coupled to a first processor of the plurality of processors arranged to identify a first write operation scheduled to be performed by the first processor of the plurality of processors to a first shared memory area in at least a first buffer, said first buffer being coupled to said first processor. The system also includes a communication controller coupled to a second processor arranged in response to the identification of the first write operation to identify a second shared memory area corresponding to the first shared memory area in at least a second buffer, said second buffer being coupled to the second processor, wherein the second shared memory area is in a first state where it is available to receive data. The communication controller is also arranged, in response to a writing of data associated with the first write operation to the first shared memory area in at least the first buffer, to change the state associated with second shared memory area of said second buffer to indicate the data in the second shared memory area is most recent data written to the second shared memory area, and to write said data to the second shared memory area of the second buffer, thereby ensuring each corresponding shared memory area contains the most recent version of data.
In one embodiment of the second aspect of the invention, the system is arranged so that the second processor is coupled to a second scheduler. The second scheduler is arranged to identify a first read operation scheduled to be performed by the second processor from the second shared memory area. In response the communication controller is arranged to identify that the second shared memory area of the second buffer is in the state indicating the data stored therein is most recent data. In response the second processor is arranged to perform the first read operation by reading the data from the second shared memory area of the second buffer.
In another embodiment of the second aspect of the invention, after identifying that the second shared memory area of the second buffer is in the state indicating the data stored therein is most recent data, the communication controller is arranged to change the state associated with the second shared memory area of the second buffer to indicate that the second shared memory area of the second buffer is in use by the second processor.
In another embodiment of the second aspect of the invention the plurality of buffers are clocked at a same rate as the plurality of processors.
In another embodiment of the second aspect of the invention the plurality of buffers are arranged into a plurality of buffer groups, each processor being coupled to buffers of one buffer group, and each buffer group comprises three buffers. In another embodiment of the second aspect of the invention read operations performed by the plurality of processors and write operations performed by the plurality of processors are performed at different rates. In accordance with a third aspect of the invention there is provided a scheduler for use in a system arranged in accordance with the second aspect of the invention.
In accordance with a fourth aspect of the invention there is provided a communication controller for use in a system arranged in accordance with the second aspect of the invention.
In accordance with a fifth aspect of the invention there is provided a product comprising a system arranged in accordance with the second aspect of the invention.
Various aspects and embodiments of the invention are defined in the appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
Exemplary embodiments of the invention will now be described with reference to the accompanying drawings, in which:
Figure 1 provides a schematic diagram showing a prior art reader and writer operating synchronously in a microprocessor system; Figure 2 provides a schematic diagram showing a prior art reader and writer operating asynchronously, i.e. such that read operations and write operations overlap; Figure 3 provides a simplified schematic diagram of an embedded system in which examples of the present invention can be implemented; Figure 4 provides a schematic diagram of a processing unit arranged in accordance with an example of the present invention; Figure 5 provides a schematic diagram of a memory unit comprising three buffer memories arranged in accordance with an example of the present invention; Figure 6 provides a state diagram illustrating a change in states of the shared memory areas in the buffer memories; Figure 7 provides a schematic diagram of a first and second core module arranged in accordance with an example of the present invention; Figure 8 provides a schematic diagram of a connection of a communication controller and a number of core modules arranged in accordance with an example of the present invention; Figure 9 and Figure 10 provide a schematic diagram showing a change in state associated with shared memory areas of three buffers during the execution of a first and second task in accordance with an example of the present invention; and Figure 1 1 provides a flow diagram of a process performed in accordance with an example of the present invention.
DETAILED DESCRIPTION Figure 3 provides a simplified schematic diagram illustrating the basic arrangement of an example of an embedded system 1 1 . The system includes an input output (I/O) unit 12 for receiving input data into the system and for sending output data out of the system. The I/O unit 12 is connected via a system bus 13 to a processing unit 14. Schematically, the I/O unit 12 receives input from external sources, converts this to a suitable format and sends it to the processing unit 14 via the system bus 13. The processing unit 14 receives the input data and performs processing on it in accordance with a software program loaded on the processing unit 14. The processing unit 14 sends data back to the I/O unit 12 via the system bus as the processing is performed which is then converted into system output.
In the example shown in Figure 3, the processing unit 14 is a multi-core device. Rather than comprising a single processor core, the processing unit 14 instead comprises a plurality of core modules 10. Each core module typically comprises a discrete microprocessor and associated components as is known in the art.
Processing performed by the processing unit 14 is distributed across the plurality of core modules 10. As is known in the art, by using a number of core modules rather than a single core module, data processing of a greater complexity can be undertaken and/or data processing can be performed more quickly. However, when employing a processing unit with multiple cores, additional considerations must be taken into account.
As is known in the art, in a single core system, when software being run on the system is converted into instructions for the processing core, a scheduling entity groups the instructions into a set of tasks. A single processing core can be constrained to execute one task at a time and therefore inter-task conflicts due to concurrently running tasks can be substantially reduced or mitigated.
However, in a multi-core system, tasks execute concurrently, by choice, on the various processing cores. In order to take full advantage of the benefits provided by distributed processing cores, tasks running
concurrently on different cores are arranged to communicate data with each other. Specific pieces of data communicated between tasks are generally referred to as shared variables.
Special measures need to be taken to ensure that conflicts do not arise between concurrently running tasks when they communicate shared variables. Such conflicts can occur, for example, when a section of memory is concurrently being written to by a first task and read from by a second task. In a simple case the likelihood of conflicts can be reduced by buffering data to be written by the first task until the second task has read the data. However, accommodating concurrent write and read operations to the same memory area becomes considerably more complex when the read and write operations are performed at different rates. For example, if more than one write operation is performed during a single read operation, not only does the data to be written need to be buffered, but also what is deemed to be the most "recent" data must be carefully tracked.
It has been found that implementing the following design principles provides a particular advantage in multi-core based embedded systems:
Multi-Core Design Principles
. Data is communicated between cores via a plurality of shared memory areas (SMAs). However, it should be noted that in the examples described below, these SMAs do not exist as one single group of memory areas.
Instead each core is attached to a plurality of buffer memories. Each buffer memory contains a version of the plurality of SMAs. Thus if a system includes n buffers, there would be a total n versions of each SMA.
It should be noted that in the following, unless otherwise stated, reference to an "SMA" refers to a particular version of an SMA in a particular buffer.
. Shared variables are allocated to SMAs that possess globally unique identifiers.
. In some examples each core is attached to three buffer memories. Therefore, each core is connected to three versions of each SMA.
. Each SMA in each buffer memory may be in one of a number of states including "in use by a local core", "in use by an external core" or marked as containing the "latest data". The SMAs cycle states depending on local or external state switching instructions.
. Each core module maintains a list of SMAs which are relevant to the tasks that are to be run on that core. . When a core writes to an SMA, a message tagged with an SMA identifier is broadcast over an arbitrary on-chip network to all other cores.
. When a core receives such a message, it processes the message only if the SMA identified therein is in the list of relevant SMAs, whereupon it writes the contents into the corresponding SMA of one of the three buffers which has been placed in an "in use by an external core" state by a separate switch message.
. When a core reads from an SMA, the data is read from the relevant SMA in one of the buffers that is in the "in use by the local core" state.
Example implementation
Figure 4 provides an example implementation of a multi-core processing unit arranged in accordance with the design principles set out above.
As will be explained in further detail below, the multi-core processing unit shown in Figure 4 implements a hardware communication controller and scheduler that synchronises task execution such that the fact that the processing unit comprises multiple cores is transparent to the application software running on the processing unit. Moreover, to safeguard access overlaps, a hardware implementation of a three buffer single-writer, single- reader mechanism is used, with the entire data memory being buffered.
In this architecture, at any one time corresponding SMAs in all the buffers have imposed on them a one "writer" condition and corresponding SMAs in each buffer group have one "reader" imposed on them.
Thus, as described above, the system is arranged such that for
corresponding SMAs in any given buffer group (i.e. all the versions of an SMA coupled to a particular processor), a maximum of only one task will ever be reading from one version of an SMA. On the other hand, for all buffers across the whole system (i.e. all versions of an SMA in all of the buffers) a maximum of only one task will ever be writing to one version of an SMA, at any given point in time.
The multi-core processing unit shown in Figure 4 includes a first core module 10a and a second core module 10b. The first and second core modules can be implemented as core modules in a processing unit of an embedded system, such as the processing unit 14 shown in the embedded system 1 1 of Figure 3.
As can be seen from Figure 4, the structure of the first core module 10a and the second core module 10b correspond. For the sake of brevity only the structure of the first core module 10a will be explained in detail. It will be understood that like parts from the first core module 10a correspond in function and in the nature of their interconnection with like parts of the second core module 10b.
Further, it will be understood that although the multi-core processing unit shown in Figure 4 comprises only first and second core modules, in some implementations, the multi-core processing unit may include more than two core modules.
As can be seen from Figure 4, the first core module 10a includes a core 1 a. The core 1 a is connected to a communication controller 2a. The communication controller 2a is connected to a scheduler 3a which is also connected to the core 1 a. The core 1 a and communication controller 2a are connected to a buffer switch 4a. The buffer switch 4a is connected to three buffer memories 5a, 6a, 7a which together comprise a memory unit (or buffer group). The communication controllers 2a, 2b of each core module 10a, 10b are connected via a common data bus 8. The buffer memories 5a, 6a, 7a are connected to the communication controller 2a.
In summary, the software (i.e. the application) running on the system is divided into tasks which are organised by the schedulers 3a, 3b. Each scheduler 3a, 3b then sequentially sends these tasks for execution to the core 1 a or core 1 b depending on which it is attached to. The core then performs the task which typically involves writing to and/or reading from the buffer memories. Data is shared between cores by writing to and reading from shared memory areas (SMAs) which are replicated in each buffer 5a, 6a, 7a, 5b, 6b and 7b. When a core writes to or reads from an SMA this is communicated to the other cores of the system by the communication controller attached to that particular core on the common data bus 8. The buffer from which a core reads is determined by the position of the buffer switch 4a under the control of the communication controller.
The various elements of the core module are explained in more detail below:
Core
The core 1 a is typically a central processing unit (CPU) that executes tasks sent to it from the scheduler 3a. The core 1 a can read and write data to the buffer memories 5a, 6a, 7a, via the buffer switch 4a under the control of the communication controller 2a.
Buffer Memories
As explained above, in each buffer memory is a version of each SMA. The buffer memories are referred to as "buffers" from this point. In the implementation shown in Figure 4, each buffer 5a, 6a, 7a, comprises an area of memory which corresponds with the total shared memory in the system.
This is shown in more detail in Figure 5. As can be seen from Figure 5, each of the buffers 5a, 6a, 7a contain a section of memory which is substantially identically divided into a plurality of n SMAs. The size and location of each SMA is typically defined at compile time.
As will be described below, the scheduler "switches" the buffers for the SMAs. The "switching" performed by the scheduler is a "switching" of a state associated with the SMA in the buffers used by a task before it executes (switching the state of the SMAs in the buffers is not to be confused with the function of the buffer switch which is explained in more detail below).
The switching performed by the scheduler includes the local buffers ("local switch") and the buffers in other cores ("external switch") sharing these memory areas.
As will be understood, a "local switch" refers to switching the state of SMAs of buffers attached to the same core as the scheduler. An "external switch" is switching the state of SMAs of buffers attached to a different core than the scheduler.
A switch also locks the buffers and so the buffers must be released by the scheduler when the task is finished. A "local switch" sets the local buffer to the latest written buffer; an "external switch" reserves a buffer that is not the latest and which is not being read. An "external switch" uses the last written buffer if a "local switch" has not occurred since the last external switch. This allows tasks working at different rates to function properly. A switch may also be performed locally only (read switch) if an SMA has multiple readers since multiple readers attempting external switches can disrupt each other. It will be understood that the term "buffer" in this context refers generally to the SMA in a particular buffer.
As shown in Figure 6, a buffer (or more specifically an SMA in a buffer) may then be in one of several states: available (a), being used locally (I), being used externally (e) and being the last used externally (u) and several guards. "Switching" between states occurs when a switch operation occurs. Specifically, an external switch (ES) and release (ER); a local switch (LS) and release (LR); and a local switch have happened since the last external switch (LSSLE).
In some examples the transitions from the available state are the least preferred and transitions by a buffer from another state are then performed instead, if possible.
As long as the application (i.e. the software running on the system) buffers data appropriately and the reading task executes (on another core) after the last execution of a write operation in a batch but before the first execution of a write operation in the next batch, then the reader can execute concurrently until the start of the next plus one batch without any data losses or any incoherence.
Multiple buffers can be toggled together by a single register write to the communication controller and there is no variability introduced by tasks using variable numbers of shared memory areas. This prevents the communication controller from increasing a task's release jitter.
Accordingly, during operation of the processing unit, each SMA in the buffer 5a, 6a, 7a, 5b, 6b, 7b may be in one of a number of states. These states can be labelled as follows: · "IN LOCAL USE" (corresponding to "being used locally (I)" described above). When in this state, an SMA of a buffer is being used by the local core (i.e. the core to which the buffer is attached). No data originating from an external core can be written to an SMA of a buffer in this state.
• "IN EXTERNAL USE" (corresponding to "being used externally (e)" described above). When in this state, an SMA of a buffer is being used by an external core (i.e. a core from a different core module).
• "MOST RECENT DATA" (corresponding to "and being the last used externally (u)" described above). In this state, the data in an SMA of a buffer is deemed the "most recent" data, and will be the SMA which will be read next when a read instruction is issued for that set of corresponding SMAs.
• "AVAILABLE" (corresponding to "available (a)" described above). This is the initial state of a buffer and in this state, data originating from either a local or external core can be written to this buffer by the core without any further consideration.
As is explained in more detail below, the communication controller is arranged to monitor the states in which the SMAs of each of the three buffers are in at any given time. Particularly, the communication controller includes a buffer state register in which the state of the SMAs in each buffer is stored. The buffer state register is updated whenever the state of one of the SMAs in the buffers changes.
Scheduler
The scheduler component of the RTOS (real time operating system) associates SMAs with tasks, requests the controller to switch to the latest buffer for those areas when the task is about to execute and releases the area when the task is finished, i.e. the whole task is considered a critical section. (As is known in the art, a "critical section" is a section of code that accesses some shared data. It is critical since the shared data ideally must not be altered by any other process during execution).
An overview of a write on one core being propagated to another core can be seen schematically in Figure 7 and is explained in further detail below. Figure 7 provides a schematic diagram indicating a flow of data during a write process. A first core module 71 including a first communication controller 72, writes data via a physical link 73 (i.e. the bus) to a second core module 74 including a second communication controller 75. It will be understood that although not shown in Figure 7, the first and second core modules are structurally the same.
As is known in the art, schedulers, such as scheduler 3a are operable to arrange the processor instructions into tasks. The scheduler schedules which tasks a core is to perform and in what order. As explained above, the scheduler 3a is also responsible for associating tasks with particular SMAs. For example, the scheduler 3a identifies when a first task needs to share data with a second task. The scheduler then indentifies which SMAs are allocated for shared variables of the first and second tasks and then associates the tasks with the SMAs accordingly.
So-called SMA descriptions, relating to the SMAs that are used by tasks being performed on the processing unit are created upon request by the RTOS and are associated with an identifier decided at compile-time.
Typically, this is how SMA identifiers are created. The communication controller spends one or more cycles updating the lookup table that converts addresses to SMA identifiers.
In the example implementation shown in Figure 4, for clarity the first and second scheduler 3a, 3b are shown as two discrete units. However, it will be appreciated that multiple discrete schedulers may be part of the same system-wide entity.
Communication Controller
The communication controller is attached to the same bus as the data memory and the rest of the peripherals (i.e. peripherals of the system), allowing it to be directly controlled by software. It receives messages from other cores and writes them directly into data memory as shown schematically in Figure 8. Figure 8 provides a schematic diagram illustrating that in accordance with examples of the present invention a communication controller 81 of a first core module 82 is typically connected via a bus 83 to a plurality of other core modules 84 in the system.
The communication controller also monitors when a core writes data to an SMA in the buffers. As explained in more detail below, the communication controller is then arranged to communicate a message including an identifier associated with the SMA to which the data has been written, along with the data itself to other core modules in the processing unit.
Correspondingly, the communication controller is also arranged to receive messages from communication controllers in other core modules which indicate that external cores (i.e. cores from other core modules) have written data to an SMA. If this SMA is relevant to the local core, the communication controller is arranged to write this data into the local cores SMA buffers to the corresponding SMAs in the local buffers.
As can be seen from Figure 4, the communication controllers 2a, 2b are connected via data bus 8.
The communication controllers are also connected to their respective cores 1 a, 1 b via other data buses. These data buses connecting each core to their communication controllers are typically attached to other peripherals of the system, thereby allowing each communication controller 2a, 2b to be directly controlled by software, during for example an initialisation stage. During this initialisation stage the relevant data in registers described below can be set indicating, for example, which SMAs are relevant to a particular core.
The communication controller maintains separate registers (the
"description") for each SMA: a globally unique identifier, the address and size of the area, an indication of whether the SMA has been read since the last write and the state of each buffer (latest data, being written, being read). In other words the communication controller maintains a buffer state register. Stored in the buffer state register for each SMA is a globally unique identifier, the address and size of the SMA, an indication of whether each SMA has been read since the last write and the state of each SMA in each buffer (corresponding to latest data, being written, being read).
The controller also maintains a lookup table that allows for half-cycle conversions from a memory address to an SMA identifier. "Half cycle conversion" means to fetch an SMA identifier from an address which can be done in half the time taken to execute an instruction. Thus, an instruction that initiates a lookup (by accessing memory) can use the lookup result during its execution.
Example Write and Read Operations
A write operation and a read operation will now be described with reference to Figure 4. For clarity, the core 1 a of the first core module 10a is referred to as a "local" core, and the second core 1 b of the second core module 10b is referred to as the "external" core.
As explained in more detail below, write operations from a core are applied to all buffers attached to that core. This is because the half-cycle required to fetch the correct buffer number combined with the additional half-cycle to actually write the data might cause data hazards in the processor pipeline.
Again, as explained in more detail below, if the address being written to is part of an SMA, then after half a cycle when the address has yielded valid SMA information, a notification message is sent to all cores. The message contains the identifier of the SMA, the offset of the write address from the area's origin and the data that was written. This data is sufficient for the other cores to write the data into their own buffers at the proper location.
Since shared memory areas may be of different sizes even if associated with the same identifier, the communication controller typically ignores write requests from other cores that cross defined memory boundaries.
Example Write Operation
Firstly, the scheduler 3a identifies that a task (Task 1 ) is to be executed on the local core 1 a. The scheduler 3a then determines if Task 1 will result in the local core 1 a writing to memory and if this write to memory is a write to an SMA.
If Task 1 requires a write operation to an SMA (referred to from this point forward as SMA 1 ), the scheduler sends a first SMA state switch message to the communication controller 2a. The SMA state switch message includes an identifier of SMA 1 .
The communication controller 2a, upon receipt of the SMA state switch message, sends an "external switch" instruction to all other core modules on the data bus 8. The "external switch" instruction includes an identifier identifying SMA 1 . As explained below, the "external switch" instruction causes a corresponding SMA in at least one buffer of the other core modules to be changed to the IN EXTERNAL USE state.
Once sent on the common data bus 8, the "external switch" instruction is received by the communication controller 2b of the second core module 2b. The communication controller 2b of the second core module then extracts the identifier of SMA 1 from the "external switch" instruction and identifies the current state of SMA 1 in the buffers 5b, 6b, 7b from the buffer state register. The communication controller 2b then identifies the first buffer in which SMA 1 is in the AVAILABLE state by referring to the buffer state register. The control unit 2a then changes the state of SMA 1 in this buffer to the IN EXTERNAL USE state. This change of state is recorded in the buffer state register. This completes the process associated with the "external switch" instruction.
Returning, to the local core 1 a, once the external switch instruction has been sent by the communication controller 2a of the first core module 10a, the scheduler 3a of the first core module 10a instructs the local core 1 a to execute Task 1 .
Task 1 executes on the local core 1 a causing the local core 1 a to write data to SMA 1 . The local core 1 a performs a write operation by writing the data identically to SMA 1 in all three of the buffers 5a, 6b, 7b.
As mentioned above, the communication controller 2a is arranged to monitor all local write operations performed by the local core 1 a.
Accordingly, the communication controller 2a detects a write operation has been performed by the local core 1 a and specifically the memory address to which data has been written (i.e. the address of SMA 1 ).
The communication controller 2a then compares the memory address associated with the detected write operation (i.e. the address of SMA 1 ), with the SMAs listed in the buffer state register.
The communication controller 2a then determines that the address of the detected write operation corresponds to an SMA (i.e. SMA 1 ), and then transmits a write message on the data bus 8. The write message includes the data that has been written, the SMA identifier identifying the SMA to which data has been written (i.e. the identifier of SMA 1 ), and an address offset of the data from the start of the SMA 1 . The communication controller 2b of the second core module 10b, receives the write message transmitted on the data bus 8 and extracts the SMA identifier of SMA 1 .
The communication controller 2b using the identifier of SMA 1 identifies the state of SMA 1 in each of the buffers 5b, 6b, 7b from the buffer state register. The communication controller 2b then identifies which of the buffers 5b, 6b, 7b contain the SMA in question (i.e. SMA 1 ) in the IN EXTERNAL USE state. The communication controller 2b then writes the data that was contained in the write message to the SMA 1 in this buffer. When the execution of Task 1 on the local core 1 a has completed, the scheduler of the first core module 10a sends a second SMA state switch message to the communication controller 2a. The second SMA state switch message includes the identifier of the SMA modified by Task 1 (i.e. the identifier of SMA 1 ).
The second SMA state switch message causes the communication controller 2a to transmit an "external release" instruction on the bus 8 which includes the identifier of the SMA modified by Task 1 (i.e. the identifier of SMA 1 ).
The "external release" instruction is received by the communication controller 2b of the second core module 10b. On receipt of this message, the communication controller 2b extracts the SMA identifier (i.e. the identifier of SMA 1 ) and retrieves the state of the corresponding SMAs from the buffer state register (i.e. the state of SMA 1 in each of the buffers 5b, 6b, 7b). The communication controller then identifies which one of the buffers 5b, 6b, 7b contains SMA 1 that was previously changed to the IN EXTERNAL USE state, and changes this to the MOST RECENT DATA state. Note that the write process described above assumes only one SMA is modified by Task 1 . However, as will be understood, in some examples, the task being executed may result in writes to multiple SMAs.
Example Read Operation
The read operation described below describes a read operation performed by the external core, i.e. core 1 b shown in Figure 4.
Firstly, the scheduler 3b of the second core module 10b identifies that a task (Task 2) is to be executed on the external core 1 b. The scheduler 3b then determines if Task 2 will result in the external core 1 b reading from memory and if this read from memory is reading from an SMA.
If the scheduler 3b of the second core module 10b identifies that Task 2 requires a read operation from an SMA (referred to from this point forward as SMA 2), the scheduler 3b sends a first SMA state switch message to the communication controller 2b. The first SMA state switch message includes an SMA identifier identifying SMA 2.
The communication controller 2b then uses the SMA identifier of SMA 2 to identify the state of SMA 2 in the three buffers 5b, 6b, 7b from the buffer state register.
When a buffer is identified in which SMA 2 is in the MOST RECENT DATA state, the communication controller changes SMA 2 in this buffer to the IN LOCAL USE state.
If there is no buffer with SMA 2 in the MOST RECENT DATA state, then first buffer with SMA 2 in the AVAILABLE state is changed to the IN LOCAL USE state.
Task 2 is then run on the external core 1 b. As a result, Task 2 requests the external core 1 b to perform a read operation by reading from SMA 2. As mentioned above, the communication controller 2b is arranged to monitor all local read operations performed by the core to which it is attached. Thus the communication controller 2b detects a read operation is to be performed by the external core 1 b and specifically the memory address (i.e. the address corresponding to SMA 2) from which data is to be read.
On detection of the read operation to be performed by the external core 1 b, the communication controller 2b refers to the buffer state register to determine the state of SMA 2 in the three buffers.
The communication controller 2b then controls the buffer switch 4b to ensure that the core 1 b reads the data in SMA 2 from the buffer in which SMA 2 is in the IN LOCAL USE state.
When the execution of Task 2 on the external core 1 b has completed, the scheduler 3b sends a second SMA state switch message to the communication controller 2b. The second SMA state switch message includes an SMA identifier identifying SMA 2 (i.e. the SMA which Task B requested the external core 2b to read).
On receipt of the second SMA state switch message, the communication controller 2b extracts the SMA identifier of SMA 2 and retrieves the state of SMA 2 from each buffer from the buffer state register.
The communication controller 2b then performs a "local switch" instruction by identifying in which of the buffers SMA 2 is in the IN LOCAL USE state and changing this to the AVAILABLE state.
Note that the read process described above assumes only one SMA is read by Task 2. However, as will be understood, in some examples, the task being executed may result in reads from multiple SMAs.
Clocking Rate In the example read and write operations discussed above, the buffer memories are clocked at the same rate as the core, with no caches. As a result, after the core places an address on the bus valid data is expected in the next clock cycle. Translating from a memory address to a shared memory identifier (to fetch the number of the buffer with the latest data) takes half a cycle; and so, all buffers fetch data concurrently from the same address and the data is multiplexed when the right buffer is known.
Communication Buffers and Application Buffers
It should be noted that the buffer memories described above are considered to be "communication" buffers as opposed to "application" buffers. At the software level (e.g. the application level) there may be several application buffers in one communication buffer each pertaining to different tasks. In the context of the present invention, the application buffers typically from one task form one or more shared memory areas (SMAs) in the three communication buffers.
Switching Behaviour
The switching behaviour is examined in more detail in Figure 9 and Figure 10 where the state of the buffers is shown from the point of view of a Task B. The condition of a local switch having happened since the last external switch is also shown. The start times of a Task A are shown as the time when the external switch request reaches the hardware of the core on which Task B executes, and likewise Task A ends when the external release request is recognised.
In other words, Figures 9 and 10 show how the states of a particular SMA in each of the buffers changes during operation due to the execution of Task A on a first core and Task B on a second core. Figures 9 and 10 show this from the "point of view" of Task B in the sense that the state of the SMA of the buffers connected to the core performing Task B are shown.
Figure 9 schematically illustrates a buffer that switches from the view of Task B when it overlaps with a Task A running at the same rate with a combined utilisation less than one.
In Figure 9, both tasks run at the same rate and overlaps from Figure 2 are chosen. The tasks in (a), (b) and (c) can all be scheduled on single processors; (b) is the sort of timeline that can occur on a single processor. It is interesting to note, that (b) only ever uses one buffer, and only (d) uses all three buffers. If the precedence constraint of task B needing to run after Task A was added to (d) & (e), then they would resemble (c) and would also use only two buffers. A great disadvantage in this system is that initial data is lost if the tasks are given non-zero offsets due to the extra condition imposed by LSSLE (local switch since last external switch). This behaviour is clearer in Figure 10.
In Figure 10, Task A runs at twice the rate of Task B, with the first execution of Task B taking place after two executions of Task A, so that data is valid. As before, various combinations are taken: either Task B runs before the next execution of Task A or not, either Task B starts before the next plus one execution of Task A or not, either Task B finishes before the next execution of Task A or not and either Task B finishes before the next plus one execution of Task A or not. In all cases, executions of Task A which have not seen an execution of Task B after a prior execution of Task A cause no switches.
Figures 10 (a) to (d) illustrate in schematic form a buffer that switches from the view of task B when it overlaps with a Task A running at twice the rate.
In particular, Figure 10 (a) is another example of a single processor type system and accordingly, one buffer is sufficient. Figure 10 (b), (c) and (d) exhibit the long-task problem. However, depending on the data structure used by the application buffers, data losses may occur in (c), (d) and (e) and may be sustained, (c), (d) and (e) could avoid data losses with proper scheduling but will recover in the next tick (not shown). The example in Figure 10 can be expanded to higher frequency rate mismatches as well. As long as the application buffers data appropriately and the reading task executes (on another core) after the last execution of a writer in a batch but before the first execution of a writer in the next batch, then the reader can execute concurrently until the start of the next plus one batch without any data losses or any incoherence.
Figure 1 1 provides a schematic diagram of a flow chart illustrating a process according to an example of the invention. The process can be implemented on a multi-core system comprising a plurality of processors and plurality of buffers, each buffer comprising memory divided into a plurality of shared memory areas such that each buffer comprises a corresponding version of each of the plurality of shared memory areas. For all of the buffers in the system, for the corresponding version of each same shared memory area of each buffer, a maximum of one shared memory can be written to any one time, whilst for all of the buffers associated with each individual processor, for the corresponding version of each same shared memory area a maximum of one shared memory area can be read from at any one time.
Step S101 comprises identifying a first write operation scheduled to be performed by a first processor of the plurality of processors to a first shared memory area in at least a first buffer, said first buffer being coupled to said first processor.
In response Step S102 comprises identifying a second shared memory area corresponding to the first shared memory area in at least a second buffer, said second buffer being coupled to a second processor, wherein the second shared memory area is in a first state where it is available to receive data.
Step S103 comprises writing data associated with the first write operation to the first shared memory area in at least the first buffer.
In response thereto Step S104 comprises changing the state associated with second shared memory area of said second buffer to indicate the data in the second shared memory area is most recent data written to the second shared memory area.
Step S104 comprises writing said data to the second shared memory area of the second buffer, to thereby ensure each corresponding shared memory area contains the most recent version of data.
The skilled person will envisage further embodiments within the scope of the appended claims.

Claims

1 . A method of communicating data between a plurality of processors in a multi-core system via a plurality of buffers, each buffer comprising memory divided into a plurality of shared memory areas such that each buffer comprises a corresponding version of each of the plurality of shared memory areas, and
for all of the buffers in the system, for the corresponding version of each same shared memory area of each buffer, a maximum of one shared memory can be written to any one time, whilst
for all of the buffers associated with each individual processor, for the corresponding version of each same shared memory area a maximum of one shared memory area can be read from at any one time; said method comprising:
identifying a first write operation scheduled to be performed by a first processor of the plurality of processors to a first shared memory area in at least a first buffer, said first buffer being coupled to said first processor, and in response
identifying a second shared memory area corresponding to the first shared memory area in at least a second buffer, said second buffer being coupled to a second processor, wherein the second shared memory area is in a first state where it is available to receive data;
writing data associated with the first write operation to the first shared memory area in at least the first buffer, and in response thereto changing the state associated with second shared memory area of said second buffer to indicate the data in the second shared memory area is most recent data written to the second shared memory area, and
writing said data to the second shared memory area of the second buffer, to thereby ensure each corresponding shared memory area contains the most recent version of data.
2. A method according to claim 1 , further comprising reading data from the second shared memory area of the second buffer by
identifying a first read operation scheduled to be performed by the second processor from the second shared memory area;
identifying that the second shared memory area of the second buffer is in the state indicating the data stored therein is most recent data; and
performing the first read operation by reading the data from the second memory area of the second buffer.
3. A method according to claim 2, wherein after identifying that the second shared memory area of the second buffer is in the state indicating the data stored therein is most recent data, changing the state of the second shared memory area of the second buffer to indicate the second shared memory area of the second buffer is in use by the second processor.
4. A method according to any previous claim, wherein identifying the first write operation comprises identifying the first write operation by a scheduler coupled to the first processor.
5. A method according to any previous claim wherein the plurality of buffers are clocked at a same rate as the plurality of
processors.
6. A method according to any previous claim, wherein the plurality of buffers are arranged into a plurality of buffer groups, each processor being coupled to buffers of one buffer group, and each buffer group comprises three buffers.
7. A method according to any previous claim, wherein read operations performed by the plurality of processors and write operations performed by the plurality of processors are performed at different rates.
8. A multi-core system comprising a plurality of processors and a plurality of buffers, each buffer comprising memory divided into a plurality of shared memory areas such that each buffer comprises a corresponding version of each of the plurality of shared memory areas, the system being arranged such that for all of the buffers in the system, for the corresponding version of each same shared memory area of each buffer, a maximum of one shared memory can be written to at any one time, whilst for all of the buffers associated with each individual processor, for the corresponding version of each same shared memory area a maximum of one shared memory area can be read from at any one time; said system further comprising:
a first scheduler coupled to a first processor of the plurality of processors arranged to identify a first write operation scheduled to be performed by the first processor of the plurality of processors to a first shared memory area in at least a first buffer, said first buffer being coupled to said first processor,
a communication controller coupled to a second processor arranged in response to the identification of the first write operation to identify a second shared memory area corresponding to the first shared memory area in at least a second buffer, said second buffer being coupled to the second processor, wherein the second shared memory area is in a first state where it is available to receive data; wherein
said communication controller is arranged, in response to a writing of data associated with the first write operation to the first shared memory area in at least the first buffer, to change the state associated with second shared memory area of said second buffer to indicate the data in the second shared memory area is most recent data written to the second shared memory area, and to write said data to the second shared memory area of the second buffer, thereby ensuring each corresponding shared memory area contains the most recent version of data.
9. A system according to claim 8, wherein the second processor is coupled to a second scheduler, said second scheduler arranged to identify a first read operation scheduled to be performed by the second processor from the second shared memory area; and in response the communication controller is arranged to identify that the second shared memory area of the second buffer is in the state indicating the data stored therein is most recent data; and in response the second processor is arranged to perform the first read operation by reading the data from the second shared memory area of the second buffer.
10. A system according to claim 9, wherein after identifying that the second shared memory area of the second buffer is in the state indicating the data stored therein is most recent data, the communication controller is arranged to change the state associated with the second shared memory area of the second buffer to indicate that the second shared memory area of the second buffer is in use by the second processor.
1 1 . A system according to any of claims 8 to 10, wherein the plurality of buffers are clocked at a same rate as the plurality of processors.
12. A system according to any of claims 8 to 1 1 , wherein the plurality of buffers are arranged into a plurality of buffer groups, each processor being coupled to buffers of one buffer group, and each buffer group comprises three buffers.
13. A system according to any of claims 8 to 12, wherein read operations performed by the plurality of processors and write operations performed by the plurality of processors are performed at different rates.
14. A scheduler for use in a system of the type defined in any of claims 8 to 13.
15. A communication controller for use in a system of the type defined in any of claims 8 to 13.
16. A product including a system of the type defined in any of claims 8 to 13.
17. A method, system, scheduler, communication controller or product as generally hereinbefore described with reference to and/or illustrated in Figures 3 to 1 1 of the accompanying drawings.
PCT/GB2011/052303 2010-11-24 2011-11-23 Method and arrangement for a multi-core system WO2012069831A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
GB1019895.0 2010-11-24
GB1019890.1 2010-11-24
GBGB1019895.0A GB201019895D0 (en) 2010-11-24 2010-11-24 Identifying the end of a task efficiency
GBGB1019890.1A GB201019890D0 (en) 2010-11-24 2010-11-24 Asynchronnous and trasnparent three-buffer communication framework for distributed memory multi-cores

Publications (1)

Publication Number Publication Date
WO2012069831A1 true WO2012069831A1 (en) 2012-05-31

Family

ID=45478354

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/GB2011/052302 WO2012069830A1 (en) 2010-11-24 2011-11-23 A method and system for identifying the end of a task and for notifying a hardware scheduler thereof
PCT/GB2011/052303 WO2012069831A1 (en) 2010-11-24 2011-11-23 Method and arrangement for a multi-core system

Family Applications Before (1)

Application Number Title Priority Date Filing Date
PCT/GB2011/052302 WO2012069830A1 (en) 2010-11-24 2011-11-23 A method and system for identifying the end of a task and for notifying a hardware scheduler thereof

Country Status (1)

Country Link
WO (2) WO2012069830A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829631A (en) * 2018-04-27 2018-11-16 江苏华存电子科技有限公司 A kind of approaches to IM promoting multi-core processor
CN111432899A (en) * 2017-09-19 2020-07-17 Bae系统控制有限公司 System and method for managing multi-core access to shared ports
CN111796948A (en) * 2020-07-02 2020-10-20 长视科技股份有限公司 Shared memory access method and device, computer equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984384B (en) * 2020-08-24 2024-01-05 北京思特奇信息技术股份有限公司 Daemon and timing type job coexistence scheduling mechanism method and related device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030084269A1 (en) * 2001-06-12 2003-05-01 Drysdale Tracy Garrett Method and apparatus for communicating between processing entities in a multi-processor
EP1956484A1 (en) * 2007-02-07 2008-08-13 Robert Bosch Gmbh Administration module, producer and consumer processor, arrangement thereof and method for inter-processor communication via a shared memory

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6212542B1 (en) * 1996-12-16 2001-04-03 International Business Machines Corporation Method and system for executing a program within a multiscalar processor by processing linked thread descriptors
FR2920557A1 (en) * 2007-12-21 2009-03-06 Thomson Licensing Sas Processor for CPU, has hardware sequencer managing running of tasks and providing instruction for giving control to sequencer at end of tasks, where instruction sets program with base address relative to next task to program counter

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030084269A1 (en) * 2001-06-12 2003-05-01 Drysdale Tracy Garrett Method and apparatus for communicating between processing entities in a multi-processor
EP1956484A1 (en) * 2007-02-07 2008-08-13 Robert Bosch Gmbh Administration module, producer and consumer processor, arrangement thereof and method for inter-processor communication via a shared memory

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CHEN J ET AL: "A ThreeSlot Asynchronous Reader/Writer Mechanism for Multiprocessor RealTime Systems", INTERNET CITATION, 16 January 1997 (1997-01-16), XP002448052, Retrieved from the Internet <URL:http://citeseer.ist.psu.edu/cache/papers/cs/1425/ftp:zSzzSzftp.cs.yor k.ac.ukzSzreportszSzYCS-97-286.pdf/a-three-slot-asynchronous.pdf> [retrieved on 20070101] *
HYEONJOONG CHO ET AL: "A Space-Optimal Wait-Free Real-Time Synchronization Protocol", REAL-TIME SYSTEMS, 2005. (ECRTS 2005). PROCEEDINGS. 17TH EUROMICRO CON FERENCE ON PALMA DE MALLORCA, BALEARIC ISLANDS, SPAIN 06-08 JULY 2005, PISCATAWAY, NJ, USA,IEEE, 6 July 2005 (2005-07-06), pages 79 - 88, XP010835768, ISBN: 978-0-7695-2400-9 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111432899A (en) * 2017-09-19 2020-07-17 Bae系统控制有限公司 System and method for managing multi-core access to shared ports
US11397560B2 (en) 2017-09-19 2022-07-26 Bae Systems Controls Inc. System and method for managing multi-core accesses to shared ports
CN108829631A (en) * 2018-04-27 2018-11-16 江苏华存电子科技有限公司 A kind of approaches to IM promoting multi-core processor
CN111796948A (en) * 2020-07-02 2020-10-20 长视科技股份有限公司 Shared memory access method and device, computer equipment and storage medium
CN111796948B (en) * 2020-07-02 2021-11-26 长视科技股份有限公司 Shared memory access method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
WO2012069830A1 (en) 2012-05-31

Similar Documents

Publication Publication Date Title
US9864627B2 (en) Power saving operating system for virtual environment
CN100524223C (en) Method and system for concurrent handler execution in a SMI and PMI-based dispatch-execution framework
US6944850B2 (en) Hop method for stepping parallel hardware threads
EP1856612B1 (en) Method for counting instructions for logging and replay of a deterministic sequence of events
EP1842132B1 (en) Method for optimising the logging and replay of multi-task applications in a mono-processor or multi-processor computer system
JP5295228B2 (en) System including a plurality of processors and method of operating the same
CN109997113B (en) Method and device for data processing
EP1927049A1 (en) Real-time threading service for partitioned multiprocessor systems
CN102375761A (en) Business management method, device and equipment
JP2000029737A (en) Processor having real-time outer instruction insertion for debugging functions
US7565659B2 (en) Light weight context switching
WO2012069831A1 (en) Method and arrangement for a multi-core system
JPH1021094A (en) Real-time control system
US9652299B2 (en) Controlling the state of a process between a running and a stopped state by comparing identification information sent prior to execution
US8732441B2 (en) Multiprocessing system
US20030014558A1 (en) Batch interrupts handling device, virtual shared memory and multiple concurrent processing device
US20080077925A1 (en) Fault Tolerant System for Execution of Parallel Jobs
Vaas et al. Taming Non-Deterministic Low-Level I/O: Predictable Multi-Core Real-Time Systems by SoC Co-Design
Walls Embedded RTOS Design: Insights and Implementation
CN112214277A (en) Operating system partitioning method, device and medium based on virtual machine
US20160034291A1 (en) System on a chip and method for a controller supported virtual machine monitor
Bulusu Asymmetric multiprocessing real time operating system on multicore platforms
KR20190118521A (en) Method and device for error handling in a communication between distributed software components
CN102073551B (en) Self-reset microprocessor and method thereof
JP2018049406A (en) Task coordination device among plural processors

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11808271

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11808271

Country of ref document: EP

Kind code of ref document: A1