Multicore embedded systems present a class of design problems that cannot be solved analytically. When multiple processors share memory, compete for bus bandwidth, and coordinate through complex inter-processor communication schemes, the aggregate behavior that emerges from those interactions is not predictable from component-level specifications alone. Simulation is the only practical tool for predicting and optimizing multicore system behavior before hardware exists.
This article examines how UML, SysML, and the MARTE profile support multicore system simulation, what each standard contributes to the modeling workflow, and where the practical challenges lie when applying these tools to realistic heterogeneous architectures.
Why Multicore Changes the Simulation Problem
Single-core embedded system performance is hard to predict but tractable. With one processor, one memory hierarchy, and one execution context, worst-case timing analysis techniques can produce useful bounds. The system is complex, but the interactions are bounded.
Multicore systems break this tractability in several ways. Cache coherence introduces non-deterministic memory access times: whether a cache line is present when a core requests it depends on what other cores have been doing. Shared bus or network-on-chip bandwidth creates contention patterns that vary with the combined workload of all active cores. Inter-core synchronization introduces data-dependent blocking that can produce latency spikes invisible to any single-core analysis.
These interaction effects are exactly what simulation excels at capturing. By running the full multicore system as an executable model under realistic workloads, engineers observe the emergent behavior that cannot be predicted component by component. They can identify the cores and communication paths that are binding constraints, experiment with scheduling policies and memory topology changes, and develop insight into the system's performance characteristics before RTL is written.
UML as the Structural Foundation
UML provides the base notation on which both SysML and MARTE are built, and it contributes several constructs that are directly useful for multicore system modeling.
Component diagrams describe the structural decomposition of the system: which software components run on which processing elements, what interfaces they expose, and how they are connected. For a multicore SoC, this translates naturally to a mapping of tasks and middleware components onto the available cores and accelerators.
Sequence diagrams and activity diagrams capture the dynamic behavior of inter-component interactions. In a multicore context, these diagrams show how tasks on different cores communicate, synchronize, and exchange data, making the inter-core coordination explicit and auditable.
Class and object diagrams model the data structures shared across cores, which is important for analyzing the memory access patterns that drive cache behavior and bus bandwidth consumption.
UML's extensibility mechanism, profiles, is the hook that makes domain-specific modeling possible without abandoning the standard notation. Both SysML and MARTE are defined as UML profiles, which means models built with these standards are still valid UML models and can be processed by any UML-compliant tool.
SysML for Architecture Definition and Requirements Traceability
SysML was developed to address the needs of systems engineering, a discipline that operates at the intersection of hardware, software, mechanical, and operational concerns. For multicore embedded systems development, SysML contributes three capabilities that UML alone does not provide.
Block Definition Diagrams (BDD) provide a hierarchical decomposition of the system that maps cleanly onto the hardware architecture. Processing elements, memories, interconnects, and peripheral subsystems appear as blocks with defined ports and interfaces. A BDD of a multicore SoC makes the architectural topology explicit and provides a reference structure that all subsequent models can reference.
Internal Block Diagrams (IBD) show how blocks connect and interact within a containing assembly. For a multicore system, an IBD of the SoC shows the data paths between cores, the connections to shared caches and memory controllers, and the interfaces to external peripherals. This diagram type is the primary artifact for communicating architecture to engineers working on different subsystems.
Requirement Diagrams link system requirements directly to the architectural elements responsible for satisfying them. This traceability is critical for multicore system development because it makes explicit which requirements are at risk when simulation reveals a performance shortfall, and which architectural choices are candidates for modification.
MARTE: Enabling Quantitative Performance Analysis
MARTE (Modeling and Analysis of Real-Time and Embedded Systems) is the UML profile designed specifically to support timing and performance analysis of embedded systems. It extends UML with stereotypes and tagged values that capture the quantitative information simulation tools need to produce meaningful results.
The MARTE time model provides a formal treatment of time that is more expressive than UML's built-in time constructs. It distinguishes between logical time (the ordering of events) and physical time (wall-clock duration), and it supports multiple concurrent time bases, essential for modeling heterogeneous SoCs where different subsystems may operate on different clock domains.
Resource modeling stereotypes in MARTE annotate model elements with their resource requirements and capacities. A processing element stereotype carries attributes for execution speed and the scheduling policy used to arbitrate between competing tasks. A communication resource stereotype carries bandwidth and latency parameters. These annotations transform a structural model into a quantitative model that simulation engines can execute.
The MARTE Schedulability Analysis and Time (SAT) sub-profile provides constructs for formal schedulability analysis of real-time task sets. For multicore systems, SAT models capture task periods, deadlines, execution budgets, and the dependencies between tasks running on different cores.
The Generic Quantitative Analysis (GQAM) sub-profile covers broader performance analysis scenarios: throughput, utilization, queue lengths, and response time distributions. GQAM-annotated models can be submitted to simulation or analytical solvers to produce performance predictions expressed in units that map directly to product requirements.
Practical Challenges in Multicore Simulation
Building useful multicore simulation models requires navigating several practical difficulties.
Fidelity calibration is the central challenge. The simulation must be accurate enough to predict the interaction effects that matter, cache contention, bus arbitration behavior, inter-core synchronization delays, without being so detailed that it takes days to execute. Most teams converge on transaction-level models for the communication fabric and instruction-set-accurate models for the application cores.
Workload realism is as important for multicore simulation as for single-core work, and harder to achieve. Multicore contention patterns depend on the simultaneous activity of all cores, which means a workload trace for one core is not sufficient. The full multi-task workload profile must be captured or synthesized. Teams that invest in profiling representative workloads from deployed systems get simulation results that are predictive of hardware behavior.
Model composition across fidelity levels is a third challenge. A complete multicore SoC simulation often combines subsystem models built at different abstraction levels by different teams. Ensuring that these models compose correctly, that their interfaces are compatible, their time bases are synchronized, and their assumptions about shared resources are consistent, requires explicit coordination that is easy to neglect.
From Simulation to Architecture Decision
Simulation results are only valuable to the extent they drive architectural decisions. The output of a multicore system simulation, utilization figures, latency distributions, deadline miss rates, bus bandwidth consumption profiles, needs to be reviewed against the original requirements and translated into specific architectural recommendations.
Teams that establish a regular simulation review cadence during the architecture phase, running updated simulations as architectural alternatives are proposed and reviewing results against requirement targets in structured design reviews, get the most value from their modeling investment. Teams that build models but do not connect simulation results to decision processes end up with interesting data they do not act on.
The MARTE traceability constructs are particularly useful for this purpose: when a simulation run shows that a latency requirement is at risk, the requirement diagram immediately identifies which architectural elements are implicated, which alternatives are candidates for investigation, and which downstream requirements depend on the one at risk.
UML, SysML, and MARTE together provide a complete stack for multicore system simulation: structural notation, requirements traceability, and quantitative performance annotation. Teams that use these standards consistently produce models that are auditable, reusable, and directly connected to the design decisions they are meant to inform. For multicore embedded systems, where interaction effects make analytical prediction unreliable, simulation grounded in standard models is the most defensible basis for architectural commitment.