Embedded systems are primarily constrained by system throughput or performance, in that it is generally irrelevant what happens inside an embedded device, as long as it performs its task with adequate speed and results. Given this constraint is met, embedded systems are mostly constrained by power consumption.
A fixed energy budget coupled with increasing demands for performance in new embedded systems is pushing embedded designs towards multi-core designs, just like general purpose processors. The main difference is that the power budget is smaller, and the processors can afford even less of the power-hungry instruction level parallelism techniques characterising todays general-purpose processors.
In the same vein, embedded systems can hardly afford power-hungry processor interconnects. As with instruction set extensions, specialisation towards the application can be used to increase performance and decrease the power cost of the interconnect architecture.
Our research in this field is in trying to find methods to automatically design, and physically implement, an interconnect architecture that is near-optimal for a given application. Using statistical and machine learning methods we expect to be able to move past expensive design-space exploration to solve this problem.