iomodeling and refinement 2
TRANSCRIPT
8/6/2019 IOmodeling and Refinement 2
http://slidepdf.com/reader/full/iomodeling-and-refinement-2 1/4
I/O Modeling and Refinement for HW/SW Codesign of Embedded Systems
Youngmin YiSchool of Computer Science and
EngineeringSeoul National University
Dohyung KimDepartment of Computer Science
and EngineeringUniversity of California, San Diego
Soonhoi HaSchool of Computer Science and
EngineeringSeoul National University
Abstract - Different levels of abstraction for I/O modeling areused depending on the codesign step. However, manually changingthe abstraction level of I/O models at each design step is laborious.
Moreover, the designer may want to mix the level of abstractionbetween the I/O models and the simulation. Thus, it is verydesirable to make the I/O modeling retargettable and configurable.
In this paper we propose an I/O modeling and refinement technique in a codesign methodology where an I/O device and itsinterface code with various levels of abstraction is automaticallyintegrated and finally simulated in the unified framework. Wedemonstrate the viability of the proposed methodology with
videophone example that consists of H.263 codec accessingcamera and LCD, network control task accessing Ethernet deviceand G.723 codec accessing microphone and speaker.
Keywords: codesign, system level modeling, I/O modeling,I/O refinement, simulation, PeaCE.
1 Introduction
I/O devices are important system components in an embeddedsystem that interacts with its environment. Different levels of abstraction for I/O modeling are used depending on the design step.At the algorithm development step or at the functional simulationstep I/O accesses are usually replaced with file accesses andmemory accesses at the source level to avoid the need of I/Omodeling at all. For example input device access is replaced with
access to a file or a memory region that stores the pre-obtainedresult of input device access. One may load the input image to thesimulator memory if the simulator provides such capability or buildtogether the input data with the executable.
Semihosting[1][2] is a well-known I/O modeling technique forfunctional simulation using a processor simulator. The processorsimulator emulates an I/O operation in the target code in thesimulation host. For example an ARM processor simulator usesSWI (software interrupt) method to hook an I/O request in thetarget code and to convert it to a host I/O request
When the designer simulate the entire system to estimate orverify the overall performance, it is essential to model I/O devicebehavior accurately enough to capture its impact on systemperformance. Since I/O behavior and its interface affects task scheduling and bus bandwidth, improper modeling of I/O deviceand the interface result in inaccurate performance result. Therefore
the current practice is to port the operation system on the simulatorand run the application on top of the OS. However, writing a devicedriver for a new I/O device is usually difficult, time consuming,and error-prone since it is heavily dependent on target architectureand requires full understanding of complex I/O device behavior. Incase the final system architecture including OS is not determinedyet, writing a device driver is not a feasible approach. So we needanother technique of I/O modeling in the design space explorationstep.
Changing the abstraction level of I/O models at each design stepis laborious task if it is done manually. In addition, the designermay want to mix the level of abstraction between the I/O models
and the simulation. For example, before the device driver for an I/Odevice is made, a higher level model of I/O device should be usedthan the simulation level of other components. In summary, it isvery desirable to make the I/O modeling retargettable andconfigurable, which is the main concern of this paper.
In this paper we propose an I/O modeling and refinementtechnique in a codesign methodology where an I/O device and itsinterface code with various levels of abstraction is automaticallyintegrated and finally simulated in the unified framework.
In the next section, we review related work. In section 3, wedescribe the general codesign procedure of embedded systems.Proposed specification, mapping, and simulation of I/O device andthe interface are described from section 4 to 6. Section 7 describesa case study with videophone application. Finally, future work andconclusions follows in section 8.
2 Related Work
Bouchhima et al[3] propose a method to model I/O operations inSystemC for HW/SW cosimulation. Their approach is specific totheir simulation environment where the operating system is alsomodeled in SystemC to make the compiled simulation environment.In this paper we assume that the system simulation environment ismade up with an integration of simulation models of components,as most commercial cosimulation environments are.
Wang et al.[4] proposed a framework for peripheral devicemodeling but their work aims at synthesis of interface code itself
from a formal specification of peripheral devices. It does not dealwith a simulation of peripheral device and its interface for HW/SWcosimulation.
UDI [5] is a set of APIs that has been defined to allow devicedrivers to be portable across different platforms and OSes. Themain idea of having OS and architecture independent deviceinterface API is the same with our methodology. But the purpose of UDI is to make device drivers portable horizontally across differentarchitectures, whereas ours is to provide horizontal and verticalretargettability for the I/O models.
In summary the previous I/O modeling works mainly focus onthe horizontal retargetability at the specific level of abstraction. Ourcontribution in this paper is to enable the vertically retargettableI/O modeling across different levels of abstraction depending on thedesign steps.
3 Methodology and FrameworkFigure 1 shows the overall codesign flow of our codesign
environment, PeaCE[6], that implements the proposed technique.While the proposed technique is also applicable to other codesignapproaches, we use ours to show specific implementation examples.In our codesign flow, architecture (platform) and algorithm areseparately specified, which is known as Y-chart codesign flow [7].Algorithm specification with a block diagram is an executablespecification; functional simulation is performed by generating ahost code from the specification and running the code in the hostmachine. SW performance estimation is performed for a specifiedprocessor using the instruction set simulator of the processor. Based
8/6/2019 IOmodeling and Refinement 2
http://slidepdf.com/reader/full/iomodeling-and-refinement-2 2/4
on the profiling information, HW/SW partition and mapping of algorithm blocks onto the target architecture is performed. After thepartitioning decision, a virtual prototype is automatically builtthrough SW synthesis, HW synthesis, and interface synthesis.Interface synthesis in this step means generating the driver andwrapper code between SW-mapped block and HW-mapped one,not between I/O device and a block. After the final architecture isdetermined through cosimulation or virtual prototyping, realprototyping is made.
In the proposed technique, we use the same I/O interface codefrom the algorithm specification down to the real prototyping. Butdifferent I/O modeling is refined depending on the design step. Wewill illustrate four different levels of refinement for the followingdesign steps: functional simulation, performance estimation, virtualprototyping, and real prototyping.
algorithmspecification
architecturespecification
FunctionalSimulation
performanceestimation
HW/SW partition& mapping
mapped result
SWsynthesis
HWsynthesis
InterfaceSynthesis
Cosimulation
C code driver/wrapper VHDL code
timecost
algorithmspecification
architecturespecification
FunctionalSimulation
performanceestimation
HW/SW partition& mapping
mapped result
SWsynthesis
HWsynthesis
InterfaceSynthesis
Cosimulation
C code driver/wrapper VHDL code
timecost
Figure 1. Overall HW/SW codesign flow of PeaCE
4 I/O Interface Specification
To make the I/O interface code independent of the abstractionlevel of I/O models, we define the generic I/O interface APIs. We
propose to specify I/O device interface only using predefined set of generic APIs and detailed I/O device specific interface definitionfor the APIs are refined according to the I/O device model indifferent abstraction levels.
The generic I/O interface API should not assume any specificimplementation. So it can also be implemented in hardware. And itmust be independent of the target architecture and operating system.But at the same time it should be in such a level that the essentialbehavior can be captured for correct validation and accurateperformance estimation. Figure 2 shows a subset of generic I/Ointerface APIs that we have defined in the current implementation.
Figure 2. Generic I/O interface APIs
Standard C library includes a number of I/O related functions(fopen, fread, fscanf, printf, etc). We provide two different levels of ports for those functions as illustrated in figure 3. The firstdefinition is for performance estimation with an ISS. In this case,
semihosting technique is applied as the definition. The seconddefinition is for time-accurate cosimulation and is defined with ourI/O interface APIs that is in turn defined with the interface codesfor target I/O device model. We selectively redefine them with thesame prototype using #define statement for each level of designflow.
Figure 3. Porting standard library using semihosting technique
#ifndef FUNC_SIM
#define fopen IOmodel_fopen
#ifdef PERF_ESTIMATION
FILE *IOmodel_fopen(const char *path, const char *mode) {
strcpy((char *)IO_BUF, path); // pass 1st
argument
strcpy((char *)(IO_BUF+256), mode); // pass 2nd
argument
*(volatile int *)IO_CMD = 8; // inform which interface it is
(int)fd = *(volatile int *)IO_RET // get result of semihosting
return fd; }
#else // TIME_ACCURATE_COSIM
FILE *IOmodel_fopen(const char *path, const char *mode) {
int fd = IOdev_open(path, mode); }
#endif
#endif
5 I/O Device Refinement
In this section we explain how an I/O device model is refinedwith different abstraction level as the design steps are performed.We use a simple illustrative example in figure 4 where block A accesses I/O device dev1 and sends data to block B which accessesdev2.
At the functional simulation step, the code is run in the hostmachine so that I/O device need not be a simulation model. Insteadit can be an actual I/O device driven in the host OS as shown infigure 4(a). Then an I/O interface definition becomes nothing butthe associated device driver of the host OS.
At the SW performance estimation step, we generate a targetcode for a specific processor and run the target code using theinstruction set simulator (ISS) of the processor (figure 4(b)). TheISS provides the execution profile information such as executioncycles and memory access counts. Since we are only interested inthe performance information of a function block for the specificprocessor, bus contention or bus bandwidth is not of main concern.Therefore, I/O device can be modeled using semihosting techniqueand the estimated performance of I/O interface is defined as afunction of the amount of data exchanged.
As explained in section 3, we partition the system behaviorbefore building a virtual prototype. Let us assume that block A ismapped to a HW component and block B to a SW component. Then,virtual prototype is built with HW simulator and ISS beingconnected to cosimulation engine. The cosimulation engineschedules the component simulators and manages interactionbetween them. Figure 4(c) displays the execution flow of therefined I/O interface code. These codes can be an actual devicedriver or premitive APIs of cosimulation engine depeding on whattype of cosimulation framework one use.
int IOdev_open(const char *devname, int flags);
int IOdev_close(int fd);
int IOdev_read(int fd, char *buf, int count) {
iodev[fd].read(buf, count); }int IOdev_write(int fd, char *buf, int count) {
iodev[fd].write(buf, count); }
int IOdev_set_config(int fd, int cmd, void *buf, int size) {
iodev[fd].set_config(cmd, buf, size); }
int IOdev_get_config(int fd, int cmd, void *buf, int size) {
iodev fd . et conf i cmd, buf, s ize ;
In either cosimulation framework, there are two factors thataffect the accuracy of simulation in this step. First is the overheadof the I/O interface code itself in the processing element and therelevant OS scheduling. Second is the response time of the I/Odevice access considering the contention on the communicationarchitecture. These two factors are solved straightforwardly in thecommon cosimulation frameworks such as Seamless CVE[8],because target OS itself is also cross-compiled and run on top of ISS with the device driver directly interacting with the OSscheduler. And the access to the communication architecture issimulated accurately since bus and memory architecutre aremodeled in RTL. The situation is not different in the common TLMcosimulation framework such as ConvergenSC[9] except that bus
8/6/2019 IOmodeling and Refinement 2
http://slidepdf.com/reader/full/iomodeling-and-refinement-2 3/4
and memory architecutre are modeled in TLM. While ourcosimulation framework[10] also employs ISS, it differs from theapproach of these tools in that ours neither simulates OS on top of ISS, nor simulates bus and memory architecutre but models theminside the cosimulation engine for higher cosimulation performance.In our cosimulation framework, the former factor is accounted withOS simulation model in the cosimulation engine with I/O devicemodel only accessed through predefined I/O interfaces. These APIsfor this level of cosimulation is defined using primitivecommunicaton APIs that correctly interoperate with thecosimulation engine. The latter factor can be solved by the bus andmemory model in the cosimulation engine since every access to I/Odevice model as well as other communication between the othersystem components is monitored by the cosimulation engine.
use(“dve1”)
while(1) {
IOdev_read ();
do_computation();
write_port() ;
}
use(“dev2”)
while(1) {
read_port();
do_computation();
IOdev_write ();
}
// interface defintion
void dev2_preinit() ;
int dev2_read();
int dev2_write();
// interface defintion
void dev1_preinit() ;
int dev1_read();
int dev1_write();
host OShost I/O dev1 host I/O dev2
host binary
Block BBlock A
(a) Host code execution in functional simulation use(“dve1”)
while(1) {
IOdev_read ();
do_computation();
write_port() ;
}
use(“dev2”)
while(1) {
read_port();
do_computation();
IOdev_write ();
}
// interface defintion
void dev2_preinit() ;
int dev2_read();
int dev2_write();
// interface defintion
void dev1_preinit() ;
int dev1_read();
int dev1_write();
host OShost I/O dev1 host I/O dev2
target binary
target ISSI/O model
Block BBlock A
I/O device model can be at any level of abstraction as long as thedesigner provides corresponding interface definition. When I/Odevice interface codes are retrieved from the interface library, theabstraction level of I/O device model and target OS are given asconfiguration parameters for a certain type of I/O device. It enablesus a mixed level simulation of an I/O device and the other systemcomponents.
(b) Performance estimation using ISS
entity A is
component IOdev_read
end component;
componentwrite_port
end component;
begin
main: process(clk) begin
end process
end entity
while(1) {
read_port();
do_computation();
IOdev_write ();
}
// interface defintion
void dev2_preinit() ;
int dev2_read();
int dev2_write();
// interface defintion
entity IOdev_read
component dev1_read
component dev1_write
end entity
host OShost I/O dev1 host I/O dev2
target binary
target ISS I/O (dev2) simulatorI/O (dev1) simulator
cosimulationengine
HW simulator
target netlistBlock A
Block B
(c) Cosimulation using ISS and HW simulator
entity A is
component IOdev_read
end component;
component write_port
end component;
begin
main: process(clk) begin
end process
end entity
while(1) {
read_port();
do_computation();
IOdev_write ();
}
// interface defintion
void dev2_preinit() ;
int dev2_read();
int dev2_write();
// interface defintion
entity IOdev_read
component dev1_read
component dev1_write
end entity
real prototype board
target binary
CPUEthernetdevice
sharedmemory
FPGA
target netlist
host OSI/O dev2Ethernet Device
I/O server
//I/O model stub
eth_read_request(id)
dev1
Block A
Block B
(d) Real prototyping board with I/O modeling server
Figure4. Various level of abstraction for I/O device model and corresponding interface definition in different level of simulations
Figure 5. (a) Algorithm specifcation (b) Architecture specification
8/6/2019 IOmodeling and Refinement 2
http://slidepdf.com/reader/full/iomodeling-and-refinement-2 4/4
use(“camera”)
fd = IOdev_open (“camera”);
while(1) {
IOdev_read (fd, buf, size);
do_computation();
write_port() ;
}
use(“LCD”)
fd = IOdev_open (“LCD”);
while(1) {
read_port();
do_computation();
IOdev_write (fd, buf, size);
}
void camera_preinit(IODEV *iodev_p) {
iodev_p->read = camera_read;
iodev_p->write = NULL;
iodev_p->set_config = camera_set_config;
iodev_p->get_config = camera_get_config;
iodev_p->open = camera_open;
iodev_p->close = camera_close;
}
int camera_read(char *buf, int size) {
write_port (CAMERA_CMD_REG, cmd, cmd_size);
read_port (CAMERA_BUF, buf, size);
}
void LCD_preinit(IODEV *iodev_p) {
iodev_p->read =NULL;
iodev_p->write = LCD_write;
iodev_p->set_config = LCD_set_config;
iodev_p->get_config = LCD_get_config;
iodev_p->open = LCD_open;
iodev_p->close = LCD_close;
}
int LCD_write(char *buf, int size) {
write_port (LCD_CMD_REG, cmd, cmd_size);
write_port (LCD_BUF, buf, size);
}
camera_device
command_reg data_buffer
LCD_device
command_reg frame_buffer
use(“camera”)
fd = IOdev_open (“camera”);
while(1) {
IOdev_read (fd, buf, size);
do_computation();
write_port() ;
}
use(“LCD”)
fd = IOdev_open (“LCD”);
while(1) {
read_port();
do_computation();
IOdev_write (fd, buf, size);
}
void camera_preinit(IODEV *iodev_p) {
iodev_p->read = camera_read;
iodev_p->write = NULL;
iodev_p->set_config = camera_set_config;
iodev_p->get_config = camera_get_config;
iodev_p->open = camera_open;
iodev_p->close = camera_close;
}
int camera_read(char *buf, int size) {
write_port (CAMERA_CMD_REG, cmd, cmd_size);
read_port (CAMERA_BUF, buf, size);
}
void LCD_preinit(IODEV *iodev_p) {
iodev_p->read =NULL;
iodev_p->write = LCD_write;
iodev_p->set_config = LCD_set_config;
iodev_p->get_config = LCD_get_config;
iodev_p->open = LCD_open;
iodev_p->close = LCD_close;
}
int LCD_write(char *buf, int size) {
write_port (LCD_CMD_REG, cmd, cmd_size);
write_port (LCD_BUF, buf, size);
}
camera_device
command_reg data_buffer
LCD_device
command_reg frame_buffer
Figure 6. Proposed I/O device modeling and interface refinement for cosimulation step in videophone example
In the real prototype board, it might be the case that I/O devicedriver for target board has not been ported yet but designers want toexecute the image and verify the correctness of syntheses. In thatcase, having I/O model server in host machine and inserting I/Omodel stub in the I/O interface definition in the target board is onesolution. This solution assumes that there is at least one porteddevice driver for communication device accessible in the targetboard (usually Ethernet device). Through that channel, I/O modelstub that is invoked whenever target code tries to access the I/Odevice transmits an I/O request packet to the I/O model server that
resides in host machine. I/O model server processes the requestaccessing the corresponding I/O device of host machine. It can beregarded as an extension form of semihosting (figure 4(d)).
In summary, separation of I/O interface from computation in thespecification and the selective refinements in the later steps in thecodesign flow allows mixed level of abstraction for I/O andcomputation. It enables concurrent design of I/O modeling and theother system components. Since I/O interface design requiresthorough knowledge about I/O device behavior and target OS APIs,it often becomes a bottleneck of the overall system design. Byapplying interface definition with high level of abstraction, thisbottleneck can be avoided and it is even applicable to the realprototype board
6 Case Study
Figure 5(a) is an algorithm specification of videophone example.It consists of several tasks: H.263 encoder, H.263 decoder, G.723encoder, G.723 decoder, network control task that connects to andaccepts from the given IP addresses, and a task that demuxes AVpackets into audio and video data.
An architecture specification in figure 5(b) contains informationabout which type of processing element is used and how many of them are used. It also specifies which type of communicationarchitecture is used for data and control transfer between theprocessing elements. We added I/O device components such ascamera, LCD panel with the LCD controller, microphone, speaker,and Ethernet controller.
A simplified version of H.263 encoder and decoder tasks and itsaccess to the camera and LCD controller is depicted in figure 6 asan example. The block representing H.263 encoder explicitlyindicates its access to a camera device (use(“camera”)). Code
generation module in the framework retrieves relevant interfacecodes from interface library and builds corresponding APIdefinition for the camera device. When interface codes are retrieved,important information of how many ports exist and what is thesemantic of each port are also retrieved and passed to the modulessubject to the remaining steps in the codesign flow. Those are verycrucial information for automatic building of a virtual prototypesuch as assignment of the memory map specified from thearchitecture specification and also for our cosimulation framework that exploits task synchronization or module interaction for highsimulation performance.
Initially the interface codes for functional simulation of videophone example were written as Linux device drivers. We
changed the camera and LCD display interface codes withproposed generic I/O interface APIs in order to apply the proposedmodeling and refinement across different steps in codesign flow.Note that interface codes in this example are defined usingpredefined communication APIs such as read_port() andwrite_port() that interoperate with our simulation engine. Theoverall system performance of a videophone example on theprocessor configured as arm926ej-s was 161 msec when AC97device latency was set to 40msec and camera latency to 16msec.
7 ConclusionsIn this paper, we proposed an I/O modeling and refinement
technique in a codesign methodology where an I/O device and its
interface code with various levels of abstraction is automatically
integrated and finally simulated in the unified framework.
With explicit specification of I/O interface using predefined
generic I/O interface APIs, different levels of I/O modeling
depending on the design step is easily refined, which makes the I/O
modeling retargettable and configurable. The viability of the
methodology is confirmed with a videophone example.
8 AcknowledgementsThis work was supported by National Research Laboratory
Program (number M1-0104-00-0015), BK21 Project, IT Leading
R&D Support Project, and ITSoC Project. The ICT at SeoulNational University provided the research facilities for this study.
References[1] ARM, http://www.arm.com
[2] SimpleScalar,http://www.simplescalar.com/docs/simple_tutorial_v4.pdf/
[3] A. Bouchhima et al, “Fast and Time-Accurate Timed
Execution of High Level Embedded Software using HW/SW
Interface Simulation Model”, In Proc. ASP-DAC , Jan. 2004
[4] S. Wang et al., “Modeling and Integration of Peripheral
Devices in Embedded Systems”, In Proc. DATE, Mar. 2003
[5] UDI, http://www.projectudi.org
[6] PeaCE, http://peace.snu.ac.kr/research/peace
[7] B. Kienhuis et al. “An approach for quantitative analysis of
application specific dataflow architectures”, In Proc. Application-
Specific Systems, Architectures and Processors, Jul. 1997
[8] Seamless CVE,
http://www.mentor.com/products/fv/hwsw_coverification/seamless/
[9] ConvergenSC,
http://www.coware.com/products/convergensc.php
[10] D. Kim et al, “Trace-Driven HW/SW Cosimulation UsingVirtual Synchronization Technique”, In Proc. DAC, June. 2005