Presentation title
A RISC-V-based FPGA Overlay to Simplify Accelerator Deployment for Unmanned VehiclesAuthors
Gianluca Bellocchi, Alessandro Capotondi and Andrea MarongiuInstitution(s)
University Modena and Reggio EmiliaPresentation type
Presentation of a research group from one or more scientific institutionsAbstract
Many challenges are to be faced when autonomous systems, such as Unmanned Aerial Vehicles (UAVs), need to attain a specific task demonstrating high reliability and safety levels. Autonomous navigation is still extremely difficult to be implemented, especially when sizes are reduced.
UAVs are increasingly adopting heterogeneous systems-on-chip (HeSoCs) as compute platforms to satisfy the demands of their sophisticated workloads; these devices can reach high performance and energy efficiency, at the cost of increased design complexity. HeSoCs couple together with a general-purpose high-performance CPU (host processor) with domain-specific acceleration engines.
Hardware flexibility and tight power envelopes make FPGA an ideal candidate for acceleration. In contrast, the design process is hard and productivity gets easily worsen by the long compilation times. Maximum performance is usually associated with conventional RTL hardware design techniques, requiring low-level device expertise. High-Level Synthesis (HLS) eases the pain of the hardware IP design process, but automated tools still lack the required maturity to efficiently tackle system-level integration of the many hardware and software blocks included in a modern UAV system.
The FPGA design and integration processes can be simplified by employing a hardware abstraction layer that overlays the original FPGA fabric, hiding most of the RTL details to non-expert users. This solution is usually referred to as an overlay architecture. This avoids the complex FPGA design flow resulting in improved design productivity. In the context of UAV system deployment, overlays offer the additional advantage of rapid swapping of architectural blocks, as coarse-grained overlay architectures have smaller configuration data sizes than fine-grained FPGAs.
In this context, our research contributes consist of:
● An Open-Source RISC-V-based FPGA Overlay Architecture: The proposed overlay is built around silicon-proven open-source RISC-V architecture[1] and its design entries are optimized for operating at the high operating frequency and low hardware resource overheads. The overlay has been implemented on a commercial off-the-shelf HeSoC platform and is extensively characterized, demonstrating little area overhead and high efficiency at a target frequency of 140 MHz. Tests with three sample accelerators show an improvement of up to 30% compared to standard HLS design flows.
● A Semi-automatic design flow for the deployment of Application-Specific Accelerators: This simplifies both the design and programmability of the HW/SW platform. A dedicated logic (the wrapper) offers a plug-and-play HW/SW integration of the accelerators; the design stage is not limited to any specific methodology (both HLS or manual RTL can be employed). The wrapper can be specialized with the aid of an abstracted integration methodology and subsequently connected to the overlay interconnect providing shared-memory communication to the overlay cores.
● High-Level Programming Model Interface: The overlay integrates RISC-V soft-cores to control the accelerators; these can flexibly operate and re-configure their operation without the costly need for host intervention, thus avoiding significant performance degradation. This is achieved employing a typical offloading procedure -- from the host CPU to the soft-cores -- of a standard heterogeneous programming model (e.g., OpenMP v4.0).