Implementazione ed analisi dell’algoritmo Stack-CNN su scheda FPGA

In this thesis the implementation and analysis on FPGA (Field Programmable Gate Array) of the Stack-CNN algorithm has been investigated. This algorithm’s goal is to achieve an online detection and track reconstruction of space debris in the 1-10 cm range with a space-based experiment. Stack-CNN was originally developed as an online trigger operating in Float-32 for an orbiting space debris remediation system. In this thesis it was converted to work with a 8-bit quantization in order to make it suitable for FPGA implementation. Regarding quantization and FPGA implementation of neural networks, a feasibility study was performed in order to understand the techniques and the state of the art. Quantization aware training in the Brevitas framework and in a Custom Layer environment were carried on the convolutional neural network of the Stack-CNN algorithm. Results were compared and analyzed. The FPGA implementation was performed with the FINN compiler. It is an experimental framework from Xilinx Research Labs to explore deep neural network inference on FPGAs. It specifically targets quantized neural networks (QNNs), with emphasis on generating dataflow-style architectures customized for each network. Results from the RTL (register transfer level) synesis report have been analyzed to understand if the constraints related to power consumption and time scale, that a space related implementation requires, were satisfied. The implementation of the CNN on a Zynq UltraScale+ MPSoC ZCU104 by Xilinx is expected to be carried out. The stacking procedure is expected to be implemented in the Brevitas environment with the aim of a full system implementation using the FINN compiler.

In questo progetto di tesi è stata studiata l’implementazione e l’analisi su scheda FPGA (Field Programmable Gate Array) dell’algoritmo Stack-CNN. L’obiettivo di questo algoritmo è di rilevare e tracciare, in un sistema in presa dati online, microdetriti orbitali nel range 1-10 cm mediante un esperimento su piattaforma spaziale. Lo Stack-CNN, che opera in Float-32 è stato originariamente sviluppato come trigger online in un sistema di bonifica spaziale. In questo lavoro l’algoritmo è stato convertito per lavorare con una quantizzazione a 8-bit al fine di renderlo implementabile su scheda FPGA. Per quanto riguarda la quantizzazione e l'implementazione FPGA di reti neurali è stato eseguito uno studio di fattibilità per comprendere le tecniche e lo stato dell'arte. L’allenamento alla quantizzazione (Quantum Aware Training) della rete convoluzionale è stato effettuato all’interno del framework Brevitas e in un ambiente utilizzante Custom Layers. I risultati dei due modelli sono stati infine confrontati e analizzati. L'implementazione FPGA è stata eseguita tramite il compilatore FINN. Si tratta di un struttura sperimentale sviluppata dalla Xilinx Research Labs per esplorare l'inferenza profonda delle reti neurali su FPGA. Si rivolge specificamente alle reti neurali quantizzate (QNNs), con particolare attenzione alla generazione di architetture personalizzate per ogni rete. I risultati del RTL (Register Transfer Level) Syntesis report sono stati analizzati per capire se fossero stati rispettati i vincoli legati al consumo energetico e alla scala temporale richiesti per un'implementazione spaziale. L'implementazione della CNN sulla scheda FPGA Zynq UltraScale+ MPSoC ZCU104 verrà implementata come step successivo. La procedura di stacking verrrà convertita e adattata all’ambiente Brevitas con l'obiettivo di un'implementazione del sistema completo, utilizzando il compilatore FINN.