Interconnect technologies for very large spiking neural networks

Thommes, Tobias

German Title: Verbindungstechnologien für sehr große spikende Neuronale Netze

[thumbnail of Dissertation_TobiasThommes_141223_PDF-A.pdf]

Preview

PDF, English - main document
Download (16MB) | Terms of use

Citation of documents: Please do not cite the URL that is displayed in your browser location input, instead use the DOI, URN or the persistent URL below, as we can guarantee their long-time accessibility.

DOI: 10.11588/heidok.00034189
URN: urn:nbn:de:bsz:16-heidok-341899

Abstract

In the scope of this thesis, a neural event communication architecture has been developed for use in an accelerated neuromorphic computing system and with a packet-based high performance interconnection network. Existing neuromorphic computing systems mostly use highly customised interconnection networks, directly routing single spike events to their destination. In contrast, the approach of this thesis uses a general purpose packet-based interconnection network and accumulates multiple spike events at the source node into larger network packets destined to common destinations. This is required to optimise the payload efficiency, given relatively large packet headers as compared to the size of neural spike events. Theoretical considerations are made about the efficiency of different event aggregation strategies. Thereby, important factors are the number of occurring event network-destinations and their relative frequency, as well as the number of available accumulation buffers. Based on the concept of Markov Chains, an analytical method is developed and used to evaluate these aggregation strategies. Additionally, some of these strategies are stochastically simulated in order to verify the analytical method and evaluate them beyond its applicability. Based on the results of this analysis, an optimisation strategy is proposed for the mapping of neural populations onto interconnected neuromorphic chips, as well as the joint assignment of event network-destinations to a set of accumulation buffers. During this thesis, such an event communication architecture has been implemented on the communication FPGAs in the BrainScaleS-2 accelerated neuromorphic computing system. Thereby, its usability can be scaled beyond single chip setups. For this, the EXTOLL network technology is used to transport and route the aggregated neural event packets with high bandwidth and low latency. At the FPGA, a network bandwidth of up to 12 Gbit/s is usable at a maximum payload efficiency of 94 %. The latency has been measured in the scope of this thesis to a range between 1.6 μs and 2.3 μs across the network between two neuron circuits on separate chips. This latency is thereby mostly dominated by the path from the neuromorphic chip across the communication FPGA into the network and back on the receiving side. As the EXTOLL network hardware itself is clocked at a much higher frequency than the FPGAs, the latency is expected to scale in the order of only approximately 75 ns for each additional hop through the network. For being able to globally interpret the arrival timestamps that are transmitted with every spike event, the system time counters on the FPGAs are synchronised across the network. For this, the global interrupt mechanism implemented in the EXTOLL hardware is characterised and used within this thesis. With this, a synchronisation accuracy of ±40ns could be measured. At the end of this thesis, the successful emulation of a neural signal propagation model, distributed across two BrainScaleS-2 chips and FPGAs is demonstrated using the implemented event communication architecture and the described synchronisation mechanism.

Translation of abstract (German)

Im Rahmen dieser Arbeit wurde eine Kommunikationsarchitektur für neuronale Spike Events in einem beschleunigten neuromorphen Rechnersystem unter Benutzung eines paketbasierten Verbindungsnetzwerks entwickelt. Bestehende neuromorphe Computersysteme nutzen meist hoch spezialisierte Verbindungsnetzwerke, bei welchen einzelne Spike Events direkt zu ihrem Ziel geroutet werden. Dagegen verwendet der Ansatz dieser Arbeit ein allgemeines paketbasiertes Hochleistungs-Verbindungsnetzwerk und akkumuliert dazu mehrere Events zu größeren Paketen, die dann zum gemeinsamen Ziel der Events gesendet werden. Dies ist notwendig, um die Daten-Nutzlast-Effizienz, bezogen auf den relativ großen Paketheader verglichen mit einem einzelnen Event, sicherzustellen. Es werden theoretische Überlegungen über die Effizienz verschiedener Strategien zur Akkumulation von Spike Events angestellt. Wichtige Faktoren sind dabei die Anzahl von vorkommenden Event-Netzwerkzielen und deren relative Häufigkeit, sowie die Anzahl verfügbarer Pufferspeicher zur Akkumulation. Basierend auf dem Konzept von Markov Ketten, wird eine analytische Methode zur Evaluation dieser Akkumulationsstrategien entwickelt. Zusätzlich werden einige dieser Strategien stochastisch simuliert, um die analytische Methode zu verifizieren und diese Strategien über die Gültigkeit der Methode hinaus zu untersuchen. Basierend auf den Ergebnissen dieser Analyse, wird eine Strategie zur Optimierung der Verteilung neuronaler Populationen über mehrere vernetzte Mikrochips hinweg, sowie der Zuordnung von Event-Netzwerkzielen zu einer Menge von Akkumulations-Pufferspeichern, vorgeschlagen. Während dieser Arbeit wurde eine solche Event-Kommunikationsarchitektur auf den Kommunikations-FPGAs des beschleunigten neuromorphen Computersystems BrainScaleS-2 implementiert. Dadurch kann dessen Nutzbarkeit über die Nutzung von Einzel-Chip-Aufbauten hinaus skaliert werden. Hierfür wird die EXTOLL Netzwerk Technologie genutzt, um die aggregierten neuralen Event Pakete mit hoher Bandbreite und niedriger Latenz zu übertragen. An den FPGAs ist dadurch eine Netzwerkbandbreite von bis zu 12 Gbit/s nutzbar bei einer maximalen Effizienz der Datennutzlast von 94 %. Die Latenz wurde im Rahmen dieser Arbeit in einem Bereich zwischen 1.6 μs und 2.3 μs über das Netzwerk zwischen zwei Neuronschaltungen auf separaten Mikrochips gemessen. Diese Latenz ist größtenteils dominiert durch den Pfad vom neuromorphen Chip über das Kommunikations-FPGA in das Netzwerk und zurück auf der Empfangsseite. Da die EXTOLL Netzwerkhardware selbst mit einer viel höheren Taktrate als die FPGAs betrieben wird, ist zu erwarten, dass die Netzwerklatenz in der Größenordnung von lediglich etwa 75 ns pro zusätzlichem Netzwerkschritt skaliert. Um die Ankunftszeitstempel, welche mit jedem Spike Event versendet werden, global interpretieren zu können, werden die Systemzeitzähler der FPGAs über das Netzwerk synchronisiert. Dazu wird im Rahmen dieser Arbeit der globale Interruptmechanismus des EXTOLL Netzwerks charakterisiert und verwendet. Damit konnte eine Synchronisationsgenauigkeit von ±40ns gemessen werden. Am Ende dieser Arbeit wird die erfolgreiche Emulation eines Modells zur neuronalen Signalweiterleitung, verteilt über zwei BrainScaleS-2 Chips und FPGAs hinweg, unter Verwendung der entwickelten und implementierten Event Kommunikationsarchitektur und des beschriebenen Synchronisationsmechanismus, demonstriert.

Document type:	Dissertation
Supervisor:	Schemmel, Dr. habil. Johannes
Place of Publication:	Heidelberg
Date of thesis defense:	11 December 2023
Date Deposited:	20 Dec 2023 12:12
Date:	2023
Faculties / Institutes:	The Faculty of Physics and Astronomy > Kirchhoff Institute for Physics
DDC-classification:	530 Physics