Among the many areas of expertise within our research laboratory, we also focused on the efficient implementation of neural networks using advanced compiler techniques. Our work includes optimizing CPU and GPU kernels to improve performance, reduce training time, and increase the scalability of machine learning models. Leveraging principles of systems programming, we also design high-performance machine learning runtimes that ensure efficient resource management and compatibility across diverse hardware, from CPUs and GPUs to specialized accelerators.
This area complements our broader research pursuits in high-performance computing, compiler optimizations, and systems-level innovations aimed at accelerating machine learning and artificial intelligence applications.
Potential topics:
I. Core Computation Optimization
Automated Vectorization: Automatically speed up neural network operations on standard processors using parallel instruction sets.
Data Layout for Cache Efficiency: Optimize how data is stored in memory to maximize performance, especially with large models.
Multi-threading & Parallelism: Efficiently distribute training tasks across multiple processor cores.
II. Accelerated Execution
Custom Kernels: Design highly optimized code for common neural network layers.
Memory Access Optimization: Maximize data transfer speeds during training and inference.
Lower Precision: Use reduced-precision numbers to accelerate inference while preserving accuracy.
III. Performant Runtimes
Automatic Code Generation: Create compilers that automatically generate optimized code for different processing units.
Dynamic Resource Allocation: Efficiently manage computing resources based on task needs.
IV. Compiler Innovations
Automated Loop Optimization: Use advanced compilation techniques to speed up neural network loops.
Graph-Level Optimizations: Optimize the entire structure of a neural network during compilation.
V. Novel Hardware Support
Integration with Emerging Accelerators: Adapt software to work efficiently with new specialized hardware.
Hardware-Aware Optimization: Tailor optimizations to the unique characteristics of different processing units.
Associated Researchers:
– Diego Bellani