Research

My research interest lies in Analog In-memory Computing, optimization and machine learning in distributed or decentralized networks.

Training on Analog In-memory Computing Hardware

Modern deep model training often requires processing vast amounts of weights and data, which must be transferred between memory and processor, leading to the "von Neumann bottleneck" that can significantly hinder computation speed and efficiency. In this context, Analog in-memory computing (AIMC) is an innovative computing paradigm that utilizes the physical properties of emerging non-volatile memory (NVM) devices to perform computations directly within the memory array. The core concept is to harness the analog storage and processing capabilities of NVM devices to execute matrix-vector multiplication (MVM) operations in a highly parallel and energy-efficient manner.

Training on AIMC hardware, while promising in terms of energy efficiency and speed, faces several significant challenges which comes from hardware imperfection. One major difficulty is the inherent variability and noise in analog hardware, which lead to inaccuracies in computations. Additionally, the precision of analog computations is generally lower than that of digital counterparts, making it harder to achieve the same level of accuracy required for training deep neural networks. Another challenge is the limited endurance and retention of analog memory devices, which can degrade over time and affect the reliability of the training process. Furthermore, integrating analog components with existing digital systems requires sophisticated design and calibration techniques to ensure compatibility and optimal performance.

To address these challenges, my research focuses on developing novel algorithms and techniques that enable effectively train deep neural networks on AIMC hardware.

alt text

Distributed Optimization

Distributed or decentralized learning, which involves a series of single devices (workers) collaborating to train a machine learning model, usually serves as a promising solution in the following scenarios:
  • Accelerate large-scale machine learning through parallel computation in data centers.
  • Exploit the potential value of large-volume, heterogeneous, and privacy-sensitive data located at geographically distributed devices in settings like federated learning (FL) or multi-agent reinforcement learning (MARL).
As one of the central topics in distributed networks, my recent work focus on robust distributed optimization algorithms. Despite the well-known advantages, the distributed nature of networks makes them vulnerable to workers’ misbehaviors, especially in FL scenarios. This misbehavior, including malicious misleading, data poisoning, backdoor injection, and so on, can be abstracted into a so-called Byzantine attack model, where some workers (attackers) send arbitrary malicious messages to others. Hoping to fulfill the robustness requirement, I dedicate myself to designing and analyzing Byzantine-resilient optimization algorithms.


alt text

Journals

Conference