Training Deep Networks like Biological Superheroes

Published on March 21, 2022

In the quest for improving self-supervised learning (SSL) in deep networks, researchers have developed biologically plausible training mechanisms that mimic the activities of neurons in our brain. These mechanisms avoid complex computations and rely on connections between units that are asymmetric, just like how superheroes have unique powers. They have also found that SSL with contrastive loss, which doesn’t require labeled data, adds a layer of robustness to the network as it handles perturbations in objects and lighting variations. To make this happen, they propose a contrastive hinge based loss that involves simple local computations instead of complex ones involving ratios and inner products. Additionally, they introduce two alternatives to backpropagation: difference target propagation (DTP) and layer-wise learning. DTP uses target-based local losses and a Hebbian learning rule to overcome the symmetric weight problem of backpropagation. Layer-wise learning updates each layer sequentially or in random order, connected to a layer computing the loss error. By training convolutional neural networks (CNNs) using SSL and these alternatives, the researchers achieved comparable performance to standard backpropagation learning.

We develop biologically plausible training mechanisms for self-supervised learning (SSL) in deep networks. Specifically, by biologically plausible training we mean (i) all updates of weights are based on current activities of pre-synaptic units and current, or activity retrieved from short term memory of post synaptic units, including at the top-most error computing layer, (ii) complex computations such as normalization, inner products and division are avoided, (iii) asymmetric connections between units, and (iv) most learning is carried out in an unsupervised manner. SSL with a contrastive loss satisfies the third condition as it does not require labeled data and it introduces robustness to observed perturbations of objects, which occur naturally as objects or observers move in 3D and with variable lighting over time. We propose a contrastive hinge based loss whose error involves simple local computations satisfying (ii), as opposed to the standard contrastive losses employed in the literature, which do not lend themselves easily to implementation in a network architecture due to complex computations involving ratios and inner products. Furthermore, we show that learning can be performed with one of two more plausible alternatives to backpropagation that satisfy conditions (i) and (ii). The first is difference target propagation (DTP), which trains network parameters using target-based local losses and employs a Hebbian learning rule, thus overcoming the biologically implausible symmetric weight problem in backpropagation. The second is layer-wise learning, where each layer is directly connected to a layer computing the loss error. The layers are either updated sequentially in a greedy fashion (GLL) or in random order (RLL), and each training stage involves a single hidden layer network. Backpropagation through one layer needed for each such network can either be altered with fixed random feedback weights (RF) or using updated random feedback weights (URF) as in Amity’s study 2019. Both methods represent alternatives to the symmetric weight issue of backpropagation. By training convolutional neural networks (CNNs) with SSL and DTP, GLL or RLL, we find that our proposed framework achieves comparable performance to standard BP learning downstream linear classifier evaluation of the learned embeddings.

Read Full Article (External Site)

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>