https://blog.tensorflow.org/2020/02/speeding-up-neural-networks-using-tensornetwork-in-keras.html

https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWgZrAi6tMBWKzS9lv_7_7sChZLB0c-f_rupWRVMdeThnierScidQ218IT0xBAyGY0dSeO4MxmrjZZqOIrNIda1OQlEZToaWUrU0BKeB_i2J5NKPSS7oEJYkU9xKpVA5g-7GZl-f-hOh8/s1600/tensorNetworkLogobig.jpg

February 12, 2020 —
*Posted by Marina Munkhoeva, PhD student at Skolkovo Institute of Science and Technology and AI Resident at Alphabet's X, Chase Roberts, Research Engineer at Alphabet's X, and Stefan Leichenauer, Research Scientist at Alphabet's X*

Introduction
In this post, we’re going to talk about TensorNetwork, and how it can be used to supercharge a feed-forward neural network in TensorFlow. Tensor…

Speeding up neural networks using TensorNetwork in Keras

Getting started with TensorNetwork is easy. The library can be installed using pip:

`pip install tensornetwork`

The example code we’ll discuss in this post is also available in a Colab.```
Dense = tf.keras.layers.Dense
fc_model = tf.keras.Sequential(
[
tf.keras.Input(shape=(2,)),
Dense(1024, activation=tf.nn.swish),
Dense(1024, activation=tf.nn.swish),
Dense(1, activation=None)])
```

This network has a two-dimensional input, one-dimensional output, and contains two hidden layers with 1024 neurons each. The total number of parameters (including weights and biases) is (2+1)*1024 + (1024+1)*1024 + (1024 +1)*1 = 1,053,697. The bulk of these parameters, (1024+1)*1024 = 1,049,600 of them, are associated with the second hidden layer. We’re going to replace those with a TN layer that is less than 1% of the original size.The 1,049,600 parameters of the hidden layer consist of 1024*1024 = 1,048,576 weights arranged in a 1024-by-1024 matrix, plus 1024 biases. Our savings are going to come from replacing the 1024-by-1024 matrix of weights by a tensor network. To preserve compatibility with the rest of the network, we don’t change the fact that there are 1024 inputs and 1024 outputs. That way we can treat the TN layer as a drop-in replacement for the original layer.

Ordinarily, we think of the 1024 inputs to the layer as a simple array X of shape (1024,). These inputs are multiplied by the weight matrix W to produce a new vector Y. The next step is to apply a nonlinearity to Y, but we won’t focus on that because the TN modifications only affect the linear part of the layer. In tensor network notation, we would write the computation of Y as follows:

Ordinarily, the weights W of a neural network layer act on the inputs X as a matrix multiplication, producing the output Y = WX. Here we illustrate that matrix multiplication diagrammatically. |

Reshaping the input X from a vector of shape (1024,) to an array of shape (32,32) is the first step of the tensorization process. In principle any reshape is allowed, we just chose this particularly simple one for our example. |

Diagrammatic representation of the tensorized weight multiplication. The output Y can be reshaped into a vector if needed. |

The result of using a TN layer is that we’ve replaced the 1,048,576 weights of the fully-connected weight matrix with the 2*(32*32*2) = 4,096 parameters of the tensor network. That’s a tremendous reduction! Even after accounting for the other layers, the total model size is down to 9,217 parameters, compared to the original 1,053,697.

For this simple example, it would not be too hard to translate the two-core tensor network into an einsum expression. However, for a tensor network with more cores and more complex connectivity, debugging or extending the einsum can be quite difficult. Thus, we instead use the TN library for a more object oriented way of constructing the network. In the code examples below we’ll stick to the simple two-core example to make things easy to follow, but at the end we’ll come back to discuss other possibilities.

```
import tensorflow as tf
import tensornetwork as tn
class TNLayer(tf.keras.layers.Layer):
def __init__(self):
super(TNLayer, self).__init__()
# Create the variables for the layer.
self.a_var = tf.Variable(tf.random.normal(
shape=(32, 32, 2), stddev=1.0/32.0),
name="a", trainable=True)
self.b_var = tf.Variable(tf.random.normal(shape=(32, 32, 2), stddev=1.0/32.0),
name="b", trainable=True)
self.bias = tf.Variable(tf.zeros(shape=(32, 32)), name="bias", trainable=True)
def call(self, inputs):
# Define the contraction.
# We break it out so we can parallelize a batch using
# tf.vectorized_map (see below).
def f(input_vec, a_var, b_var, bias_var):
# Reshape to a matrix instead of a vector.
input_vec = tf.reshape(input_vec, (32,32))
# Now we create the network.
a = tn.Node(a_var, backend="tensorflow")
b = tn.Node(b_var, backend="tensorflow")
x_node = tn.Node(input_vec, backend="tensorflow")
a[1] ^ x_node[0]
b[1] ^ x_node[1]
a[2] ^ b[2]
# The TN should now look like this
# | |
# a --- b
# \ /
# x
# Now we begin the contraction.
c = a @ x_node
result = (c @ b).tensor
# To make the code shorter, we also could've used Ncon.
# The above few lines of code is the same as this:
# result = tn.ncon([x, a_var, b_var], [[1, 2], [-1, 1, 3], [-2, 2, 3]])
# Finally, add bias.
return result + bias_var
# To deal with a batch of items, we can use the tf.vectorized_map
# function.
# https://www.tensorflow.org/api_docs/python/tf/vectorized_map
result = tf.vectorized_map(
lambda vec: f(vec, self.a_var, self.b_var, self.bias), inputs)
return tf.nn.relu(tf.reshape(result, (-1, 1024)))
```

In this example, we hard-coded the size of the layer, but that is fairly easy to adjust. Having made this layer, we can use it as part of a Keras model very simply:```
tn_model = tf.keras.Sequential(
[
tf.keras.Input(shape=(2,)),
Dense(1024, activation=tf.nn.relu),
# Here use a TN layer instead of the dense layer.
TNLayer(),
Dense(1, activation=None)
]
)
```

The model can be trained as usual using the Keras `fit`

method.A four-core tensor network that we used to tensorize the fully-connected layers of a Transformer model. |

There are a number of other types of tensor networks and tensor decomposition which are applicable to various types of layers in neural networks, and we won’t make an attempt to give an exhaustive reference list here. But for a start, check out Khrulkov et al. for tensor networks in embedding layers, Ma et al. for attention layers, and Lebedev et al. for convolutional layers.

Tensorization of neural networks is still in its infancy. There are still many unanswered questions, and many experiments to try. All of the tensor networks considered in this post are of the tensor train type, known in physics as an MPO (see also this paper), but other well-studied tensor networks like PEPS and MERA could also be used. The TensorNetwork library was designed to handle any possible tensor network, and specifically to do so in the context of machine learning as we have presented here. We look forward to seeing all of the new and exciting results that you can cook up using it.

Next post

Speeding up neural networks using TensorNetwork in Keras

February 12, 2020
—
*Posted by Marina Munkhoeva, PhD student at Skolkovo Institute of Science and Technology and AI Resident at Alphabet's X, Chase Roberts, Research Engineer at Alphabet's X, and Stefan Leichenauer, Research Scientist at Alphabet's X*

Introduction
In this post, we’re going to talk about TensorNetwork, and how it can be used to supercharge a feed-forward neural network in TensorFlow. Tensor…

Build, deploy, and experiment easily with TensorFlow