Basic usage

In this tutorial we guide you through the basic usage of this package. This tutorial is also available as a Jupyter notebook, and can for example be run on google Colab by clicking this link.

Installation

This library can be installed using pip by running pip install tt-sketch. Alternatively in a Jupyter notebook or on Colab we can run

!pip install --quiet tt-sketch

You can also install the library by cloning the repository as follows:

git clone git@github.com:RikVoorhaar/tt-sketch.git
cd tt-sketch
pip install .

Tensor train decompositions

The purpose of this library is to quickly and easily compute low-rank Tensor Train (TT) decompositions of a given tensor. This is for example useful if we want to compress tensorial data, or to convert from some other low-rank tensor format to a TT format because of some of its attractive properties.

A TT has a shape \((n_1, n_2, \ldots, n_d)\) and a rank \((r_1,\ldots,r_{d-1})\) associated to it. To create a random TT of specified rank and shape we can use the following command:

[1]:

from tt_sketch.tensor import TensorTrain

shape = (10, 5, 6, 8, 5)
rank = (4, 6, 3, 2)
tt = TensorTrain.random(shape, rank)
tt

[1]:

<Tensor train of shape (10, 5, 6, 8, 5) with rank (4, 6, 3, 2) at 0x7f2af04b10d0>

We can convert a TT into a numpy array as shown below. Note that this is usually not something you want to do, since a TT uses far less memory. Below we print the number of floating point numbers required for storing both the TT and the equivalent numpy array.

[2]:

tt_numpy = tt.to_numpy()
print(tt_numpy.shape)
print(tt_numpy.size)
print(tt.size)

(10, 5, 6, 8, 5)
12000
326

A TT is essentially a list of cores which are order-3 tensors of shape \((r_{\mu-1},n_\mu,r_\mu)\). We can access them using the .cores attribute:

[3]:

cores = tt.cores
print([C.shape for C in cores])

[(1, 10, 4), (4, 5, 6), (6, 6, 3), (3, 8, 2), (2, 5, 1)]

We can initialize a TT directly from a list of cores.

[4]:

import numpy as np

C1 = np.random.normal(size=(1, 5, 3))
C2 = np.random.normal(size=(3, 6, 4))
C3 = np.random.normal(size=(4, 8, 1))
TensorTrain([C1, C2, C3])

[4]:

<Tensor train of shape (5, 6, 8) with rank (3, 4) at 0x7f2a20957760>

While there is some basic arithmetic we can do with TTs, the main point of interest for us is to convert tensors to the TT format. Before we consider that problem we first introduce some other tensor formats.

Tensor formats

Other than TTs, the tt_sketch.tensor module has support for some other tensor formats. We list them here.

Sparse tensors

A sparse tensor \(\mathcal T\) consists of a list of indices pointing to the location of non-zero entries, and a list of values of these non-zero entries. Below we construct a sparse tensor with 100 non-zero uniformly distributed elements.

[5]:

from tt_sketch.tensor import SparseTensor

nnz = 100
shape = (10, 5, 6, 8)

# Generate `nnz` random indices and entries
total_size = np.prod(shape)
indices_flat = np.random.choice(total_size, size=nnz, replace=False)
indices = np.unravel_index(indices_flat, shape)
entries = np.random.uniform(size=nnz)

sparse = SparseTensor(shape, indices, entries)

# Alternative method:
# sparse = SparseTensor.random(shape, nnz)

sparse

[5]:

<Sparse tensor of shape (10, 5, 6, 8) with 100 non-zero entries at 0x7f2a20957c10>

It can be converted to a dense tensor as before using .to_numpy(). Again, since this is a compressed tensor format, this can drastically increase the memory usage. Note that in this case we need to store 5 numbers for each non-zero entry of sparse; 4 for the index, and 1 for the value. This is akin to the COO format for sparse matrices.

[6]:

sparse_numpy = sparse.to_numpy()
print(sparse_numpy.shape)
print(sparse_numpy.size)
print(sparse.size)

(10, 5, 6, 8)
2400
500

CP tensors

A CP (a.k.a. CANDECOMP/PARAFAC) tensor is a sum of \(N\) rank-1 tensors. Each of these rank one tensors is represented by a tuple of vectors: \(v_1\otimes v_2\otimes \cdots\otimes v_d\). For each mode of the tensor, all these \(N\) vectors are stored in an \(n_\mu\times N\) matrix \(V_\mu\). Below we create a random rank 100 CP tensor. All matrices \(V_\mu\) are random normal.

[7]:

from tt_sketch.tensor import CPTensor

shape = (7, 5, 6, 20)
rank = 100

cores = [np.random.normal(size=(n, rank)) for n in shape]
cp = CPTensor(cores)

# Alternative method:
# cp = CPTensor.random(shape, rank)

cp

[7]:

<CP tensor of shape (7, 5, 6, 20) and rank 100 at 0x7f2af062fd30>

Tucker tensors

A Tucker tensor consists of a dense core tensor \(\mathcal C\) of shape \(s_1\times\cdots\times s_d\), and a collection of factor matrices \(U_\mu\) of shape \(s_\mu\times n_\mu\). This forms a \(n_1\times\cdots\times n_d\) tensor through the product \(\mathcal T = (U_1\otimes\cdots\otimes U_d)\mathcal C\). Often the matrices \(U_\mu\) are assumed to have orthogonal rows (so that they represent an orthogonal basis of a subspace), but this is not necessary for our purposes. We can create a Tucker tensors as shown below:

[8]:

from tt_sketch.tensor import TuckerTensor

shape = (7, 4, 9, 4, 12)
rank = (3, 4, 2, 2, 3)

factor_matrices = [np.random.normal(size=(r, n)) for r, n in zip(rank, shape)]
core = np.random.normal(size=shape)
tucker = TuckerTensor(factor_matrices, core)

# Alternative method:
# tucker = TuckerTensor.random(shape, rank)

tucker

[8]:

<Tucker tensor of shape (7, 4, 9, 4, 12) and rank (3, 4, 2, 2, 3) at 0x7f2af043a070>

Tensor sums

One of the major advantages of our TT sketching algorithms is that they work very well for sums of arbitrary tensors (with the same shape). Taking sums of tensors is easy; the + operator is overloaded to create a tt_sketch.tensor.TensorSum object when taking the sum of any tensors. This is simply a container with a list of tensors. For example, let’s add together a sparse tensor, a CP, and a TT together:

[9]:

shape = (8, 8, 8, 4, 5)
cp = CPTensor.random(shape, 100)
sparse = SparseTensor.random(shape, 1000)
tt = TensorTrain.random(shape, 10)

tensor_sum = cp + sparse + tt
tensor_sum

[9]:

<Sum of 3 tensors of shape (8, 8, 8, 4, 5) at 0x7f2af076ea90>

Multiplication by scalars is also supported:

[10]:

cp * 0.1 - tt + sparse * 1e-9

[10]:

<Sum of 3 tensors of shape (8, 8, 8, 4, 5) at 0x7f2a20957b80>

HMT sketch

This library implements three similar sketching algorithms for finding approximate TT decompositions; orthogonal_sketch, stream_sketch and hmt_sketch. The former method is unpublished because it offers little advantage over the other two methods, and therefore we will not cover it in this tutorial either. Perhaps the easiest to use of the methods is the TT-HMT sketching procedure tt_sketch.sketch.hmt_sketch. In its most basic usage, we just need to supply a tensor, and a desired approximation rank. As a demonstration, we will approximate the tensor_sum defined above.

[11]:

from tt_sketch.sketch import hmt_sketch

rank = (2, 3, 4, 5)
tt_sketched = hmt_sketch(tensor_sum, rank)
tt_sketched

[11]:

<Tensor train of shape (8, 8, 8, 4, 5) with rank (2, 3, 4, 5) at 0x7f2a20957310>

Note however that this is not a very good approximation, because the tensor we are trying to approximate cannot be accurately represented by a TT of this low rank. We can check the quality of the approximation by computing the relative error:

[12]:

tt_sketched.error(tensor_sum, relative=True)

[12]:

0.9900753801070363

The relative error is quite close to 1, but for a good approximation it should be close to zero.

As a comparison, we have also implemented the (much more expensive) classical TT-SVD algorithm, and it also fails to find a useful approximation of this tensor:

[13]:

from tt_sketch.tt_svd import tt_svd

tt_approx = tt_svd(tensor_sum, rank)
tt_approx.error(tensor_sum, relative=True)

[13]:

0.9754118198082177

A tensor that is very easy to approximate with a TT is, of course, a TT. Reconstructing a TT using the sketching algorithm produces very accurate results:

[14]:

shape = (4, 5, 6, 8, 5)
rank = (4, 6, 3, 2)
tt = TensorTrain.random(shape, rank)

tt_sketched = hmt_sketch(tt, rank)

tt_sketched.error(tt, relative=True)

[14]:

4.358868417360322e-14

It is actually not necessary to supply rank as a tuple. If we provide a single integer then a constant rank is assumed.

[15]:

hmt_sketch(tt, 8)

[15]:

<Tensor train of shape (4, 5, 6, 8, 5) with rank (4, 8, 8, 5) at 0x7f2af51f0100>

Note that in this case the rank of the approximate TT is not (8, 8, 8, 8). This is because the first and last dimensions of the tensor are less than the rank 8, and increasing the rank past the dimension will not improve the quality any further, and the rank is therefore automatically truncated. The maximum value for the second left_rank[1] would in this case be 4*5=20, which is more than 8, so no further truncation happens.

Choosing DRM types

The sketching is done by contracting the tensor with a Dimension Reduction Matrix (DRM). By default we use DRMs obtained through partial contractions of a TT (i.e. the class TensorTrainDRM), but other options are available as well. In particular for sparse tensors it makes sense to use a Gaussian DRM.

For example, below we sketch a sparse tensor using a Gaussian DRM (tt_sketched2). Typically choosing a different DRM will not result in drastically different performance, although it may reduce the variance in the quality of the approximation.

[16]:

from tt_sketch.drm import SparseGaussianDRM

nnz = 100
shape = (10, 10, 10, 10)
sparse_tensor = SparseTensor.random(shape, nnz)
sparse_tensor.entries *= np.logspace(0, -50, nnz)  # make entries decay fast

tt_sketched1 = hmt_sketch(sparse_tensor, rank=10)
print(tt_sketched1.error(sparse_tensor, relative=True))

tt_sketched2 = hmt_sketch(
    sparse_tensor,
    rank=10,
    drm_type=SparseGaussianDRM,
)
print(tt_sketched2.error(sparse_tensor, relative=True))

1.166825938858844e-05
2.0489501226846625e-05

Streaming sketch

The second sketching algorithm is the Streaming TT approximation. While it typically produces errors that are slightly worse than the TT-HMT method, it has the advantage of being a streaming algorithm. This means that we can cheaply update the sketch of a tensor. If we don’t care about this feature, the usage is very similar to hmt_sketch, except for one major difference. Since a two-sided sketch is performed, we need to supply both a left_rank and a right_rank.

[17]:

from tt_sketch.sketch import stream_sketch

tt_sketched = stream_sketch(sparse_tensor, left_rank=10, right_rank=15)
tt_sketched

[17]:

<Sketched tensor train of shape (10, 10, 10, 10) with left-rank (10, 10, 10) and right-rank (15, 15, 15) at 0x7f2af0635a90>

The result of the sketch is not a TensorTrain object, but rather a SketchedTensorTrain object. It can easily be converted to a TensorTrain:

[18]:

tt = tt_sketched.to_tt()
print(tt)
tt.error(sparse_tensor, relative=True)

<Tensor train of shape (10, 10, 10, 10) with rank (10, 10, 10) at 0x7f2a203274f0>

[18]:

4.523995632718617e-05

Updating a SketchedTensorTrain is easy; we simply add a new tensor to it:

[19]:

other_sparse_tensor = SparseTensor.random(shape, 10)*1e-6
sparse_tensor_sum = sparse_tensor + other_sparse_tensor

# Updating an existing sketch
tt_sketched_updated = tt_sketched + other_sparse_tensor
print(tt_sketched_updated.error(sparse_tensor_sum, relative=True))

# Sketching the sum of two tensors directly
tt_sketched2 = stream_sketch(sparse_tensor_sum, left_rank=10, right_rank=15)
print(tt_sketched2.error(sparse_tensor_sum, relative=True))

4.327928793048581e-05
9.479464055761044e-05

This computes a sketch of other_sparse_tensor using the same DRMs and ranks. The resulting sketch is completely equivalent to sketching the sum of tensors directly.

An important restriction is that all entries of left_rank need to be smaller than all entries of right_rank (or vice versa). Just as before we can also use other DRMs. If we use SparseGaussianDRM, we can also adaptively increase the rank of our approximation. This is useful if we are not sure in advance what tt-rank we should choose. For technical reasons this is not possible for TensorTrainDRM.

For example below we first try to approximate sparse_tensor as a rank 5 TT, but then realize this is not good enough, and we increase the rank to (10, 15, 10).

[20]:

tt_sketched = stream_sketch(
    sparse_tensor,
    left_rank=10,
    right_rank=5,
    left_drm_type=SparseGaussianDRM,
    right_drm_type=SparseGaussianDRM,
)

print(tt_sketched)
print("relative error:", tt_sketched.error(sparse_tensor, relative=True))

tt_sketched_updated = tt_sketched.increase_rank(
    sparse_tensor, (20, 30, 20), (10, 15, 10)
)
print("")
print(tt_sketched_updated)
print(
    "relative error:", tt_sketched_updated.error(sparse_tensor, relative=True)
)

<Sketched tensor train of shape (10, 10, 10, 10) with left-rank (10, 10, 10) and right-rank (5, 5, 5) at 0x7f2a2036bfd0>
relative error: 0.011087215612540783

<Sketched tensor train of shape (10, 10, 10, 10) with left-rank (20, 30, 20) and right-rank (10, 15, 10) at 0x7f2af0635910>
relative error: 1.376905732926918e-07