Combining Black-Box Operators

Matlab normally combines matrices by evaluating an expression such as X=[A,B] or X=A*B. This require the matrices to be on a form which can be evaluated, i.e. sparse or full matrix.

In contrast, black-boxes can be implemented as a set of algorithms, which can not be evaluated (executed) until it is used to produce a product with a Matlab matrix. Therefore, BBTools combines black-box operators by storing their evaluation tree rather than computing the product. The only place where the function is evaluated is when the product of a black-box operator and a Matlab-matrix is taken (see function compatibility) or a function is explicitly evaluated (see bbfunc and bbfuncm).

In practice black-box operators acts much like their Matlab counterparts. However, it is often more efficient to use a black-box than a Matlab matrix. For instance, the product of very sparse matrices is normally less sparse than the original matrices:

A=sprandn(5000,1000,.01);
B=sprandn(1000,5000,.01);
C=A*B;
nnz(A)+nnz(B)
ans =
       99533
nnz(C)
ans =
     2358238

Since each non-zero requires a multiplication in a matrix-vector product, it is faster to compute A*(B*X) than C*X, even though they should give the same result (ignoring rounding errors). With BBTools one would instead do:

C=bbprod(A,B)
C =
   prod 5000x5000
   > matrix 5000x1000 (sparse)
   > matrix 1000x5000 (sparse)

This shows that C is a 5000-by-5000 matrix built as the product of two sparse matrices. Computing C*X is essentially equivalent to A*(B*X).

An alternative view of the evaluation-tree of C can be obtained using the function bbshowtree:

bbshowtree(C);
Evaluation Tree

Kronecker products

One combinator that often leads to very large operators is the Kronecker product. Normally Matlab evaluates such a product, leading to a huge amount of memory consumption and a very inefficient matrix-vector product.

In the case at hand, forming the Kronecker product of A and B would require about 18.5GB of memory, just to store the non-zero elements. Since 32-bit machines can only address 4GB, we will not be able to compute this product on a such a machine.

Let us try BBTools instead:

C=bbkron(A,B)
C =
   kron 5000000x5000000
   > matrix 5000x1000 (sparse)
   > matrix 1000x5000 (sparse)

Of course this is only useful if the products can be computed efficiently:

X=randn(5000000,1);
t=cputime; Y=C*X; t1=cputime-t;
t=cputime; Z=B*reshape(X,5000,1000)*A'; t2=cputime-t;
t1/t2
ans =
    2.0606

We see that BBTools is in the same ball-park as an explicit algorithm. However, simply forming the Kronecker product and using it makes it the code much clearer.

The overhead incurred by BBTools results from the fact that Matlab avoids transpositions (in version 6.5 and later). In contrast, BBTools must support any kind of operator, including custom algorithms. The factor grows as large in this example because A and B are very sparse, and transposition takes a relatively large amount of time.

Had we used Z=B*(reshape(X,5000,1000)*A') instead, then BBTools would have been superior.