- Wikipedia says that the model used here is the Multitape Turing Machine. This does not seem to make much sense to me, since in the analysis of matrix multiplication, scalar multiplication is supposed to have a constant time complexity. This is not the case on Turing Machines.
- Some texts describe the complexity only vaguely as the number of arithmetic operations used. However, what exactly are arithmetic operations in this context? I suppose that addition, multiplication, and probably subtraction. But what about division, integer division, remainder, etc.? And what about bitwise operations – how do they fit into this setting?
- Finally, I have recently discovered an article, which uses the BSS machine as the model of computation. However, this also seems little bit strange to me, since for, e.g., integer matrices, it does not make much sense to me to disallow operations such as, e.g., integer division.
I would be grateful to anyone, who could help me to sort these things out.
Asked By : 042
Answered By : Yuval Filmus
- Certain linear combinations $alpha_i$ of the input matrix $a_{jk}$ are calculated.
- Certain linear combinations $beta_i$ of the input matrix $b_{jk}$ are calculated.
- $gamma_i to alpha_i times beta_i$.
- Each entry in the output matrix is a linear combination of $gamma_i$s.
This is known as bilinear normal form. In the matrix multiplication algorithm shown above, $x_{jk},y_{jk}$ function as the $gamma_i$, but in Strassen’s algorithm the linear combinations are more interesting; they are the $M_i$’s in Wikipedia’s description. Using a tensoring approach (i.e. recursively applying the same algorithm), similar to the asymptotic analysis of Strassen’s algorithm, one can show that given such an algorithm for multiplying $ntimes n$ matrix with $r$ products (i.e. $r$ variables $gamma_i$), then arbitrary $Ntimes N$ matrices can be multiplied in complexity $O(N^{log_n r})$; thus only the number of products matters asymptotically. In Strassen’s algorithm, $n = 2$ and $r = 7$, and so the bound is $O(N^{log_2 7})$. The problem of finding the minimal number of products needed to compute matrix multiplication can be phrased as finding the rank of a third-order tensor (a “matrix” with three indices rather than two), and this forms the connection to algebraic complexity theory. You can find more information in this book or these lecture notes (continued here). The reason this model is used is twofold: first, it is very simple and amenable to analysis; second, it is closely related to the more common RAM model. Straight-line programs can be implemented in the RAM model, and the complexity in both models is strongly related: arithmetic operations have unit cost in several variants of the model (for example, the RAM model with real numbers), and are otherwise related polynomially to the size of the numbers. In the case of modular matrix multiplication, therefore, arithmetic complexity provides an upper bound on complexity in the RAM model. In the case of integer or rational matrix multiplication, one can show that for bilinear algorithms resulting from tensorization, the size of the numbers doesn’t grow too much, and so arithmetic complexity provides an upper bound for the RAM model, up to logarithmic factors. It could a priori be the case that a RAM machine can pull some tricks that the arithmetic model is oblivious to. But often we want matrix multiplication algorithms to work for matrices over arbitrary fields (or even rings), and in that case uniform algorithm should only use the arithmetic operations specified by the model. So this model is a formalization of “field-independent” algorithms.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/20245 Ask a Question Download Related Notes/Documents