November 7th, Wednesday 14:15, Room 303, Jacobs Building
Title: Fast Parallel Matrix Multiplication
Lecturer: Oded Schwartz
Lecturer homepage
: http://www.cs.berkeley.edu/~odedsc/
Affiliation : EECS Department, University of Berkeley
Faster algorithms can be obtained by minimizing communication. That is,
reducing the amount of data sent across the memory hierarchy and
between processors.
The communication costs of algorithms, in terms of time or energy, are
typically much higher than the arithmetic costs.
We have computed lower bounds on these communication costs by analyzing
geometric and expansion properties of the underlying computation DAG of
algorithms. These techniques (honored by SIAG-LA prize for 2009-2011
and SPAA'11 best paper award) inspired many new algorithms, where
existing ones proved not to be communication-optimal.
Parallel matrix multiplication is one of the most studied fundamental
problems in parallel computing. We obtain a new parallel algorithm
based on Strassen's fast matrix multiplication that is communication-optimal.
It exhibits perfect strong scaling, within the maximum possible range.
The algorithm asymptotically outperforms all known parallel matrix
multiplication algorithms, classical and Strassen-based. It also
demonstrates significant speedups in practice, as benchmarked on
several super-computers (Cray XT4, Cray XE6, and IBM BG/P).
Our parallelization approach is simple to implement, and extends to other
algorithms. Both the lower bounds and the new algorithms have an
immediate impact on saving power and energy at the algorithmic level.
Based on joint work with
Grey Ballard, James Demmel, Olga Holtz, Ben Lipshitz