Uncertainty & Computing

Randomized (Numerical) Linear Algebra

Linear algebra can be thought of as both the study of linear systems of equations and matrices, and the study of vector spaces and linear transformations between them. Randomized linear algebra (RLA), which is still evolving, can be thought of as a sub-discipline of linear algebra in which randomness is present in the specification of linear systems of equations, in a matrix, or, equivalently, in a linear transformation.

There are a number of excellent reviews on the subject, of which I list two here.

Martinsson and Tropp, "Randomized numerical linear algebra: foundations and algorithms." 2020 (DOI).
Murray and others, "Randomized numerical linear algebra: a perspective on the field with an eye to software." 2023 (arXiv).

While there are many areas of RLA, as discussed in the above references, we focus primarily on efforts to leverage randomness to solve large-scale linear systems and least squares problems, which often appear as sub-problems in much larger workflows in data science, climate sciences, power systems and AI.

Theory

RLA has two broad approaches to linear systems and least squares problems.

One approach is to use a random matrix to compress the number of equations of the system into a more manageable size. Then, this compressed system is solved in place of the original system. Owing to the smaller size of the compressed system, it should be faster (or feasible) to solve in comparison to the original system. Of course, the solution of the compressed system will not correspond to the solution of the original problem. Fortunately, the solution to the compressed system can be shown to produce reasonably good approximations. This approach is called sketching.
Another approach is to start with a guess of the solution, randomly sample equations from the system, update the guess by using the information in the sampled equations, and iterating on this process. As this approach only requires dealing with a subset of the system at a time, it can handle very large systems of equations. In fact, it can even handle the streaming case: that is, when the only a subset of the equations are made available for a short period of time and then must be discarded. This approach is called the Kaczmarz method.

The main (and still persisting) practical difficulty with the sketching approach is how much compression is allowable. While this number is known up to a multiplicative constant, this constant matters in practice. In the context of consistent linear systems and least squares problems, we addressed this by implicitly sketching the system and simultaneously solving the problem (

). As a result, we replaced the problem of determining the size of the sketching with the problem of determining when to stop our alternative procedure. We will discuss stopping conditions below.

As part of this same work (

), we continued a line of research that tried to unite the sketching approach and Kaczmarz method, which are called row-action methods. Our particular contribution was two-fold. First, we allowed for compressions that went beyond sketching or equation selection alone: we allowed for methods that adapted to the problem (). Second, we united all of these methods (sketching, equation selection, adaptive methods) under a single theoretical framework. In subsequent work, we broadened our theoretical framework and class of methods under a single theoretical framework (, ). We showed that any method within our framework (which covered a broad range of known and new methods) was guaranteed to have desirable convergence properties.

Importantly, our theoretical work allowed for flexibility. That is, we allowed for a wider variety of projection methods to be developed that are tailored to the problem and computational environment in order to maximize performance, while still having guarantees for convergence. Our theory is the foundation for our randomized linear algebra software.

Methodology

In order to realize the full benefits of RLA for solving linear systems and least squares problems, we need to be able to integrate the resulting solutions into larger computational workflows where the linear systems and least squares problems arise. Given our application interests, these computational workflows often require a certain quality to the solution of the linear system or least squares problem. To ensure this quality, we need to be able to do two tasks.

Tracking. We need to be able to measure the progress of the randomized solver as it solves the problem of interest.
Stopping. We need to be able to stop the randomized solver once a sufficiently good solution is determined.

Here, sufficiently good depends on the needs of the application.

Tracking

The need to track the progress of a solver of a linear system or a least squares problem is not unique to randomized techniques. Whenever we employ any iterative technique, we would like to keep track of how it is performing. For example, when solving a linear system or least squares problem with an iterative Krylov solver, we track the progress by measuring the norm of the residual of the system or the norm of the residual of the normal equations. When in a context where using a randomized iterative solver is appropriate, these typical progress measures are too expensive to compute, and, in some cases, impossible to compute (say, if the problem data is never available all at once). Hence, we need a new approach to track the progress of these solvers that is both computationally efficient and useful for larger computational workflows.

We developed novel, computationally-efficient methods for estimating typical measures of progress with measures of accuracy (i.e., uncertainty sets) for randomized solvers for least squares problems (

), and for randomized solvers for linear systems (). Importantly, we prove rigorous probability coverage bounds on our uncertainty sets showing that: if we compute a, say, 95% uncertainty set, then our uncertainty set will contain the measure of progress at least 95% of the time.

Stopping

By their nature, an iterative algorithm can (usually) run forever and get closer and closer to solving a problem.

Therefore, we need to stop these algorithms at a sufficiently good point so that we can move on with the larger problem that we are trying to solve. When we use a typical iterative algorithm (e.g., an iterative Krylov solver), we can stop the method based on the norm of the residual of the system at the current iterate or the norm of the residual of the normal equation of the system at the current iterate.

However, just as for tracking, these quantities are too expensive to compute or even infeasible to compute when in a context where using a randomized iterative solver is appropriate. So, when we use a randomized solver, we need to come up with a practical way to stop the algorithm. By using our tracking procedures, we developed a practical way to stop randomized solvers (

, ). Given that there is uncertainty in our trackers, we augmented our stopping methods to control the probability that we believe our algorithm has reached a certain solution accuracy when it has not (i.e., type I error), and to control the probability that we believe we have not reached a certain accuracy even though we already have (i.e., type II error).

What's next?

We are currently working to develop the theory and methodology of RLA for solving linear systems and least squares problems in a number of directions. We list a few here.

Symmetric Problems. A number of important problems in our motivating application areas produce linear systems or least squares problems whose coefficient matrices are symmetric. While such problems can be addressed by our existing theory and methodologies, they do not take full advantage of the symmetry to speed up computation. We are working to develop the theory and methods that can exploit symmetry.
Orthogonalization. Orthogonalization has played an important role in our work to replacing one-off sketching and then solving with implicitly growing a sketching matrix and simultaneously solving the system (

Details for PJM21

Patel, Vivak, Jahangoshahi, Mohammad and Maldonado, D. Adrian. An implicit representation and iterative solution of randomly sketched linear systems. Published in SIAM Journal on Matrix Analysis and Applications in 2021.

). One limitation in our earlier work is that we ignored block operations, which can substantially speed up calculation. We are working to address this.