I have a question about the conjugate gradient (CG) algorithm in particular, and possibly about interative solvers in general. My main reference here will be section 5.1 of Numerical Optimization by Nocedal and Wright (hereafter NW).
Let be a symmetric positive definite matrix. All lower-case letters will denote -vectors. The conjugate gradient algorithm solves the linear system
using only a black-box routine to compute the product for any vector . To provide some intuition for how this is possible, note that by defining
the optimization problem has the unique solution (cf NW Equation 5.2). Furthermore, the gradient of is given by
evaluation of which requires only the matrix-vector product. Consequently, one can perfor gradient descent to solve using only matrix-vector products. (In fact, this is nearly what CG does— CG performs gradient descent on with the constraint that the search directions are all mutually conjugate with respect to .)
Now, suppose that you are not actually interested in the full vector , but only in the value of in some subspace expressed by an inner product, , where is a unit vector. (This is typically the case in sensitivity anaylysis, which is the source of my interest in this question.) Is there a way to monitor the progress of CG for the quantity ? No matter how you look at it, it seems to me that CG is not set up to answer such a question, especially when is aligned with a small eigenvalue of .
Typically, one monitors the convergence of CG using the residual . Of course, the value of tells you little about in general. The theoretical guanrantees for CG (as given, for example, in NW Theorem 5.5) are given in terms of the norm . If is very small, then it is possible for to be small but for to be very large. Similarly, it is possible for the optimization objective to be near while is still very large.
It might be pointed out that calculating is equal to , so that the roles of and are symmetric here.
What I believe this comes down to is the obvious-enough fact that an iterative method based on products will not do well for directions in which is small. Of course, in theory and practice we expect our linear solvers to work best when the condition number is not too large. However, I would be very interested to know whether there is a practical variant of CG that provides a guide for terminating the algorithm if the quantity of interest is and not .