## Fourth Edition (2013) of Golub and Van Loan’s Matrix Computations

Back in 1980 there were not many up to date books on numerical linear algebra. Stewart’s Introduction to Matrix Computations (1973) was a popular textbook, and was the text for the final year undergraduate course that I took on the subject. Parlett’s The Symmetric Eigenvalue Problem (1980) was a graduate level treatment of the symmetric eigenvalue problem. And Wilkinson’s The Algebraic Eigenvalue Problem (1965) was still the bible of numerical linear algebra, albeit already somewhat out of date due the fast moving research developments since it was published.

While an MSc student, I heard about the impending publication of a new book on matrix computations by Golub and Van Loan. I pre-ordered a copy and in spring 1983 received one of the first copies in the UK. The book was a revelation. It presented a completely fresh and up to date perspective on the subject. Some of the most exciting features were

• extensive use of pseudocode, with MATLAB-style indexing notation, to describe algorithms,
• the use of flops to measure computational cost,
• emphasis on the use of the SVD,
• modern presentation of rounding error analysis, with rounding error bounds given for each algorithm,
• systematic treatment of the conjugate gradient and Lanczos methods,
• coverage of topics not found in earlier books, such as condition estimation, generalized SVD, and total least squares,
• very lively writing style.

I studied the book in great detail and learned a huge amount from it.

Covers of first to fourth editions.

A second edition was published in 1989. It was written while Charlie Van Loan was in the UK on sabbatical and I was spending a year at Cornell (Charlie’s home university). I had the opportunity to read and comment on draft chapters. The second edition maintained all the material from the first and added new chapters on matrix multiplication (and the relevant machine architecture considerations) and parallel algorithms, and it was typeset in LaTeX for the first time. The term flop was redefined so that a+b*c represents two flops (as it does today) instead of one as in the first edition. A number of other changes were introduced to address a criticism in some reviews of the first edition that the book was rather terse and fast-paced for use as a course textbook.

A third edition followed in 1996. After a 17 year gap the fourth edition has just been published. Work on this edition began following the untimely death of Gene Golub in 2007. Some statistics indicate the development of the book:

Edition Year Number of pages Pages of master bibliography
First 1983 472 25
Second 1989 642 34
Third 1996 694 50
Fourth 2013 756 $65^\dagger$

$\dagger$ The master bibliography of the fourth edition is not printed in the book but is downloadable from the book’s web page.

### What is Different About the Fourth Edition?

The new edition is physically larger than its predecessors, with a text width of 13 cm versus 11.5 cm in the last edition, so the content is increased by more than the page count would suggest. Moreover, the paper is extremely high quality, and this makes the book bigger and heavier than you would expect. I bought the hardback, because I know from experience that the softback of all three previous editions did not stand up well to heavy use. The image shows the third and fourth editions along with Horn and Johnson’s Matrix Analysis (second edition, 2013) and my Accuracy and Stability of Numerical Algorithms (second edition, 2002).

A number of new topics are included, of which I would pick out

• fast transforms
• Hamiltonian and product eigenvalue problems
• large-scale SVD
• multigrid
• tensor computations

I like the statement in the preface that “References that are historically important have been retained because old ideas have a way of resurrecting themselves.” This is of course particularly true as regards methods suitable for high-performance computing.

Lists of relevant LAPACK codes at the start of each chapter have been removed, as have many of the small, illustrative numerical examples, which are replaced by MATLAB codes to be made available on the book’s web page.

The fourth edition remains the best general reference on matrix computations and a must-have for any serious researcher in the field. A big difference from 1983, when the first edition appeared, is that now a separate research monograph is available covering almost every topic in the book (and due reference is made to 28 such “Global References”). But Matrix Computations brings together and unifies a wide variety of topics in one place.

2013 has been a good year for books on matrices and approximation, with the publication of a second edition of Horn and Johnson’s Matrix Analysis, Trefethen’s Approximation Theory and Approximation Practice, and now this very welcome fourth edition of Golub and Van Loan. It is available from the usual sources as well as from SIAM. Consider the Kindle edition to save your back. You can still have it signed!

## Workshop on Matrix Functions and Matrix Equations

Last month we (Stefan Guettel, Nick Higham and Lijing Lin) organized a 2.5 day workshop Advances in Matrix Functions and Matrix Equations We had 57 attendees from around the world (see group photo): UK (19), Italy (7), USA (7), Germany (6), Canada (2), France (2), Portugal (2), South Africa(2), Saudi Arabia(2), Austria (1), Belgium (1), India (1), Ireland (1), Poland (1), Russia (1), Sweden (1), Switzerland (1).

We last organized a workshop on matrix functions in Manchester in 2008 (MIMS New Directions Workshop Functions of Matrices). The field has advanced significantly since then. Some emerging themes of this year’s workshop were as follows.

Krylov methods: Several speakers presented new results on this class of methods for the approximation of large-scale matrix functions, including a convergence analysis by Grimm of the extended Krylov subspace method taking into account smoothness properties of the starting vector, black-box parameter selection for the rational Krylov approximation of Markov matrix functions by Guettel and Knizhnerman, and an adaptive tangential interpolation strategy for MIMO model order reduction by Simoncini and Druskin.

Matrix exponential: Research continues to focus on this, the most important of all matrix functions (the inverse is excluded as being too special). We were delighted that Charlie Van Loan opened the workshop with a talk “What Isn’t There To Learn from the Matrix Exponential?”. Charlie wrote some of the key early papers on exp(A). Indeed his work on exp(A) began when he was a postdoc at Manchester in the early 1970s, and his 1975 Manchester technical report A Study of the Matrix Exponential contains ideas that later appeared in his papers and his book (with Golub) Matrix Computations. In particular, it makes the case that “anything that the Jordan decomposition can do, the Schur decomposition can do better”, and is still worth reading.

Exotic matrix functions: Two talks focused on newer, more “exotic” matrix functions and had links to Rob Corless, who was in the audience. Bruno Iannazzo discussed how to compute the Lambert W function of a matrix, which is any solution of the matrix equation $X e^X = A$. The scalar Lambert W function was named and popularized in a 1996 paper by Corless, Gonnet, Hare, Jeffrey and Knuth, On the Lambert W Function; it has many applications, including in delay differential equations. Bruno finished with a striking photo of the equation written in sand. Mary Aprahamian presented a new matrix function called the matrix unwinding function, defined as $U(A) = (A - \log e^A )/(2\pi i)$, which arises from the scalar unwinding number introduced by Corless, Hare and Jeffrey in 1996. She showed that it is useful as a means for obtaining correct identities involving multivalued functions at matrix arguments, as well as being useful for argument reduction in evaluating the matrix exponential.

A special afternoon session celebrated the 70th birthday of Krystyna Zietak, who has made many contributions to numerical linear algebra and approximation theory. Krystyna gave the opening talk in which she described some highlights of her international travels and of hosting visitors in Wroclaw, well illustrated by photos.

Happy birthday Krystyna!

Following the session we had a reception in the Living Worlds gallery of the Manchester Museum, followed by a dinner in the Fossil gallery, with Stan the Tyrannosaurus Rex looking over us.

Dinner in the Fossil gallery

Financial support for the workshop came from the European Research Council and book displays were kindly provided by Cambridge University Press, Oxford University Press, Princeton University Press and SIAM.

Most of the talks are available in PDF format from the workshop programme page.

A gallery of photos from the workshop has been produced, combining the efforts of several photographers.

## Emacs: The Ultimate Editor?

I started using Emacs about 1990 but have been using it exclusively for just two years. Prior to that my main editor was TSE Pro – a fast, customizable Windows-only editor that evolved from a 1980s DOS editor called Qedit. The motivation for switching to Emacs was that I wanted to be able to work in the same way on both Windows and Mac machines. After looking at the possibilities I settled on Emacs as the ideal solution.

Although Emacs dates from the 1980s, it seems to have enjoyed renewed popularity in the last few years, with regular new releases (currently version 24.3), many new packages appearing, several excellent Emacs blogs posting regularly, and even an Emacs conference held in London in March 2013.

For me the main advantages of Emacs are

• Excellent LaTeX integration through AucTeX and RefTeX.
• ORG mode, an incredibly powerful mode for working with plain text. Among its uses are
• making notes and outlines,
• TODO lists,
• writing documents in a simple markup language that can be exported to LaTeX, html, and various other formats,
• writing and managing WordPress blogs, via org2blog (this blog is produced entirely from within Emacs, apart from some tweaking in WordPress).

In all these cases, the ability to narrow the view to certain parts of the buffer, and to reorder logical units via simple keypresses, provide tremendous usability.

• Complete built-in documentation, with the ability to see the Emacs Lisp source code for all functions except the small number of low-level functions written in compiled C.
• The ability to customize every aspect of Emacs, and in particular to reassign almost any keypress.
• The use of Emacs is essentially system-independent; in particular Emacs has its own file management functions, which bypass the Windows, Mac or Linux file open/save dialog boxes.

I’ll write about some of these aspects in future posts.

And for an excellent perspective on the eternal “Emacs/Vi versus the latest hot editor” debate I recommend the post Good tools by James Bennett, which appeared just as I was about to publish this post.

Posted in Emacs | 1 Comment

## How Accurate Are Spreadsheets in the Cloud?

For a vector $x$ with $n$ elements the sample variance is $s_n^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \overline{x})^2$, where the sample mean is $\overline{x} = \frac{1}{n} \sum_{i=1}^n x_i$. An alternative formula often given in textbooks is $s_n^2 = \frac{1}{n-1} \left( \sum_{i=1}^n x_i^2 - \frac{1}{n} \left(\sum_{i=1}^n x_i \right)^2 \, \right)$. This second formula has the advantage that it can be computed with just one pass through the data, whereas the first formula requires two passes. However, the one-pass formula can suffer damaging subtractive cancellation, making it numerically unstable. When I wrote my book Accuracy and Stability of Numerical Algorithms I found that several pocket calculators appeared to use the one-pass formula.

How do spreadsheets apps available in web browsers and hosted in the cloud fare on computations such as this? I used Google Sheets to compute the standard deviation of vectors of the form $x = [m, m+1, m+2]$ (Google Sheets does not seem to have a built-in function for the sample variance; the standard deviation is the square root of the sample variance). Here is what I found. (The spreadsheet that produced these results is available as this xlsx file. Note that if you click on that link it will probably load into Excel and display the correct result.)

$m$ Exact standard deviation Google’s result
$10^7$ 1 1
$10^8$ 1 0

The incorrect result 0 for $m=10^8$ is what I would expect from the one-pass formula in IEEE double precision arithmetic, which has the equivalent of about 16 significant decimal digits of precision, since $\sum_{i=1}^n x_i^2$ and $\frac{1}{n} \left(\sum_{i=1}^n x_i\right)^2$ are both about $10^{16}$ and so there is not enough precision to retain the difference (which is equal to 2). A computation in MATLAB verifies that the one-pass formula returns 0 in IEEE double precision arithmetic.

It seems that Google Sheets is using IEEE double precision arithmetic internally, because the expression $3\times (4/3-1)-1$ evaluates to 2.2E-16. So it appears that Google may be using the one-pass formula.

This use of the unstable formula is deeply unsatisfactory, but it is just the tip of the iceberg. In a recent paper Spreadsheets in the Cloud—Not Ready Yet, Bruce McCullough and Talha Yalta show that Google Sheets, Excel Web App and Zoho Sheet all fail on various members of a set of “sanity tests”. This might not be too surprising if you are aware of McCullough’s earlier work in which he found errors in several versions of Microsoft Excel.

However, spreadsheets in the cloud bring further complications, as noted by McCullough and Yalta:

• These spreadsheets apps do not carry version information and the software can be changed by the provider at any time without announcement. It is therefore impossible to reproduce results computed previously.
• The hardware and software environment on which the software is running is not specified, which adds another level of irreproducibility.
• McCullough and Yalta found that the Excel Web App could produce different output from Excel 2010. Anyone moving a spreadsheet between the two applications could be in for a surprise.

The conclusion: use spreadsheets in the cloud at your peril! In fact, I avoid spreadsheets altogether. Anything I want to do can be done better in MATLAB, LaTeX or Emacs ORG mode.

Posted in software | Tagged | 3 Comments

## SIAM Conference on Computational Science and Engineering 2013

As predicted in my my preview post, this conference, held on the Boston waterfront, proved to be SIAM’s largest ever, with 1378 attendees. Over 1000 presentations were given in up to 20 parallel minisymposia at a time, but this did mean that there was at least one talk (and usually several) of interest to me in almost every time slot.

One thing I learned from the conference is how widely Python is being used in computational science, especially for solving real world problems involving large amounts of data. This is partly due to its ability to act as the glue between codes written in other languages and web applications. The IPython environment, with its notebook interface, was featured in a number of talks, in some of which the slides were displayed using the notebook.

The following highly selective photos will give a flavour of the conference.

The conference venue. Note the residual snow, which fortunately did not fall in any serious amounts during the conference.

The poster session of about 65 posters was preceded by a poster blitz (1 minute presentations) and was accompanied by an excellent dessert. This photo shows Edvin Deadman (University of Manchester and NAG Ltd.) discussing his poster on Matrix Functions and the NAG Library with Cleve Moler and Charlie Van Loan (authors of the classic Nineteen Dubious Ways to Compute the Exponential of a Matrix paper). For some thoughts on poster sessions by one of the conference attendees see Please, no posters! by David Gleich.

Josh Bloom’s (UC Berkeley) invited presentation Automated Astrophysics in the Big Data Era contained a fascinating mix of observational astronomy, machine learning, robotic telescopes, numerical linear algebra, and Python, with a focus on classifying stars.

It was interesting to see MapReduce being used to implement numerical algorithms, notably in the minisymposium Is MapReduce Good for Science and Simulation Data? organized by Paul Constantine (Stanford; standing) and David Gleich (Purdue; sitting, with pointer).

Here is the lunchtime panel Big Data Meets Big Models being videod. Highlights from this panel and some of the invited plenary talks will be available in due course on the SIAM Presents YouTube channel.

If you weren’t at the conference perhaps you can make it to the next one in two year’s time (date and location to be announced). In the meantime a good way to keep up with events is to join the SIAM Activity Group on Computational Science and Engineering, which organizes the conference.

Posted in conferences | Tagged , | 1 Comment

## Tiny Relative Errors

Let $x\ne0$ and $y$ be distinct floating point numbers. How small can the relative difference between $x$ and $y$ be? For IEEE double precision arithmetic the answer is $u = 2^{-53} \approx 1.1 \times 10^{-16}$, which is called the unit roundoff.

What if we now let $x$ and $y$ be vectors and define the relative difference as $d(x,y) = \|x-y\|_{\infty}/\|x\|_{\infty}$, where the infinity norm is $\|x\|_{\infty} = \max_i |x_i|$? It does not seem to be well known that $d(x,y)$ can be much less than $u$. I have observed this phenomenon numerous times over the years but had not previously stopped to wonder how it could happen. An example is

$x = \begin{bmatrix}1 \\ 10^{-22} \end{bmatrix}, \quad y = \begin{bmatrix}1 \\ 2\times 10^{-22} \end{bmatrix}, \quad d(x,y) = 10^{-22}$.

Notice that the largest element(s) in magnitude of $x$ agrees with the corresponding element(s) of $y$. This is, in fact, the only way that $d(x,y) < u$ is possible.

Theorem (Dingle & Higham, 2013).
Let $x\ne0$ and $y$ be $n$-vectors of normalized floating point numbers.
If $d(x,y) < u$ then $x_k = y_k$ for all $k$ such that $|x_k| = \|x\|_{\infty}$.

This result, and the scalar case mentioned above, are proved in

Nicholas J. Higham and Nicholas J. Dingle, Reducing the Influence of Tiny Normwise Relative Errors on Performance Profiles, MIMS EPrint 2011.90; to appear in ACM Trans. Math. Soft.

Performance profiles, introduced by Dolan and Moré in 2002, are a popular way to compare competing algorithms according to particular measures of performance, such as relative error or execution time. We show in the paper above that tiny relative errors can result in misleading performance profiles and show how a simple transformation of the data can lead to much more useful profiles. For more, see the paper or the talk Numerical Issues in Testing Linear Algebra Algorithms that I gave at the SIAM Conference on Computational Science and Engineering in Boston last week.