How Fast is Quadruple Precision Arithmetic?

170724-2024-49-7501.jpg

The 24th IEEE Symposium on Computer Arithmetic included several talks on multiprecision arithmetic.

When I am testing an algorithm running in double precision arithmetic I often want to compare the computed solution with a reference solution: a solution that is fully accurate to double precision. To obtain one I solve the same problem in quadruple precision and round the result back to double precision. If the conditioning of the problem leads me to doubt that that this reference solution will be correct to double precision, I compute it at a precision even higher than quadruple. Whether it is feasible to compute a reference solution in this way depends on the size of the problem and the speed of quadruple precision arithmetic. This is just one scenario where quadruple precision arithmetic is needed; others are discussed in my earlier post The Rise of Mixed Precision Arithmetic.

Roughly speaking, a quadruple precision number x can be represented by a pair of double precision numbers (x_1,x_2), where x = x_1 + x_2 and x_1 and x_2 represent the leading and trailing significant digits of x, respectively. Then a product x y can be written (x_1 + x_2)(y_1 + y_2) = x_1y_1 + x_1y_2 + x_2y_1 + x_2y_2, which involves four double precision multiplications and three double precision additions. We can therefore expect quadruple precision multiplication to be at least seven times slower than double precision multiplication. Almost all available implementations of quadruple precision arithmetic are in software (an exception is provided by the IBM z13 mainframe system), so we should expect quadruple precision to be slower than the theoretical prediction would suggest. Bailey and Borwein (2015) state that quadruple precision implemented in software typically runs five to ten times slower than hardware double precision arithmetic.

Various aspects need to be taken into consideration when we compare double precision and quadruple precision arithmetics. First, the relative speed of the arithmetics may depend on the type of operations being performed (scalar versus vectorizable operations), memory traffic considerations, and to what extent the implementations exploit multicore processors. Second, are we to compare the same code running in different precisions or computational kernels coded specially for each precision? I will take the second approach, as it is more relevant to my usage of quadruple precision. However, this does mean that I am comparing algorithms and their implementations as well as the underlying arithmetic.

I have done some very simple comparisons in MATLAB using the VPA arithmetic of the Symbolic Math Toolbox and the MP arithmetic of the third party Multiprecision Computing Toolbox from Advanpix.

The Multiprecision Computing Toolbox supports IEEE-compliant quadruple precision arithmetic, which is invoked by using the function mp.Digits to set the precision to 34 decimal digits (in fact, this is the default). For the VPA arithmetic in the Symbolic Math Toolbox 34 decimal digit precision is specified with the command digits(34) (the default is 32 digits). According to the documentation, VPA arithmetic uses a few guard digits, that is, it computes with a few extra digits of precision. The number of guard digits cannot be set by the user. In the Multiprecision Computing Toolbox the number of guard digits can be set with the function mp.GuardDigits, and the default is to use no guard digits.

These experiments were performed on an Intel Broadwell-E Core i7-6800K CPU @ 3.40GHz, which has 6 cores. The software details are:

MATLAB Version: 9.2.0.538062 (R2017a)
Operating System: Microsoft Windows 10 Home Version 10.0
                  (Build 15063)
Advanpix Multiprecision Computing Toolbox   Version 4.4.1.12580   
Symbolic Math Toolbox                       Version 7.2  (R2017a)

Further details on the Multiprecision Computing Toolbox can be seen by typing mp.Info, which lists details of open source libraries that are used.

In these experiments I do not check that the computed results are correct. Such checks are done in more extensive tests available in the test scripts at https://www.advanpix.com/2015/02/12/version-3-8-0-release-notes/. I note that the timings vary from run to run, but it is the order of magnitudes of the ratios that are of interest and these are correctly reflected by the reported results.

Test 1: LU Factorization

This code compares the speed of the arithmetics for LU factorization of a 250-by-250 matrix.

%QUAD_PREC_TEST1  LU factorization.
rng(1), n = 250;
A = randn(n); A_mp = mp(A,34); A_vpa = vpa(A,34);
t = clock; [L,U] = lu(A);     t_dp = etime(clock, t)
t = clock; [L,U] = lu(A_mp);  t_mp = etime(clock, t)
t = clock; [L,U] = lu(A_vpa); t_vpa = etime(clock, t)
fprintf('mp/double: %8.2e, vpa/double: %8.2e, vpa/mp: %8.2e\n',....
         t_mp/t_dp, t_vpa/t_dp, t_vpa/t_mp)

The output is

t_dp =
   1.0000e-03
t_mp =
   9.8000e-02
t_vpa =
   2.4982e+01
mp/double: 9.80e+01, vpa/double: 2.50e+04, vpa/mp: 2.55e+02

Here, the MP quadruple precision is 98 times slower than double precision, but VPA is 25,000 times slower than double precision.

Test 2: Complete Eigensystem of Nonsymmetric Matrix

%QUAD_PREC_TEST2  Complete eigensystem of nonsymmetric amtrix.
rng(1), n = 125;
A = randn(n); A_mp = mp(A,34); A_vpa = vpa(A,34);
t = clock; [V,D] = eig(A);     t_dp = etime(clock, t)
t = clock; [V,D] = eig(A_mp);  t_mp = etime(clock, t)
t = clock; [V,D] = eig(A_vpa); t_vpa = etime(clock, t)
fprintf('mp/double: %8.2e, vpa/double: %8.2e, vpa/mp: %8.2e\n',....
         t_mp/t_dp, t_vpa/t_dp, t_vpa/t_mp)

The output is

t_dp =
   1.8000e-02
t_mp =
   1.3430e+00
t_vpa =
   1.0839e+02
mp/double: 7.46e+01, vpa/double: 6.02e+03, vpa/mp: 8.07e+01

Here, MP is 75 times slower than double, and VPA is closer to the speed of MP.

Test 3: Complete Eigensystem of Symmetric Matrix

%QUAD_PREC_TEST3  Complete eigensystem of symmetric amtrix.
rng(1), n = 200;
A = randn(n); A = A + A'; A_mp = mp(A,34); A_vpa = vpa(A,34);
t = clock; [V,D] = eig(A);     t_dp = etime(clock, t)
t = clock; [V,D] = eig(A_mp);  t_mp = etime(clock, t)
t = clock; [V,D] = eig(A_vpa); t_vpa = etime(clock, t)
fprintf('mp/double: %8.2e, vpa/double: %8.2e, vpa/mp: %8.2e\n',....
         t_mp/t_dp, t_vpa/t_dp, t_vpa/t_mp)

The output is

t_dp =
   1.1000e-02
t_mp =
   3.5800e-01
t_vpa =
   1.2246e+02
mp/double: 3.25e+01, vpa/double: 1.11e+04, vpa/mp: 3.42e+02

Note that there are at least three different algorithms that could be used here (the QR algorithm, divide and conquer, and multiple relatively robust representations), so the three eig invocations may be using different algorithms.

Test 4: Componentwise Exponentiation

%QUAD_PREC_TEST4  Componentwise exponentiation.
rng(1), n = 1000;
A = randn(n); A_mp = mp(A,34); A_vpa = vpa(A,34);
t = clock; X = exp(A);     t_dp = etime(clock, t)
t = clock; X - exp(A_mp);  t_mp = etime(clock, t)
t = clock; X - exp(A_vpa); t_vpa = etime(clock, t)
fprintf('mp/double: %8.2e, vpa/double: %8.2e, vpa/mp: %8.2e\n',....
         t_mp/t_dp, t_vpa/t_dp, t_vpa/t_mp)

The output is

t_dp =
   7.0000e-03
t_mp =
   8.5000e-02
t_vpa =
   3.4852e+01
mp/double: 1.21e+01, vpa/double: 4.98e+03, vpa/mp: 4.10e+02

Both MP and VPA come closer to double precision on this problem of computing the scalar exponential at each matrix element.

Summary

Not too much emphasis should be put on the precise timings, which vary with the value of the dimension n. The main conclusions are that 34-digit MP arithmetic is 1 to 2 orders of magnitude slower than double precision and 34-digit VPA arithmetic is 3 to 4 orders of magnitude slower than double precision.

It is worth noting that the documentation for the Multiplication Computing Toolbox states that the toolbox is optimized for quadruple precision.

It is also interesting to note that the authors of the GNU MPFR library (for multiple-precision floating-point computations with correct rounding) are working on optimizing the library for double precision and quadruple precision computations, as explained in a talk given by Paul Zimmermann at the 24th IEEE Symposium on Computer Arithmetic in July 2017; see the slides and the corresponding paper.

[Updated September 6, 2017 to clarify mp.Info.]

Posted in software | Tagged | Leave a comment

How and How Not to Compute a Relative Error

The relative error in a scalar y as an approximation to a scalar x is the absolute value of e = (x-y)/x. I recently came across a program in which e had been computed as 1 - y/x. It had never occurred to me to compute it this way. The second version is slightly easier to type, requiring no parentheses, and it has the same cost of evaluation: one division and one subtraction. Is there any reason not to use this parenthesis-free expression?

Consider the accuracy of the evaluation, using the standard model of floating point arithmetic, which says that fl(x \mathbin{\mathrm{op}} y) = (x \mathbin{\mathrm{op}} y)(1+\delta) with |\delta| \le u, where \mathrm{op} is any one of the four elementary arithmetic operations and u is the unit roundoff. For the expression e_1 = (x-y)/x we obtain, with a hat denoting a computed quantity,

\widehat{e_1} = \displaystyle\left(\frac{x-y}{x}\right) (1+\delta_1)(1+\delta_2),  \qquad |\delta_i| \le u, \quad i = 1, 2.

It follows that

\left| \displaystyle\frac{e - \widehat{e_1}}{e} \right| \le 2u + u^2.

Hence e_1 is computed very accurately.

For the alternative expression, e_2 = 1 - y/x, we have

\widehat{e_2} = \left(1 - \displaystyle\frac{y}{x}(1+\delta_1)\right) (1+\delta_2),  \qquad |\delta_i| \le u, \quad i = 1, 2.

After a little manipulation we obtain the bound

\left| \displaystyle\frac{e - \widehat{e_2}}{e} \right| \le          u + \left|\displaystyle\frac{1-e}{e}\right|(u + u^2).

The bound on the relative error in \widehat{e_2} is of order |(1-e)/e|u, and hence is very large when |e| \ll 1.

To check these bounds we carried out a MATLAB experiment. For 500 single precision floating point numbers y centered on x = 3, we evaluated the relative error of y as an approximation to x using the two formulas. The results are shown in this figure, where an ideal error is of order u \approx 6\times 10^{-8}. (The MATLAB script that generates the figure is available as this gist.)

rel-err-formula.jpg

As expected from the error bounds, the formula 1-y/x is very inaccurate when y is close to x, whereas (x-y)/x retains its accuracy as y approaches x.

Does this inaccuracy matter? Usually, we are concerned only with the order of magnitude of an error and do not require an approximation with many correct significant figures. However, as the figure shows, for the formula |1-y/x| even the order of magnitude is incorrect for y very close to x. The standard formula |x-y|/|x| should be preferred.

Posted in research | Tagged | 2 Comments

SIAM Annual Meeting 2017 Highlights

170712-1821-50-7337.jpg

Emily Shuckburgh delivering the I. E. Block Community Lecture “From Flatland to Our Land: A Mathematician’s Journey through Our Changing Planet”. A recording of her lecture is available at https://www.pathlms.com/siam/courses/4988/sections/7425.

It’s a couple of weeks since the 2017 SIAM Annual Meeting, which I previewed in an earlier post. The meeting was held at the David Lawrence Convention Center in Pittsburgh and was co-chaired by Des Higham (University of Strathclyde) and Jennifer Mueller (Colorado State University).

The approximately 1150 attendees enjoyed five days packed from morning to evening with lectures, panel sessions, careers events, professional development sessions, and other activities.

You can catch up with what went on at the meeting in several ways.

Other Links

Here is a Storify that captures many of the Tweets from the meeting.

Here is blog post about the meeting by Scott Morgan, the president of the SIAM Student Chapter.

SIAM Presents contains recordings of selected talks delivered in the main ballroom. These include all the invited lectures and prize lectures.

Join us in Portland, Oregon, July 9-13 next year, for the 2018 SIAM Annual Meeting!

Posted in conferences | 1 Comment

Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions

by Erin Carson and Nick Higham

lu-ir-hsd.jpg

Half precision LU factorization (H,S D) can deliver full single precision accuracy (Algorithm New), just like traditional iterative refinement (S,S,D: Algorithm Trad).

With the growing availability of half precision arithmetic in hardware and quadruple precision arithmetic in software, it is natural to ask whether we can harness these different precisions, along with the standard single and double precisions, to solve problems faster or more accurately.

We have been investigating this question for linear systems Ax = b with a nonsingular matrix A, for which the standard solution process is by LU factorization. By making use of iterative refinement, we are able to harness an LU factorization computed in lower precision to solve the problem up to twice as fast and with greater accuracy than with the standard approach.

Iterative refinement is an old technique for improving the accuracy of an approximate solution to a linear system Ax = b. It was used by Wilkinson and his colleagues in the 1940s in the early days of digital computing. The traditional usage employs two precisions, and fixed precision refinement has also been in use since the late 1970s.

We have found that using three different precisions, rather than the usual two, can bring major benefits. A scenario of particular interest is a mix of half precision, single precision, and double precision, with single precision the working precision in which A, b, and the iterates x_i are stored. Here is the traditional algorithm followed by the new algorithm. All computations are in single precision (unit roundoff 6.0 \times 10^{-8}) except where stated.

Algorithm Trad: two-precision refinement (single, double).
Factorize $PA = LU$.
Solve $Ax_0 = b$ using the LU factors.
for $i=0:\infty$
   Form $r_i = b - Ax_i$ in *double precision*
        and round $r_i$ to *single precision*.
   Solve $Ad_i = r_i$ using the LU factors.
   $x_{i+1} = x_i + d_i$.
end
Algorithm New: three-precision refinement (half, single, double).
Factorize $PA = LU$ in *half precision*.
Solve $Ax_0 = b$ using the LU factors at *half precision*.
for $i=0:\infty$
   Form $r_i = b - Ax_i$ in *double precision*
        and round $r_i$ to *half precision*.
   Solve $Ad_i = r_i$ at *half precision*
        using the LU factors; store $d_i$ in *single*.
 $x_{i+1} = x_i + d_i$.
end

Speed

Algorithm Trad does O(n^3) flops at single precision and O(n^2) flops at double precision. Algorithm New, however, does O(n^3) flops at half precision and O(n^2) flops at single and double precision. Both these statements assume, of course, that iterative refinement converges in a small number of iterations. There is therefore a potential two times speedup of Algorithm New over Algorithm Trad, since half precision runs at twice the speed of single precision on (for example) NVIDIA GPUs and AMD GPUs.

Accuracy

Algorithm Trad converges as long as \kappa_{\infty}(A) \le 10^8 and it yields a forward error (defined by \|x-\widehat{x}\|_{\infty}/\|x\|_{\infty}, where \widehat{x} is the computed solution) and a backward error both of order 10^{-8} (as shown by standard analysis). Our new rounding error analysis shows that Algorithm New has the same error bounds, but has the more stringent requirement \kappa_{\infty}(A) \le 10^4 for convergence.

GMRES-IR

Now we replace the solve step in the loop of Algorithm New by an application of GMRES to the preconditioned system

\widetilde{A} d_i \equiv \widehat{U}^{-1}\widehat{L}^{-1}Ad_i = \widehat{U}^{-1}\widehat{L}^{-1}r_i,

where matrix–vector products with \widetilde{A} are done at double precision and all other computations are done at single precision. Algorithm New now converges as long as \kappa_{\infty}(A) \le 10^8 and it yields forward and backward errors of order 10^{-8}. In other words, it has the same numerical properties as Algorithm Trad but potentially does half the work (depending on the number of GMRES iterations needed to converge).

Other Choices of Precision

Let H, S, D, and Q denote half precision, single precision, double precision, and quadruple precision, respectively. Algorithm New can be described as “HSD”, where the three letters indicate the precision of the factorization, the working precision, and the precision with which residuals are computed, respectively. Various combinations of letters produce feasible algorithms (20 in all, if we include fixed precision refinement algorithms, such as “SSS”), of which HSD, HSQ, HDQ and SDQ use three different precisions. Similar results to those above apply to the latter three combinations.

Outlook

Our MATLAB experiments confirm the predictions of the error analysis regarding the behavior of Algorithm New and its GMRES-IR variant. They also show that the number of GMRES iterations in GMRES-IR can indeed be small.

Iterative refinement in three precisions therefore offers great promise for speeding up the solution of Ax = b. Instead of solving the system by an LU factorization at the working precision, we can factorize A at half the working precision and apply iterative refinement in three precisions, thereby obtaining a more accurate solution at potentially half the cost.

Full details of this work can be found in

Erin Carson and Nicholas J. Higham, Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions MIMS EPrint 2017.24, Manchester Institute for Mathematical Sciences, The University of Manchester, UK, July 2017.

Posted in research | Leave a comment

Five Examples of Proofreading

proofs-authors-red.jpg

Every writer has also to be a proofreader, whether it be of his or her own drafts or of proofs sent by a publisher. In this post I will give some real-life examples of corrections to proofs. The problems to be corrected are not all errors: some are subtle aspects of the typesetting that need improvement. These examples should give you some ideas on what to look out for the next time you have a set of proofs to inspect.

Example 1

The first example is from proofs of one of my recent papers:

hist16-proof-line191.jpg

The article had been submitted as LaTeX source and it was reasonable to assume that the only differences between the proofs and what we submitted would be in places where a copy editor had imposed the journal style or had spotted a grammatical error. Fortunately, I know from experience not to make that assumption. These two sentences contain two errors introduced during copy editing: the term “Anderson acceleration” has been deleted after “To apply”, and “We denote by unvec” has been changed to “We denote by vec” (making the sentence nonsensical). The moral is never to assume that egregious errors have not been introduced: check everything in journal proofs.

In a similar vein, consider this extract from another set of proofs:

hino16-proof-line154.jpg

There is nothing wrong with the words or equations. The problem is that an unwanted paragraph break has been inserted after equation (2.6), and indeed also before “Only”. This set of proofs contained numerous unwanted added new paragraphs.

Example 2

Here is an extract from the proofs of my recent SIAM Review paper (with Natasa Strabic and Vedran Sego) Restoring Definiteness via Shrinking, with an Application to Correlation Matrices with a Fixed Block:

hss16-p4-proof.jpg

We noticed that the word “how” appears at the end of a line four times within seven lines—an unfortunate coincidence. We suggested that the production editor insert a hard space in the LaTeX source between one or more of the hows and the following word in order to force different line breaks. Here is the result as published:

hss16-p4-final.jpg

Example 3

What’s wrong with this example, from a paper in the 1980s?

besa89-p193.jpg

The phrase “best unknown” should be “best known”!

Example 4

The next example is from a book:

boyd14-p324.jpg

At first sight there is nothing wrong. But the 9z is suspicious: why 9, and why is this term that depends only on z inside the integral? It turns out that the equation should read

k(z) \equiv \frac{2}{z} \int_0^1 \tanh\bigl( z \sin(2\pi t) \bigr) \sin(2\pi t) \,dt.

When you realize that the left parenthesis and the digit 9 share the same key on the keyboard you can start to see how the error might have been made at the typing stage.

Example 5

The final example (from a 2013 issue of Private Eye) is completely different and illustrates a rare phenomenon:

private_eye_river_text1.jpg

If you cannot see anything wrong after a minute or so, click here. This phenomenon, whereby white spaces in successive lines join up to make a snake, is known as rivers of white. The fix, as in Example 2, is to force different line breaks.

Posted in writing | Leave a comment

SIAM Annual Meeting 2017 Preview

It’s a month to the 2017 SIAM Annual Meeting at the David Lawrence Convention Center in Pittsburgh. We’re returning to the location of the 2010 meeting. The meeting is co-chaired by Des Higham (University of Strathclyde) and Jennifer Mueller (Colorado State University).

Here are a few highlights and things it’s useful to know. If you haven’t already made plans to attend it’s not too late to register. Be sure to take in the view from the roof of the convention center, as shown here.

100716-2047-03-0774.jpg

Block Lecture by Emily Shuckburgh

The I. E. Block Community Lecture on Wednesday evening will be given by Emily Shuckburgh on From Flatland to Our Land: A Mathematician’s Journey through Our Changing Planet. Emily, from the British Antarctic Survey, is a co-author of the recent book Climate Change, which she wrote with HRH Prince Charles and Tony Juniper.

Prize Lectures

As always, a number of prize lectures will be given at the meeting. These include the four-yearly James H. Wilkinson Prize in Numerical Analysis and Scientific Computing, which will be awarded to Lek-Heng Lim. His lecture is titled Tensors in Computational Mathematics. See this article about Lek-Heng.

Joint with Activity Group Conferences and Workshops

The meeting is held jointly with the SIAM Conference on Industrial and Applied Geometry (GD17) and the SIAM Conference on Control and Its Applications (CT17), in the same location. One registration fee gains you access to all three meetings!

In addition, the SIAM Workshop on Parameter Space Dimension Reduction (DR17) and the SIAM Workshop on Network Science (NS17) are taking place just before and just after the conference, respectively.

Funding

Funding of mathematics, and other subjects, is in a state of uncertainty under the current US administration. In the minisymposium How Changing Implementations of National Priorities Might Affect Mathematical Funding a panel of representatives from funding agencies will describe the current situation and future opportunities. This is a great chance to hear the latest news from Washington from those in the know.

Students

SIAM provides a host of activities for students, beginning with an orientation session on Sunday evening and including a career fair, a session on career opportunities in business, industry and government (BIG), and the chance to meet and talk to invited speakers and co-chairs.

Hidden Figures

An evening session will include Christine Darden, who was one of the human computers included in the book “Hidden Figures” by Margot Lee Shetterly, on which the recent Hollywood movie of the same title was based.

SIAM Business Meeting

The Business Meeting (Tuesday at 6.15pm) provides an opportunity to hear the president (that’s me!) and SIAM staff report on SIAM’s activities over the past year and to ask questions. The 2017 SIAM Fellows will be recognized, and a reception in their honor follows the Business meeting.

Website

SIAM is developing a new website. A preliminary version will be available on laptops in the exhibit hall for participants to try. Feedback will be much appreciated and SIAM staff will be on hand to receive your comments.

Baseball Match

If you are staying in Pittsburgh on the Friday night, consider attending a baseball match. The Pittsburgh Pirates play the St Louis Cardinals at home at PNC Park on Friday July 14. I went to the Friday match after SIAM AN10 and really enjoyed it; the views from the ground are spectacular.

100716-2057-10-0196.jpg

Twitter

If you are not able to attend you can get a feel for what’s going on by following the hashtag #SIAMAN17 on Twitter.

Pittsburgh

There’s plenty to do and see in Pittsburgh, as the following images illustrate. As well as the impressive bridges over the Allegheny and Monongahela rivers, and some interesting downtown architecture and murals, there’s the Andy Warhol Museum (a short walk from the convention center). Here are some images I took in 2010.

100716-2057-19-0797.jpg 100716-1953-56-0749.jpg 100716-2013-14-0769.jpg 100716-1620-53-0547-Edit.jpg 100716-1800-58-0606.jpg 100716-1525-36-0533-Edit.jpg 100716-1524-16-0529.jpg 100716-1624-56_0551_Edit.jpg

Posted in conferences | Tagged | 2 Comments

A Second Course in Linear Algebra, by Garcia and Horn (2017)

gaho17-cover.jpg

The publication of a new linear algebra textbook is not normally a cause for excitement. However, Roger Horn is co-author of two of the most highly regarded and widely used books on matrix analysis: Matrix Analysis (2nd edition, 2013) and Topics in Matrix Analysis (1991), both co-authored with Charles Johnson. It is therefore to be expected that this new book by Garcia and Horn will offer something special.

Chapter 0 (Preliminaries) summarizes basic concepts and definitions, often stating results without proof (for example, properties of determinants). Chapters 1 (Vector Spaces) and 2 (Bases and Similarity) are described as reviews, but give results with proofs and examples. The second course proper starts with Chapter 3 (Block Matrices). As the chapter title suggests, the book makes systematic use of block matrices to simplify the treatment, and it is very much based on matrices rather than linear transformations.

Two things stand out about this book. First, it lies part-way between a traditional linear algebra text and texts with a numerical linear algebra focus. Thus it includes Householder matrices (but not Givens matrices), QR factorization, and Cholesky factorization. The construction given for QR factorization is essentially the Householder QR factorization, but the existence proof for Cholesky goes via the QR factor of the Hermitian positive definite square root, rather than by constructing the Cholesky factor explicitly via the usual recurrences. The existence of square roots of Hermitian positive definite matrices is proved via the spectral decomposition. It is possible to prove the existence of square roots without using the spectral theorem, and it would have been nice to mention this, at least in an exercise.

The second impressive aspect of the book is the wide, and often quite advanced, range of topics covered, which includes polar decomposition, interlacing results for the eigenvalues of Hermitian matrices, and circulant matrices. Not covered are, for example, Perron–Frobenius theory, the power method, and functions of nonsymmetric matrices (though various special cases are covered, such as the square root of Jordan block, often in the problems). New to me are the QS decomposition of a unitary matrix, Shoda’s theorem on commutators, and the Fuglede–Putnam theorem on normal matrices.

The 16-page index occupies 3.7 percent of the book, which, according to the length criteria discussed in my article A Call for Better Indexes, is unusually thorough. However, there is some over-indexing. For example, the entry permutation consists of 7 subentries all referring to page 10, but “permutation, 10” would have sufficed. An index entry “Cecil Sagehen” puzzled me. It has two page locators: one on which that term does not appear and one for a problem beginning “Cecil Sagehen is either happy or sad”. A little investigation revealed that “Cecil the Sagehen” is the mascot of Pomona College, which is the home institution of the first author.

There is a large collection of problems that go well beyond simple illustration and computation, and it is good to see that the problems are indexed.

Here are some other observations.

  • The singular value decomposition (SVD) is proved via the eigensystem of A^*A. Personally, I prefer the more elegant, if less intuitively obvious, proof in Golub and Van Loan’s Matrix Computations.
  • The treatment of Gershgorin’s theorem occupies six pages, but it omits the practically important result that if k discs form a connected region that is isolated from the other discs then that region contains precisely k eigenvalues.
  • The Cayley-Hamilton theorem is proved by using the Schur form. I would do it either via the minimal polynomial or the Jordan form, but these concepts are introduced only in later chapters.
  • Correlation matrices are mentioned in the preface, but do not appear in the book. They can make excellent examples.
  • The real Schur decomposition is not included, but rather just the special case for a real matrix having only real eigenvalues.
  • Matrix norms are not treated. The Frobenius norm is defined as an inner product norm and, unusually, the 2-norm is defined as the largest singular value of a matrix. There are no index entries for “matrix norm”, “norm, matrix”, “vector norm”, or “norm, vector”.
  • The pseudoinverse of a matrix is defined via the SVD. The Moore-Penrose conditions are not explicitly mentioned.
  • Three pages at the front summarize the notation and point to where terms are defined. Ironically, the oft-used notation M_n for an n \times n matrix, is not included.
  • Applications are mentioned only in passing. However, this does keep the book down to a relatively slim 426 pages.

Just as for numerical analysis texts, there will probably never exist a perfect linear algebra text.

The book is very well written and typeset. With its original presentation and choice of content it must be a strong contender for use on any second (or third) course on linear algebra. It can also serve as a reference on matrix theory: look here first and turn to Horn and Johnson if you don’t find what you want. Indeed a surprising amount of material from Horn and Johnson’s books is actually covered, albeit usually in less general form.

Posted in books | 1 Comment

SIAM Books Available Worldwide from Eurospan

eurospan.jpg

SIAM books are now available to individuals, bookstores, and other retailers outside North America from Eurospan, who have taken over the role previously carried out by Cambridge University Press.

This is significant news for those of us outside North America for two reasons.

  • Shipping is free, anywhere in the world.
  • SIAM members get a 30 percent discount on entering a special code at checkout. This code was emailed to all SIAM members in March 2017. If you are a SIAM member and do not have the code, you can get it by contacting SIAM customer service at siambooks@siam.org.

Check out the landing page for SIAM books at the Eurospan website.

SIAM members outside North America now have three options for ordering SIAM books.

  • Eurospan: 30 percent member discount; pay in GBP in the UK, euros in Europe, Australian dollars in Australian territories, US dollars everywhere else; free shipping.
  • SIAM Bookstore: 30 percent member discount; pay in US dollars; free shipping (at least for the near future).
  • Amazon: unpredictable, sometimes very small, discount; pay in currency of the particular site.
Posted in books | Tagged | Leave a comment

Dot Grid Paper for Writing Mathematics

As I discussed in Writing Mathematics in Pencil, and Why Analogue is Not Dead, there is a lot to be said for writing mathematics on paper, at least for early drafts before the material is typed into LaTeX.

There are essentially four types of paper that you might use.

  • Plain paper. Readily available: you can always raid the printer or photocopier. A plain sheet of paper places no constraints on your writing, but it can make it hard to maintain straight lines and a consistent letter height.
  • Ruled paper. The most popular choice. A drawback is that it may be hard to find the perfect line spacing, which can depend on what you are writing or even what pen you are using.
  • Graph, or quadrille, paper. Although aimed at those needing to draw graphs or designs, this paper can be used for general writing, as long as the lines are not so prominent as to be distracting.

There is a fourth type of paper that is less well known, but is becoming more popular: dot grid paper. This paper contains a rectangular array of dots. It is particularly popular with bullet journal enthusiasts.

Could dot grid paper be the perfect choice for writing mathematics? The dots are sufficient to keep your writing straight, but there is less ink on the page to distract you in the way that rules or a graph pattern can. If you need to draw a diagram or graph then the dots are most likely all the guidance you need. And you can draw boxes through groups of four dots to make a to-do list. As explained on the Baron Fig website, dot grid is “there when you need structure, quiet when you don’t”.

A popular supplier of dot grid paper is Rhodia, whose Dot Pads have lightly printed dots at 5mm intervals. The pads are stapled at the top, with the cover pre-scored in order to help it fold around the back of the pad. They also have micro-perforations that make it very easy to tear a page off. Their paper is much-loved by users of fountain pens for its smooth quality and resistance to bleed-through.

For general comments on dot grid paper from the online stationer Bureau Direct, some great photos, and even a flowchart (written on dot grid of course), see 3 Reasons To Switch To Dot Paper.

Here is a sample of mathematics written on Rhodia dot grid paper, using a Tombow Mono 100 4B pencil.

dotgrid-notes.jpg

Of course, you can generate your own dot grid paper with suitable LaTeX code. The following code is adapted from this post on Reddit; it produces this A4 sheet.

\documentclass{article}
\pagenumbering{gobble}
\usepackage[a4paper,hmargin={0mm,3mm},vmargin=5mm]{geometry}  
\usepackage{tikz}
\begin{document}
  \begin{tikzpicture}[scale=.5]
    \foreach \x in {0,...,41}
    \foreach \y in {0,...,57}
    {
  \fill[gray!75] (\x,\y) circle (0.06cm);
    }       
  \end{tikzpicture}
\end{document}

Give dot grid paper a try. It could be just what you need to unleash your mathematical (or other) creativity.

Posted in writing | Tagged | 2 Comments

Elements of MATLAB Style

140415-1443-12-4032-cropped.jpg

Style is an important aspect of writing, and also of programming. While MATLAB is a quick and easy language in which to program, style should not be neglected. Good style aids readability, which in turn makes it easier to debug and maintain code. It also fosters confidence in the code. Here are six style tips, in which the examples given are all inspired by real-life code that I have come across.

For more on MATLAB style see the book MATLAB Guide, and in particular section 16.1, “Elements of Coding Style”.

1. Omit Unnecessary Parentheses

% Bad style.
x = (A')\b;
y = [1];
D = (diag(x));

The parentheses around A' are unnecessary. Without them, the statement is legal and unambiguous. But of course the parentheses would be necessary in, for example, x = (A*B)'\c;

y is a scalar. There is no need to enter it as a 1-by-1 matrix by putting it in square brackets.

There is no need for parentheses around the diag function.

% Good style.
x = A'\b;
y = 1;
D = diag(x);

2. Use Spaces Consistently

% Bad style.
q=sqrt(2);
a = 1;
Z =zeros(3);
for i=1:10, q=q+1/q; end

Stick to a convention about spacing around around equals signs and plus and minus signs. I think spaces make the code more readable.

% Good style.
q = sqrt(2);
a = 1;
Z = zeros(3);
for i = 1:10, q = q + 1/q; end

3. Omit Unnecessary Semicolons and Commas

% Bad style.
figure;
for i = 2:n,
    x(i) = x(i-1)^2*y(i);
end;

Semicolons suppress output, but in this example there is no need for the them since no output is returned by the figure or end functions. The comma would only be necessary if the for loop was collapsed onto one line.

% Good style.
figure
for i = 2:n
    x(i) = x(i-1)^2*y(i);
end

4. Use the Appropriate Construct

The following if statement is perfectly correct.

% Bad style.
if flag == 1
   fprintf('Failed to converge.\n')
elseif flag == 2
   fprintf('Overflow encountered.\n')
elseif flag == 3
   fprintf('Division by zero.\n')
elseif flag == 4
   fprintf('Tolerance too small.\n')
end

But given its regular structure, it is better written as a switch statement, which is shorter, brings out the parallelism in the logic, and is easier to check.

% Good style.
switch flag
   case 1, fprintf('Failed to converge.\n')
   case 2, fprintf('Overflow encountered.\n')
   case 3, fprintf('Division by zero.\n')
   case 4, fprintf('Tolerance too small.\n')
end

5. Omit Trailing Tildes

In the list of output arguments in a function call a tilde signifies that the output argument in that position is not wanted and so can be discarded (and hence need not be computed). This notation was introduced in MATLAB R2009b. Generally, there is no point in providing a tilde as the last output argument, as a well written function will check the number of requested outputs and not compute any trailing outputs that are not requested.

The singular value decomposition (SVD) of a matrix A is computed by the call [U,S,V] = svd(A). In the following example we wish to compute the left singular vectors of A but not the singular values or right singular vectors.

% Bad style.
[U,~,~] = svd(A);

% Good style
[U,~] = svd(A);

An obvious question is why the second example was not shortened to

U = svd(A);

The reason is that with one output argument the svd function has a special behavior: it returns a matrix of singular values, not the matrix U of left singular vectors.

6. Include an H1 line in Program Files

The H1 line in a MATLAB program file is a comment line that is the second line in the file and contains the program name followed by a one-line description of the program. It is good practice to provide every program with an H1 line, as MATLAB itself does. Several MATLAB functions make use of H1 lines. For example, when the help function is invoked with a directory name and the directory in question does not contain a contents.m file, a list of contents is created from the H1 lines of the program files in the directory and displayed.

Here is an example of a function with an H1-line (this is an updated version of a function from The Matrix Computation Toolbox).

function show(x)
%SHOW   Display signs of matrix elements.
%   SHOW(X) displays X in `FORMAT +' form, that is,
%   with `+', `-' and  blank representing positive, negative
%   and zero elements respectively.

f = get(0,'Format'); % Get current numerical format.
format +
disp(x)
format(f)            % Restore original numeric format.
Posted in software | Tagged | 2 Comments