## Advances in Numerical Linear Algebra Conference and James Hardy Wilkinson Centenary

This year marks the 100th anniversary of the birth of James Hardy Wilkinson, FRS. Wilkinson developed the theory and practice of backward error analysis for floating-point computation, and developed, analyzed, and implemented in software many algorithms in numerical linear algebra. While much has changed since his untimely passing in 1986, we still rely on the analytic and algorithmic foundations laid by Wilkinson.

This micro website about Wilkinson, set up by Sven Hammarling and me, contains links to all kinds of information about Wilkinson, including audio and video recordings of him.

With Sven Hammarling and Françoise Tisseur, I am organizing a conference Advances in Numerical Linear Algebra: Celebrating the Centenary of the Birth of James H. Wilkinson at the University of Manchester, May 29-30, 2019. Among the 13 invited speakers are several who knew or worked with Wilkinson. As well as focusing on recent developments and future challenges for numerical linear algebra, the talks will include reminiscences about Wilkinson and discuss his legacy.

Contributed talks and posters are welcome (deadline April 1, 2019) and some funding is available to support the attendance of early career researchers and PhD students.

## Who Invented the Matrix Condition Number?

The condition number of a matrix is a well known measure of ill conditioning that has been in use for many years. For an $n\times n$ matrix $A$ it is $\kappa(A) = \|A\| \|A^{-1}\|$, where $\|\cdot\|$ is any matrix norm. If $A$ is singular we usually regard the condition number as infinite.

The first occurrences of the term “condition number” and of the formula $\kappa(A) = \|A\| \|A^{-1}\|$ that I am aware of are in Turing’s 1948 paper Rounding-Off Errors in Matrix Processes. He defines the $M$-condition number $n\|A\|_M \|A^{-1}\|_M$ and the $N$-condition number $n^{-1}\|A\|_F \|A^{-1}\|_F$, where $\|A\|_M = \max_{i,j}|a_{ij}|$ and $\|A\|_N = (\sum_{i,j}|a_{ij}|^2)^{1/2}$, the latter N-norm being what we now call the Frobenius norm. He suggests using these condition numbers to measure the ill conditioning of a matrix with respect to linear systems, using a statistical argument to make the connection. He also notes that “the best conditioned matrices are the orthogonal ones”.

In his 1963 book Rounding Errors in Algebraic Processes, Wilkinson credits the first use of “condition number” to Turing and notes that “the term ill-condition’ had been in common use among numerical analysts for some considerable time before this”. An early mention of linear equations being ill conditioned is in the 1933 paper An Electrical Calculating Machine by Mallock. According to Croarken, Mallock’s machine “could not adequately deal with ill conditioned equations, letting out a very sharp whistle when equilibrium could not be reached”.

As noted by Todd (The Condition of a Certain Matrix, 1950), von Neumann and Goldstine (in their monumental 1947 paper Numerical Inverting of Matrices of High Order) and Wittmeyer (1936) used the ratio of largest to smallest eigenvalue of a positive definite matrix in their analyses, which amounts to the 2-norm condition number $\kappa_2(A) = \|A\|_2 \|A^{-1}\|_2$, though this formula is not used by these authors. Todd called this the $P$ condition number. None of the $M$, $N$ or $P$ names have stuck.

Nowadays we know that $\kappa(A)$ can be thought of both as a measure of the sensitivity of the solution of a linear system to perturbations in the data and as a measure of the sensitivity of the matrix inverse to perturbations in the matrix (see, for example, Condition Numbers and Their Condition Numbers by D. J. Higham). How to formulate the definition of condition number for a wide class of problems was worked out by John Rice in his 1966 paper A Theory of Condition.

## Reflections on a SIAM Presidency

The Drexel Dragon, on the Drexel University campus a couple of blocks from SIAM.

Face Fragment (1975) sculpture, a block from SIAM.

My two-year term as SIAM President ended on December 31, 2018. It’s been an exciting and enjoyable two years, not least because of the excellent SIAM staff, leadership and other volunteers I’ve worked with.

My blog post Taking Up the SIAM Presidency and my SIAM News article Evolving and Innovating set out some ideas that I wanted to pursue during my two years as president. I will not attempt to review these here, but just list five highlights from the last two years.

• We held the SIAM ADVANCE in April 2018: a two-day strategic planning workshop attended by 25 officers, staff, and other members of the SIAM community. The many ideas that emerged from the event are summarized in an 80-page report provided to the SIAM Council and Board of Trustees. Many of these have already been acted upon, others are in progress, and yet more will be considered in the future. My SIAM News article Advancing SIAM gives more details of the workshop.
• A new journal SIAM Journal on Mathematics of Data Science was created. The first issue will be published in the first few months of 2019.
• A new SIAM book series Data Science was created.
• A new SIAG, the SIAM Activity Group on Applied and Computational Discrete Algorithms was approved and will begin operation in 2019.
• The new SIAM website was launched (in June 2018).

Here is a summary of my presidency in numbers:

• 12 trips to the USA (with 0 upgrades from economy class to business class).
• 8 visits to SIAM headquarters and 1 SIAM staff meeting attended.
• 20 “From the SIAM President” columns written for SIAM News: they are listed here.
• 2 SIAM Council Meetings chaired and 4 SIAM Board meetings attended.
• 1 ICIAM board meeting attended and 1 ICIAM board meeting and workshop hosted by SIAM in Philadelphia.
• 2 meetings of the Joint Policy Board for Mathematics in Washington chaired and 2 attended.
• Over 230 appointments made to committees and of candidates for elections (with the advice of various SIAM committees).

## Top Ten Posts of 2018

According to the WordPress statistics, this blog received over 42,000 visitors and 73,000 views in 2018. These are the ten most-viewed posts published during the year.

In addition, the article Differentiation With(out) a Difference in my SIAM News “From the SIAM President” column was the most viewed SIAM News article of 2018.

## Half Precision Arithmetic: fp16 Versus bfloat16

The 2008 revision of the IEEE Standard for Floating-Point Arithmetic introduced a half precision 16-bit floating point format, known as fp16, as a storage format. Various manufacturers have adopted fp16 for computation, using the obvious extension of the rules for the fp32 (single precision) and fp64 (double precision) formats. For example, fp16 is supported by the NVIDIA P100 and V100 GPUs and the AMD Radeon Instinct MI25 GPU, as well as the A64FX Arm processor that will power the Fujitsu Post-K exascale computer.

## Bfloat16

Fp16 has the drawback for scientific computing of having a limited range, its largest positive number being $6.55 \times 10^4$. This has led to the development of an alternative 16-bit format that trades precision for range. The bfloat16 format is used by Google in its tensor processing units. Intel, which plans to support bfloat16 in its forthcoming Nervana Neural Network Processor, has recently (November 2018) published a white paper that gives a precise definition of the format.

The allocation of bits to the exponent and significand for bfloat16, fp16, and fp32 is shown in this table, where the implicit leading bit of a normalized number is counted in the significand.

Format Significand Exponent
bfloat16 8 bits 8 bits
fp16 11 bits 5 bits
fp32 24 bits 8 bits

Bfloat16 has three fewer bits in the significand than fp16, but three more in the exponent. And it has the same exponent size as fp32. Consequently, converting from fp32 to bfloat16 is easy: the exponent is kept the same and the significand is rounded or truncated from 24 bits to 8; hence overflow and underflow are not possible in the conversion.

On the other hand, when we convert from fp32 to the much narrower fp16 format overflow and underflow can readily happen, necessitating the development of techniques for rescaling before conversion—see the recent EPrint Squeezing a Matrix Into Half Precision, with an Application to Solving Linear Systems by me and Sri Pranesh.

The drawback of bfloat16 is its lesser precision: essentially 3 significant decimal digits versus 4 for fp16. The next table shows the unit roundoff $u$, smallest positive (subnormal) number xmins, smallest normalized positive number xmin, and largest finite number xmax for the three formats.

$u$ xmins xmin xmax
bfloat16 3.91e-03 (*) 1.18e-38 3.39e+38
fp16 4.88e-04 5.96e-08 6.10e-05 6.55e+04
fp32 5.96e-08 1.40e-45 1.18e-38 3.40e+38

(*) Unlike the fp16 format, Intel’s bfloat16 does not support subnormal numbers. If subnormal numbers were supported in the same way as in IEEE arithmetic, xmins would be 9.18e-41.

The values in this table (and those for fp64 and fp128) are generated by the MATLAB function float_params that I have made available on GitHub and at MathWorks File Exchange.

## Harmonic Series

An interesting way to compare these different precisions is in summation of the harmonic series $1 + 1/2 + 1/3 + \cdots$. The series diverges, but when summed in the natural order in floating-point arithmetic it converges, because the partial sums grow while the addends decrease and eventually the addend is small enough that it does not change the partial sum. Here is a table showing the computed sum of the harmonic series for different precisions, along with how many terms are added before the sum becomes constant.

Arithmetic Computed Sum Number of terms
bfloat16 $5.0625$ $65$
fp16 $7.0859$ $513$
fp32 $15.404$ $2097152$
fp64 $34.122$ $2.81\dots\times 10^{14}$

The differences are striking! I determined the first three values in MATLAB. The fp64 value is reported by Malone based on a computation that took 24 days, and he also gives analysis to estimate the limiting sum and corresponding number of terms for fp64.

The NVIDIA V100 has tensor cores that can carry out the computation D = C + A*B in one clock cycle for 4-by-4 matrices A, B, and C with only one rounding error; this is a 4-by-4 fused multiply-add (FMA) operation. Moreover, C and D can be in fp32. The benefits that the speed and accuracy of the tensor cores can bring over plain fp16 is demonstrated in Harnessing GPU Tensor Cores for Fast FP16 Arithmetic to Speed up Mixed-Precision Iterative Refinement Solvers.

Intel’s bfloat16 format supports a scalar FMA d = c + a*b, where c and d are in fp32.

## Conclusion

A few years ago we had just single precision and double precision arithmetic. With the introduction of fp16 and fp128 in the IEEE standard in 2008, and now bfloat16 by Google and Intel, the floating-point landscape is becoming much more interesting.

Posted in research | Tagged , , | 8 Comments

## What’s New in MATLAB R2018b?

The MATLAB R2018b release notes report a number of performance improvements, including faster startup and faster calls to built-in functions. I pick out here a few other highlights from the release (excluding the toolboxes) that are of interest from the numerical analysis point of view.

The new xline and yline functions add vertical or horizontal lines to a plot—something that I have needed from time to time and have previously had to produce with a line or a plot command, together with hold on. For example, xline(pi) plots a vertical line at x = pi.

The stackedplot function is mainly of interest for plotting multiple variables from a table in a single plot, but it can also plot the columns of a matrix. In this example, A is the symmetric eigenvector matrix for the second difference matrix:

A = gallery('orthog',10,1); stackedplot(A);


The resulting plot clearly shows that the number of sign changes increases with the column index.

String arrays, introduced in R2016b, are now supported throughout MATLAB. In the previous example I could have typed A = gallery("orthog",10,1).

A new option 'all' to the functions all, any, max, min, prod, sum (and a few others) makes them operate on all the dimensions of the array. For example:

>> A = pascal(3)
A =
1     1     1
1     2     3
1     3     6
>> max(A,[],'all')
ans =
6
>> [prod(A,'all'), sum(A,'all')]
ans =
108    19


The empty second argument in the max call is needed because max(x,y) is also a supported usage. This is a useful addition. I have often written norm(A(:),inf) to avoid max(max(abs(A))) (which does not work for arrays with more than two dimensions), but now I can write max(abs(A),[],'all') without incurring the overhead of forming A(:).

New functions sinpi and cospi plot the sine and cosine functions at the specified multiple of $\pi$. Thus sinpi(x) is the same as sin(pi*x) except that it does not explicitly compute pi*x and so returns exact answers for integer arguments:

>> [sin(pi)         sinpi(1)]
ans =
1.2246e-16            0


## Bohemian Matrices in Manchester

Bohemian matrices are families of matrices in which the entries are drawn from a fixed discrete set of small integers (or some other discrete set). The term is a contraction of BOunded HEight Matrix of Integers and was coined by Rob Corless and Steven Thornton of the University of Western Ontario. Such matrices arise in many situations:

• adjacency matrices of graphs have entries from $\{0, 1\}$;
• Bernoulli matrices, which occur in compressed sensing, have entries from $\{-1,1\}$;
• Hadamard matrices have entries from $\{-1,1\}$ and orthogonal columns; and
• matrices with elements from $\{-1, 0, 1\}$ provide worst case growth factors for Gaussian elimination with partial pivoting and yield the most ill conditioned triangular matrices with elements bounded by $1$.

Rob’s group have done extensive computations of eigenvalues and characteristic polynomials of Bohemian matrices, which have led to interesting observations and conjectures. Many beautiful visualizations are collected on the website http://www.bohemianmatrices.com as well as on the Bohemian Matrices Twitter feed.

In June 2018, Rob and I organized a 3-day workshop Bohemian Matrices and Applications, bringing together 16 people with an interest in the subject from a variety of backgrounds. The introductory talks by Rob, Steven, and I were videod (and are embedded below), and the slides from most of the talks are available on the conference website.

We scheduled plenty of time for discussion and working together. New collaborations were started, several open problems were solved and numerous new questions were posed.

The workshop has led to various strands of ongoing work. Steven has created the Characteristic Polynomial Database, which contains more than $10^9$ polynomials from more than $10^{12}$ Bohemian matrices and has led to a number of conjectures concerning matches of properties to sequences at the On-Line Encyclopedia of Integer Sequences. Three recent outputs are

Sponsorship of the workshop by the Heilbronn Institute for Mathematical Research, the Royal Society and the School of Mathematics at the University of Manchester, as well as support from Canada for some of the Canadian participants, is gratefully acknowledged.

Eunice Chan (University of Western Ontario)

## Conversations with Gil Strang

At JuliaCon 2018 in London, one of the keynote presentations was a conversation with Gil Strang led by Alan Edelman and Pontus Stenetorp. Gil is well known for his many books on linear algebra, applied mathematics and numerical analysis, as well as his research contributions.

Gil talked about his famous 18.06 linear algebra course at MIT, which in its YouTube form has had almost 3 million views. A number of people in the audience commented that they had learned linear algebra from Gil.

Gil also talked about his book Linear Algebra and Learning from Data, due to be published at the end of 2018, which includes chapters on deep neural networks and back propagation. Many people will want to read Gil’s insightful slant on these topics (see also my SIAM News article The World’s Most Fundamental Matrix Equation).

As well as the video of Gil’s interview embedded below, two written interviews will be of interest to Gil’s fans:

## Tricks and Tips in Numerical Computing

In a keynote talk at JuliaCon 2018 I described a variety of tricks, tips and techniques that I’ve found useful in my work in numerical computing.

The first part of the talk was about two aspects of complex arithmetic: the complex step method for approximating derivatives, and understanding multivalued functions with the help of the unwinding number. Then I talked about the role of the associativity of matrix multiplication, which turns out to be the key property that makes the Sherman-Morrison formula work (this formula gives the inverse of a matrix after a rank 1 update). I pointed out the role of associativity in backpropagation in neural networks and deep learning.

After giving an example of using randomization to avoid pathological cases, I discussed why low precision (specifically half precision) arithmetic is of growing interest and identified some issues that need to be overcome in order to make the best use of it.

Almost every talk at JuliaCon was livecast on YouTube, and these talks are available to watch on the Julia Language channel. The slides for my talk are available here.

Also at the conference, my PhD student Weijian Zhang spoke about the work he has been doing on evolving graphs in his PhD.

Posted in conferences | Tagged | 2 Comments

## What’s New in MATLAB R2018a?

MATLAB R2018a was released in March 2018. With each biannual release I try to give a brief overview of the changes in MATLAB (not the toolboxes) that are of most interest to me. These are not comprehensive summaries of what’s new and you should check the release notes for full details.

Complex empty arrays, such as that generated with complex([]) now have an (empty) imaginary part instead of being real.

% R2017b
>> complex([])
ans =
[]

% R2018a
>> complex([])
ans =
0×0 empty complex double matrix


This is a consequence of a change in the way MATLAB stores complex arrays. Prior to R2018a it stored the real parts together and the imaginary parts together. In R2081 it uses an interleaved format in which the real and imaginary parts of a number are stored together; this is the storage format used in the C and C++ languages. For more details see MATLAB Support for Interleaved Complex API in C MEX Functions. Unless you write MEX files or create complex empty arrays, this change should have no effect on you.

The Live Editor continues to gain improved functionality, including embedded sliders and drop-down menus.

Legends can now have multiple columns, specified with the NumColumns property for Legend objects. I must admit that I had not realized that is was already possible to arrange legends in a horizontal (1-by-n) rather than vertical orientation (n-by-1) using the Orientation property.

Tall arrays have several new capabilities, including the ability to compute a least squares solution to an overdetermined linear system Ax = b with a tall A, which is done by QR factorization.

MATLAB starts up faster and has further optimizations in the execution engine.

The GraphPlot Object has some additional options for the 'force', 'force3', and 'circle'` layouts.

I noticed that the recent Windows 10 Spring Creators Update broke the Symbolic Toolbox. An update to fix this (and various other issues) is available at this page (MathWorks login required).

For a concise but wide-ranging introduction and reference to MATLAB, see MATLAB Guide (third edition, 2017)

Posted in software | Tagged | 3 Comments