Today is the 80th birthday of Cleve Moler. Most readers will know Cleve as the inventor of MATLAB. MATLAB was originally a Fortran program that Cleve wrote to give students easier access to the EISPACK and LINPACK libraries. Cleve and John Little later founded The MathWorks and turned MATLAB into a commercial product.

Cleve was a speaker at the conference Advances in Numerical Linear Algebra: Celebrating the Centenary of the Birth of James H. Wilkinson held in Manchester last May. We took the opportunity there to celebrate Cleve’s birthday in advance. A video of Cleve’s talk is included below.

Most recently I saw Cleve at the International Congress on Industrial and Applied Mathematics (ICIAM) conference in Valencia, where he spoke in the minisymposium *Bohemian Matrices and Applications* that I organized with Rob Corless. See Cleve’s blog post about his talk.

Happy birthday, Cleve!

(Photo credits: Gian Maria Negri Porzio.)

The tapes contain four talks by Wilkinson and one by Cleve Moler from an Eigensystem Workshop held at Argonne National Laboratory, Illinois, USA, in June 1973.

The tapes are Scotch UC60 High Energy color Videocassettes, in U-matic format, one of which is shown to the right. Chris was able to have the tapes digitized and the results are pretty good given the age of the tapes. We have put the videos on the Numerical Linear Algebra Group YouTube channel and link to them below.

The videos were announced at the recent conference Advances in Numerical Linear Algebra: Celebrating the Centenary of the Birth of James H. Wilkinson, held in Manchester May 29-30, 2019.

- Elementary Transformations Used in Practical Linear Algebra – James H. Wilkinson, Eigensystem Workshop, Argonne National Laboratory, Argonne, Illinois, USA, June 1973.

- Algorithms for Non-Sparse Real Symmetric Matrices – James H. Wilkinson, Eigensystem Workshop, Argonne National Laboratory, Argonne, Illinois, USA, June 1973.

- Inverse Iteration – James H. Wilkinson, Eigensystem Workshop, Argonne National Laboratory, Argonne, Illinois, USA, June 1973.

- Detailed Study of the QR, QL Algorithms – James H. Wilkinson, Eigensystem Workshop, Argonne National Laboratory, Argonne, Illinois, USA, June 1973.

- Fortran Implementation of EISPACK – Cleve Moler, Eigensystem Workshop, Argonne National Laboratory, Argonne, Illinois, USA, June 1973.

The Royal Society discussion meeting Numerical Algorithms for High-Performance Computational Science, which I organized with Jack Dongarra and Laura Grigori, was held in London, April 8-9, 2019. The meeting had 16 invited speakers and 23 contributed posters, with a poster blitz preceding the lively poster session.

The meeting brought together around 150 people who develop and implement numerical algorithms and software libraries—numerical analysts, computer scientists, and high-performance computing researchers—with those who use them in some of today’s most challenging applications. The aim was to discuss the challenges brought about by evolving computer architectures and the characteristics and requirements of applications, both to produce greater awareness of algorithms that are currently available or under development and to steer research on future algorithms.

Several key themes emerged across multiple talks, all in the context of today’s high performance computing landscape in which processor clock speeds have stagnated (with the end of Moore’s law) and exascale machine are just two or three years away.

- An important way of accelerating computations is through the use of low precision floating-point arithmetic—in particular by exploiting a hierarchy of precisions.
- We must exploit low rank matrix structure where it exists, for example in hierarchical (H-matrix) form, combining it with randomized approximations.
- Minimizing data movement (communication) is crucial, because of its increasing costs relative to the costs of floating-point arithmetic.
- Co-design (the collaborative and concurrent development of hardware, software, and numerical algorithms, with knowledge of applications) is increasingly important for numerical computing.

Applications treated included deep learning, climate modeling, computational biology, and the Square Kilometre Array (SKA) radio telescope project. We learned that the latter project needs faster FFTs in order to deal with the petabytes of data it will generate each year.

Papers from the meeting will be published in a future issue of *Philosophical Transactions of the Royal Society A*.

Here are the talks (in order of presentation), with links to the slides.

- Hierarchical algorithms on hierarchical architectures, David E. Keyes
- Antisocial parallelism: avoiding, hiding and managing communication, Katherine Yelick
- Big telescope, big data: towards exa-scale with the SKA, Anna Scaife
- Algorithms for in situ data analytics in next generation molecular dynamics workflows, Michela Taufer
- Numerical computing challenges for large-scale deep learning, Rick Stevens
- High-performance sampling of determinantal point processes, Jack Poulson
- Computing beyond the end of Moore’s Law, John Shalf
- Exascale applications: skin in the game, Douglas B. Kothe
- Machine learning and big scientific data, Tony Hey
- Iterative linear algebra in the exascale era, Erin Carson
- Stochastic rounding and reduced-precision fixed-point arithmetic for solving neural ODEs, Steve Furber
- Rethinking deep learning: architectures and algorithms, George Constantinides
- Reduced numerical precision and imprecise computing for ultra-accurate next-generation weather and climate models, Tim Palmer
- Memory-aware algorithms for automatic differentiation and backpropagation, Guillaume Aupy
- Post-K: the first ‘exascale’ supercomputer for convergence of HPC and big data/AI, Satoshi Matsuoka
- Accelerated sparse linear algebra: emerging challenges and capabilities for numerical algorithms and software, Michael A. Heroux

For a brief summary of the talks see George Constantinides’s insightful thoughts on the meeting.

Great turnout for @royalsociety discussion meeting. Kicking off with @nhigham introducing. pic.twitter.com/lNeQld2RN7

— G A Constantinides (@gconstantinides) April 8, 2019

Here are some photos from the meeting.

This year marks the 100th anniversary of the birth of James Hardy Wilkinson, FRS. Wilkinson developed the theory and practice of backward error analysis for floating-point computation, and developed, analyzed, and implemented in software many algorithms in numerical linear algebra. While much has changed since his untimely passing in 1986, we still rely on the analytic and algorithmic foundations laid by Wilkinson.

This micro website about Wilkinson, set up by Sven Hammarling and me, contains links to all kinds of information about Wilkinson, including audio and video recordings of him.

With Sven Hammarling and Françoise Tisseur, I am organizing a conference Advances in Numerical Linear Algebra: Celebrating the Centenary of the Birth of James H. Wilkinson at the University of Manchester, May 29-30, 2019. Among the 13 invited speakers are several who knew or worked with Wilkinson. As well as focusing on recent developments and future challenges for numerical linear algebra, the talks will include reminiscences about Wilkinson and discuss his legacy.

Contributed talks and posters are welcome (deadline April 1, 2019) and some funding is available to support the attendance of early career researchers and PhD students.

]]>The condition number of a matrix is a well known measure of ill conditioning that has been in use for many years. For an matrix it is , where is any matrix norm. If is singular we usually regard the condition number as infinite.

The first occurrences of the term “condition number” and of the formula that I am aware of are in Turing’s 1948 paper Rounding-Off Errors in Matrix Processes. He defines the -condition number and the -condition number , where and , the latter N-norm being what we now call the Frobenius norm. He suggests using these condition numbers to measure the ill conditioning of a matrix with respect to linear systems, using a statistical argument to make the connection. He also notes that “the best conditioned matrices are the orthogonal ones”.

In his 1963 book *Rounding Errors in Algebraic Processes*, Wilkinson credits the first use of “condition number” to Turing and notes that “the term `ill-condition’ had been in common use among numerical analysts for some considerable time before this”. An early mention of linear equations being ill conditioned is in the 1933 paper An Electrical Calculating Machine by Mallock. According to Croarken, Mallock’s machine “could not adequately deal with ill conditioned equations, letting out a very sharp whistle when equilibrium could not be reached”.

As noted by Todd (The Condition of a Certain Matrix, 1950), von Neumann and Goldstine (in their monumental 1947 paper Numerical Inverting of Matrices of High Order) and Wittmeyer (1936) used the ratio of largest to smallest eigenvalue of a positive definite matrix in their analyses, which amounts to the 2-norm condition number , though this formula is not used by these authors. Todd called this the condition number. None of the , or names have stuck.

Nowadays we know that can be thought of both as a measure of the sensitivity of the solution of a linear system to perturbations in the data and as a measure of the sensitivity of the matrix inverse to perturbations in the matrix (see, for example, Condition Numbers and Their Condition Numbers by D. J. Higham). How to formulate the definition of condition number for a wide class of problems was worked out by John Rice in his 1966 paper A Theory of Condition.

]]>My two-year term as SIAM President ended on December 31, 2018. It’s been an exciting and enjoyable two years, not least because of the excellent SIAM staff, leadership and other volunteers I’ve worked with.

My blog post Taking Up the SIAM Presidency and my SIAM News article Evolving and Innovating set out some ideas that I wanted to pursue during my two years as president. I will not attempt to review these here, but just list five highlights from the last two years.

- We held the SIAM ADVANCE in April 2018: a two-day strategic planning workshop attended by 25 officers, staff, and other members of the SIAM community. The many ideas that emerged from the event are summarized in an 80-page report provided to the SIAM Council and Board of Trustees. Many of these have already been acted upon, others are in progress, and yet more will be considered in the future. My SIAM News article Advancing SIAM gives more details of the workshop.
- A new journal SIAM Journal on Mathematics of Data Science was created. The first issue will be published in the first few months of 2019.
- A new SIAM book series Data Science was created.
- A new SIAG, the SIAM Activity Group on Applied and Computational Discrete Algorithms was approved and will begin operation in 2019.
- The new SIAM website was launched (in June 2018).

Here is a summary of my presidency in numbers:

- 12 trips to the USA (with 0 upgrades from economy class to business class).
- 8 visits to SIAM headquarters and 1 SIAM staff meeting attended.
- 20 “From the SIAM President” columns written for SIAM News: they are listed here.
- 2 SIAM Council Meetings chaired and 4 SIAM Board meetings attended.
- 1 ICIAM board meeting attended and 1 ICIAM board meeting and workshop hosted by SIAM in Philadelphia.
- 2 meetings of the Joint Policy Board for Mathematics in Washington chaired and 2 attended.
- Over 230 appointments made to committees and of candidates for elections (with the advice of various SIAM committees).

According to the WordPress statistics, this blog received over 42,000 visitors and 73,000 views in 2018. These are the ten most-viewed posts published during the year.

- How to Program log z
- Tricks and Tips in Numerical Computing
- Conversations with Gil Strang
- What’s New in MATLAB R2018a?
- Numerical Linear Algebra Group 2017
- Bohemian Matrices in Manchester
- Half Precision Arithmetic: fp16 Versus bfloat16
- Palomino Blackwing Pencil Tribute to Ada Lovelace
- Joan E. Walsh (1932–2017)
- Lectures on Multiprecision Algorithms in Kácov

In addition, the article Differentiation With(out) a Difference in my SIAM News “From the SIAM President” column was the most viewed SIAM News article of 2018.

]]>The 2008 revision of the IEEE Standard for Floating-Point Arithmetic introduced a half precision 16-bit floating point format, known as fp16, as a storage format. Various manufacturers have adopted fp16 for computation, using the obvious extension of the rules for the fp32 (single precision) and fp64 (double precision) formats. For example, fp16 is supported by the NVIDIA P100 and V100 GPUs and the AMD Radeon Instinct MI25 GPU, as well as the A64FX Arm processor that will power the Fujitsu Post-K exascale computer.

Fp16 has the drawback for scientific computing of having a limited range, its largest positive number being . This has led to the development of an alternative 16-bit format that trades precision for range. The bfloat16 format is used by Google in its tensor processing units. Intel, which plans to support bfloat16 in its forthcoming Nervana Neural Network Processor, has recently (November 2018) published a white paper that gives a precise definition of the format.

The allocation of bits to the exponent and significand for bfloat16, fp16, and fp32 is shown in this table, where the implicit leading bit of a normalized number is counted in the significand.

Format | Significand | Exponent |
---|---|---|

bfloat16 | 8 bits | 8 bits |

fp16 | 11 bits | 5 bits |

fp32 | 24 bits | 8 bits |

Bfloat16 has three fewer bits in the significand than fp16, but three more in the exponent. And it has the same exponent size as fp32. Consequently, converting from fp32 to bfloat16 is easy: the exponent is kept the same and the significand is rounded or truncated from 24 bits to 8; hence overflow and underflow are not possible in the conversion.

On the other hand, when we convert from fp32 to the much narrower fp16 format overflow and underflow can readily happen, necessitating the development of techniques for rescaling before conversion—see the recent EPrint Squeezing a Matrix Into Half Precision, with an Application to Solving Linear Systems by me and Sri Pranesh.

The drawback of bfloat16 is its lesser precision: essentially 3 significant decimal digits versus 4 for fp16. The next table shows the unit roundoff , smallest positive (subnormal) number xmins, smallest normalized positive number xmin, and largest finite number xmax for the three formats.

xmins | xmin | xmax | ||
---|---|---|---|---|

bfloat16 | 3.91e-03 | (*) | 1.18e-38 | 3.39e+38 |

fp16 | 4.88e-04 | 5.96e-08 | 6.10e-05 | 6.55e+04 |

fp32 | 5.96e-08 | 1.40e-45 | 1.18e-38 | 3.40e+38 |

(*) Unlike the fp16 format, Intel’s bfloat16 does not support subnormal numbers. If subnormal numbers were supported in the same way as in IEEE arithmetic, xmins would be 9.18e-41.

The values in this table (and those for fp64 and fp128) are generated by the MATLAB function `float_params`

that I have made available on GitHub and at MathWorks File Exchange.

An interesting way to compare these different precisions is in summation of the harmonic series . The series diverges, but when summed in the natural order in floating-point arithmetic it converges, because the partial sums grow while the addends decrease and eventually the addend is small enough that it does not change the partial sum. Here is a table showing the computed sum of the harmonic series for different precisions, along with how many terms are added before the sum becomes constant.

Arithmetic | Computed Sum | Number of terms |
---|---|---|

bfloat16 | ||

fp16 | ||

fp32 | ||

fp64 |

The differences are striking! I determined the first three values in MATLAB. The fp64 value is reported by Malone based on a computation that took 24 days, and he also gives analysis to estimate the limiting sum and corresponding number of terms for fp64.

The NVIDIA V100 has tensor cores that can carry out the computation D = C + A*B in one clock cycle for 4-by-4 matrices A, B, and C with only one rounding error; this is a 4-by-4 fused multiply-add (FMA) operation. Moreover, C and D can be in fp32. The benefits that the speed and accuracy of the tensor cores can bring over plain fp16 is demonstrated in Harnessing GPU Tensor Cores for Fast FP16 Arithmetic to Speed up Mixed-Precision Iterative Refinement Solvers.

Intel’s bfloat16 format supports a scalar FMA d = c + a*b, where c and d are in fp32.

A few years ago we had just single precision and double precision arithmetic. With the introduction of fp16 and fp128 in the IEEE standard in 2008, and now bfloat16 by Google and Intel, the floating-point landscape is becoming much more interesting.

The MATLAB R2018b release notes report a number of performance improvements, including faster startup and faster calls to built-in functions. I pick out here a few other highlights from the release (excluding the toolboxes) that are of interest from the numerical analysis point of view.

The new `xline`

and `yline`

functions add vertical or horizontal lines to a plot—something that I have needed from time to time and have previously had to produce with a `line`

or a `plot`

command, together with `hold on`

. For example, `xline(pi)`

plots a vertical line at `x = pi`

.

The `stackedplot`

function is mainly of interest for plotting multiple variables from a table in a single plot, but it can also plot the columns of a matrix. In this example, `A`

is the symmetric eigenvector matrix for the second difference matrix:

A = gallery('orthog',10,1); stackedplot(A);

The resulting plot clearly shows that the number of sign changes increases with the column index.

String arrays, introduced in R2016b, are now supported throughout MATLAB. In the previous example I could have typed `A = gallery("orthog",10,1)`

.

A new option `'all'`

to the functions `all`

, `any`

, `max`

, `min`

, `prod`

, `sum`

(and a few others) makes them operate on all the dimensions of the array. For example:

>> A = pascal(3) A = 1 1 1 1 2 3 1 3 6 >> max(A,[],'all') ans = 6 >> [prod(A,'all'), sum(A,'all')] ans = 108 19

The empty second argument in the `max`

call is needed because `max(x,y)`

is also a supported usage. This is a useful addition. I have often written `norm(A(:),inf)`

to avoid `max(max(abs(A)))`

(which does not work for arrays with more than two dimensions), but now I can write `max(abs(A),[],'all')`

without incurring the overhead of forming `A(:)`

.

New functions `sinpi`

and `cospi`

plot the sine and cosine functions at the specified multiple of . Thus `sinpi(x)`

is the same as `sin(pi*x)`

except that it does not explicitly compute `pi*x`

and so returns exact answers for integer arguments:

>> [sin(pi) sinpi(1)] ans = 1.2246e-16 0

- For earlier “What’s New in MATLAB” posts click here.
- See also MATLAB Guide, Third Edition (2017).

- adjacency matrices of graphs have entries from ;
- Bernoulli matrices, which occur in compressed sensing, have entries from ;
- Hadamard matrices have entries from and orthogonal columns; and
- matrices with elements from provide worst case growth factors for Gaussian elimination with partial pivoting and yield the most ill conditioned triangular matrices with elements bounded by .

Rob’s group have done extensive computations of eigenvalues and characteristic polynomials of Bohemian matrices, which have led to interesting observations and conjectures. Many beautiful visualizations are collected on the website http://www.bohemianmatrices.com as well as on the Bohemian Matrices Twitter feed.

In June 2018, Rob and I organized a 3-day workshop Bohemian Matrices and Applications, bringing together 16 people with an interest in the subject from a variety of backgrounds. The introductory talks by Rob, Steven, and I were videod (and are embedded below), and the slides from most of the talks are available on the conference website.

We scheduled plenty of time for discussion and working together. New collaborations were started, several open problems were solved and numerous new questions were posed.

The workshop has led to various strands of ongoing work. Steven has created the Characteristic Polynomial Database, which contains more than polynomials from more than Bohemian matrices and has led to a number of conjectures concerning matches of properties to sequences at the On-Line Encyclopedia of Integer Sequences. Three recent outputs are

- E. Y. S. Chan, R. M. Corless, L. Gonzalez-Vega, J. R. Sendra, J. Sendra and S. E. Thornton, Bohemian Upper Hessenberg Matrices, ArXiv preprint arXiv:1809.10653, September 2018.
- E. Y. S. Chan, R. M. Corless, L. Gonzalez-Vega, J. R. Sendra, J. Sendra and S. E. Thornton, Bohemian Upper Hessenberg Toeplitz Matrices, ArXiv preprint arXiv:1809.10664, September 2018.
- S. L. Hill, M. N. Huxley, M. C. Lettington and K. M. Schmidt, Some Properties and Applications of Non-trivial Divisor Functions, ArXiv preprint arXiv:1806.00651, June 2018; to appear in The Ramanujan Journal [added to this post October 26, 2018].

Sponsorship of the workshop by the Heilbronn Institute for Mathematical Research, the Royal Society and the School of Mathematics at the University of Manchester, as well as support from Canada for some of the Canadian participants, is gratefully acknowledged.