RegularizeData2D, the Excel Spreadsheet Function for Regularizing 2D Data – Part 2 of 2

In Part 1 of RegularizeData2D, the Excel Spreadsheet Function for Regularizing 2D Data we saw how the program is organized, reviewed some of the differences between theory and implementation, and listed the important programming shortcuts that become available when we depart from rigorous mathematical theory in favor of practical implementation. In this part we will see the inner workings of the Excel spreadsheet function by looking at an example in detail. In the example we will take five “input points” in x and y, and we will regularize them to ten “output points.” The spreadsheet and source code are available for download.

RegularizeData2D Example of Use in Excel

Multiplying Matrices without Actually Multiplying Matrices

To keep the notation easy to read, let’s divide A into two parts. We’ll call F the fidelity component and G the derivative component. Both F and G have the same number of columns and all of the math puts the transpose operation first (FTF instead of FFT). This means that we can work with F and G separately and then add the results at the end because all we’re really doing is multiplying columns by each other.
When F and G have the same number of columns, they can be added separately like this.

So let’s do that. In sub ApplyInputDatapoints() we loop through the input points because they are what dictate the fidelity equations F. (We’re ignoring G for now.)

There are two ways this can go. Suppose the input and output coordinates are an exact match (like the first input point x=0 represented in the first row in F and circled in blue). Only one element is affected in ATA, in this case Element 2,2. An exact match will always impact an element on the diagonal and add 1 to its value. (I say add 1 because if there are two or more input points that have the same x coordinate, they all contribute to that diagonal. Those contributions can accumulate with more and more input points.)

The other scenario is where the input point lies between two output x coordinates. In that case the corresponding row of F will have two nonzero columns. For example, the second row of F has nonzero values 0.9 and 0.1 in columns 3 and 4 respectively. Four elements in ATA will be affected, and those are the four permutations of 3 and 4, hence that part of the function has four assignments to A (circled in pink).

Consider each row of F.  Nonzero values in columns of F have an impact on the rows and columns in FTF

FTF In Compressed FormOf course all of this is done without using full arrays. F is shown in the above diagram, but it is not actually stored anywhere; the code only loops over the input points, knowing that they represent individual rows of F. The result FTF is stored in memory, but only in the compressed format shown on the right.

ATb is also affected only by the fidelity equations F, so we can use the same loop to calculate the elements of ATb. Each value in b is multiplied by the values in the corresponding row of F. Whatever columns of F had the nonzero values, those are the rows of ATb where each product is added.

For example, the first row of F has a 1 in the second column. We multiply that by the value in the first row of b, which is also 1. That gives us 1*1. We add that to the second row of ATb. This is circled in blue below.

The second row of b has a value of 1.1 and the second row of F has values of 0.9 and 0.1 in its third and fourth columns. When we multiply them, we have 0.99 and 0.11 in the third and fourth rows of ATb, respectively. This is circled in pink below. Those values are added to whatever is already in ATb. This process is repeated for each input point.

Calculating A Transpose b

After all five input points are accounted for, ATb looks like this. Because it is just a vector, there is no need to compress it, so it is just stored as a one-dimensional array.
A Transpose b After the Projection

Multiplying More Matrices without Actually Multiplying Matrices

The next step is to process the second-derivative projections, GTG. This is done in the sub ApplySecondDerivatives(). Recall that the vector b had zeros in the rows corresponding to second derivatives G. The projection GTb involves multiplying by the rows of b – in other words, multiplying by zero. That means we can leave b alone because the derivatives only impact ATA.

To get an idea of what happens to A, consider a small matrix where only the first row has the three values that form the coefficients of a second derivative. For example, +0.1, -0.2, and +0.3. The projection ATA entails multiplying each column by itself and by the other two columns. The first column multiplied by itself is 0.1*0.1 = 0.01, and that goes into the 1,1 element of ATA. The second column multiplied by itself is -0.2 * -0.2 = 0.04 and that goes into the 2,2 element of ATA. The same goes for the third column multiplied by itself, giving a value of 0.09.

Example Single Second Derivative.  A second derivative occupies one row and three columns, but after projection it occupies three rows and three columns.

The same thing happens for the six off-diagonal elements. The first column multiplied by the second column is 0.1 * -0.2 = -0.02, which goes into the 1,2 element of ATA. However, the first column times the second column is the same as the second column times the first column. In other words, we can take advantage of that symmetry and reuse the calculated values. We calculate the three values for 1,2; 1,3; and 2,3. Then we also apply those values to 2,1; 3,1; and 3,2, respectively. These six values are circled in purple. As before, we only add these values to whatever is already in those elements.

We needed 9 values: 3 diagonal elements and 6 off-diagonal elements. However, symmetry let us get away with doing only 6 calculations instead of all 9. We repeat this for each derivative, as usual adding the nine results to whatever is already in A. Most of the 9 values will overlap with others, which is fine.

FTF and GTG and A in Compressed Format

Solving the System of Linear Equations to Regularize the Input Data

Now that we have all of the projected matrices that we need, it is time to solve the system for y. This will be done by Gaussian elimination and back-substitution. Sub UpperTriangularForm() operates on A and b to transform A into upper triangular form. The animation below shows the Gaussian elimination, which proceeds from top to bottom. The goal of Gaussian elimination is to arrange the equations so that there are zeros to the left of the diagonal elements. Note that the diagonal elements are in the third column, so in this compressed format the “upper triangular form” looks more like a rectangle than a triangle. The algorithm is set up for a pentadiagonal matrix, so each time it adds one row to another, it only performs four operations on A (and one operation on b). The nonzero row values are staggered, so for any two adjacent rows of A, no more than four columns mutually contain nonzero elements. One side effect of this is that the values in the fifth column don’t change during the Gaussian elimination, and this is plainly visible in the animation.
Animated Gaussian Elimination on the Compressed Matrix ATA and b
Once A is in upper triangular form, the sub SolveCompressedPentadiagonalUpperTriangularMatrixEquation() solves it for y using back-substitution. This time the algorithm starts at the bottom of A and works its way up. The last row of A has only one element, and it is in the diagonal. That tells us that 0.058 * y10 = 0.079 (circled in blue below). We solve this for y10 and get a value of 1.363983.

The second-to-last row has two nonzero values, one in the diagonal (y9) and one to the right of the diagonal (y10). We know the value of y10 because we just found it. At that point we construct the equation: 1.826 * y9 – 0.96 * y10 = 1.741. (This is circled in pink below.) Solving for y9 gives us a value of 1.67029.

Back-Substitution to solve for the last two elements of y

After those last two elements are solved for, the sub loops through the remaining rows above. Each time it plugs in the two known values from the rows below and solves for the value of y corresponding to each diagonal element of A. When the loop is finished, y is populated with the answer. At that point y is returned to RegularizeData2D() where it is formatted for assignment to the selected spreadsheet range. At this point the user sees the result in the spreadsheet. Done.

RegularizeData2D VBA Solution

The Answer

This is the answer, which can be found on the worksheet 10-Point Excel Example. The bottom two values for y match the y9 and y10 values we calculated above. All of the calculations in this article, as well as additional examples, are in the spreadsheet download. The spreadsheet VBA modules contain full source code with additional comments and explanations.

Download the Excel Spreadsheet: RegularizeData2D Excel Function with Examples and Source Code.xlsm

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s