# Introduction to Regularizing 3D Data – Part 2 of 2

## An Entire 3D Surface from a Few Measured Points

• An investment banker needs to create an entire volatility surface from a handful of derivatives trades that took place today.
• A chemist needs to know the temperature near the center of a reaction chamber based on only a few temperature measurements scattered around its perimeter.
• An engineer needs to model something with a smooth function of two variables, but he doesn’t know the function and all he has are a few dozen noisy measurement points.

How do they do it? By regularizing their 3D data of course!

The first part of this article showed how we use bilinear interpolation to relate arbitrary input points to locations on an output grid. Now we will look at the fidelity equations, set up second-derivative equations for the 3D data, and find the least-squares solution. The result is a regularized surface in 3D.

### Implementing Bilinear Interpolation in the Spreadsheet

In the spreadsheet the bilinear interpolation looks like this. The input points on the left are listed as seven x, y, and z values. On the right are the logistics for bilinear interpolation.

In practice the input points may lie directly over or precisely between points in the output grid. When that happens, the weights in x or y (or both) take on values of 0 or 1 and there are fewer than four nonzero numbers in the equality condition. In this example rows 5 and 7 have more than 1 nonzero value. For example, Row 5 corresponds to the coordinate (2.5, 1), which is halfway between (2, 1) and (3, 1). For that reason Columns 8 and 9 share 50% of the weight for that point.

Using the sign conventions we used when we were regularizing 2D data, we call these AFidelity and bFidelity. Note that the equation is Az=b, not Ay=b like in the 2D case, because the output values are on the z axis instead of the y axis.

These interpolation numbers comprise the fidelity equations. The next step is to set up the second derivatives.

### Second Derivatives – Vertical and Horizontal

The second derivatives are calculated the same way as the 2D case, except that they are calculated separately for both axes, x and y. Just by looking at the diagram you can see that the horizontal derivatives are arranged in five clusters of three derivatives. With five adjacent points in our 5×5 output grid there are only three places where you can take a discrete second derivative, so each row in the output grid has three second derivatives.

For example, Columns 1-5 are five columns, yet between those five columns there are only three second derivatives. The fourth second derivative begins in Column 6.

The vertical derivatives have similar properties, but they show up at different locations in the matrix. Columns 1-5 have only one nonzero value (the value 1). Columns 6-10 have two nonzero values (1 and -2). Columns 11-15 have three nonzero values (1, -2, and 1 again). Then Columns 16-20 go back to two nonzero values and finally Columns 21-25 have only one nonzero value in them. This represents the way the vertical derivatives reuse rows in the output grid. There are five rows but only three ways to take second derivatives between those rows.

This Tetris-like animation shows how the vertical derivatives stack up based on the way the matrix columns map to points in the output grid. The center (representing the middle row of the output grid) has the highest stack of nonzero values because three derivative coefficients overlap there.

The two sets of derivatives have some interesting properties that are consistent with their role in this calculation process, and we can see these properties in the matrices visually. These details are not essential to know, but they are a good excuse to make a reference to Tetris!

### Smoothness Constant for 3D

In the 2D case the smoothness constant KSmoothness and the numbers of derivative and fidelity equations determine the derivative scaling factor S. As before, to give equal importance to smoothness and fitting, use KSmoothness = 1 (resulting in noticeable smoothing). If you want to give high importance to fitting your input points (as in 100 times as important), use KSmoothness = 0.01. If you want the result to be planar, use a high number like 5,000.

S is derived from the sum of squared errors just like in the 2D case. For that reason it uses exactly the same equations.

The only real difference is that there are more derivative equations because of the two directions x and y. If the output grid has the size MxN, then the number of derivative equations is related to that size. For example, a 25×4 grid has 142 derivative equations because there are 23 second derivatives to take within each column and there are 4 columns. There are 2 second derivatives that you can take in each row with 25 rows. That gives a total of 23 * 4 + 2 * 25 = 142. A 10×10 grid needs 80 + 80 = 160 derivative equations.

In this example we have 7 input points (fidelity equations) and a 5×5 output grid, which gives us 30 second derivatives. Plugging those numbers into the equations with a smoothness constant KSmoothness of 0.5, we get…

If we put it all together, we get this matrix A and vector b.

At this point we have the overdetermined matrix equation with fidelity equations and scaled derivative equations. We can solve all of this for the least-squares solution z.

### Normal Equations

As in the 2D case, we use the transpose of A to give the normal equations: (ATA)z = ATb. At this point it is just a matter of solving for z and mapping the result to the output grid.

### Solution to the Normal Equations

The solution is the vector of z values that satisfy the normal equations. Of course we need these numbers on a grid of x and y values so that the data are in 3D, so we map each z value to its location in the output grid. This animation shows how the 25×1 least-squares solution for z maps to our 5×5 output grid.

### Another Example of Regularizing 3D Data in the Excel Spreadsheet

The spreadsheet also includes an example of a surface on a 5×4 output grid that is calculated using the same process. This kind of surface might represent the shape of a mountain or the electrical charge of an impurity on a flat, crystalline substrate or a microscopic particle…

1. Amit Mohan says:|

Hello,
how to determine ‘Second Derivatives – Vertical and Horizontal’ is not clear to me . could you give me some more explanation about this. do we have a Macro for 3D regularization.
Best Regards,
Amit Mohan

• Thank you for your question, Amit.

The second derivatives are a way to measure the linearity locally on the surface. The article about regularizing 2D data explains why we use second derivatives with respect to x. In 3D we do exactly the same thing, but in two directions in order to account for both possible directions that could have nonlinear variation. These are second derivatives of z with respect to x and with respect to y. The pattern 1, -2, 1 arises because the grid spacing is constant in this particular example, but the numerical derivatives article gives the detailed derivation and shows how that pattern of numbers represents a second derivative. Most of the posts are intended to build upon each other. For example, second derivatives followed by 2D methods followed by 3D followed by scripts. We encourage you to look through our different posts and to ask questions if something is not clear.

There are two scripts that automate these calculations in 3D, but they aren’t posted yet. One is an Excel function in VBA and the other is for Matlab.

• Amit Mohan says:|

Hi,
I’m not able to understand the process of getting 2nd order derivatives for x and y separately 😦
I am going through the linked topics again and again…may be im missing something and this would help. If you could give me some more clues, it would be helpful. Or I would wait for the vba posting.
This is interesting. Thanks for your help again.
Best Regards,
Amit Mohan

• The second derivatives are represented by the nonzero values in the derivative portions of the matrix A. Each column represents a point in the 5×5 output grid and each row is a single instance of calculating a second derivative. There are 15 rows of horizontal second derivatives for a 5×5 grid because we are calculating that second derivative in 15 separate locations on the output surface. The same is true for the vertical derivatives.

Look at which columns are nonzero within a particular row. For example, columns 1, 2, and 3. Those all have the same y coordinate but different x coordinates, so that row must represent a second derivative in the horizontal direction. (Only x and z values are changing, so the equation is unaware of changes in z with respect to y.)

I’m about to board an airplane now, but I’ll respond again tomorrow if you need more details.

2. Amit says:|

Hi,