In Part 2 of Introduction to Regularizing with 2D Data we set up an overdetermined system of 13 equations with 10 variables and showed how to find the best solution using the normal equations. In this part we will look at the solution to the example and calculate the residual errors for this. This and several other examples of higher resolution (higher numbers of output points) are demonstrated in an Excel spreadsheet.
Getting the Answer
Below-Left is the answer for y. On the right is a plot showing the original five data points in blue and the ten regularized output points from this example in red. There is also a curve generated by the same calculation process but with 10,001 points shown in pink. The 10-point curve and the 10,001-point curve have vastly different resolution. Nevertheless, they nearly overlap because of the derivative scaling factor S that was used earlier. The small misalignment between them is attributable to quantization error.
Looking at the Residuals
Now that we have a solution for y, we can plug y into the original equation to see what the residuals look like. The first five rows correspond to the five input points. The rest of the rows depend on how many output points there are. If there are thousands of output points, there will be thousands of second derivatives (and therefore thousands of rows below the first five rows).
No matter how many output points or derivatives there are, the sum of squared errors (SSE; all of them) should stay the same. So should their ratio. The ratio itself is a characteristic of the collinearity of the input points and the smoothness constant KSmoothness. That means that the only way to change the ratio is with different input points or a different smoothness constant. Quantization (10 output points vs. 21 or 10,001 output points) will have a small effect, as it does here. (But in fairness, ten points is extremely coarse.)
This shows that the scheme for scaling the derivative equations is working. All of the residuals impact the result, but the fidelity residuals are regarded as fixed. The number of output points (and therefore the number of second derivatives) is regarded as variable. After all, you may want to fit more and more points to the same measured data and it makes sense that you should see the same result each time.
Maintaining a Consistent Fit via the Sum of Squared Errors
Suppose you have 10 points and the derivative/stiffness residuals are r1 … r8. When you square them and add them together, suppose you get the number 0.4. Now you add another 8 output points, which means there will be another 8 second derivatives between them. The numerical second derivatives are the same (because the output curve hasn’t changed), but now there are twice as many of them (r1 … r16), giving a sum of 0.8 instead of 0.4. This is a problem because the stiffness sum of squared residuals changed while the fidelity sum of squared residuals stayed the same, causing the stiffness to have greater importance.
The way to fix this is to cut the squared stiffness residuals in half. Do this by multiplying the residuals by 1/sqrt(2). The squared residuals each have half of their values, but there are twice as many of them. Therefore there is no net change in the sum of the squared residuals. That is how the square root function balances out the input points and the smoothness for a given set of output points. Increasing the number of output points will not change the characteristic of the output curve. The benefit of this is that the output is always consistent with a set of input points and a smoothness/stiffness constant.