Excel Spreadsheet Function to Optimize Cells (without Using the Excel Solver or Goal Seek)

The Excel Solver is a powerful tool for solving complicated problems. But what if you want something smaller and more compact like… an Excel spreadsheet function? Minimize() combines the power of a nonlinear optimizer with the convenience of a spreadsheet function. Unlike the Excel Solver or Goal Seek, it responds to changes in the values of its arguments and recalculates automatically like any other formula in the spreadsheet. As an added bonus, Minimize() can optimize some things that the Excel Solver cannot.

Minimize Function Syntax

 

 

Minimize Function Finding the Square Root of 100

Download Excel Optimization Function (Minimize and MinimizeNonnegative) v5

 

How to use Minimize() in a Spreadsheet

Minimize() takes three arguments. The first argument is the cell whose value “y” you want to be minimized. In this case that cell is the merit function. The second argument is the cell whose value “x” will change. (In other words: “Minimize y by changing x.”) The third argument is the starting value to use. For example, Minimize(B1, A1, 9) means “Bring Cell B1 to a minimum by changing Cell A1. Start with A1 equal to 9.” If you then type another number (like 5 or 11 or -99) into cell A1, it will change back to the optimum value automatically.

Minimize() doesn’t really return a value like most spreadsheet functions; it’s more like a “babysitter” for the cells in its arguments. It watches those cells and runs an optimization when it detects changes. The only catch is that the first two arguments must be references to individual cells, not other functions like SUMSQ(A1:A3).

So this won’t work: Minimize(SUMSQ(A1:A3),B1,1) but this will: Minimize(A1,B1,C1+D3).

Minimize? Can I maximize? What about a goal seek?

Sometimes you want to maximize instead of minimize. In that case (in the context of a spreadsheet) you insert another cell we’ll call z. In that cell you put in a formula like z = -y. Then use Minimize() to minimize z (instead of y) by changing x. Sometimes you want to reach a specific value of y instead of just a minimum. To do that we use a different formula for z that has a minimum where we want. For example, suppose we want to know what value of x brings y to a value of 3.5. To do that you minimize z =(y – 3.5)2 by changing x.

The important thing about a nonlinear optimizer is that it has no knowledge of what it’s optimizing. It changes a variable and looks at the effect on the merit function, essentially by trial and error (but not randomly because there is a method to the madness). From the optimizer’s perspective it is like turning a knob on a black box. The benefit of this is that the optimization algorithm doesn’t care how complicated the merit function is or what it means or where it comes from. It can handle nearly anything you throw at it, which makes it a powerful tool.

Example Merit Functions for Maximizing and Goal Seek

Why Optimize in a Spreadsheet

The point of numerical optimization is to make it easy to solve very complicated functions. For example, actuaries use the Black-Scholes formula to price financial derivatives. That formula is very complicated, but you can use a numerical optimizer like Minimize() to find, for example, the implied volatility for a put option. In these cases there is no closed-form solution. Another example is: sin(x) + x = 0.5. (This simple-looking math problem has no analytical solution. It’s so impossible that, like many authors, we leave it as an exercise for the reader.) The spreadsheet has an example of implied volatility and the trigonometric function with no closed-form solution.

A common demonstration of numerical optimization is to find the square root of a number without using a square root function, and most of the examples in the spreadsheet do that.

To find the square root of 2:

  • Error = (x2-2)
  • Squared Error = (x2-2)2
  • We get zero squared error when x equals the square root of 2.

If I square x and then subtract 2, that tells me the difference between what I have (x2) and what I want (2). That quantity is the “error.“ When x = 0, the error is -2 and the squared error is 4. When x = 1.5, we are getting closer and the error is 0.25 while the squared error is 0.0625. The useful property of squared error is that it has a minimum when the error is zero. Unlike error, squared error cannot go negative. That’s why it is good to minimize squared error, as is done in most optimization routines. When x = sqrt(2), then the error is zero and the squared error is likewise zero. That’s how we know when x is at its optimum value.

It should be noted that minimizing something is not the same as bringing it to zero. This minimizer minimizes to… a minimum, which can be something other than zero, such as 9.3 or -1 or -5 or -999, etc. When we’re setting up the problem, we usually minimize the squared error, not the error. This is because the squared error bottoms out at zero; it can’t go any lower, and that makes it easier to work with. Also bear in mind that the starting position can affect the outcome of the optimization. For example, the square root of 100 can be 10 or -10 and the optimizer will begin at the initial guess and then move x in whatever direction has a negative derivative. If the starting point is -9, the optimum value will be x = -10; if x starts at +11.3, then the optimum will be x = +10.

How the Numerical Optimization Algorithm Works

This section shows one iteration of the algorithm. The worksheet “Optimization Algorithm Details” demonstrates all but the last iteration and explains why the last iteration can’t be completed by a spreadsheet.

The gods proclaim from on high that Minimize() is an unconstrained nonlinear optimizer that uses an exponential line search, followed by a false position method that updates its interval by quadratic approximations. For us mere mortals, that means it follows the merit function, point by point, as it decreases. The fact that the merit function is decreasing means its derivative is negative. At some point the merit function starts to go up again, indicating that the derivative is now positive. If the derivative started out negative and then became positive, then the derivative must have changed sign (because of the intermediate value theorem). If it changed sign, the derivative must have passed through zero where the function was at a local minimum. That’s the point we’re looking for and the parabolas help the optimizer zero in on that minimum very quickly.

Example Minimization Problem

PerformOptimization() starts with a line search using ExponentialLineSearch(). During the initial line search the optimizer has no idea how close it is to a local minimum. It could be right next to the minimum, or it could be 1,000,000 units away. The line search starts by finding the direction of the negative gradient (the direction where the function is decreasing). Suppose that direction is to the right. The algorithm recognizes this and starts by checking a point 0.05 to the right of the starting point. Then 0.1 to the right of that. Then 0.2 further than that. Then 0.4, 0.8, 1.6, etc. The idea is that the more steps it takes to find where the derivative changes sign, the further away the local minimum must be. By exponentially increasing (i.e. by doubling) the search distance, we are covering more ground in less time. The Excel Solver uses a similar strategy for its line search.

False Position Optimization Algorithm with a Detailed Example

Algorithm 01 - Initial Line Search

The line search always remembers the three most recently searched points, which we can call A, B, and C. When the points are aligned from left to right and B is lower than A and C, the line search stops. It then calls EnforceDerivativeRequirements(), whose purpose is to narrow the three points down to two points. This algorithm, in a nutshell, always has two points Left and Right which have to be on opposite sides of a local minimum. A “false position algorithm” is just a way of bringing the Left and Right points closer and closer together until they reveal the optimum value with sufficient precision.

Algorithm 02 - Finding A, B, and C

 

The algorithm only works if the local minimum is between them, so EnforceDerivativeRequirements() checks the derivative of B. If the derivative is positive, then the minimum is between A and B. If the derivative is negative, the minimum is between B and C. Let’s suppose it will search between B and C. We know that the derivative at B is negative and that C has a higher function value than B. However, the derivative at C could be anything. We need the derivative at C to have the opposite sign as the derivative at B. EnforceDerivativeRequirements() checks this and moves C closer to B if necessary via a bisection algorithm. Long story short: EnforceDerivativeRequirements() gives us two points whose derivatives point toward a local minimum somewhere between the points.

Algorithm 03 - Revising C

Algorithm 04 - New A, B, and C

Algorithm 05 - Ignore A

 

At this point we have two points L and R (formerly called B and C), and we need to adjust them to be closer together in the x direction. In other words, we have good starting conditions and now we can begin using parabolic approximations in the false position method.

Algorithm 06 - Zoom in on Left and Right

 

The next step is QAssembleM(), which uses two quadratic approximations (i.e. parabolas) to find new “midpoint” values between L and R. The first parabola is fitted to L, the derivative at L, and R. The second parabola is fitted to L, R, and the derivative at R. These two parabolas each have a vertex somewhere between L and R. We solve for the vertex locations x and y, which we call M1 and M2. This series of diagrams shows how we come up with M1 and M2.

Algorithm 07 - Labeled Left and Right

Algorithm 08 - Calculate the Slope at Both Points

Algorithm 09 - Fit a Parabola to Left

Algorithm 10 - Fit a Parabola to Right

Algorithm 11 - Find Parabola Vertexes

Algorithm 12 - Calculate y for M1 and M2

 

Now we have four points: L, M1, M2, and R. We also have their derivatives.

Algorithm 13 - Four Points to Choose from

 

The next step is to choose the two best points. QUpdateLR() does that. A numerical optimization algorithm has a duty to use the lowest possible (valid) y value it finds at any step in the algorithm. This algorithm fulfills that duty by taking the point with the lowest y value among the four of them. It then looks at the derivative. If the derivative is negative, that point becomes the new L. If the derivative is positive, that lowest point becomes the new R. Once that point is assigned, the algorithm has to choose the complementary point. If it just assigned the new point to L, it looks at the several points to the right for the nearest one with a positive derivative. If it assigned the new point to R, it looks at the several points to the left and uses the nearest one with a negative derivative. The points Left and Right are now closer together, and this is the essence of the false position method.

Algorithm 14 - Pick the Lowest Point

 

By doing this it has chosen new L and R points that are guaranteed to be valid. It updates the merit function (of the optimization algorithm, which is the average y values of L and R) and calls QAssembleM() again. It continues doing this until 1) the merit function hasn’t decreased by more than StoppingCondition (which is 1E-15; i.e. nearly nothing) or 2) the points L and R are so close together that fitting parabolas is likely to be numerically unstable.

The exit condition relating to numerical stability is an examination of the two x values of points L and R. It takes their ratio and subtracts 1. If the result is within ±5E-9, the points are deemed too close together and therefore unsuitable for the fitting process. The numerical instability would manifest itself as a divide-by-zero error in Solve3x3AugmentedMatrix(), and you can see this for yourself by forcing the IF statement to evaluate to False. Long story short: the exit condition is that the optimization has run as far as it can practically go. Usually this is less than 10 iterations.

Algorithm 15 - Left and Right Are Closer Together

Algorithm 16 - Repeat It All Over Again

The optimization exits by returning the x value halfway between L and R. That in itself is an approximation, but it is a very good one!

How Minimize() Interacts with Excel

Excel has a rule that a user-defined function can return a value but cannot change anything anywhere else on a spreadsheet. Because Minimize() is blatantly circumventing that rule, this program has to take on a lot of the responsibilities that Excel normally takes care of. It maintains a list of all of the cells that contain calls to Minimize(). It remembers the merit function values of each of them (to determine whether something has changed and therefore needs to be optimized again). It has to do its own error checking.

  • What happens if one Minimize() function depends on the result of another Minimize() function?
  • What happens if Excel is busy doing other things and doesn’t have time to run Optimize()?
  • What happens if the user deletes the row containing the cell?
  • What happens if the formula has an error, such as a syntax error or a bad reference?
  • What happens if the cell is still there but contains an entirely different formula?

Here’s a fun bit of trivia. When Excel calls Minimize() once, the VBA code checks (and possibly recalculates) all cells with Minimize() functions. This is necessary because some of them may be precedents in each other’s formulas. Also bear in mind that Minimize() has the unique ability to change the values of its own precedent cells! Once that happens, Excel detects that the cells changed – which triggers Minimize() again and again, which results in recalculation of all cells on all spreadsheets whose formulas are =Minimize(). The program has to prevent the first call to Minimize() from cascading into an infinite number of redundant calls to Minimize().

The most important ways that Minimize() handles these potential “gotchas” are the list of Minimize() cells and the system timer. Upon the first call to Minimize(), it checks the calling cell’s syntax, and if everything looks ok, it calls SetTimer() with a 1 millisecond delay. (That miniscule delay is important because if something goes wrong and the timer triggers repeatedly, that delay is just enough time for Excel’s GUI thread to close the program.) The Minimize() function immediately returns, and Excel is none the wiser. One millisecond later Windows calls the VBA function ScheduleOptimizationServiceProc(), which has this line: Call Application.OnTime(Now, "OptimizationService")

That tells Excel to wait until it’s idle, and then to call OptimizationService(). OptimizationService() then loops through the worksheet cells that contain Minimize(). Each time it detects a cell whose merit function or formula changed, it runs PerformOptimization(), which does the actual optimization.

If you set the global variable ShowDebugText to True, the debug window will show a log of all of the calculations it does. It shows what it’s optimizing, how many times it evaluated a merit function, how many potentially redundant Minimize() calls it ignored, etc. This gives some insight into the interactions between Minimize() and Excel.

Acknowledgements

Math for Mere Mortals would like to thank Mike Zigmont and Carlos Chiquete for their feedback in preparing this post.

Advertisements

15 comments

  1. Pingback: Constrained Nonlinear Optimization in Excel without the Excel Solver « Math for Mere Mortals

  2. Great post, thank you. Have an issue – I’m trying to goal seek a dividend (y) by adjusting cash retained (x)using the z = (y-3)^2 method, and I’ve used the equation MinimizeNonnegative(z,x,0). However, I find that rather than reduce the squared error, it tends to maximize it by simply returning x as 0 (or any starting value provided). The number I need in cell x is actually the “maximized error”.

    Any ideas what I’m doing wrong here would be appreciated.

    • Hi, Steiniteer. Thanks for your comment. Let’s see if we can get it working. It’s returning the initial guess, so scripting must be working. That means that the optimizer is having trouble calculating a numerical derivative. Usually this is because either the function y is returning an error or it’s returning a constant no matter what value you use for x. Can you verify that y does change with different values of the cell x that is being optimized? In other words, when you use different starting values, do those produce different values in y and z?

      On Mon, Mar 24, 2014 at 4:58 AM, Math for Mere Mortals wrote:

      >

  3. Hello Kintaar,

    Many thanks. Yes, I can confirm that y does change as values of x change – also changes as I change the initial guess.

    I studied the Black – Scholes example in your spreadsheet and it looks to me like it does the same thing: returns the initial guess for the sigma value. Is there something I don’t understand here?

    • I think I have figured out what the problem is. The macro evaluates the merit function by looking at the precedent cells of the merit function cell. Rather than calculate the entire worksheet, the macro tries to calculate only the few cells it needs. This is much faster than recalculating everything. That’s fine if the merit function’s precedent cells are all on the same worksheet, but it doesn’t work when they reference more than one worksheet. The problem is that Cell.Precedents returns a Range object, and that object suffers from a Microsoft design flaw that a Range cannot reference cells in more than one worksheet. Your optimization references cells whose formulas reference cells on other worksheets, and the macro can’t see those precedent cells and therefore can’t calculate the derivatives properly. This is a bug in the macro that stems from a long-standing design flaw in the Microsoft Excel object model; you are not doing anything wrong.

      It will take me some time to fix it, but I will work on correcting this. There are several workarounds, and I just need to choose one that I can integrate into the macro. It will probably involve checking to see if the precedents are on more than one sheet and then handling them specially as needed. A brute force workaround is to replace PrecedentCells.Calculate() and Precedents.Calculate() with Application.Calculate(), but that is *really* slow. In your worksheet that approach would add about 10 seconds per optimization, which is simply not good enough. It may take a few weeks to fix, but we’ll post the updated spreadsheet as soon as it’s ready.

      On Mon, Mar 24, 2014 at 8:49 AM, Math for Mere Mortals wrote:

      >

      • Kintaar,

        Many thanks for your reply and the time on this. Will look out for the updated spreadsheet.

  4. hi, i am trying to use the minimize as described here to do basic goal seek – i want to find the value which causes two cells to be equal.

    so if A1 = 5 and B1 = 4 and C1 = A1 – B1 (so 5-4 = 1) , i would think minimize (c1,b1,1) would cause the Value in B1 to change to 5

    this doesn’t seem to be happening. am i missing something? thanks

  5. Hi, Rich. Thanks for your question. It looks like the problem is the merit function C = A – B. Minimize() is trying to minimize C, not trying to set it equal to zero. When the cells are both 5, C is 0. When A = 5 and B = 6, C = -1. That’s lower than 0. Minimize() will keep increasing B to make C go lower and lower (and more negative) until it can’t go any further. Try C = (A – B)^2. And for fun try C = (A – B)^2 – 9. The -9 doesn’t make it go any faster, but it shows that nothing special happens at zero. I hope this helps.

    • Hi Kintaar – thanks for the help. That works quite well. Only issue i am having now is when i try to export the modules and import them to another sheet i’m working on, i get an error subscript out of range which traces back to

      ReDim Preserve RangeQueue(2, 1)

      thoughts?

      • It looks like the ReDim operation is trying to change the number of elements in the array’s first dimension, which isn’t allowed in VBA. Does your code have an Option Base 0 or Option Base 1 statement anywhere? If you copy/paste the code into another workbook that uses a different base, it will cause this kind of error when you redimension any of the arrays. Try this too: ReDim Preserve RangeQueue(1 to 2, 1) Let me know if those ideas don’t work, and I’ll take a closer look at it.

        On Tue, Nov 11, 2014 at 1:04 PM, Math for Mere Mortals wrote:

        >

      • one of the first statements on the Query Performance Timer is Option Base 1 – i tried removing that and it still throws the same error. Also Changing the ReDim statement didn’t work either… I’m trying to import only the 3 sample modules into a blank workbook so there shouldn’t be any vba in there already (right)?
        appreciate it

  6. Hello all,

    I saw Your Internetpage and I was wondering if someone of You could help me with my project about Zero-Beta portfolios in Markowitz Portfolioselection and the use of those “rolling window” for getting my results via Excel but not by using Excel solver. I would be very glad if you could give me some feedback!

    Thank You in advance!
    Best regards,
    Melina

    • Hi Melina. It depends on what you are optimizing. If it’s just one parameter (e.g. you parameterize your efficient frontier), that is easily done. I’m not sure about a rolling window, as I’m not trained in financial math, but this optimization function can optimize anything with a smooth merit function modeled as a function of a single variable. If you are changing more than one variable, you’ll need the Excel solver for that.

      On Mon, May 23, 2016 at 5:08 AM, Math for Mere Mortals wrote:

      >

  7. If in an equation multiple values are unknown but the target value and some other values of the equation are known, can we guess those multiple unknown values using any spreadsheet or excel function or using any other method?

    • Hi, Swarnima. It sounds like you have an equation with several parameters of unknown values and you want the expression to equal a constant. The goal is to find the values of the parameters that satisfy the equation. There are various ways to go about this, but without knowing the specific problem, I can only suggest something general. Subtract the target value from both sides of the equation to make the expression nominally equal to zero. The square of the expression (or its absolute value) will equal zero when the parameters have the correct values; i.e. use the expression itself as the merit function. Suppose you have three parameters. Set the last two parameters to a constant number and optimize over the first one. Then hold the first and third parameters constant and optimize over the second one. Do this for each parameter, then use some kind of IF statement to choose the best combination of parameters. You can optimize all parameters at the same time using the Excel Solver. However, none of this guarantees that you will find a solution (or even the same solution) every time. There are other techniques for things like polynomial curve fitting or rational curve fitting, but that’s another topic for another day. If you have a specific problem, we’d be happy to offer advice about it.

      On Sat, Mar 4, 2017 at 9:13 AM, Math for Mere Mortals wrote:

      >


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s