Optimization Models and Algorithms using Microsoft Excel and VBA

Is Microsoft Excel (including VBA) a viable platform for the development of complex mathematical optimization models and algorithms?

© Rainer Sturm / pixelio.de

Whenever it comes to implement business calculation models, probably everyone immediately considers using Microsoft Excel. Excel is widely spread in corporate environments and most people are familiar with using it to a greater or lesser extent. The most exciting thing about Excel, however, is its apparently unlimited flexibility. Excel does the math, imports, stores and exports data, creates visualizations, provides a predefined tabular structure, includes a built-in programming language, etc.

Sometimes this flexibility is boon and bane. The fact that it seems as if you could do anything with Microsoft Excel does not mean you should. I have seen people misusing Microsoft Excel as a word processor, a database, a project planning software or a graphical application. Most of those workbooks reminded me of the following quote by Abraham Maslow (The Psychology of Science, 1966):

“If the only tool you have is a hammer,
you tend to see every problem as a nail.”

Having said this, it is definitely appropriate to always question in advance whether the tool you are intending to use is the right one for the task you are facing.

Besides financial planning, simulation models, visualization and project management, optimization models and algorithms always played an important role in my professional life. Thus, the question whether Microsoft Excel is an appropriate option for developing optimization models and algorithms for complex combinatorial problems suggests itself. Today’s post discusses this question, including the pros and cons as well as the possible use cases.

The background

Every version of Microsoft Excel comes with a feature providing algorithms to solve linear and non-linear problems (the Simplex-algorithm and the Generalized Reduced Gradient Method): The Solver add-in.

But what if you want to solve other combinatorial problems using other algorithms and heuristics? Is Microsoft Excel a viable platform for this as well?

Here are the pros and cons:

The pros

  • A ready-to-use and familiar environment

You have Microsoft Excel installed on your computer, don’t you? And you are familiar with using it, aren’t you? Well, then you are ready to go. You don’t have to select, install and learn other applications or programming environments and languages. Simply open Microsoft Excel and start developing.

  • An integrated programming language

VBA is as easy (or hard) to learn as any other programming languages. There are tons of websites, blogs and forums on the internet where you can find answers to almost every question you may have. Moreover Excel’s macro recorder can give you a smooth start into creating and understanding VBA code. Although there are some things one could wish for, the Visual Basic Editor (VBE) is already a rather convenient environment for developing code.

  • The “all-in-one” approach

Microsoft Excel provides an all-in-one solution. You have your input data, the code, all calculation results and the visualizations / the dashboard in one single file and application. Your Excel workbook is all you have to take care of: no external data files, no additional libraries, etc. If you want to transfer your work to another computer, all you have to do is to copy your Excel workbook.

  • Built-in data import and export features

Excel provides different ways of importing and exporting data. The range is from simple cut and paste to using Microsoft Query for building more sophisticated connections to external databases. Wherever your data come from: Microsoft Excel will assist you and minimize the effort and time needed.

  • Built-in data analysis features

You need further analysis of your input data or the results of your model? There is no need for additional coding. Excel provides a variety of built-in features and functions that will do the job. The function FREQUENCY, for one, will be of great help if you want to calculate and visualize the distribution of data. The functions RAND and RANDBETWEEN or Excel’s Goal Seek will assist you with creating dummy data, etc.

  • Built-in visualization features

You do not have to take care of visualizing the input data, your algorithm and the results. Microsoft Excel’s built-in chart engine and the option to combine charts, tabular data and input cells on one single dashboard should meet every requirement you might have for visualizing your algorithm.

Developing algorithms requires comparisons all over: compare the results of the algorithm for different problems, compare the results of different parameter constellations, compare the results of your algorithm with the results of another algorithm, etc. When it comes to compare data, Microsoft Excel is unbeatable: write the results to adjacent cell ranges and add simple Excel formulas to calculate the deviation as absolute numbers, as percentages, etc.

  • A widely spread application

If you want to distribute your optimization model to others, a Microsoft Excel workbook is probably the easiest way of doing this. Others don’t have to install a new application or get used to a new user interface. It’s good old Excel. It is installed on almost every computer and most people are familiar with it. Enabling macros when opening the workbook is the only thing your users have to take care of.

The bottom line of the pros:

Microsoft Excel and VBA provide a flexible and easy-to-use environment including countless options of creating meaningful visualizations, convenient user interfaces and performing additional analyses within the workbook. The result can easily be distributed and almost everyone is able to use the model without the need to install new software.

The cons

Optimization models and algorithms implemented in Microsoft Excel and VBA have limitations regarding the scalability. Firstly there is the well-known maximum size of worksheet ranges (e.g. 65,536 rows in Excel 2003 and earlier). Secondly changing the dimensions of the underlying problem (like adopting the objective function, adding new decision variables or adding new constraints) usually requires changes in the worksheet structures and the VBA code. Adopting declaration statements would take a few seconds in C# or C++ source code. Changing the according structure of a Microsoft Excel workbook and the according VBA code, however, could turn into laborious work, is highly error-prone and requires additional efforts for retesting.

VBA macros included in Microsoft Excel workbooks are interpreted, i.e. VBA is a line-by-line approach. Each line is read, checked for syntax errors, compiled and then executed. Although some of this work is done in advance, this approach leads to considerably longer runtimes than using an application developed in a compiler-based programming language like C#. Especially the interaction of the VBA code with the Microsoft Excel workbook, like reading and writing data from and to worksheet cell ranges or updating the screen (i.e. Application.ScreenUpdating = True), significantly increases the runtime.

Some performance improvement can be achieved by transforming the VBA macros into Excel add-ins. However, creating add-ins slows down the process of development and takes away parts of the flexibility of using Excel as a development environment. And – even worse – at the end of the day, a VBA based add-in will still be slower than let’s say C# compiled applications. If you have to solve really large-scale problems, you probably have to transfer your code into a compiler programming language.

  • Lack of stability and high error-proneness

As long as you do not protect the worksheets or the entire workbook, a spreadsheet is always a very fragile construct. Changes to the worksheets (intended or unintended) can either make the whole model unusable or – even worse – lead to wrong results. Inserting or deleting rows or columns, deleting or renaming cell range names, changing formulas, etc. can and usually will have unpredictable consequences.

My favorite example: array formulas. Let’s assume someone enters the formula bar of a cell including an array formula and leaves the bar using Enter (instead of Ctrl-Shift-Enter). If he’s lucky, this will lead to an error and he will at least notice that something is wrong. If not, the model will still work, but the results will be plain wrong.

  • Suboptimal options for documentation and maintenance

Let’s call a spade a spade: a lot of spreadsheet models and applications are not or insufficiently documented and therefore hard to maintain. The more complex a spreadsheet is, the harder it is to understand what it does and how it is working. For sure, Microsoft Excel isn’t the cream of the crop with regards to its documentation capabilities.

However, there are some ways of how to mitigate this weakness. It is a question of discipline for the developer and a question of know-how for the one trying to understand the model. The developer can improve the documentation by e.g. commenting the VBA code, using range names, describing cells and cell ranges on the worksheet, inserting cell comments, etc. If you are the one who tries to understand someone else’s model, you can take advantage of a lot of Excel’s built-in functionality, like the tracing features, the formula evaluator, the formula view, the Go To Special feature, etc.

The bottom line of the cons:

Excel optimization models don’t scale well. Slow performance, lacking stability and high error-proneness are the prices you pay for a convenient and flexible development environment. Last but not least, you need a lot of discipline during the development in order to come to a well documented and maintainable model.

The use cases

  • Develop and test your own algorithm

You have a new idea of how to solve a complex mathematical optimization problem (either an exact algorithm or a heuristic)? During the development of an algorithm, the performance comes second. A convenient, flexible and easy-to-use development environment is more important than the runtime of the algorithm.

And this is where Microsoft Excel comes into play: Excel’s worksheets in combination with VBA and the VBE can serve both as your developer environment and your user interface.

Basically you need three different types of worksheets in your workbook:

  1. The data worksheet: store the example problem(s), either copied from other sources or generated by using Microsoft Excel’s functions RAND or RANDBETWEEN
  2. The dashboard worksheet: a one page overview, including meaningful visualizations of parameters and results, input cells for changing the algorithm’s parameters, output cells for monitoring the results and some interactive features to e.g. switch between different problems
  3. Calculation / control worksheet(s): one or several worksheets to store temporary results of the algorithm and to set up the charts on the dashboard

After setting-up the workbook and developing your algorithm in VBA, you can conduct several runs for different problems and different parameter constellations, watch the algorithm work and find out its strengths and weaknesses as well as the opportunities for improvements.

  • Understand how an existing algorithm works

Rebuild an already existing algorithm with Microsoft Excel and VBA. Implementing an algorithm from scratch is quite an effort, but it definitely pays off. It guarantees that you fully understand the principle, the strengths and the weaknesses of the algorithm.

  • Compare different algorithms

Using Microsoft Excel as your integrated development environment makes it easy to implement different algorithms for the same type of problem and to directly compare them regarding the way they are working, their results and their performance.

  • Optimize existing algorithms by rebuilding and tuning the original

This one is a combination of the three use cases mentioned above: implement the exiting algorithm, build your own version within the same workbook and compare them.

  • Perform scenario analyses

A well-designed dashboard including all relevant parameters of your algorithms helps you to conduct various what-if analyses, to identify the most relevant drivers and to find the optimal parameter constellation for a given problem.

Build a user-friendly dashboard and present the principles, the function, the results and the sensitivities of the algorithm to your management or client.

Use Microsoft Excel in your Operations Research trainings and lectures. Meaningful step-by-step visualizations combined with the flexibility of doing further analysis and own modifications will help your students to better understand the algorithms.

  • Enhance existing Microsoft Excel decision support models with new algorithms

Add intelligent optimization technologies to your existing Microsoft Excel based decision support systems and improve your results.

This list is non-exhaustive. I am sure there are some more fields of application where using Microsoft Excel and VBA is a possible option for implementing and using optimization algorithms.

My conclusion

Microsoft Excel and VBA can be a viable option for developing, evaluating, comparing, presenting and teaching optimization models and algorithms as long as the complexity of the problem(s) is limited and as long as minimizing the runtime is not the primary objective.

References

The use of spreadsheet-based approaches for mathematical optimization has already been discussed in the Operations Research community for a long time. Here are a couple of links to resources you may find interesting for further reading on this topic:

What’s next?

The next post will present an example of such a Microsoft Excel based optimization model: a self-organizing map to solve Travelling Salesman Problems, including the example workbook for free download.

Stay tuned.

Add a Comment

Your email address will not be published. Required fields are marked *