Demystifying Monte Carlo

A very powerful, but little used, statistical modeling method is Monte Carlo Simulation. Yes, it gets its name from the famous international gambling city and, like so many statistical methods, was developed as a means of predicting probabilities and outcomes of various gambling scenarios.

Alan Nicol, Executive Member, AlanNicolSolutions

Nov 30, 2012

A very powerful, but little used, statistical modeling method is Monte Carlo Simulation. Read on to see how practical and simple it can be to predict the outcome of a design before it is complete.

One of my favorite tools, garnered from practicing Design for Six Sigma, is the Monte Carlo Simulation method. Yes, it gets its name from the famous international gambling city and, like so many statistical methods, was developed as a means of predicting probabilities and outcomes of various gambling scenarios.

I’m not a gambler, when I can help it. In fact, when it comes to product design I hate to gamble with results. So, a gambler’s cheat to predict the outcome of a scenario proves to be a staple tool in my repertoire. Yet, in spite of the fact that my mindset for risk is very popular, the use of the Monte Carlo Simulation tool is not.

I perceive that few people have learned how or when to apply it, and many of those who have been introduced to the method, perceive it to be complicated or difficult. I would like to explain it to this audience and try to show how practical and reasonable it can be. I’d also like to share why it is so powerful.

A Monte Carlo Simulation is a mathematical model of a phenomenon. To make it work we need three pieces of information:

A mathematical formula that represents how inputs turn into an output.
A reasonable estimation of the variation of each input.
An idea of what output performance is acceptable.

It may seem obvious or moot to include the third piece, but I did so because we can, and people often do, waste time running simulations without knowing what we are trying to accomplish. There should always be a need, target, or limit we are trying meet, exceed, or maintain before we run a simulation to see what our performance might be.

If we know what we are trying to accomplish, we can build a mathematical representation of our system’s behavior, collect some data that reflects the performance of the system’s inputs, and do a Monte Carlo Simulation. That doesn’t sound so hard I hope.

Why would we want to do a Monte Carlo Simulation? There could be hundreds of reasons and thousands of examples, but they all reduce down to one thing: predicting performance without conducting hundreds of experiments or building thousands of samples.

Here are a few examples for context. In a mechanical design we might have several components that must stack together and fit inside some fixture or enclosure. It might be a couple of circuit boards, some spacers, and two halves of a shell or case. We all know that not every piece/part will be exactly the same as another. Our design tolerances on our drawings reflect this understanding. Given some reasonable variation of component dimensions, how often will the parts not fit together, or will have too much gap between parts so that something rattles?

A common practice is to design for the maximum material condition. That would ensure that the case always closes without a gap where the shells meet, but what about the minimum condition? Will something rattle? Might that lead to vibration that causes reliability issues? Might it simply be a tactile “dissatisfier” for a customer? Would we like to know how often it might occur?

How about a chemical reaction? Consider the common problem of using two-part adhesives in our designs. In a case where adhesive strength is a critical design element and mixing the two parts precisely can mean the difference between success and failure. Would it be nice to know just how much we can allow our mixing process to vary before we invite adhesive failure? Is our process capable of the precision we need? Can we mix them such that it gives us either a shorter or longer working time or assembly window as needed?

Even programmable electronics can be opportunities for prediction simulations, though I confess these opportunities are rare. Imagine a circuit where response time of a switch is critical. Voltage may vary from one battery to another, or over batter life, or between power sources. That variance in voltage can affect charge times for capacitors and trigger or hold times for solid-state switches. If a variance in microseconds between one switching event and another could negatively affect the output of the circuit or device, it could be important to predict. I know it’s not a likely scenario; most circuits’ variance would never be noticeable.

It’s more common in electro-mechanical scenarios for the circuit to turn on or off a device such as an electric motor that drives some mechanism. If the motor runs a mechanism out and back repeatedly it might, over time, end or start in a different place. It can lead to performance failures, reliability issues, or intermittent faults.

With Monte Carlo Simulations we can model our design, before we finalize the design and release drawings, and predict if or how often assembly, mixtures, or time-based elements might end up with unacceptable outputs.

With the model’s predictions we can make intelligent adjustments to the design and we can make intelligent decisions as to whether the probability of a defect is acceptable, or if we have some more design work to do. When we invest tens-of-thousands to millions of dollars developing a product, the ability to calculate our defect rate is incredibly powerful.

Suppose we want to try conducting a Monte Carlo Simulation. How does it work? We must go back to the list above. We must have a mathematical formula for how inputs to our system turn into outputs. These formulas come from a variety of possible sources. For many mechanical designs, they can simply be taken from the drawings and the tolerance stacks. Otherwise they can be calculated using well established engineering principles and formulas.

For electronic or electromechanical designs, the outcome can likewise be predicted directly from the design schematics. For chemical reactions we often need to run experiments and record the outcomes. As a general rule, we want to try the simplest, least costly, most reliable method possible to get our formulas. Particularly when conducting experiments, it can be important to eliminate factors (inputs) that have no, or negligible influence on the outcome. It saves us time and effort to reduce the model as much as practical.

Let’s examine a very simple example to explain further. Let’s imagine the scenario described above, a pair of circuit boards, three spacers, and two halves of a shell-case that will be sandwiched together in assembly. Giving each component a letter designation, our output (the gap between the two shells) can be modeled with the following equation.

Gap = (A + G) – (B + C + D + E + F)

Let A and G represent the internal space dimension of each shell, C and E represent the thicknesses of the circuit boards, and B, D, and F the thicknesses of the spacers. So, the total space inside the shells minus the total thickness of the circuit boards and spacers stacked together will reveal a gap. A Gap of zero is best, but if the Gap is negative 0.02 or greater the components will rattle.

The formula for the model in this case is simple. Now we need part #2: some data about the performance of each input. From our drawings we can dictate the acceptable range of each part with our tolerances. However, just because we dictate a range that doesn’t mean it is exactly what we get. We want to know what the true performance is likely to be.

We’ll say for the sake of the example that the spacers are a standard part. We might even have quality control inspection data on them already and all we need to do is get it out of the database. The shells are a new design, but we will order them from a reliable supplier. That supplier has data from similar parts of injection-molded plastic of the same material, and we will use that data to assume the same tolerance range for our new part.

Finally, we need data on the circuit boards. We make those ourselves so we need to collect data on board thickness variation. We’ve not done it before, so we just need to pool a bunch and take some measurements. It takes some time, and it’s a little inconvenient, but we do it anyway.

The spacers are cut from stock and vary with a normal distribution. That means that most of the spacer lengths are very close to the desired length, with fewer appearing toward the limits of the tolerance range. The curve looks like a bell shape. After examining our quality control measurements for the last couple of years, we can characterize that normal distribution with a mathematical representation.

We discover the same for the circuit boards. Though they are very consistent in size within each batch, the different batches vary and we can characterize the long-term variation with another normal distribution.

The shells are a bit different. They are injection molded and the tool wears rather uniformly over time. Therefore the distribution profile is more like a uniform distribution.

In other words, there is as much likelihood that the true dimension will be at the center of the tolerance range as at either extreme. The tool is deliberately constructed for the minimal dimensional limit, and creeps toward the maximum limit as the tool wears. When it reaches the limit, the tool is retired and the process starts over. The distribution looks like a rectangle or a straight line.

We have everything we need because we also know what we want in terms of an output (no more Gap than zero, no less than -0.02). We just break out our handy spreadsheet software and do some math.

Make a column for each input or factor, and one more for the output. In each input column (A through G) we will instruct the spreadsheet to randomly generate numbers according to the formula that models each input. A thousand of each should make a very good simulation and still take only a few seconds for our software to calculate. In the output column (Gap) we insert our model formula Gap = (A + G) – (B + C + D + E + F) and copy it down 1,000 rows, one for each randomly generated input number.

The spreadsheet calculates the output for each row, which of course is different in each case because the inputs were randomly generated and are different combinations.

Do you see what we have done? Within the constraints of each input’s actual or assumed performance we have randomly generated data. We generated 1,000 possible combinations of inputs and calculated the expected output 1,000 times. Now we can use the spreadsheet’s various screening tools to tell us what we need to know.

We can generate a histogram of the output data and see how much of the histogram falls outside of our limits of Zero and -0.02. Alternatively, we can have the spreadsheet tell us a count of how many outputs are above or below our desired limits. If the result is less than a percent, for example, we might accept that failure rate and go into production. If the result is higher than we can accept, we can look for ways to adjust the design. The model gives us a prediction of actual performance before we have even built a prototype. Why would we not want that?

Now we touch on one of the most valuable aspects of tools like Monte Carlo Simulation. Not only can we predict performance, the tool also helps us see where we can make a better design. We begin to tease designs for ways to eliminate the variation, reduce the influence of certain processes, reduce the influence of certain inputs, eliminate opportunities for things to vary, or just plain simplify. It promotes and stimulates better and smarter engineering. People don’t talk about that but it is, in my opinion, the most powerful result of incorporating the tool into standard practice.

When Monte Carlo Simulation becomes a regular practice, design teams begin thinking proactively about how to design such that variable inputs have little or no influence on the output. That is a very good thing. Management teams become much more intelligent about launching products, and tend to agree more than disagree with design teams about whether the design is ready to prototype, test, or ready to launch. This tool can single-handedly (with some leadership of course) improve the success rate of designs and processes, and reduce the long-term costs of products.

Monte Carlo Simulation is a very potent tool. Yes, it requires some homework, but it’s not a difficult thing to do. There are some fancy tools out there that make it even easier, some that are specifically designed to perform Monte Carlo Simulations, with a wide variety of price tags (no I don’t have a distribution model for that), but a professional-quality spreadsheet program is all that is really required.

I strongly encourage us all to give the Monte Carlo Simulation method a try with our designs. It can be very versatile and it gets easier to do the more we practice. It’s not as difficult as it might seem at first and the influence it can have over our designs, our design methods, and our important design and business decisions is enormous.

Take some time this week and look at your ongoing designs. Is there an output you would like to predict before testing? Can you put together a mathematical formula for the system? Can you gather or assume a performance profile for each input? If so, you can indeed predict the outcome. Do so and be the hero this week.

Stay wise, friends.

If you like what you just read, find more of Alan’s thoughts at www.bizwizwithin.com