It has been almost a year of my involvement in a project of global marketing mix optimization solution for a large consumer packaged goods company. Conceptually, the problem is simple: given a fitted model of a company’s revenue as a function of promotion campaigns for its products, and using past year’s promotion campaigns allocation scenario as a starting point, find a revenue maximizing scenario subject to promotion expenditure constraints.

The problem becomes more interesting when we go into details. The company is large, so the number of products (brands) is large (up to 100), and every product has multiple (up to 300) SKUs associated with it. In addition, there are many markets where products are sold (i.e., Target, Walmart, Kroger etc), and products sold through different markets are considered distinct. In addition, there are 7 promotion types (e.g., tv, discounts), and promotion campaigns vary over 52 weeks of the calendar year. This brings up the dimensionality of the problem to around 100 * 300 * 30 * 7 * 52 = 327,600,000. The dimensionality is large because the whole idea is in *reallocating* resources. We would like to include everything in our optimization so that, for example, the underperforming Duane Reade chain gets less promotions, and booming Target gets more. Overall, there is about 2GB of inputs to the optimizer in the form of tables, each row of which corresponds to a value that the optimizer has the power to change.

Setting aside the challenges with dimensionality of the problem, the most striking and exciting thing is that these large companies are *actually interested in doing this*. A snapshot of the optimization procedure I have developed (with the help of a clever friend) is shown on Figure 1, and a more complete dynamic version is now the logo at theory.info. The procedure works, and it works rather well.

What I have been puzzled about from the very beginning of the project, is how can we first fit a model, and then manipulate the predictors, possibly even applying inference algorithms like MCMC as if the predictors were random variables. In fact, the predictors are not random, they are determined by the finance and planning department of the company (if they can get them to where they want them through business mechanisms). The problem we are considering here can be conceptualized as an application of experiment design, namely the Response Surface Methodology. Wonder if George Box ever thought dimensionality this high would ever be explored in an industrial setting. Excitingly, it is.

This raises a question: is the scientific community ready to approach these problems? That is, can it provide the answer-hungry industry with guidelines of correct methodology? A version of this question was addressed to Sir David Cox during his keynote lecture on JSM2011 yesterday (the conference is going great). The lecture hall was huge, with several hundred attendees. One of them asked: “Given that the experiment design methodology is evolving so rapidly, do you think that the current literature on experiment design is outdated?” And the response was that the newer techniques include response surface designs and should be actively developed. Sir Cox was born in 1924 and is still capable of giving this extremely sharp answer.

Coming back to the project described in this post, I am convinced that optimization alone won’t allow the client to achieve postulated revenue levels. A schedule of factor adjustments and response surface diagnostics must be in place so that the optimization is performed on the appropriate objective function. However, marginally the optimized promotion campaign allocations are likely to be pointing in the right directions, which is a good start.

The project is done working with In4mation insights. Client privacy is preserved in this post. I built the dynamic visualization on theory.info using an exciting Javascript library d3, developed by Mike Bostock from Stanford CS. This is a great tool and I will be using it more (hopefully, the asynchronous transitions will eventually get more flexible).

Tags: conferences, d3, experiment design, in4ins, optimization, Theory, visualization

[…] the conference, I also had a chance to finish making a dynamic 3D visualization of a constrained optimization algorithm, which is exciting. As for Miami Beach itself, ti is a […]

[…] (over $6 billion sales) to achieve 3-5% revenue increase without increasing expenditure described in this post. Essentially, this was an exercise in Response Surface methods with dimensionality as high as […]