Our previous blog post discussed the gap between data analysts and big data. In this post, we'll look at how those data analysts go about building models and leveraging big data to drive insights.

Market mix modeling is the science of establishing mathematical relationships between marketing tactics and sales. Statistical modeling is an important means to discover these relationships. Modeling isn’t new -- it’s been around for many decades.

From Excel to SAS, there are many tools an analyst can use to build models. However, it isn’t exactly a walk in the park to do marketing mix modeling. Why is it so difficult to put models to work for marketing analytics? For that, we have to first understand the many different steps before and after the model is built by the analyst:

1. The analyst loads the customer’s data into a database.
2. She cleans up the data by removing statistical anomalies.
3. She analyzes the data, perhaps running a few regressions.
4. She looks at the results, refining and augmenting the input. She might have to iterate over steps 3 and 4 multiple times. After multiple iterations, the analyst selects a system of equations that describes the relationships between marketing tactics and sales.
5. The data analyst deploys the model in a software package for the end user to build “what if” scenarios.
6. A solver might be deployed in an application to find optimized scenarios.

Now, the business user can provide new inputs about marketing activities. The software application gives the business user predictions about financial metrics or can even suggest improvements to the original inputs.

Simple enough, right? As they say in Inception, we must go deeper.

An example of a very simple modeling equation would look something like this

Weekly Unit Sales = Base Demand * (Paid Search SpendPaid Search Lift Factor) * (TV SpendTV Lift Factor) * …

Real-life marketing mix models are rarely that simple. They often consist of systems of equations representing marketing funnels and decision journeys. They contain data transformations that better describe consumer behavior. But the fundamental challenges remain the same.

Data Management

The journey from marketing touch points to sales is much more complicated in the real world. Data has to be collected and stored from hundreds of sources. The volume of digital impression level data for a year for a midsize enterprise could run into 30 terabytes or thereabouts. Hundreds of variables have to be transformed to remove statistical anomalies.


Next comes the building of model equations. The modeling process is to use available past data to establish significant relationships between marketing targets and marketing drivers. With a simple model, the relationship may be derived by running linear regressions. However, in a real world application, more often than not much more advanced and sophisticated statistical algorithms will be explored and used.

In general, the typical modeling process includes the following elements:

  • Data exploration, variable generation and transformation - explore potentially significant marketing drivers, with input from both past experience and business domain knowledge.
  • Statistical algorithms and techniques - apply appropriate methods according to complex data structures, such as logistic regression, time series, principal components, and Bayesian analysis (to name a few) to expose the relationship between spend and sales.
  • Interaction exploration - understand the impact of the combination of variables on the dependent outcome.
  • Direct and indirect impact from drivers - to account for various impacts mimicking customer behavior, such as time lag reactions to specific campaigns.
  • Model validation - to ensure the validity and robustness of the models. In real life, a single equation can never suffice to represent the relationship between marketing tactics and eventual sales.

"What if" Scenarios

If the business user put these models in an Excel spreadsheet, what would they do with it? The dizzying number of coefficients and equations would simply confuse the business user. Instead, the models should be embedded in an application. The application uses the models to solve critical business requirements like response curves, what if scenarios, attribution and pricing.

To convert bare bones equations to an application requires a sophisticated metadata management workflow. Depending on the business user’s definition, the application programs treat each variable differently. For instance, the customer might be constrained in how much TV budget can be varied for a particular product. The application has to reflect this restriction to prevent the user from getting lost in the maze of equations. Another example would be unemployment index as a variable, which the user should not be able to change. The application is architected to allow for these business definitions to be reflected in the application’s user interfaces.


Marketing analytics applications provide quantitative guidance to marketing executives and decision makers. This guidance leads to intelligent decisions that maximize the value of the marketing investment. With deep understanding of their marketing dynamics through modeling and “what if” analysis, marketers can get answers and insights for their campaigns under business constraints. For example:

  • What is the maximum revenue or profit with a given total marketing spending
  • How do you spend the least amount to achieve a target revenue or profit

The picture below shows the different stages of the analytic workflow. In the next few posts, we will dive deeper into the challenges and possible solutions for each stage of the analytic workflow.

Satya Ramachandran

Satya is the VP of Engineering at MarketShare and is responsible for all product development. He has more than 15 years of experience in distributed computing and real-time analytics at Sybase, Cognos and 3PARData.