Predictive Analytics is the special field of analytics which generally looks at the past data and tries to predict what is going to happen in the future. And therein lies both the opportunities and problems...
Looking into the past to know what is going to happen in future hinges on a key assumption that past is a good reflection of the future. In terms of an analogy that we can all relate to, it’s like driving a car forward on a highway by looking into the rear view mirror. If the road ahead is exactly similar to the road that you have covered in the past, then no issues.
However what happens if the road ahead is very different? Say bumpy, curvy, zig-zag, diversions etc. ? The answer everyone can guess.
The same applies for using the past data to make predictions about the future. For example- If your past customer’s behavior is very indicative of what they will be doing in the future, you can reap great benefits by using the prediction methodology.
Say, an eCommerce company is running an online campaign during holiday time to increase the sales. The company’s customer base and behavior with the company has not changed much from the last holiday season. The company can build predictive models using the last campaign data to find out who will respond, how much will be the incremental sales etc.
On the other hand, let’s say the company’s customer base has shifted a lot from the previous campaign. Earlier it used to be 95% Female and only 5% Male. This year the mix is 65% Female and 35% Male. Do you think we can still use the previous campaigns data for building prediction models?
The answer is probably not. Here is why, let’s say the company was promoting women's dress in the last campaign and got a great response due to predominantly female customer base. If they do the same thing in the current campaign, most likely they will not get big boost in the sales as now 35% customers are male and they may not be as responsive.
You may ask what is the solution? Here are couple of suggestions-
Ensure if you are building a prediction model, the data on which a model is trained is similar (statistically and business wise) to the data on which model is going be implemented. Generally if you pick a recent data to train you model, it should be indicative of your current population.
If above is not the case, you are better off doing an A/B Testing (Design of Experiment) and find out the best strategy for the existing population. This means that rather than you try to predict what a customer will like or not, you ask the customer directly. Doesn't it make more sense?
Disclaimer: The views expressed here are solely those of the author in his private capacity.