Follow Us











Startup Sectors

Women in tech







Art & Culture

Travel & Leisure

Curtain Raiser

Wine and Food


This is a user generated content for MyStory, a YourStory initiative to enable its community to contribute and have their voices heard. The views and writings here reflect that of the author and not of YourStory.

Intelligent Marketing Funnel: Game of Numbers

Intelligent Marketing Funnel: Game of Numbers

Wednesday June 26, 2019,

9 min Read

Currently, analytics/data science/machine learning/artificial intelligence cover a major portion of a presentation in any company's strategic planning. Be it marketing, finance, supply chain or HR every department consumes analytics in the form of AI and ML. Even after so much of research work, analytics is still a cost center for many of these departments but among all, marketing is actually one of the functions where ML is actually turning department as a profit center.

There are millions of articles and reports which talk about how companies are planning to leverage AI/ML but all those talks find a place in PPTs and annual reports. For a non-data scientist or fresher, it is still a grey area. Almost all of you have heard some buzz as in data science is helping marketing but how? and which algorithms are actually being used. In this article, I would largely touch upon data science techniques and ML methods that are being implemented through R/Python/SPSS/SAS in daily to daily basis. But before going into details of use cases Lets understand some basics metrics that are required to improve through data science.

Before that let's understand the focus areas of marketing in a few lines.

  1. First of all, we need to understand our products, competitors and markets i.e. SWOT analysis: What and How much and where to sell?
  2. Decide your target customer base: Whom to sell?
  3. Create a communication strategy to reach customers via various communication channels: How to sell?
  4. Analyze Response and convert them into the purchase.
  5. NOW, WHAT? Don't worry Next time target them for another product (cross/up-sell) :)

This whole journey of a customer is called a funnel where funnel starts from customer identification and ends at final purchase. i.e.

Product awareness (Communication) ----- User Response ----- Lead Generation ---- Lead Evaluation ---- Sell

Again coming to the basics, performance/KPI of an overall funnel (i.e. overall funnel ratio) can be defined as

In terms of cost: (Final Sales)/(Cost of Acquisition)

In terms of Conversion ratio : (Final converted numbers)/(Total customer entered into the funnel )

If I were analytics head of marketing department I would definitely decompose the funnel, focus upon components of this funnel ratio and understand performance KPIs of each integral stages. As per my understanding, this can be simplified as:

Overall Funnel Ratio/Overall Performance = (Performance at stage_1) * (Performance at stage_2)*(Performance at stage_3)*(Performance at stage_4)


Funnel Ratio = (User Response/ Sent Communication Items) * (Total Leads/ Total Response) * (Qualifying Leads/ Leads)*(Conversion /Total Qualifying Leads)

To Introduce Data Science and Analytics into each stage, we need to understand each funnel ratio in detail because KPIs for each funnel heavily depends upon activities in the funnel.

STAGE 1: Total user response out of Total Advertisement Units.

The very first step of marketing after target base identification is campaigning. The success of the campaign lies in how many users are responding to your campaign. The response is nothing but whether a customer has seen our advertisement or whether he has responded our communication through either view or click.

Further, advertisement units vary for communication channel such as Total Number of Emailers, Total SMSs, Total facebook Impressions, Total Views in Browser Push Notifications, etc. Hence KPIs for each channel also differ.

For EMAIL we measure Open Rate, Click Rate; For Browser Notification, we check view rate, click rate; For FB campaign we measure clicks per impressions; For youtube, we check total clicks per view.

In 2018, The average click-through rate in Google Ads across all industries is 3.17% on the search network and 0.46% on the display network. For EMAIL Open rate across all industries varies from 15- 24% and CTR varies from 1.5-3.25%.

Here how analytics and ML help in the improvement of KPIs:

Effective Audience Selection :

Product Propensity: It is applied once we have identified "What to Sell". An ML classification model (Bagging, Boosting or Linear Classifier) helps to identify a hot customer for any product. We can develop an individual binary classification model for all product or We can develop a single multiclass classification model for all products. The output of this model is a probability score for each product against each customer.

Recommendation Engine: It is mostly applied to our existing base customers to promote cross-sell and upsell. We can use association rule mining as well as collaborative filtering to rank products for customers. Ex: Customer A bought product 1 and 2 hence we can also pitch him a product 3.

CLT Value analysis: Customer Lifetime value analysis identifies the total value of a customer that he can bring to the company. It helps a marketer to decide whom he should send promotional offers. This, in turn, reduces the cost of promotions. We can use either average revenue per month or derive propensity of a customer being alive using Beta-geometric binomial model alternatively gamma-gamma model.

Effective Channel Selection: It comes into the picture, once we have identified " Whom to sell and What to Sell". An ML classification model helps to derive probability scores (probability to respond) for any customer across all channels. Thus, we can prioritize channels for all customers. Say: for customer A, the probability of response to email is higher than on FB.

Effective Content Design: Natural Language Processing serves the best when it comes to content analysis and content design.

Analysis of Key Components in communication Message: Collect historical data of responded and non-responded users along with communication content. create a classification ML model (preferably logistic regression) which will provide the important parts of communication content that influence users to the response. Ex: Should we mention the interest rate in the subject line or not?. Sometimes a basic term-frequency analysis can give us a root cause. e.g. how many times long subject lines have increased/decreased Open Rate.

Placement of Contents in mailers: A basic exploratory analysis on email performance data would be sufficient for the same. Data must contain email response and properties of emails.

STAGE 2: Total leads out of Total response.

Mostly Leads are captured through "Fill the forms", "Click or Know more", "Apply through Chatbots" OR "Direct calling" based on customer footprint. In some cases, his response (say click on notification/view) is counted as a lead but most of the times, this view/click is not enough, his lead is counted only after filling a lead/application form. In many such cases, though people click on an ad but whenever they are asked to submit a form they back out. Say in an email campaign there are many instances where he/she opens mailer and clicks on mail content but after redirecting to company page he decides not to go further. This is a loss of a lead. In the marketing world, it may be referred to as "Lead Leakage" or "Lead Drop".

Analysis of Lead Drop is crucial as it indicates the look-alike feels of our application form or content reliability of arrival page.

How a data scientist can help here:

Lead Propensity Model: A classification ML model can identify potential customers who are likely to submit the lead form. Lead propensity model differs from product propensity models in independent variables selection. Lead propensity model takes arrival page 'look n feel' attributes into model building.

Leads through Chat Bot: Chatbot might be a smart agent but not enough smart till date to understand the mood of the customer. This mood should be captured to make chatbot more smart and interactive. Currently, the best deep learning models which are being used are seq2seq recurrent neural networks.

Seq2seq networks usually contain two LSTM (Long Short term memory): an encoder and a decoder. As the name suggests they take a sequence (context sentence) and produce another sequence (output sentence/machine response). Previous chat history is a feed for LSTM.

Form_Analytics: A form requires visitors to enter one of many fields (name, age, city, mobile, salary) as information. Structure of form, misalignment of campaign-content and form-content provokes visitors to drop off the form. With form analytics, we can figure out how many leads are lost from the start of the form till the bottom end of the form i.e. drop of leads through any field w.r.t. previous fields. Although predictive modeling has not gained much scope in form_analytics, a/b testing around it works well

STAGE 3: Total qualifying leads out of Total Submitted leads.

A marketing Qualified Lead (MQL) is the one who has shown interest voluntarily to purchase your product. The concept came into light when unwanted leads were captured just to justify the effectiveness of the campaign. Criteria for MQL depends upon business to business as well as product to product. For E-com MQL may be just a non-duplicate customer-intent for the same product but for loan products MQL covers their CIBIL/Income etc also.

Companies usually define criteria by themselves for MQL but sometimes they are also unaware which leads can be turned into the final purchase and which leads are worthless to focus. Here data science helps businesses identifying non-measurable behavior of good leads.

Lead Potential: A classification ML model (logistic regression) can take attributes like website activities, RFM parameters, etc and predict whether this lead has the potential to convert or not.

Fraudulent Leads: Another application of logistic regression here is to detect fraudulent leads. The fraud detection system can be kept offline or it can be integrated into a live production system.

Recommendation Engine: Online commercial sites are now using a recommendation engine to influence non-potential leads to hot leads. Apart from product recommendation Natural language processing with the help of social media provides brand-customer sentiments and influence visitors to get a customer.

STAGE 4: Total conversions out of Total Qualifying Leads

Processes, after MQL identification, are generally taken care of by the operations team. In Loan providers, field representative visit to Qualified lead customers for remaining formalities, same case for Real State and automobiles. For online stores, It is a web developer, who is responsible for a smooth check out procedure.

Intelligent Document Verification: Document NLP is a strong tool but it is in its beginning stage. Its use is very niche though some companies use deep learning to perform image processing for online document verification.

Conclusion: A data scientist surely plays a necessary role to make this complete funnel as an intelligent and sharp but his number crunching is still not sufficient. He also has to have an in-depth understanding of "how to do marketing".

    Share on