Customer Churn and Retention
Introduction to Churn and Retention Analysis
Churn and retention analysis aims to identify the reasons why customers discontinue their use of a product or service. Leveraging tools like Infer, you can even forecast which users are at risk of churning in the near future.
Churn manifests in various ways, including:
- Contract termination
- Account deletion
- App uninstallation
- Lack of repeated website visits
- A single purchase without follow-up in a 6+ month period
To effectively analyze churn, begin by defining the churn behavior relevant to your product. Then, choose the variables or features most insightful for your churn analysis. These usually encompass demographic data (e.g., age, country) and product usage metrics (e.g., overall activity, number of orders, value of orders).
Basic Example Using Kaggle's Bank Customer Churn Dataset
In the following example, we draw from the Bank Customer Churn Dataset on Kaggle. The dataset provides demographic information and high-level bank details for both active and churned customers.
To fit with our analysis, you can download the corresponding raw dataset here.
SQL-inf Query for Churn Prediction
SELECT * FROM Bank_Customer_Churn_Prediction PREDICT(churn, ignore=customer_id)
This query selects all columns and constructs a predictive model targeting the churn
column. The model learns from existing churn patterns while ignoring the customer_id
, which is irrelevant for prediction.
Key Findings
Our results can be broken down into four main categories:
Model Accuracy: With an approximate accuracy of 80%, the model significantly outperforms random guessing (50%), making its insights reliable.
Feature Importance: The four most influential factors contributing to 75% of the predictions are product count, age, membership activity, and geographic location.
Highlights: The analysis yielded the following insights:
- Customers with two products exhibit the lowest churn rates, whereas those with three or more products are at a higher risk.
- Young customers, especially those below 30, have a churn rate between 7-10%. Churn increases steadily beyond age 30, peaking at nearly 50% between ages 46-53.
- Active members are substantially less prone to churn (23%) compared to inactive ones (38%).
- German customers are twice as likely to churn (~50%) as those from France and Spain (~25%).
Additional Features: This section enables deeper exploration into other, less influential factors like gender's effect on churn.
Strategic Conclusions and Recommendations
- Target Demographics: The current product offering appears less appealing to older customers and those located in Germany. Consider tailoring features to meet the specific needs of these demographic groups, or alternatively, realign marketing strategies to target them more effectively.
- Optimal Product Ownership: Our data clearly shows that customers with two products are least likely to churn. Marketing and product strategies should, therefore, aim to encourage customers to own exactly two products for higher retention.
By implementing these targeted strategies, we can likely improve both customer satisfaction and retention rates, thereby enhancing the overall customer experience and contributing to long-term success.
Advanced View Creation for Churn
A user will usually have to define churn if it is not already defined in their data source. Here we will give examples of views that define churn for different scenarios.
Contract Termination with Usage Metrics
Assumes tables: contracts
, usage_stats
, billing
SELECT c.customer_id,
u.average_usage,
u.peak_usage,
b.total_amount_paid,
b.number_of_late_payments,
CASE WHEN c.termination_date IS NOT NULL THEN 1 ELSE 0 END AS churn
FROM contracts c
JOIN usage_stats u ON c.customer_id = u.customer_id
JOIN billing b ON c.customer_id = b.customer_id;
Account Deletion with Purchase History
Assumes tables: accounts
, purchase_history
, payment_methods
SELECT a.customer_id,
ph.total_purchases,
ph.total_refunds,
pm.payment_type,
CASE WHEN a.deletion_date IS NOT NULL THEN 1 ELSE 0 END AS churn
FROM accounts a
JOIN purchase_history ph ON a.customer_id = ph.customer_id
JOIN payment_methods pm ON a.customer_id = pm.customer_id;
App Uninstallation with User Engagement
Assumes tables: app_usage
, user_engagement
, feature_usage
SELECT au.customer_id,
ue.login_frequency,
ue.in_app_time,
fu.most_used_feature,
CASE WHEN au.uninstall_date IS NOT NULL THEN 1 ELSE 0 END AS churn
FROM app_usage au
JOIN user_engagement ue ON au.customer_id = ue.customer_id
JOIN feature_usage fu ON au.customer_id = fu.customer_id;
Lack of Repeated Website Visits with Interaction Metrics
Assumes tables: website_visits
, click_behavior
, search_history
SELECT wv.customer_id,
cb.average_clicks,
cb.average_pageviews,
sh.search_queries,
CASE WHEN COUNT(wv.visit_date) <= 1 THEN 1 ELSE 0 END AS churn
FROM website_visits wv
JOIN click_behavior cb ON wv.customer_id = cb.customer_id
JOIN search_history sh ON wv.customer_id = sh.customer_id
GROUP BY wv.customer_id, cb.average_clicks, cb.average_pageviews, sh.search_queries;
A Single Purchase Without Follow-up in a 6+ Month Period with Customer Feedback
Assumes tables: purchases
, customer_feedback
, delivery_details
SELECT p.customer_id,
cf.feedback_score,
cf.feedback_comments,
dd.delivery_time_deviation,
CASE WHEN MAX(p.purchase_date) < CURRENT_DATE - INTERVAL '6 months' AND COUNT(p.purchase_id) = 1 THEN 1 ELSE 0 END AS churn
FROM purchases p
JOIN customer_feedback cf ON p.customer_id = cf.customer_id
JOIN delivery_details dd ON p.purchase_id = dd.purchase_id
GROUP BY p.customer_id, cf.feedback_score, cf.feedback_comments, dd.delivery_time_deviation;
Each of these queries tries to capture a mix of behavior and product-usage metrics that might correlate with churn.