Skip to main content

Customer Churn and Retention

Introduction to Churn and Retention Analysis

Churn and retention analysis aims to identify the reasons why customers discontinue their use of a product or service. Leveraging tools like Infer, you can even forecast which users are at risk of churning in the near future.

Churn manifests in various ways, including:

  • Contract termination
  • Account deletion
  • App uninstallation
  • Lack of repeated website visits
  • A single purchase without follow-up in a 6+ month period

To effectively analyze churn, begin by defining the churn behavior relevant to your product. Then, choose the variables or features most insightful for your churn analysis. These usually encompass demographic data (e.g., age, country) and product usage metrics (e.g., overall activity, number of orders, value of orders).

Basic Example Using Kaggle's Bank Customer Churn Dataset

In the following example, we draw from the Bank Customer Churn Dataset on Kaggle. The dataset provides demographic information and high-level bank details for both active and churned customers.

To fit with our analysis, you can download the corresponding raw dataset here.

SQL-inf Query for Churn Prediction

SELECT * FROM Bank_Customer_Churn_Prediction PREDICT(churn, ignore=customer_id)

This query selects all columns and constructs a predictive model targeting the churn column. The model learns from existing churn patterns while ignoring the customer_id, which is irrelevant for prediction.

Key Findings

Our results can be broken down into four main categories:

  • Model Accuracy: With an approximate accuracy of 80%, the model significantly outperforms random guessing (50%), making its insights reliable.

  • Feature Importance: The four most influential factors contributing to 75% of the predictions are product count, age, membership activity, and geographic location.

  • Highlights: The analysis yielded the following insights:

    1. Customers with two products exhibit the lowest churn rates, whereas those with three or more products are at a higher risk.
    2. Young customers, especially those below 30, have a churn rate between 7-10%. Churn increases steadily beyond age 30, peaking at nearly 50% between ages 46-53.
    3. Active members are substantially less prone to churn (23%) compared to inactive ones (38%).
    4. German customers are twice as likely to churn (~50%) as those from France and Spain (~25%).
  • Additional Features: This section enables deeper exploration into other, less influential factors like gender's effect on churn.

Strategic Conclusions and Recommendations

  • Target Demographics: The current product offering appears less appealing to older customers and those located in Germany. Consider tailoring features to meet the specific needs of these demographic groups, or alternatively, realign marketing strategies to target them more effectively.
  • Optimal Product Ownership: Our data clearly shows that customers with two products are least likely to churn. Marketing and product strategies should, therefore, aim to encourage customers to own exactly two products for higher retention.

By implementing these targeted strategies, we can likely improve both customer satisfaction and retention rates, thereby enhancing the overall customer experience and contributing to long-term success.

Advanced View Creation for Churn

A user will usually have to define churn if it is not already defined in their data source. Here we will give examples of views that define churn for different scenarios.

Contract Termination with Usage Metrics

Assumes tables: contracts, usage_stats, billing

SELECT c.customer_id, 
u.average_usage,
u.peak_usage,
b.total_amount_paid,
b.number_of_late_payments,
CASE WHEN c.termination_date IS NOT NULL THEN 1 ELSE 0 END AS churn
FROM contracts c
JOIN usage_stats u ON c.customer_id = u.customer_id
JOIN billing b ON c.customer_id = b.customer_id;

Account Deletion with Purchase History

Assumes tables: accounts, purchase_history, payment_methods

SELECT a.customer_id,
ph.total_purchases,
ph.total_refunds,
pm.payment_type,
CASE WHEN a.deletion_date IS NOT NULL THEN 1 ELSE 0 END AS churn
FROM accounts a
JOIN purchase_history ph ON a.customer_id = ph.customer_id
JOIN payment_methods pm ON a.customer_id = pm.customer_id;

App Uninstallation with User Engagement

Assumes tables: app_usage, user_engagement, feature_usage

SELECT au.customer_id,
ue.login_frequency,
ue.in_app_time,
fu.most_used_feature,
CASE WHEN au.uninstall_date IS NOT NULL THEN 1 ELSE 0 END AS churn
FROM app_usage au
JOIN user_engagement ue ON au.customer_id = ue.customer_id
JOIN feature_usage fu ON au.customer_id = fu.customer_id;

Lack of Repeated Website Visits with Interaction Metrics

Assumes tables: website_visits, click_behavior, search_history

SELECT wv.customer_id,
cb.average_clicks,
cb.average_pageviews,
sh.search_queries,
CASE WHEN COUNT(wv.visit_date) <= 1 THEN 1 ELSE 0 END AS churn
FROM website_visits wv
JOIN click_behavior cb ON wv.customer_id = cb.customer_id
JOIN search_history sh ON wv.customer_id = sh.customer_id
GROUP BY wv.customer_id, cb.average_clicks, cb.average_pageviews, sh.search_queries;

A Single Purchase Without Follow-up in a 6+ Month Period with Customer Feedback

Assumes tables: purchases, customer_feedback, delivery_details

SELECT p.customer_id,
cf.feedback_score,
cf.feedback_comments,
dd.delivery_time_deviation,
CASE WHEN MAX(p.purchase_date) < CURRENT_DATE - INTERVAL '6 months' AND COUNT(p.purchase_id) = 1 THEN 1 ELSE 0 END AS churn
FROM purchases p
JOIN customer_feedback cf ON p.customer_id = cf.customer_id
JOIN delivery_details dd ON p.purchase_id = dd.purchase_id
GROUP BY p.customer_id, cf.feedback_score, cf.feedback_comments, dd.delivery_time_deviation;

Each of these queries tries to capture a mix of behavior and product-usage metrics that might correlate with churn.