Skip to main content

Product Metrics

The motivation for product metrics is to better understand the dynamics and behaviors of users of ones product through the behavioral data captured about users within the product, as opposed to the non-behavioral or external data like account details, payment info, feedback etc.

What you can do with product metrics is often quite specific to a particular domain and context, since it involves the types and formats of data captured within and around a particular product, but some common types of analysis are:

  • measuring effects of product changes, either analysing before versus after or across cohorts
  • drivers of observed behaviors
  • forecasting of behavioral metrics

We will look at two examples of product metrics here: understanding a binary conversion metric as well as a continuous engagement metric.

Example Data

In the following examples we will use data from the Kaggle Dataset eCommerce events history in electronics store, which includes user events captured on an electronics ecommerce site. For each entry we have the type of the event, the user, the session and meta data about the product they interacted with.

You can download the raw dataset that fits with below queries directly from here.

Conversion

Defining conversion

The first product metric we will look at is a binary conversion metric, meaning an metric that for each user can either be true or false: they either converted/did something or not

For this particular dataset we define our conversion event for a given user as whether that user purchased a product or not, which we can define in SQL terms as

CASE WHEN event_type='purchase' THEN 1 ELSE 0 END as purchased

Understanding conversion from product perspective

The first thing we want to do with this data is analyse and understand what other behavioral traits drive conversion.

We look at it from the perspective of a given product, meaning we look at what features of a product predicts, or drives, a high conversion rate, ie a large amount of units sold.

SELECT product_id, price, brand, category_code,
SUM(CASE WHEN event_type='purchase' THEN 1 ELSE 0 END) as purchases
FROM Electronics_Store_Events
GROUP BY product_id
EXPLAIN(PREDICT(purchases, model='reg', ignore=product_id))

Note that we set the model to be a regression, since we want the sum of purchases to be viewed as a continuous variable and not as a discrete classification.

The conclusion we can derive from quite simplistic feature set, is that it is the price that the primary driver of units sold - which makes sense, you would expect low-cost items to sell more units than more expensive ones.

Understanding conversion from session perspective

We now look at conversion but from the perspective of a single session within the site. Meaning we look at what drives the successful conversion of a single user session.

To analyse this we create a few new feature, specific to a single session, that we believe might be important. These are

  • views, how many pages where views in the user session
  • carts, how many items where put in the cart in the user session
  • duration, how long did the session last
  • price, what was the average price of all products viewed in the session

We then use these to predict whether the session results in a purchase or not and then explain this prediction to understand what drives conversion from a session perspective.

SELECT AVG(price) as price, SUM(is_view) as views, 
SUM(is_cart) as carts, MAX(is_purchase) as is_purchase,
MAX(julianday(event_time)) - MIN(julianday(event_time)) as duration
FROM (
SELECT user_session, price as price,
(CASE WHEN event_type='view' THEN 1 ELSE 0 END) as is_view,
(CASE WHEN event_type='cart' THEN 1 ELSE 0 END) as is_cart,
(CASE WHEN event_type='purchase' THEN 1 ELSE 0 END) as is_purchase,
substr(event_time, 1, 19) as event_time
FROM Electronics_Store_Events
)
GROUP BY user_session
EXPLAIN(PREDICT(is_purchase))

We find that the duration of the sessions is the main driver, which is an interesting insight. This is followed by carts, which is less interesting - the more you put in your cart the more likely you are to purchase. However, it is interesting that views has almost no effect on purchase, ie how many products a user views in a session does not predict whether that session ends in purchase.

Finding patterns

Finally we try to understand whether there are patterns in the user sessions, ie whether they group together by their features.

This will enable us to understand what the patterns are, how they look like and how prevalent they are compared to other patterns.

SELECT AVG(price) as price, SUM(is_view) as views, SUM(is_cart) as carts,
MAX(is_purchase) as is_purchase,
MAX(julianday(event_time)) - MIN(julianday(event_time)) as duration
FROM (SELECT
user_session, price as price,
(CASE WHEN event_type='view' THEN 1 ELSE 0 END) as is_view,
(CASE WHEN event_type='cart' THEN 1 ELSE 0 END) as is_cart,
(CASE WHEN event_type='cart' THEN 1 ELSE 0 END) as is_purchase,
substr(event_time, 1, 19) as event_time
FROM Electronics_Store_Events)
GROUP BY user_session
CLUSTER()

Product Engagement

Define engagement metric

In the next example we define a continuous engagement score to try to understand what drives user engagement on the site.

There are many was that we could define an engagement score on a per-user basis, a few are:

  • Point based model where certain actions earn certain points, a users engagement is total amount of points
  • Duration of user sessions, how long has the user spend on the site.
  • Number of times the user has logged in. This captures if we are recurring users.

In this example, we use the number of user sessions to capture engagement. In SQL terms this can be defined as:

SELECT user_id, COUNT(user_session) as engagement
FROM (SELECT user_id, user_session FROM Electronics_Store_Events GROUP BY user_session)
GROUP BY user_id
ORDER BY engagement DESC

Understanding engagement

Using the engagement metric we can now try to understand what drives engagement by predicting it and explaining the prediction, using the same features that we constructed in the previous example, namely views, price, carts, duration, purchase, which we calculate on a session basis and then aggregate across sessions to a per-user basis by averaging.

SELECT user_id, COUNT(user_session) as engagement,
AVG(price) as price, AVG(views) as views, AVG(carts) as carts,
AVG(is_purchase) as purchases, AVG(duration) as durations
FROM (SELECT user_id, user_session,
AVG(price) as price, SUM(is_view) as views,
SUM(is_cart) as carts, MAX(is_purchase) as is_purchase,
MAX(julianday(event_time)) - MIN(julianday(event_time)) as duration
FROM (SELECT user_id, user_session, price as price,
(CASE WHEN event_type='view' THEN 1 ELSE 0 END) as is_view,
(CASE WHEN event_type='cart' THEN 1 ELSE 0 END) as is_cart,
(CASE WHEN event_type='cart' THEN 1 ELSE 0 END) as is_purchase,
substr(event_time, 1, 19) as event_time
FROM Electronics_Store_Events)
GROUP BY user_session)
GROUP BY user_id
EXPLAIN(PREDICT(engagement, model='reg', ignore=user_id))

We find that, as opposed to conversion, engagement is not driving by the average duration of sessions but by the average page views per session. So the more page views users have per session the more likely they are to be recurring users.