AB_TEST
The AB_TEST
command calculates a number of useful statistics for comparing groups,
including statistical significance and a p-value, to determine if the two or more groups are significantly different.
This method is most commonly applied to A/B testing, but can be used for any kind of statistical testing. This includes multiple groups, A/B/C testing, and comparing numerical or categorical.
The AB_TEST
command takes two inputs: the target
column (the column we wish to compare, e.g. conversion, churn, click-through-rate), and the treatment
column (i.e. the group they belong to - A/B, gender, location... whatever you wish!).
By default, the treatment column is assumed to be treatment
, and needs to be specified otherwise.
The output will contain several new columns, which are explained in the Returns sections below.
Technical details:
- numeric: T-test for the means of two independent samples of scores.
- categorical: Chi-squared test of independence of variables..
Syntax
AB_TEST(<column_name>, [treatment=<column_name>>])
Options
treatment
can be used to specify the column defining the group (A/B, gender, etc). Default istreatment
.
Returns
Returns several statistical outputs below, allowing you to quickly determine if the groups are significantly different to one another:
numeric
:count
- total number of valid (non-NULL) values in columnmean
- mean value of outcome for each treatmentmean_upper
- 95th percentile confidence interval on the mean value of outcome for each treatmentmean_lower
- 5th percentile confidence interval on the mean value of outcome for each treatmentindexing
- relative index of the treatment mean compared to all treatmentsp_value
- the p-value calculated by the statistical test (independent t-test)stat_sig
- if statistically significant (p_value
< 0.05) then returns 1, else 0.
categorical
:count
- total number of valid (non-NULL) values in columnpercentage
- percentage of outcome for each treatmentpercentage_upper
- 95th percentile confidence interval on the percentage of outcome for each treatmentpercentage_lower
- 5th percentile confidence interval on the percentage of outcome for each treatmentindexing
- relative index of the treatment percentage compared to all treatmentsp_value
- the p-value calculated by the statistical test (chi-squared)stat_sig
- if statistically significant (p_value
< 0.05) then returns 1, else 0.
Examples
A/B test to see if a new feature increases usage.
SELECT * FROM user AB_TEST(total_hours_active_per_week, treatment=feature_group)
Statistical test to see gender affects the types of products users buy.
SELECT * FROM user AB_TEST(product_category, treatment=gender)