DESCRIBE
The DESCRIBE
command summarises your dataset using different useful metrics, so you can get a quick overview without deep analysis.
The DESCRIBE
command takes no inputs, since it uses all input data.
This can be adjusted by using the ignore
option to ignore certain columns.
The output is a new field description
, which describes the summary metric being used for each row.
Additionally, the output will contain summaries of the numeric columns in your dataset (using option data_type=numeric
, which is the default),
or the categorical columns (data_type=categorical
).
Syntax
DESCRIBE([, data_type=<data_type>, ignore=<column_names>])
Options
ignore
can be used to specify columns (as a comma separated list) returned by theSELECT
statement but which you want theDESCRIBE
to ignore.data_type
can be used to specify which data type to analyse. Must be one ofcategorical
ornumeric
.numeric
is default.
Returns
Appends a new column to the input dataset named description
which describes the summary metric being used for each row.
A column for each input feature with type data_type
(numeric
or categorical
) with a summary statistic defined by the description
for each row.
The output columns are named after the input columns.
description
metrics are below:
numeric
:count
- total number of valid (non-NULL) values in columnmean
- meanstd
- standard deviationmin
- minimum valuemax
- maximum value10%
- 10th percentile value25%
- 25th percentile value50%
- median value75%
- 75th percentile value90%
- 90th percentile value
categorical
:count
- total number of valid (non-NULL) values in columnunique
- number of unique valuestop
- the most frequent valuefreq
- frequency of the top valuefirst
- (timestamps only) - first timestamplast
- (timestamps only) - last timestamp
Examples
DESCRIBE the numeric values in a customer
table.
SELECT * FROM customer DESCRIBE
DESCRIBE the numeric values explicitly in a customer
table.
SELECT * FROM customer DESCRIBE(data_type='numeric')
DESCRIBE the categorical values in a customer
table.
SELECT * FROM customer DESCRIBE(data_type='categorical')