SIMILAR_TO
The SIMILAR_TO
commands lets you compute a similarity score between a specific row and all other rows in the data set.
The command will append a new column similarity
which has a score for each row between 0 and 1,
where 1 indicates that the two rows are identical.
Syntax
SIMILAR_TO(<column_name>=<value>)
column_name
defines the column name to use for specifying the row to compute similarities for.value
defines the value to use. The condition must specific a unique row in the data set.ignore
can be used to specify columns (as a comma separated list) returned by theSELECT
statement but which you want theSIMILAR_TO
to ignore.
Returns
Appends one new column to the input data set: similarity
.
The column holds a value between 0 and 1 for each row,
where 1 is the highest similarity score.
Examples
Computes the similarity between all rows and the row where the column user_id
has value 50
.
SELECT * FROM companies SIMILAR_TO(user_id=50)