SENTIMENT
The SENTIMENT
commands lets you predict the sentiment of text given a text column as input.
Only the text column specified will be analysed.
The command will append four columns to the output: Positive
, Neutral
, Negative
and prediction
.
NOTE: SENTIMENT
currently works only for English. You can use our TRANSLATE
function to convert from 120 different languages into English.
Syntax
SENTIMENT(<column_name> [, version=<model_name>, n_fast_samples=<integer>])
column_name
the input column name to run sentiment analysis on. This must be a text column.
Options
version
can be used to specify a particular model override.- The current options are
amazon_reviews
[default] and `twitter``. amazon_reviews
is trained on ~6 million Amazon Reviews, best for review-like text.twitter
is trained on ~45k tweets, best for tweet-like text.
- The current options are
n_fast_samples
can be used to specify a special 'fast mode' for Sentiment.- The slow-but-accurate sentiment model will calculate sentiment for a random selection of data points up until
n_fast_samples
, e.g. 2000 data points ifn_fast_samples=2000
. - After this, it will use TFIDF + a XGBoost classifier to predict the rest of the data points. This is usually extremely fast, of order ~few seconds even for hundreds of thousands of data points.
- This is especially useful on very large datasets (>100k), but beware that it comes at a cost to accuracy. If the results aren't satisfactory, you can choose a larger, e.g.
n_fast_samples=10000
.
- The slow-but-accurate sentiment model will calculate sentiment for a random selection of data points up until
Returns
Appends four new columns to the input data set: Positive
, Neutral
, Negative
and prediction
.
The first three fields hold the probability of either sentiment being what is expressed in the text field and
the prediction
field is a text field and can either be Positive
, Neutral
or Negative
, depending on what is
most likely.
Examples
Predicts the sentiment of the text in the column Review Text
.
SELECT * FROM reviews SENTIMENT("Review Text")
Predicts the sentiment of the text in the column Review Text
and filters the output rows to
only return Negative
reviews.
SELECT * FROM reviews SENTIMENT("Review Text") WHERE prediction='Negative'
Predicts the sentiment of the text in the column Review Text
using the Twitter sentiment model.
SELECT * FROM reviews SENTIMENT("Review Text", version='twitter')