Skip to main content

SENTIMENT

The SENTIMENT commands lets you predict the sentiment of text given a text column as input. Only the text column specified will be analysed.

The command will append four columns to the output: Positive, Neutral, Negative and prediction.

NOTE: SENTIMENT currently works only for English. You can use our TRANSLATE function to convert from 120 different languages into English.

Syntax

SENTIMENT(<column_name> [, version=<model_name>, n_fast_samples=<integer>])
  • column_name the input column name to run sentiment analysis on. This must be a text column.

Options

  • version can be used to specify a particular model override.
    • The current options are amazon_reviews [default] and `twitter``.
    • amazon_reviews is trained on ~6 million Amazon Reviews, best for review-like text.
    • twitter is trained on ~45k tweets, best for tweet-like text.
  • n_fast_samples can be used to specify a special 'fast mode' for Sentiment.
    • The slow-but-accurate sentiment model will calculate sentiment for a random selection of data points up until n_fast_samples, e.g. 2000 data points if n_fast_samples=2000.
    • After this, it will use TFIDF + a XGBoost classifier to predict the rest of the data points. This is usually extremely fast, of order ~few seconds even for hundreds of thousands of data points.
    • This is especially useful on very large datasets (>100k), but beware that it comes at a cost to accuracy. If the results aren't satisfactory, you can choose a larger, e.g. n_fast_samples=10000.

Returns

Appends four new columns to the input data set: Positive, Neutral, Negative and prediction. The first three fields hold the probability of either sentiment being what is expressed in the text field and the prediction field is a text field and can either be Positive, Neutral or Negative, depending on what is most likely.

Examples

Predicts the sentiment of the text in the column Review Text.

SELECT * FROM reviews SENTIMENT("Review Text")

Predicts the sentiment of the text in the column Review Text and filters the output rows to only return Negative reviews.

SELECT * FROM reviews SENTIMENT("Review Text") WHERE prediction='Negative'

Predicts the sentiment of the text in the column Review Text using the Twitter sentiment model.

SELECT * FROM reviews SENTIMENT("Review Text", version='twitter')