Week 8: LightSide Bazaar Activity, Part One

Bazaar Activity: Hands on Experimentation With Advanced Feature Extraction


I began exploring the false positive negative examples first, as described in the assignment [1]. As I learned the application, my attentions changed from a focus on language to statistics. I was curious at first to discover why the model mis-categorized some sentences. What mistakes is the model making?

Shortly this process gave way to looking to quantify the LightSide data. This is due in no small part to the pattern of exploration I learned from using Gephi export tables with Tableau, and now continued with export tables from Lightside. Grammar and sentence sense-making went on the back-burner. Now, I was thinking about quantifications and seeking those scores which supported my observations.

If you wish to read about my imaginative investigations of sentences, grammar and human behaviors, read “Looking closer at the sentence level”. For the a more quantified exploration, skip to “Back to the Features Table”.

Exploring the Features Table

I started looking for the features which had both high frequency and weight looking for anything which struck me as unusual or odd. There are many high frequency words which don’t appear to bear any effects on sentiment detection. Words like 'a' (.1258) or 'of' (.0788) which are simply high frequency words (although they’re important insofar as they differentiate parts of speech). Then some word would just pop-out to me. I started asking myself questions like, Does 'what' (.375) have a positive connotation? Do I expect ‘what’ used in positive ways or negative ways in natural language? (what‘s misclassified ratio shows a negative tendency (58:54)).

I picked a few words to trace through the LightSide interface: but, you, and be. I expected 'but' to have a greater negative tendency, 'you' to have a positive tendency, and 'be' to exhibit neutral tendency. I set out to explore these lexemes to discover if my expectations were accurate.

FeatureFrequencyFeature Weight

Looking closer at the sentence level

Naturally, I was curious to learn how these words were used. I started drilling down to the sentence level. Each lexeme became the central figure in the mystery that was the false classification. ‘But’ has a negative feature weight, so I skipped this. Now, what’s going on with ‘you’?


94+-dignified ceo's meet at a rustic retreat and pee against a tree . can you bear the laughter ?
143+-this is one of those rare pictures that you root for throughout...
249+-this is the kind of movie during which you want to bang your head on the seat
284+-...perhaps it's just impossible not to feel nostalgia for movies you grew up with .
336+-and if you appreciate the one-sided theme

Notice #94 refers to the auditor and #143 is to anyone?

I asked myself, How is this word used? Can I define its use? If the answer was Yes, I looked for trends. With a high frequency and feature weight in this false positive category I asked myself, Is there something going on with how the actor was using the pronoun? Was something misleading about addressing the second person?

My first thought was to look for a trend for ‘you’ found outside the negative sentiment as I noticed in the first example. The phrase, ‘can you bear the laughter’, is borderline negative. It’s really pushed over the edge by the sarcasm of the first sentence; however, ‘you root for throughout’ seems completely positive. But the third instance shows an example of ‘you’ directly expressing the negative sentiment. Of course, with natural language analysis, you’re not likely to find simple consistency. The lexeme ‘you’ could be involved in any sort of address to the auditor–positive or negative.

My second thought was to consider a film or movie seen in a movie theater as a communal experience, and the context of someone asking your opinion. The first person reaction (I loved the film) helped me to consider the contrary reaction. From this I hypothesized an actor’s social awareness of his/her fellow moviegoers. I imagined a prototypical respondent acting with deference–not presuming to speak for others. Film and movies are, after all, big budget events [2]. (If I’m not thinking about myself, but instead thinking of you…).

On the other hand, going to see a film/movie in a theater is a form of entertainment which we pay to enjoy. If a moviegoer is not entertained, its permissible in the American culture to say so. Thus, when a moviegoer is asked their opinion, they may feel entitled to use the stronger language of ‘I’, possibly to exert their assumed right, and at the same time lay claim.

I formulated two tentative hypothesis based on first pass examination:

  • socially sensitive moviegoer who uses ‘you’, before making a negative comment
  • self-confident critic who uses ‘I’ when making a negative comment

Examining the failed positives there does seem to be a pattern where ‘you’ refers to the larger social group before a negative comment is made. A positive comment together with ‘you’, is followed with a negative comment reversing the expectation and representing the individual’s actual self-expressed opinion.

Of 135 records, I manually inspected 76 (57% of the total) records to determine if the use of the lexeme ‘you’ was correlated with a positive sentiment in a complex or compound sentence where the second clause contained the negative component. 14 of the records showed this pattern, 14 records I couldn’t decide and 76 records were not of this type. Comparing just the examples which were Positive examples (14) to the Negative examples (48) of this pattern we arrive at a ratio of .29. Is this statistically significant? (I don’t know, I’m really asking.)

Back to the Features Table

Once I started navigating the confusion matrix (radio buttons) [3] on just a single word, I started to understand what the model was doing, and how the confusion matrix is organized. Looking at the confusion matrix’s expected outcome (actual outcome from training set) and the model’s performance on the frequency of ‘you’ and ‘i’ as distributed among the classified sentences in the matrix, notice ‘you’ is favored as a positive indicator in both the model and training set. The model, on the other hand, agreed with they hypothesis, predicting ‘i’ as a negative sentiment nearly 2:1. Unfortunately, the actual data says the pronoun has an opposite effect.


I appears the intuition behind the hypothesis was correct. The pronoun ‘i’ shows a positive correlation with decisive expressions of sentiment. The confident American moviegoer has no problem expressing their opinion. The pronoun ‘you’ was more evenly distributed. We arrived at the characterization for the use of ‘you’ as part of an awareness of social context. The actor expressing negative sentiments would include a socially courteous statement, followed by a passive assertion of personal opinion. Thus, each sentence, from the machine learning perspective, exhibits a balance of sentiment. Only the culturally aware auditor can tell for sure. [4]

While I expected this to be a behavior of the negative critic (and so did the model), the distribution shows the opposite. Our moviegoer is more likely to use the pronoun ‘i’ in a positive sentiment. What’s most interesting to me is our model fails to capture the effect of ‘i’, despite the pronoun’s association with more definitive sentiments.


[1] This reminded me of the Type I error associated with the statistical concept of the Null Hypothesis which I understand as the inquiry asking if the default position of no correlation is true. However, my understanding of the Null Hypothesis and Type I & Type II errors comes from observational experiments. I’m not sure how it applies to the Machine Learning activity.

[2] Spending one hundred million dollars doesn’t happen without a great amount of human thought and effort by many people interested to produce a success. “In 2007, for example, the average cost to produce a major studio movie was around $65 million. But the production costs don’t cover distribution and marketing, which was another $35 million or so, on average, in 2007, bringing the total cost to produce and market a major movie right at $100 million.” [Google search].

Movies and film effect us on many levels. Modern movie makers work to appeal to, stimulate, audiences on many levels creatively, but also as a hedge-bet for greater box office returns. Tropes such as the tag line from the 1980s come to mind. Certainly a rousing sound track helps. Tarantino is famous for not using film scores, but assembling a play list to accompany the film.

[3] I didn’t expect the Feature Weight for false positives would be the mirror of true negatives. Maybe I expected the weight value to sum to one, or maybe I expected the weights to be different. I’m not sure what I expected.

[4] These types of expressions could be described as ironic. Irony involves a gap in expectation, or cognitive dissonance. Where the auditor experiences a strong (imbalanced) reversal of expectation, we may discover these sentiments are more decisive as opposed to the characteristics of confusion or indecisiveness generated by balanced contrariness or opposition.

Experiment Notes

1. Extract
Basic features
track feature hit loc.yyyyy
stretchy patternsyyyyy
  category files: pos, neg. yyyyy
  include surface words in p.yyyyy
  require >=1 cat/patternyyyyy
  don't include surf/pos w/cmyyyyy
3. Build Models
Logistic Regressionyyyyyyy
L2 Regularizationyyyyyyy
Num folds10101010101010
Feature Selectionallallall3500350035003500
4. Exploration
Confusion Matrix
t neg4079406940824131508541314129
f neg125212621249120024612001202
f pos1318132713261280476412801310
t pos401340044005405156740514021
Model Eval Metrics***
accuracy (.759, .7651)0.7590.75720.75850.76740.53010.76740.7644
kappa (.52, .53)0.51790.51440.5170.53480.06020.53480.5288