Using sketchy sentiment to pump up your post count

Online 7 April 2010 | 0 Comments

Finally, a post topic that combines both sentiment analysis and the meta-world of professional blogging!

I usually like TechCrunch for the most part, but these two articles have annoyed me: ‘Sentiment is split on the iPad‘ and ‘More iPad Sentiment Analysis‘. Both use poor, crude methods of sentiment analysis to produce posts full of fluff and pretty graphs. Result? Whatever point the blogger wanted to make. (You know what they say about statistics).

A quick rundown of the problems: Spurious classification algorithms, poor data sizes, and non-credible results. An algorithm that analyses every piece of traffic on Twitter and comes up with “51% positive, 49% negative” is Just Plain Wrong. There’s going to be a ton of stuff in the middle, unclassifiable, undecided, even just retweets of blog posts with the word in the title, and any graph should reflect that as well. Stripping out the neutral, a result of 51/49 just seems completely nonsensical to me, and I’ve been working with Twitter sentiment for a long time now.

It’d also be very interesting to know what methods the classifiers use, probably available with some digging, but I fear it’s manual keyword lists that some poor sod had to draw up — “hmm, I think if someone says the iPad is ’stupid’ that’s probably negative, yah?”.

Attensity does better, but what on earth does “not thrilled” mean (weak negative?) and again, where’s the neutral or noise aspect? It’s valuable to know just how many tweets were about the iPad, and how many of those were about sentiment. What if a TechCrunch headline with a negative word got retweeted 2000 times? That’s what we in the trade call “skew”. Plus, classifying on a small sample is just crazy. Why? Surely it can’t be computational limits; were these the only tweets with sentiment information? That’s useful data! Why throw it all away…

It also looks like there are some great leaps in logic in terms of distinguishing between “Like the iPad because it might replace iPhone” and “Don’t like the iPad because it won’t replace my iPhone”. How do you automatically extract the difference between “Can’t replace battery” and praise for the battery life? Sigh.

Plus, there’s the key mistake of not showing error, accuracy bounds, or mistakes. Both posts assume the algorithms are 100% correct. While that makes for some pretty graphs, it just isn’t true, and with no idea of sample size or result size (e.g. for the battery category above) then a result of 5% could just mean one out of a total of twenty tweets with the word battery in was negative. It’s the same for intent to purchase. Not every tweet will have any kind of intent, so if you just took the tweets containing “will” “buy” “iPad” or “won’t” “buy” “iPad”,

Of course, the reason I’m most annoyed at these posts is that I could have helped put together a custom dataset and classifier to provide much more detailed data, and didn’t. But while I can’t go back in time and change things, I can at least point out the flaws in using off the shelf graphs to meet your daily post quota as a pro-blogger.

Tagged in , , , ,

The Opposition: Twendz

Startups 12 March 2009 | 2 Comments

Twendz

It’s every CEO’s nightmare: waking up to find mails and tweets pointing out a new competitor, someone who — at first glance — appears to have pipped you to a particularly juicy post. Thus began my morning when I was alerted to Twendz, a new sentiment-based Twitter monitoring app.

However, I actually think Twendz is a good sign. To be boisterously arrogant, the app itself doesn’t hope to compete with what I’m working on in terms of technology, as it’s using the most naive of sentiment classification techniques. (OK, so it could incorporate more advanced classification, which might get worrying.) It also doesn’t really show any useful information — sure, you can get a barometer based on the last three hours or so, and there’s a jolly pretty waterfall of tweets, and the tag cloud is a nice touch, but beyond that? It’s candy, not something you can sit and digest and act on.

The fact it’s been launched by a PR firm is nothing but good, though. It’s a clear sign that at least one firm acknowledges the value of sentiment and opinions in social media, and sees a need for better ways to search and track them. It’ll be interesting to see what sort of press the app gets and it’s certainly alerted me to something that was hit home quite squarely yesterday: think small, not monolithic, release early, and see what happens. If I’d coded this up months ago, which I certainly could have done, who’d be getting the press now? Lesson learned.

Tagged in , , , , ,

Why the iPhone sucks

Games & Gadgets 28 February 2009 | 1 Comment

Desert island - elvis_payne on flickr

Idyllic retreat, or boredom incarnate? Perfection is in the eye of the beholder.

You would easily be forgiven for thinking the iPhone was a paragon of technical perfection, the answer to all of our prayers and so forth. Certainly I would warrant that a quick Internet trawl would throw up many articles praising the iPhone as Steve Jobs’ Second Coming, and more or less establishing it as the de-facto web 2.0 geek’s mobile phone of choice. But in amongst such positivity, how do we find the negative? You guessed it, that’s one of the problems I’m trying to solve.

Sometimes it’s as easy as adding the word ’sucks’ to your Googling. And yet an article like this MobileCrunch rundown of ‘8 things that we still can’t stand about the iPhone‘ is full of negative language without using too many explicitly laden adjectives, while also being very specific, constructive and useful. The comments thread is a goldmine for anyone looking to make a better iPhone, so it’s not just Apple that should be paying attention, but its competitors too.

My point here is that although things seem black-and-white when you’re trying to pull out the negativity surrounding a product, often really valuable content can be hard to find manually, whereas a sophisticated natural-language algorithm that weighted several factors would identify the above article as being fairly key to the negative sentiment around the iPhone yesterday and today. Such as, I don’t know, the one I’m developing.

As a side note, most of the poster’s concerns about the iPhone are pretty valid, and as commenters immediately identify, lack of copy and paste is a big problem too. To be frank, though, only two of the problems really affect me – no SMS counter, and no email search. Due to being Twitter-trained, 160 character messages are a luxury, and Gmail offers a web interface for when I need to search — sometimes we train ourselves to work around the device’s faults, rather than expecting the device to work for us.

Tagged in , ,

On likes and liking

Headline, Online 7 October 2008 | 1 Comment

On likes and liking

Encouraging users to ‘thumbs up’ or ‘thumbs down’ items is a great way to get some sentiment-based feedback on what can be an unmanageably large amount of data. But how reliable is it?

Both FriendFeed and Socialmedian have a binary way of saying you found a particular news item or post interesting – a quiet nod of approval, if you will. I like this. I don’t like this. As commenters have pointed out, the word ‘like’ isn’t always appropriate (I “like” the story about a celebrity suicide?) but that’s purely semantics.

What’s the point, though? By ‘liking’ items on FriendFeed you can help populate ‘best of’ lists, and aid uses in seeing at a glance what’s worth looking at. On the other hand, why do I care if Joe Bloggs, friend of Robert Scoble, likes an item? He might find entirely different things interesting to me. When I only know one of the people who likes a story, is there real value in pulling out ‘most liked’ items?

[...]

Tagged in , , , , ,