Browsing archives for 'Headline'

Professional blogging in practice: part 2

Headline, Online 13 February 2009 | 2 Comments

Following on from last week’s post about finding sources, today I’m looking at the rest of the professional blogger’s daily pipeline.

Once you’ve found something to write about, it’s time to sit back, relax and let your blogger instincts do the rest. Right? Perhaps. Once you get into the habit of posting multiple times a day on the same site, a lot of the following stages in a post’s lifecycle do become second nature, but when you’re starting out it’s useful to run through the checklist in your head.

[...]

Tagged in , , , , , , ,

Encapsulating chatter (or why we don’t trust pixies)

Headline, Online 30 January 2009 | 0 Comments

I’ve been spending a lot of time playing about with the Twitter API recently, putting together a couple of quick projects which are partially powered by sentiment AI and partially just fun(tm). Along my wanders I appear to be accruing Twitter followers at a rate that, while modest by comparison to most people, means my time spent monitoring the platform and simply engaging in meta activity (checking out new followers etc) is growing.

It gets to the point where, with an unfiltered stream-of-twitter-consciousness, I spend far too long first thing in the morning just catching up on what people have said while I was asleep (a side effect of following Americans and insomniacs). It’s a bit like reading vast numbers of RSS feeds (another habit of mine); my eyes glaze over halfway through and I barely skim read most of the messages, although fortunately due to years of top-notch academic training I’m able to skim read like a pro. Oh yes.

Oh, but a service like filttr would be perfect for me, I hear you cry. A secret, intelligent algorithm that figures out what I want to read so all I have to do in the morning is sit back and trust the magic pixies. Well, that’s the sticking point for me and a lot of people, I guess — and possibly a reason why sentiment classification will never work as a means to cut down personal information overload. I’m an algorithm designer, I have entire farms of pixies at my beck and call, and can I encapsulate an algorithm that will show me exactly what I want to read? Most likely not. My inherent distrust of generalisations means there will always be one tweet that isn’t shown that I would have liked to have read — or even if there isn’t, I’ll think there is, and not want to filter.

Instead, I climb the laborious yet enlightening slope of trying to figure out a system that will more or less get me the information I want without any of that pesky “you might have missed something interesting” feeling nagging at me. I use TweetDeck, as many others do, to get a great visualisation of @replies and DMs as well as the follow firehose. Yet my follows are split into logical (ish) groups, and I can make a couple of friend groups that will streamline the chatter quite effectively. The only problem is keeping said groups up to date — I’m constantly adding new people that seem interesting, and if every follow results in altering two TweetDeck installations’ group settings, urgh. Yet if all new follows end up in the firehose, I haven’t really solved the problem at all.

The other approach which I think will be useful is twofold. Links and conversations. As someone (I forget who) recently noted, as you follow more people on Twitter, you end up being privy to more and more @conversations (assuming you don’t already have ’show all @replies to anyone’ on, which is horrendous to stay on top of and mostly pointless as well). Sometimes it’s useful to join in on these conversations, but at other times, you’re 12 hours late to the party and they’re just not that important. I’ve seen mention of people working on Twitter conversation threading, so I guess this belongs there, but a UI that collapsed conversations (perhaps older than a set time) would save a lot of screen and brain real estate.

Links is the other half of this. It’s quite fun in the morning, while catching up on Twitter, to click on interesting looking links and open them all in tabs then browse through them. Two problems with this. Sometimes people RT the same link and most of the time I forget the link’s origin, so if I want to RT it I can’t give credit. Would pulling out all the links from last night’s twitterstream be useful? I think so, especially if there were some way to associate opened links with the original tweet. Trying to scroll back and see which tinyurl or bit.ly post is actually pointing at the informative Guardian article I have open is a nightmare, but if I had a clean display that pulled out all the links from friends, gave them extra props if linked multiple times, and gathered all relevant comments (possibly including non-friend @replies) while also longurling them for easy back-reference — then hid all the pure-link tweets from last night, I suppose — then URL management might be a little easier.

Of course, a different approach is to manage who I follow. The big question from a NLP/AI point of view here is “can a classifier learn the set {worth following, oh my god no way}?” i.e. if I show an algorithm a twitter account, will it make the same following decision I do?

It seems fairly simple at first pass to jot down a feature set that’s computationally feasible and probably relevant to how I make following decisions:

  • userpic present? (more subjective: userpic interesting?)
  • twitterer location (Edinburgh, Cambridge or London = likely to be a yes. Rest of world = no preference)
  • twitterer language (only follow English tweets, sorry)
  • number followers and following (don’t follow followbots; generally follow people with a healthy ratio or people with lots of following i.e. net-celebs)
  • volume of updates (don’t follow the super spammy; unlikely to follow people with 1 or 2 updates especially if they plug their product)
  • quality of updates (decent number of @replies – person is active participant; lots of URLs to same site – account is a blog bot, unlikely to follow unless I like the blog – ah the irony)
  • subject matter, both from bio and from content (if person’s bio matches my interests, likely to be a yes; if person’s recent tweets overlap with areas I’m interested in, similar. maybe they’re using the same hashtag as me, i.e. at the same event.)
  • interestingness factor (hardest part to quantify. do they post funny photos? do a lot of people I follow follow them? are they witty? will I benefit from hearing what they had for breakfast?)

Of course, I don’t run through all these factors every time I see a new profile. Unfollowing someone has a low penalty (they might stop following you, oh no!) so it’s very easy to just go “no” at the obvious bots/promoters and “yes” to anyone who seems remotely human and Twitter-savvy, then unfollow them later. But I reckon it would be pretty easy to train up a classifier to decide whether people were worth following, and (a la MrTweet) use shared interests (hello NLP) and friend/follow networks to recommend new people. In fact one of my Google Apps sandbox projects is working on this but it’s not a commercial venture, so I feel quite happy to ramble about it in the hope people will tell me I’m dead wrong and how to fix it.

Another thought which occurs to me when I think about term extraction, frequency and classifying Twitters into subject buckets. Given the total sum of knowledge available to a casual observer (my update history, my network and extended network, my network’s update history) can I use simple clustering techniques to segregate my follow cloud into distinct groups for easier update browsing? I think with some gentle nudging, I could do so for my own network, but establishing an algorithm that could do so for anyone might be more difficult. Perhaps it needs a little Facebook or Friendfeed integration to pull out some more information (I have some seemingly isolated Twitterfriends from university who I guess it’d be hard to cluster at first pass). The question from a commercial point of view is, of course, would anyone use it and would they pay? Generally, as we saw with the magic pixies above, people trust their own judgement better than a computer’s. And yet I think it’d be quite fun to play around with clustering, visualisations and the magic of the Tweetcloud. Especially if we could change the parameters at will to cluster people by similarity (these guys all post loads of links, these all have conversations with each other) as well as overlap in topic/geography/network. Is anyone doing this? Someone must be, surely.

If not, I will.

[Aside: No image, since for some godforsaken reason T-Mobile's mobile broadband service blocks Flickr and then asks for a credit card number over an unsecured connection.]

[Update: Image from dawn_perry]

Tagged in , , , , , , , , , , ,

LeWeb – Day 1

Headline, Startups 9 December 2008 | 1 Comment

Will be updating this until I run out of wifi or battery, either of which will likely happen before lunchtime…

The first half of the morning’s been interesting. Three big hitters on stage — Microsoft, Google and MySpace. If you’re following the twitterverse, you’ll learn quickly that the Myspace Toolbar announcement was pretty flat, that Microsoft is attempting to be vaguely fluffy and that Google… is. The most interesting part so far’s been David Weinberger’s talk, basically saying that the age of the lonely, heroic leader is over and crowdsourcing is the way things will go. The vision of an Obama govt where the US is connected via a social network & community leaders engage directly with government over key issues was pretty appealing.

A theme of optimism despite the downturn and a general sense of ‘wow, so many entrepreneurs are here despite the climate’ which is meant to be cheerful but really only adds to the general feeling that everything must be pretty bad if people keep telling us how good things are ‘despite’ it. Sure, I wouldn’t be here at all if it wasn’t for TechCrunch so I’m amazed there are plenty of companies apparently paying over 1000 euros for a 2-day event. I can see why the price was so high – very professional production quality, reminds me of American events I’ve been to far more than anything else. But I still maintain it’s silly.

On another note the Twitter backchannel is so, so noisy. Sitting in the audience nearly every laptop screen I squinted at (nosy, moi?) had Twitter open and some people were literally live-tweeting the talks. Giving me some pause for thought about how to implement Project X for an event like this where you want to filter out a ton of this realtime information. It’s great to follow as it happens but when I’m just catching up during a coffee break, I’m overwhelmed and mostly only read the latest tweets. How do I filter out the interesting ones that were tweeted during the talks? Well, watch this space.

Tagged in , ,

Web 2.0 Expo: What makes success?

Headline, Online 21 October 2008 | 0 Comments

This morning, Dion Hinchcliffe spoke about Web 2.0 and its place in the online world, a presentation which (despite its overly-long jam-packed-with-slides delivery) had plenty of useful things to say. Here are a few of the key messages.

Firstly, what does ‘Web 2.0′ mean? A Web 2.0 application takes advantage of network effects — i.e. the more that other people have or use a service, the more value it has. Social network effects increase this value due to their immense potential for rapid growth and large reach. The small percentage of sites that manage to hit critical mass and use network effects experience astronomical growth, but others trying to compete in the space have a lot of trouble fighting established network effects.
[...]

Tagged in , , ,

On likes and liking

Headline, Online 7 October 2008 | 1 Comment

On likes and liking

Encouraging users to ‘thumbs up’ or ‘thumbs down’ items is a great way to get some sentiment-based feedback on what can be an unmanageably large amount of data. But how reliable is it?

Both FriendFeed and Socialmedian have a binary way of saying you found a particular news item or post interesting – a quiet nod of approval, if you will. I like this. I don’t like this. As commenters have pointed out, the word ‘like’ isn’t always appropriate (I “like” the story about a celebrity suicide?) but that’s purely semantics.

What’s the point, though? By ‘liking’ items on FriendFeed you can help populate ‘best of’ lists, and aid uses in seeing at a glance what’s worth looking at. On the other hand, why do I care if Joe Bloggs, friend of Robert Scoble, likes an item? He might find entirely different things interesting to me. When I only know one of the people who likes a story, is there real value in pulling out ‘most liked’ items?

[...]

Tagged in , , , , ,