Blog CS - Repository of things - Gephi, SNA and Facebook

My Facebook friend network...

As you may have noticed if you know me, I've scrambled all the names on this graph to protect people's identities, because my girlfriend was understandably alarmed by having her affiliation with me available for all to see on the internet. Which is ironic when you consider Facebook in general.

Click permalink to read more...

Over the past 6 months I've been exploring the use of social network analysis techniques on various types of network data, including looking at communication networks, and within organisations. In this context you will also sometimes hear SNA referred to as ONA or 'organisational network analysis'.

For a quick blog example to demonstrate the power of some of these techniques, I've used 'netvizz' on facebook to pull out my 'facebook friend' network, then applied the modularity and force atlas algorithms in Gephi to cluster and layout my network - before finally using sigma.js to display the network above (in fact I cheated and used the Gephi SigmaExporter plugin and some hacking).

The results are plain to see: the layout and modularity algorithms produce very clear clusters, segmenting my facebook friencs into school, home, university college friends, university sports friends and work friends. I could also have tuned the modularity settings a little more to bring out the difference between uni friends in my year, versus those in my girlfriends year... So in terms of clustering network data, (both algorithmically and visually), clearly a powerful technique.

Unfortunately, facebook friend data isn't perfect for demonstrating some of the other features SNA gives out easily, because all the edges (links) have a static weight of 1. Specifically, measures of centrality, which you can calculate easily in Gephi (or using the igraph package in R), can make use of weighted edges for a more thorough analysis. However, in to give you a flavour, in the graph above, nodes are sized by their 'degree'; that is, the number total number of friends each node (or person) has, that are visible to me. However, in SNA we can calculate more interesting 'measures of centrality' for each node. One such example is betweenness. Roughly speaking, to calculate betweenness, you consider the shortest path between every single pair of nodes on the graph. A particular node's 'betweeness' is higher if they sit on more of those 'shortest paths' than other nodes.

Lets take a look at the same graph as that above, but this time with the nodes sized for betweenness:

Now we can see some specific different nodes, (those with high betweenness), being brought out of the visualisation. Specifically, let's consider the two big nodes in the middle: Nick and Akshay. The fact these guys have been highlighted is unsurprising; being close friends of mine from varying cliques (school and uni), over the years they have got to know each other as well. Hence the shortest link between a lot of my uni friends and my school friends is through this bridge, and Nick and Akshay reap all the sweet sweet betweenness. In fact, all the large nodes in this view are my close friends or family - and that makes sense - because in general, your closest friends will know each other, regardless of which clique they are from.

So from the application of a cold hard analytical technique to a set of simple 'x knows y' set of data, we can infer my inner circle of friends. But what if we applied the same technique to a collection of businesses? While merely interesting on a friendship network, betweenness centrality is also known as 'brokerage centrality'. This is because sitting on the shortest path between lots of nodes tends to happen when you're sit between two big clusters of nodes that are otherwise unconnected (as per the Akshay and Nick situation). In a business setting, this puts the broker node in the potentially financially superior position of being the link between two groups of people or businesses. In basic terms, this means the broker node is free to say whatever it wants to either cluster, without worrying about being too consistent...

Categories

Tag Cloud

About

Me

Latest posts

Gephi, SNA and Facebook

There are no published comments.

New comment