I give a brief introduction to social network analysis.
Social network analysis is hot these days. Part of the reason is that the notion of a social network is very general.
What is a social network?
A social network is a combination of nodes and links. The nodes can be any sort of object and the links can be any sort of connection between them. That is deliberately abstract; the abstraction is part of the power of social network analysis. But some examples will help.
Examples of social networks
In the “Kevin Bacon game” the nodes are actors and the links are “being in a movie together”. Mathematicians may like to know their Erdos number – in that case, the nodes are mathematicians (and people in related fields) and the links are “being co-authors on a paper”. In a study of concentration of corporate power, the nodes could be corporations and the links could be “number of board members shared”. In a study of 15th century Florence, researchers used families as nodes and marriage as the link. As is evident, social networks cover a wide range of fields. They are also used to rank pages on the web.
Some notions regarding social networks
A network can be connected or not. It is connected if every node can be reached from every other node by a series (however long) of links. If a social network is not connected, then it will be composed of two or more connected components. The shortest path between two nodes is called a geodesic. The diameter of a component is the longest geodesic.
Social networks can be directed or not. A network is directed if the links go in one direction. For example, if the link is “parent of” then the network is directed (because if Joe is the parent of Bob, Bob is not the parent of Joe), but if the link is “sibling” then the network is undirected because if Joe is Bob’s sibling then Bob is Joe’s sibling.
The small world phenomenon is that, in many networks of different types, all (or nearly all) pairs of nodes are connected by paths of relatively short length. For example, in the Kevin Bacon game, it is claimed that the longest geodesic from Kevin to any other actor who has been in a movie is 6. However, it was later found that almost any pair of actors (whether it included Mr. Bacon or not) was connected by a path of length 6 or less.
Nodes of a graph may be more or less central. Although there are many definitions of centrality, one notion is that a node is central if it is on a lot of geodesics (this is called betweenness centrality). Another, much simpler notion, is that a node is central if it has a lot of links connected to it. In addition, an entire network may be more or less central – the key idea here is the relationship between the node with the largest centrality and the centrality of other nodes. In a star network, one node is highly central and the others very low on centrality. In the most extreme case, one node is connected to all the other nodes, while each of them is connected only to it.
Cliques and clans are notions related to connection of nodes that are tightly connected – that is, they have many links among them and (relatively few) links to other nodes.
The density of a network is the ratio of the actual number of links to the total possible links
This only touches the surface of the subject, but it should be clear that social network analysis can be very useful in a lot of areas.
A good short introduction is Social Network Analysis by David Knoke and John Scott
A seminal text is Social Network Analysis: Methods and Applications by Stanley Wasserman and Katherine Faust
One of many good recent books is The Sage Handbook of Social Network Analysis by John Scott