Networks and graph

Networks (or graph) are visualisations of the connections/interactions between elements of a dataset. It’s often possible to reframe your data as a network and extract useful information from analysis of the graph. Here are some examples of where graph may be interesting:

  • Social interactions: Which members of a social network interact with one another?
  • Social media data: What content is accessed by which members of a social network?
  • Purchasing decisions: Which items are bought together?
  • Genetic data: Which genes are common amongst individuals who express characteristic X?
  • Technical dependencies: Which components of a machine are critically dependent on which other components?

It’s extremely important to understand that you must not interpet features of a graph based on how close nodes are to one another. In general, the absolute positions of nodes is unimportant as most layout algorithms are non-deterministic; i.e. final node positions are computed by running a computer simulation.

Nodes

Nodes are the unique elements in your data (the individuals). They might well have a number of different properties which are important when visualising the data:

  • Node appearance: nodes are often styled to convey information about the thing the node represents, typical channels of communication include size, colour and shape
  • Intrinsic properties: nodes may have intrinsic properties that are important if one was constructing a journey (or path) through the network and might be displayed to the user when nodes are clicked or hovered over.

Edges

Edges communicate how two nodes are connected to one another, they have the following properties:

  • Directed/Undirected: An undirected edge indicates there’s no direction in the relationship between two nodes, whereas a directed node indicates information can only pass through the relation between these two nodes in the direction specified. Note that directed nodes might well be bi-directional.
  • Edge appearance: edges are often styled to convey information about the relationship between the nodes, typical channels of communication include width, dashed/dotted and colour.
  • Intrinsic properties: edges may have intrinsic properties that are important if one was constructing a journey (or path) through the network.

The networks below are undirected and directed, respectively. Select a node to see it’s first degree neighbours, note that in the second graph the direction of the edges affects which nodes are highlighted.

Layout algorithms

As mentioned above, in general the absolute positions of nodes is unimportant. Most layout algorithms are non-deterministic, i.e. final node positions are computed by running a computer simulation. Deciding on which layout algorithm (or graph embedding) to use for your visualisations is somewhat of an art as opposed to a science. The three graph below display the exact same dataset using three very different layout algorithms from the excellent igraph library. In the future this website will include a page dedicated to choosing layout algorithms.