# Clustering coefficient

### From Wikipedia, the free encyclopedia

The **clustering coefficient** of a vertex in a graph quantifies how close the vertex and its neighbors are to being a clique (complete graph). Duncan J. Watts and Steven Strogatz introduced the measure in 1998 ^{[1]} to determine whether a graph is a small-world network.

## [edit] Formal Definition

A graph *G* = (*V*,*E*) formally consists of a set of vertices *V* and a set of edges *E* between them. An edge *e*_{ij} connects vertex *i* with vertex *j*.

The neighbourhood *N* for a vertex *v*_{i} is defined as its immediately connected neighbours as follows:

The degree *k*_{i} of a vertex is defined as the number of vertices, | *N*_{i} | , in its neighbourhood *N*_{i}.

The clustering coefficient *C*_{i} for a vertex *v*_{i} is then given by the proportion of links between the vertices within its neighbourhood divided by the number of links that could possibly exist between them. For a directed graph, *e*_{ij} is distinct from *e*_{ji}, and therefore for each neighbourhood *N*_{i} there are *k*_{i}(*k*_{i} − 1) links that could exist among the vertices within the neighbourhood (*k*_{i} is the total (in + out) degree of the vertex). Thus, the **clustering coefficient for directed graphs** is given as

An undirected graph has the property that *e*_{ij} and *e*_{ji} are considered identical. Therefore, if a vertex *v*_{i} has *k*_{i} neighbours, edges could exist among the vertices within the neighbourhood. Thus, the **clustering coefficient for undirected graphs** can be defined as

Let λ_{G}(*v*) be the number of triangles on for undirected graph *G*. That is, λ_{G}(*v*) is the number of subgraphs of *G* with 3 edges and 3 vertices, one of which is *v*. Let τ_{G}(*v*) be the number of triples on . That is, τ_{G}(*v*) is the number of subgraphs (not necessarily induced) with 2 edges and 3 vertices, one of which is *v* and such that *v* is incident to both edges. Then we can also define the clustering coefficient as

It is simple to show that the two preceding definitions are the same, since

These measures are 1 if every neighbour connected to *v*_{i} is also connected to every other vertex within the neighbourhood, and 0 if no vertex that is connected to *v*_{i} connects to any other vertex that is connected to *v*_{i}.

The **clustering coefficient for the whole system** is given by Watts and Strogatz as the average of the clustering coefficient for each vertex:

A graph is considered small-world, if its average clustering coefficient is significantly higher than a random graph constructed on the same vertex set, and if the graph has a short mean-shortest path length.