This "generalized Louvain" MATLAB code for community detection allows the user to define a quality function in terms of a generalized-modularity null model framework and then follows a two-phase iterative procedure similar to the "Louvain" method, with the important distinction that the Louvain passes in the codes here work directly with the modularity matrix, not the adjacency matrix. That is, the main `genlouvain.m`

code can be used with any quality function specified in terms of a modularity matrix; but as such it does not take advantage of any particular structure to those matrices (cf. the excellent findcommunities code).

The most recent package file is GenLouvain2.1.zip. Please see the `README`

file there. For ongoing development and pre-release versions see the GitHub page.

Previous versions:

The functions `genlouvain.m`

with a default parameter choice are essentially similar to (but have been optimized to be faster than) the code used in our paper "Community structure in time-dependent, multiscale, and multiplex networks", P. J. Mucha, T. Richardson, K. Macon, M. A. Porter and J.-P. Onnela, *Science* **328**, 876-878 (2010). (The definitive version of this paper is freely available via referring links on Mucha's "networks" page.) However, some of the GenLouvain versions contain additional important features:

- Version 1.2: only speed improvement
- Version 2.0: speed improvement over 1.2 and randomisation option 'moverand' in
`genlouvain.m`

(helpful in a multilayer setting)
- Version 2.1: speed improvement over 2.0, additional randomisation option 'moverandw' in
`genlouvain.m`

(helpful in a multilayer setting), additional main function `iterated_genlouvain.m`

that produces better output partitions in a multilayer setting when used with the recommended input options, and a "HelperFunctions" folder with many useful functions for a multilayer setting

The development of these codes was supported at different times by NSF DMS-0645369 (in particular in the form of supporting Jutla's Research Experiences for Undergraduates project), NIGMS R21-GM099493 (supporting Mucha's contributions of improving such codes as a specific aim of this project), and two EPSRC CASE studentships (Jeub and Bazzi). Additional thanks go to our users whose experiences have helped improve this code, especially Dani Bassett, Jesse Blocher, Mason Porter and Simi Wang.

Please also note that Vincent Traag has released his Louvain Community Detection software with a variety of options, including multislice contributions. I am very hopeful that new and efficient codes will soon be released by Giuseppe Mangioni and his collaborators.

Please note that the examples below do not represent all possible uses of this generalized Louvain code. The intention of the "generalized" part here is that the code itself can in principle be used for any community detection problem that can be cast as a quality function summing over B(i,j) elements for i and j in the same community. The examples below are just some simple, clean situations wherein that might happen. As an example of a more general setting with nodes coming and going in the temporal network, consider the temporal roll call proof-of-concept in our paper.

Please pay attention to the different options below, particularly the use of function handles for larger networks. For very large networks with many layers, it may be possible to even make the outputs of these B(i) function calls as sparse; but such sparse B(i) do not appear to be compatible with the latest combinations of our code and the way we have it implemented in MATLAB.

### Example 1: Small Network

Define the NxN adjacency matrix A representing an undirected network with N small enough that you can expect a fully dense NxN B matrix to fit in memory, the following commands will run `genlouvain`

for modularity at the default resolution (`gamma = 1`

) and rescale modularity appropriately:

gamma = 1;
k = full(sum(A));
twom = sum(k);
B = full(A - gamma*k'*k/twom);
[S,Q] = genlouvain(B);
Q = Q/twom

To force deterministic (cf. pseudorandom) results, see the documentation (`help genlouvain`

) about how to set the 4th input appropriately, or directly reset the random stream yourself (e.g., `reset(RandStream.getGlobalStream)`

or `reset(RandStream.getDefaultStream)`

, depending on your version of MATLAB).

Again, the primary point of this code is if you have a different null model than that used in modularity for the standard network cases; but it nevertheless will work (albeit less efficiently) for such standard cases if you define B appropriately.

#### Example 1a: Larger Network

If your adjacency matrix A is large enough that you cannot expect to fit a fully dense NxN B matrix into memory, then you need to define B as a function handle such that B(i) returns the ith column of the B matrix. For modularity of an undirected network with resolution parameter `gamma`

, replace the B call in Example 1 above with the command

B = @(i) A(:,i) - gamma*k'*k(i)/twom;

### Example 2: Ordered Multislice Network

Define the cell array A of square symmetric NxN matrices of equal size each representing one of the T ordered, undirected network "slices". After setting values for the parameters `gamma`

and `omega`

, the multislice modularity with nearest-slice identity arcs of equal strength `omega`

is calculated by

N=length(A{1});
T=length(A);
B=spalloc(N*T,N*T,N*N*T+2*N*T);
twomu=0;
for s=1:T
k=sum(A{s});
twom=sum(k);
twomu=twomu+twom;
indx=[1:N]+(s-1)*N;
B(indx,indx)=A{s}-gamma*k'*k/twom;
end
twomu=twomu+2*omega*N*(T-1);
B = B + omega*spdiags(ones(N*T,2),[-N,N],N*T,N*T);
[S,Q] = genlouvain(B);
Q = Q/twomu
S = reshape(S,N,T);

Note that the last command conveniently reshapes the community assignment vector into an N (nodes) by T (slices) matrix so that the movement of identified nodes between communities can be easily tracked through the slices.

*It is important to stress here that multislice modularity is not limited to undirected network slices, nor does it require the same N nodes to be present in all slices; rather, these restrictions placed in this example are merely for the simplicity of the resulting function calls. For that matter, it does not require that ordered slices be connected by equal-valued nearest-slice couplings; but this is a simple way to treat ordered slices in the absence of other guidance. The multislice modularity methodology is as flexible as one's ability to cast the null models in terms of a quality function. No matter how many indices that quality function might have in its easiest-to-express form, it should be recast in terms of an equivalent modularity matrix B for input into *`genlouvain`

codes.

#### Example 2a: Larger Networks

N=length(A{1});
T=length(A);
ii=[]; jj=[]; vv=[];
twomu=0;
for s=1:T
indx=[1:N]'+(s-1)*N;
[i,j,v]=find(A{s});
ii=[ii;indx(i)]; jj=[jj;indx(j)]; vv=[vv;v];
k=sum(A{s});
kv=zeros(N*T,1);
twom=sum(k);
twomu=twomu+twom;
kv(indx)=k/twom;
kcell{s}=kv;
end
AA = sparse(ii,jj,vv,N*T,N*T);
clear ii jj vv
kvec = full(sum(AA));
AA = AA + omega*spdiags(ones(N*T,2),[-N,N],N*T,N*T);
twomu=twomu+2*omega*N*(T-1);
B = @(i) AA(:,i) - gamma*kcell{ceil(i/(N+eps))}*kvec(i);
[S,Q] = genlouvain(B);
Q = Q/twomu
S = reshape(S,N,T);

### Example 3: Categorical Multislice Network

The distinction between ordered slices and categorical slices manifests in the call below by the presence of all-to-all identity arcs between slices (cf. the nearest-slice couplings in the above example):

N=length(A{1});
T=length(A);
B=spalloc(N*T,N*T,(N+T)*N*T);
twomu=0;
for s=1:T
k=sum(A{s});
twom=sum(k);
twomu=twomu+twom;
indx=[1:N]+(s-1)*N;
B(indx,indx)=A{s}-gamma*k'*k/twom;
end
twomu=twomu+T*omega*N*(T-1);
all2all = N*[(-T+1):-1,1:(T-1)];
B = B + omega*spdiags(ones(N*T,2*T-2),all2all,N*T,N*T);
[S,Q] = genlouvain(B);
Q = Q/twomu
S = reshape(S,N,T);

#### Example 3a: Larger Networks

N=length(A{1});
T=length(A);
ii=[]; jj=[]; vv=[];
twomu=0;
for s=1:T
indx=[1:N]'+(s-1)*N;
[i,j,v]=find(A{s});
ii=[ii;indx(i)]; jj=[jj;indx(j)]; vv=[vv;v];
k=sum(A{s});
kv=zeros(N*T,1);
twom=sum(k);
twomu=twomu+twom;
kv(indx)=k/twom;
kcell{s}=kv;
end
AA = sparse(ii,jj,vv,N*T,N*T);
clear ii jj vv
kvec = full(sum(AA));
all2all = N*[(-T+1):-1,1:(T-1)];
AA = AA + omega*spdiags(ones(N*T,2*T-2),all2all,N*T,N*T);
twomu=twomu+T*omega*N*(T-1);
B = @(i) AA(:,i) - gamma*kcell{ceil(i/(N+eps))}*kvec(i);
[S,Q] = genlouvain(B);
S = reshape(S,N,T);

### Example 4: Bipartite Network

Represent the network by an m-by-n adjacency matrix A between the two types (of m and n nodes, respectively).

[m,n]=size(A);
N=m+n;
k=sum(A,2);
d=sum(A,1);
mm=sum(k);
B1=A-gamma*k*d/mm;
B=zeros(N,N);
B(1:m,m+1:N)=B1;
B(m+1:N,1:m)=B1';

## Related Codes

zrand.m calculates the z-Rand score and Variation of Information distance between a pair of partitions. If you use this code, please cite Amanda L. Traud, Eric D. Kelsic, Peter J. Mucha, and Mason A. Porter, "Comparing Community Structure to Characteristics in Online Collegiate Social Networks," *SIAM Review* **53**, 526-543 (2011).

spectral23.m provides various options for spectral bipartitioning and tripartitioning to optimize modularity. If you use this code, please cite T. Richardson, P. J. Mucha, and M. A. Porter, "Spectral tripartitioning of networks," *Physical Review E* **80**, 036111 (2009). Please also be sure that you understand that nothing about this code has been optimized for speed; rather, it was designed for proof of concept.

klnb.m is our implementation of Newman's description of Kernighan-Lin node-swapping steps for improving modularity.

### License

All codes included here are provided for broad use under the same terms as a "FreeBSD License", with copyright holders and dates as indicated in individual files or directories.