|
Table of Contents
|
1. Converting the Gene Ontology graph into igraph
Rajarshi Guha wrote a nice Python script that converts the Gene Ontology graph into an igraph graph. Here is an R version that is much shorter, because we take advantage of the Bioconductor project, which has the data bundled into an R package.
Note that you only need the first two lines if you don't yet have the GO.db package installed. This package is updated twice a year, just like Bioconductor itself.
source("http://bioconductor.org/biocLite.R")
biocLite("GO.db")
library(GO.db)
library(igraph)
BP <- toTable(GOBPPARENTS)
CC <- toTable(GOCCPARENTS)
MF <- toTable(GOMFPARENTS)
g <- graph.data.frame( rbind(BP,CC,MF) )
Here is another version that keeps (some of) the metadata of the vertices (=terms) as well: the name of the term, the ontology and the textual definition. These will be added as vertex attributes.
source("http://bioconductor.org/biocLite.R")
biocLite("GO.db")
library(GO.db)
library(igraph)
BP <- toTable(GOBPPARENTS)
CC <- toTable(GOCCPARENTS)
MF <- toTable(GOMFPARENTS)
terms <- toTable(GOTERM)[,2:5]
terms <- terms[ !duplicated(terms[,1]), ]
g <- graph.data.frame( rbind(BP,CC,MF), vertices=terms )
For the terms, the table in the GO.db package is a one-to-many mapping, because it contains the (potentially many) synonyms and secondary IDS of the category. But we throw away these columns anyway, so we throw away the rows that appear many time in the table as well.
2. Centrality(-zation) function
This is quiet basic, but may help cut corners. All the three 'classic' centrality measures are normalized.
centrality.norm<-function(graph,type=c("degree","closeness","betweenness"),centralization=FALSE)
{
result<-NA
g<-graph
cent<-centralization
if (!is.igraph(g)) {stop("Not a graph object")}
if (type[[1]] == "degree") {
if (!cent) result <- degree(g)/(vcount(g)-1)
else result <- (sum(max(degree(g))-degree(g)))/((vcount(g)-1)*(vcount(g)-2))}
else if (type[[1]] == "betweenness") {
temp <- 2*betweenness(g)/((vcount(g)-1)*(vcount(g)-2))
if (!cent) result <- temp
else result <- sum(max(temp)-temp)/(vcount(g)-1)}
else if (type[[1]] == "closeness") {
if (!cent) result <- closeness(g)
else result <- (2*vcount(g)-3)*(sum(max(closeness(g))-closeness(g)))/((vcount(g)-1)*(vcount(g)-2))}
else {stop("this type is unavailable or mispelled")}
return(result)
}
For example, if you want to calculate the closeness centrality of the vertices of a graph:
g<-graph.atlas(1000);
centrality.norm(g,type="closeness",centralization=FALSE)
To obtain the closeness centralization:
centrality.norm(g,type="closeness",centralization=TRUE)
3. Exporting graphs to LaTeX, using igraph and TikZ
Jérémie Knüsel wrote a very useful function to export graphs in LaTeX in order to use the powerful TikZ package, allowing then to use its options. You'll be able to create very large graphs, which was hard and time-consuming to create in LaTeX before…
3.1 The function
- The two "igraph" parameters are 1. the graph (igraph format) 2. the layout, which can be either a matrix (dim: vcount(g) x vcount(g)) or a function (e.g. layout.random, …)
- The "TikZ" parameters, that are situated inside the "igraph.to.tikz" function may by modified according to the pgf/TikZ manual.
node is related to the nodes' aspects
nodirectional is used in the case of undirected graph
unidirectional and bidirectional are used with directed graphs. tbidirectional is used in the case of bidectional (!) edges and draws them one next to the other (and not superimposed like with the dafault R output).
3.2 The code
tikz <- function (graph, layout) {
## Here we get the matrix layout
if (class(layout) == "function")
layout <- layout(graph)
layout <- layout / max(abs(layout))
clos <- closeness(graph)
##TikZ initialisation and default options (see pgf/TikZ manual)
cat("\\tikzset{\n")
cat("\tnode/.style={circle,inner sep=1mm,minimum size=0.8cm,draw,very thick,black,fill=red!20,text=black},\n")
cat("\tnondirectional/.style={very thick,black},\n")
cat("\tunidirectional/.style={nondirectional,shorten >=2pt,-stealth},\n")
cat("\tbidirectional/.style={unidirectional,bend right=10}\n")
cat("}\n")
cat("\n")
##Size
cat("\\begin{tikzpicture}[scale=5]\n")
for (vertex in V(graph)) {
label <- V(graph)[vertex]$label
if (is.null(label))
label <- ""
##drawing vertices
cat (sprintf ("\t\\node [node] (v%d) at (%f, %f)\t{%s};\n", vertex, layout[vertex+1,1], layout[vertex+1,2], label))
}
cat("\n")
adj = get.adjacency(graph)
bidirectional = adj & t(adj)
if (!is.directed(graph)) ##undirected case
for (line in 1:nrow(adj)) {
for (col in line:ncol(adj)) {
if (adj[line,col]&col>line) {
cat (sprintf ("\t\\path [nondirectional] (v%d) edge (v%d);\n", line-1, col-1)) ##edges drawing
}
}
}
else ##directed case
for (line in 1:nrow(adj)) {
for (col in 1:ncol(adj)) {
if (bidirectional[line,col]&line > col)
cat (sprintf ("\t\\path [bidirectional] (v%d) edge (v%d);\n", line-1, col-1),
sprintf ("\t\\path [bidirectional] (v%d) edge (v%d);\n", col-1, line-1)) ##edges drawing
else if (!bidirectional[line,col]&adj[line,col])
cat (sprintf ("\t\\path [unidirectional] (v%d) edge (v%d);\n", line-1, col-1)) ##edges drawing
}
}
cat("\\end{tikzpicture}\n")
}
3.3 An example
After the installation of the pgf/TikZ package, create a LaTeX document with this template:
\documentclass{article}
\usepackage{tikz}
\usetikzlibrary{arrows,decorations.pathmorphing,backgrounds,positioning,fit}
\begin{document}
\end{document}
Then evaluate the function "igraph.to.tikz" in R. At that point, create a graph and translate it:
g<-as.directed(graph.atlas(400))
igraph.to.tikz(g,layout.fruchterman.reingold)
Copy the result in the LaTeX document and compose it!
4. Bonacich's power centrality for big graphs, using sparse matrices
bonpow.sparse <- function(graph, nodes=V(graph), loops=FALSE,
exponent=1, rescale=FALSE, tol=1e-07) {
## remove loops if requested
if (!loops) {
graph <- simplify(graph, remove.multiple=FALSE, remove.loops=TRUE)
}
vg <- vcount(graph)
## sparse adjacency matrix
d <- get.adjacency(graph, sparse=TRUE)
## sparse identity matrix
id <- Diagonal(vg)
## solve it
ev <- solve(id - exponent * d, degree(graph, mode="out"), tol=tol)
if (rescale) {
ev <- ev/sum(ev)
} else {
ev <- ev * sqrt(vcount(graph)/sum((ev)^2))
}
ev[as.numeric(nodes) + 1]
}
## test graph
test.g <- simplify(ba.game(1000,m=2))
## test run
system.time(bp1 <- bonpow(test.g))
system.time(bp2 <- bonpow.sparse(test.g))
## check that they are the same
max(abs(bp1-bp2))
5. Assortativity coefficient in R
Code translated from the python recipes. References? Same coefficient as Newman's ?
assortativity <- function(graph){
deg <- degree(graph)
deg.sq <- deg^2
m <- ecount(graph)
num1 <- 0; num2 <- 0; den <- 0
edges <- get.edgelist(graph)+1
num1 <- sum(deg[edges[,1]] * deg[edges[,2]]) / m
num2 <- (sum(deg[edges[,1]] + deg[edges[,2]]) / (2 * m))^2
den <- sum(deg.sq[edges[,1]] + deg.sq[edges[,2]]) / (2 * m)
return((num1-num2)/(den-num2))
}





