El recuento de triángulos es realmente difícil y costoso desde el punto de vista informático. Tal vez este sea un buen punto de partida para entender por qué: Efficient Triangle Counting in Large Graphs via Degree-based Vertex Partitioning.
El bucle apropiado debe verificar para cada uno de los n nodos contra cada uno de sus vecinos (n * (n-1)) y continuar el ciclo para ver si n vecinos del vecino n es n: (n * (n-1)) (n-1) (n-1), que es casi 10^16 para 10000 n. Con un millón de nodos estos bucles se vuelven tontos, pero para tu 10000 no deberías tener ningún problema si quieres usar la fuerza bruta :)
Mencionaste que codificas en C# y en el gráfico (que está disponible para C) tiene un excelente algoritmo para contar triángulos escritos por Gabor Csardi. Contó 1.3 millones de triángulos en mi gráfico aleatorio de 10000 nodos y un millón de bordes en 1.3 segundos en una computadora portátil de cinco años :) Gabor Csardi sería el tipo a preguntar :)
En términos de diferentes enfoques programáticos, tal vez debería ver los datos en los que almacena su red. Si se almacena en una matriz de adyacencia, el número de bucles es fijo, pero en una lista de bordes de una red de tres aristas, el número de bucles es un múltiplo de tres, independientemente del número de bucles. Puede preguntar a la lista de bordes de los vecinos de un nodo sin tener que probar cada combinación de i-> j.
Escribí una secuencia de comandos pedagógica en R para ilustrar los enfoques y medir la velocidad de los diferentes algoritmos de una manera muy básica. Hay muchos problemas de velocidad inherentes al uso de R aquí (la versión de la lista de bordes está completamente saturada por demasiados bordes), pero creo que el ejemplo del código debería hacer que fluyan algunas ideas sobre cómo pensar acerca de la velocidad de la velocidad bruta. forzando recuentos de triángulos. Esto está en R, y no extremadamente limpio, pero bien comentado. Espero que puedas romper la barrera del idioma.
Todo lo mejor.
# Counting triangles in a random graph using igraph and two different
# and almost equally stupid approaches looping through the 1) adjacency
# matrix and 2) the edge-list in R.
# Use igraph and these configs
library(igraph)
V <- 100
E <- 1700
# This is the random graph that we will use
g <- erdos.renyi.game(type="gnm", n=V, p=E, directed=FALSE, loops=FALSE)
# igraph has such a fast algorythm. Long live Gabor Csardi!
benchmark <- proc.time()["elapsed"]
triangle.count <- sum(count_triangles(g)/3)
gabor.Csardi.benchmark <- proc.time()["elapsed"] - benchmark
# For not to big networks we can plot them to get a basic feel
if(length(V(g)) < 100){
V(g)$size <- 5
plot(g)
}
# The adjacency matrix approach will have to deal with a crazy
# ammount of looping through pairs of matrix combinations to
# find links:
# We'll loop through each node to check it's participation in triangles
# notice that a triangle ijk will be participated in by nodes
# i, j, and k, and that each of those nodes have two triangular counts.
# All in all the structures ijk, ikj, jik, jki, kij, kji are each counted
# but shall be returned as 1 triangle. We will therefore devide our
# search-result by 6 later on.
# Store our progess in this matrix to look at how we did later on
progress <- matrix(0, nrow=length(V(g)), ncol=8)
# Loop through all nodes to find triangles in an adjacency matrix
benchmark <- proc.time()["elapsed"] # Measure time for this loop
for(i in 1:length(V(g))){
# Node i has connections to these nodes:
i.neighbors <- as.vector(neighborhood(g, 1, nodes=i)[[1]])
i.neighbors <- setdiff(i.neighbors, c(i)) # i should not be part of its own neighborhood
# for each i, tri is the number of triangles that i is involved in
# using any j or any k. For a triangle {1,2,3}, tri will be 2 for
# i==1, since i is part of both triangles {1,2,3} and {1,3,2}:
tri <- 0
for(j in i.neighbors)
{
# Node j has connections to these nodes:
j.neighbors <- as.vector(neighborhood(g, 1, nodes=j)[[1]])
j.neighbors <- setdiff(j.neighbors, c(j)) # j should not be part of its own neighborhood
# Were any of j's neighbors also a neighbor of i?
k <- intersect(i.neighbors, j.neighbors)
tri <- tri + length(k)
}
# Save our findings to the progress matrix
progress[i,1] <- tri
progress[i,7] <- proc.time()["elapsed"] - benchmark
}
progress[,2] <- sapply(1:length(progress[,1]), function(x) sum(progress[,1][1:x]))
progress[,3] <- round(progress[,2]/6, digits=2)
# The edge-list approach uses a list of all edges in the network to loop through instead
# Here, I suppose, a lot of the extra speed could arise from R being better at looping
# with lapply() and at finding data in a data.frame that the brute-force loop above is.
el <- as.data.frame(as.matrix(get.edgelist(g,)))
# This is ugly. Make the edgelist contain all edges as both i->j and j->i. In
# the igraph object, they are only given as low i to high j by get.edgelist()
el.rev <- data.frame(el[,2], el[,1])
names(el) <- names(el.rev) <- c("i","j")
el <- rbind(el, el.rev)
# these nodes are connected (we'd only need to bother abouth non isolates)
nodes <- sort(unique(c(el$i, el$j)))
tri <- 0
# Loop through connected nodes to find triangles in edge-list
benchmark <- proc.time()["elapsed"] # Measure time for this loop
for(i in nodes){
i.neighbors <- el[el$i==i,]$j
# i's neighbors are the $j:s of the edgelist where $i:s are i.
k.list <- unlist(lapply(i.neighbors, function(x) intersect(i.neighbors,el[el$i==x, ]$j)))
# lists nodes that can be a k in an ijk-triangle for each of i's neighboring j:s
# If 1 has neighbors 2 and 3, el[el$i==x, ]$j) will be first, the neighbors of 2 and then
# the neighbors of 3. When intersected with the neighbors of i, k:s will be found. If
# {1,2,3} is a triangle one k will be 3 for {i=1, j=2}, and another k will be 2 for {i=1, j=3}
# k.list might be NULL
tri.for.this.i <- (as.numeric(length(k.list))/2)
# Here we devide by two since i can be in a triangle with j and k lik {ijk} and {ikj}
# We will later have to devide by 3 more, since each triangle will be counted for
# each node i that we loop through
# Save the counting to the progress
tri <- tri.for.this.i + tri
progress[i,4] <- as.numeric(tri.for.this.i)
mm <- c(mm, i)
progress[i,8] <- proc.time()["elapsed"] - benchmark
}
progress[,5] <- sapply(1:length(progress[,4]), function(x) sum(progress[,4][1:x]))
progress[,6] <- round(progress[,5]/3, digits=2)
# Fix the results into a usable format
results <- data.frame(c("igraph", "adjacency-loop", "edge-loop"),
c(triangle.count, max(progress[,3]), max(progress[,6])),
c(gabor.Csardi.benchmark, (max(progress[,7]) - min(progress[,7])), (max(progress[,8]) - min(progress[,8]))))
row.names(results) <- c("igraph", "Adjacensy-loop", "Edge-loop")
names(results) <- c("Routine", "Triangle count", "Execution-time")
# Now we have run two approaches of more or less the same thing.
# Not only should the igraph triangle.count, and the two loops
# be identical, but the progress of the two methods should too.
progress[,3] == progress[,6]
plot(progress[,6], type="l", col="blue")
lines(progress[,7], col="green")
# Look at the result:
View(results)
Comprobar esto para C programa implimentation [número de triángulo en el gráfico dado] (http://www.msccomputerscience.com/2014/04/wap-to-find-number-of-triangle-in- given.html) – ARJUN