Clustering coefficient with parallel edges

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Clustering coefficient with parallel edges

Tanguy Fardet-2

Hi Tiago,

I was recently comparing the behavior of the clustering coefficient in different graph libraries and I was surprised by the results given by graph-tool with parallel edges:

import graph_tool as gt
from graph_tool.clustering import local_clustering
from graph_tool.stats import remove_parallel_edges

num_nodes = 5
edge_list = [(0, 3), (3, 0), (1, 0), (0, 1), (1, 2), (2, 1), (2, 4), (4, 2), (4, 1), (1, 4), (4, 3), (3, 4)]

g = gt.Graph()

g.add_vertex(num_nodes)

g.add_edge_list(edge_list)

print(local_clustering(g, undirected=False).a)
print(local_clustering(g, undirected=True).a)

g.set_directed(False)
remove_parallel_edges(g)
print(local_clustering(g, undirected=True).a)

Gives:

[0.         0.33333333 1.         0.         0.33333333]
[0.         0.26666667 0.66666667 0.         0.26666667]
[0.         0.33333333 1.         0.         0.33333333]

And I must say I was expecting the same result for all three ^^

The graph:

From what I understand, then, it seems that, when using undirected=True, graph-tool considers reciprocal edges as two different ones and considers the possible triangles between parallel edges as distinct.

Is this desired behavior, and if so, why?
I know it stems from the mathematical formula, but I think this one was derived from simple graphs and I'm not sure I see the physical meaning of the "extension" to parallel edges, besides the fact that it would perhaps give the same result as weighting the edges by their number.

If this is indeed desired, is there a quick way to do the equivalent of setting the graph to undirected and removing the parallel edges in "view-mode" (i.e. making the removal temporary, like a filter)?

Thanks in advance for your help and for your work on graph-tool.

Best,
Tanguy


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Clustering coefficient with parallel edges

Tiago Peixoto
Administrator
Am 14.04.20 um 09:32 schrieb Tanguy Fardet:
> From what I understand, then, it seems that, when using undirected=True,
> graph-tool considers reciprocal edges as two different ones and
> considers the possible triangles between parallel edges as distinct.
>
> Is this desired behavior, and if so, why?> I know it stems from the mathematical formula, but I think this one was
> derived from simple graphs and I'm not sure I see the physical meaning
> of the "extension" to parallel edges, besides the fact that it would
> perhaps give the same result as weighting the edges by their number.

Yes, it's desired behavior.

Ignoring parallel edges would have complicated and slowed down the code,
and in any case can be easily implemented by the user (see below), who
should have something specific in mind when attempting to compute the
value for multigraphs.

> If this is indeed desired, is there a quick way to do the equivalent of
> setting the graph to undirected and removing the parallel edges in
> "view-mode" (i.e. making the removal temporary, like a filter)?

Yes, just use:

  u = GraphView(g, directed=False)
  GraphView(u, efilt=label_parallel_edges(u).fa == 0)

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Clustering coefficient with parallel edges

Tanguy Fardet-2
Hi Tiago,

thanks for the quick and precise answer!


> Yes, it's desired behavior.
>
> Ignoring parallel edges would have complicated and slowed down the code,
> and in any case can be easily implemented by the user (see below), who
> should have something specific in mind when attempting to compute the
> value for multigraphs.

OK, do you think it would be relevant to include it in the documentation?
It was not very obvious to me...


>> If this is indeed desired, is there a quick way to do the equivalent of
>> setting the graph to undirected and removing the parallel edges in
>> "view-mode" (i.e. making the removal temporary, like a filter)?
> Yes, just use:
>
>    u = GraphView(g, directed=False)
>    GraphView(u, efilt=label_parallel_edges(u).fa == 0)

Works like a charm, thank you!

Best,
Tanguy

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Clustering coefficient with parallel edges

Tiago Peixoto
Administrator
Am 14.04.20 um 10:50 schrieb Tanguy Fardet:
>> Yes, it's desired behavior.
>>
>> Ignoring parallel edges would have complicated and slowed down the code,
>> and in any case can be easily implemented by the user (see below), who
>> should have something specific in mind when attempting to compute the
>> value for multigraphs.
>
> OK, do you think it would be relevant to include it in the documentation?
> It was not very obvious to me...

But doesn't the function compute exactly what it says in the documentation?

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Clustering coefficient with parallel edges

Tanguy Fardet-2
> But doesn't the function compute exactly what it says in the documentation?

It does, but one needs to know what the undirected=True implies for the
graph, otherwise the result is surprising (or just not what one's think
one is computing, if not working on a super simple test graph where it
is easy to check).

I think some people might end up not noticing and therefore would not
compute what they think they are computing...

Best,
Tanguy

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Clustering coefficient with parallel edges

Tiago Peixoto
Administrator
Am 14.04.20 um 11:26 schrieb Tanguy Fardet:
>> But doesn't the function compute exactly what it says in the
>> documentation?
>
> It does, but one needs to know what the undirected=True implies for the
> graph, otherwise the result is surprising (or just not what one's think
> one is computing, if not working on a super simple test graph where it
> is easy to check).

It is consistent with how directionality is implement in graph-tool,
which preserves the total number of edges. Furthermore, the concept of
multigraph vs simple graph is orthogonal to directed vs undirected graphs.

So this is not really about the clustering coefficient computation.

> I think some people might end up not noticing and therefore would not
> compute what they think they are computing...

I don't like second-guessing misinterpretations like this. What is
important is to be consistent.

The surprise came from not understanding how the undirected/directed
filter works. This kind of surprise will appear in almost every function
of this kind, and I don't think that putting a warning everywhere is the
right approach.

Besides, I think the way the undirected filter works is perfectly
intuitive. What you seemed to be expecting was for a simple directed
graph to become a simple undirected graph. But in this case how would
property maps be handled? Suppose the incoming edge had a weight value
and the outgoing edge had another, which one should be kept in the
simple graph? How would a directed multigraph be handled when converted
to undirected? Should it magically become a simple graph? Wouldn't that
be a lot more surprising?

In graph-tool all graphs are multigraphs per default, which makes
decisions such as these easier.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Clustering coefficient with parallel edges

Tanguy Fardet-2
Hi Tiago,

> It is consistent with how directionality is implement in graph-tool,
> which preserves the total number of edges. Furthermore, the concept of
> multigraph vs simple graph is orthogonal to directed vs undirected graphs.
>
> So this is not really about the clustering coefficient computation.

I understand.


> Besides, I think the way the undirected filter works is perfectly
> intuitive. What you seemed to be expecting was for a simple directed
> graph to become a simple undirected graph. But in this case how would
> property maps be handled? Suppose the incoming edge had a weight value
> and the outgoing edge had another, which one should be kept in the
> simple graph? How would a directed multigraph be handled when converted
> to undirected? Should it magically become a simple graph? Wouldn't that
> be a lot more surprising?

Yes, I was thinking that way because there were no attributes in the
current graph, but you are right, of course.
For instance igraph provides a keyword for edge coalescence to sum/take
the max/other of the edge attributes; is there a similar way of doing
this that is already available in graph-tool?

Best,
Tanguy

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Clustering coefficient with parallel edges

Tiago Peixoto
Administrator
Am 14.04.20 um 14:34 schrieb Tanguy Fardet:
>
> Yes, I was thinking that way because there were no attributes in the
> current graph, but you are right, of course.
> For instance igraph provides a keyword for edge coalescence to sum/take
> the max/other of the edge attributes; is there a similar way of doing
> this that is already available in graph-tool?

Yes, take a look at the condensation_graph() function.

--
Tiago de Paula Peixoto <[hidden email]>
_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Clustering coefficient with parallel edges

Tanguy Fardet-2
Ah, I did not think I could use it that way!

I'll try it out, thank you.

Best,
Tanguy

PS: (sorry for the spam, I hit reply again)

On 14/04/2020 14:36, Tiago de Paula Peixoto wrote:
> Am 14.04.20 um 14:34 schrieb Tanguy Fardet:
>> Yes, I was thinking that way because there were no attributes in the
>> current graph, but you are right, of course.
>> For instance igraph provides a keyword for edge coalescence to sum/take
>> the max/other of the edge attributes; is there a similar way of doing
>> this that is already available in graph-tool?
> Yes, take a look at the condensation_graph() function.
>
_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Clustering coefficient with parallel edges

Rolf Sander
Hello Tiago and Tanguy,

I'm following your discussion with great interest as I have a similar
task. I want to convert a multigraph into a simple graph. This works
very nicely with

GraphView(u, efilt=label_parallel_edges(u).fa == 0)

However, removing the parallel edges this way means that the edge
properties of the parallel edges are lost. I have an edge property
called "eqntag" which is a string and I want to concatenate all eqntags
from the parallel edges into the simple graph. I think this is what
Tanguy meant by "edge coalescence". I did not understand how to use
condensation_graph() for this purpose, so I've written my own function:

def remove_parallel_edges_merge_eqntags(g):
    # create simple graph g2:
    g2 = gt.GraphView(g, efilt=gt.label_parallel_edges(g).fa == 0)
    # loop over the remaining edges in the simple graph:
    for e in g2.edges():
        # create a list with all edges from g that connect the same
        # vertices as the current edge in g2:
        edgelist = g.edge(e.source(), e.target(), all_edges=True)
        # replace eqntag by semicolon-separated merged eqntag:
        g2.ep.eqntag[e] = ';'.join([g.ep.eqntag[edge] for edge in edgelist])
    return g2

Best regards
Rolf




--
Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool