Effect of hubs in WSBM

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Effect of hubs in WSBM

Dominik Schlechtweg
Hi Tiago,

we noticed that with certain weighted graphs minimize_blockmodel_dl() tends to put hubs (vertices with many edges) into the same cluster. Please find a minimal example below, which produces the clustered graph in the attached plot. This happens even if edge weights are distributed uniformly over edges. Is this intended behavior?

We wonder if a possible explanation could be that the WSBM is fit to predict edge weights *as well as edge probabilities*. (Compare to formulas (1) and (4) in [1].) Hence, vertices with similar degrees tend to end up in the same cluster, if the edge weights do not contradict this. Is this correct?

In case the above makes any sense, is there a way to suppress the likelihood of the edge probabilities as in [2] where the alpha-parameter can be used to fit "only to the weight information"? (Compare to formula (4) in [2].)

This is also related to the question we asked here:

https://git.skewed.de/count0/graph-tool/-/issues/664

Best,
Dominik (and Enrique)

[1] Tiago Peixoto. 2017. Nonparametric weighted stochastic block models. Physical Review E, 97.
[2] C. Aicher, A. Z. Jacobs, and A. Clauset. 2014.  Learning  latent  block  structure  in  weighted  networks. Journal of Complex Networks, 3(2):221–248.

-----
import graph_tool.all as gt

h = gt.Graph(directed=False)

h.add_edge(0,3)
h.add_edge(1,3)
h.add_edge(2,3)
h.add_edge(8,3)
h.add_edge(9,3)
h.add_edge(12,3)
h.add_edge(13,3)
h.add_edge(14,3)
h.add_edge(15,3)

h.add_edge(3,4)
h.add_edge(5,4)
h.add_edge(6,4)
h.add_edge(7,4)
h.add_edge(10,4)
h.add_edge(11,4)
h.add_edge(16,4)
h.add_edge(17,4)
h.add_edge(18,4)
h.add_edge(19,4)
h.add_edge(20,4)

h.add_edge(22,21)
h.add_edge(23,21)
h.add_edge(24,21)
h.add_edge(25,21)
h.add_edge(26,21)
h.add_edge(27,21)
h.add_edge(28,21)
h.add_edge(29,21)
h.add_edge(30,21)
h.add_edge(31,21)
h.add_edge(32,21)

h.add_edge(5, 21)

ew = h.new_edge_property("double")
ew.a = [4] * len(h.get_edges())
h.ep['weight'] = ew


state = gt.minimize_blockmodel_dl(
    h,
    state_args=dict(recs=[h.ep.weight],rec_types=["discrete-binomial"])
)

b = state.get_blocks()
b = gt.perfect_prop_hash([b])[0]


gt.graph_draw(
  h,
  edge_text=h.ep.weight,
  vertex_size=20,
  vertex_fill_color=b,
  ink_scale=0.9,
  bg_color=[1,1,1,1]
)

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

hub_wsbm_1.png (143K) Download Attachment
0x67336E43281F46CA.asc (693 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Effect of hubs in WSBM

Tiago Peixoto
Administrator
Am 16.07.20 um 00:49 schrieb Dominik Schlechtweg:
> Hi Tiago,
>
> we noticed that with certain weighted graphs minimize_blockmodel_dl() tends to put hubs (vertices with many edges) into the same cluster. Please find a minimal example below, which produces the clustered graph in the attached plot. This happens even if edge weights are distributed uniformly over edges. Is this intended behavior?
>
> We wonder if a possible explanation could be that the WSBM is fit to predict edge weights *as well as edge probabilities*. (Compare to formulas (1) and (4) in [1].) Hence, vertices with similar degrees tend to end up in the same cluster, if the edge weights do not contradict this. Is this correct?
>

This has nothing to do with having weights or not; if you use an
unweighted SBM you get the same behavior.

This clustering makes sense under the model, because a random multigraph
model with the same degree sequence would yield a larger number of
connections between the hubs, and between the nodes with smaller degree.

See an explanation for this in this paper: https://arxiv.org/abs/2002.07803

Best,
Tiago


--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Effect of hubs in WSBM

Dominik Schlechtweg
thanks, short follow-up:

Am 17.07.20 um 12:36 schrieb Tiago de Paula Peixoto:

> Am 16.07.20 um 00:49 schrieb Dominik Schlechtweg:
>> Hi Tiago,
>>
>> we noticed that with certain weighted graphs minimize_blockmodel_dl() tends to put hubs (vertices with many edges) into the same cluster. Please find a minimal example below, which produces the clustered graph in the attached plot. This happens even if edge weights are distributed uniformly over edges. Is this intended behavior?
>>
>> We wonder if a possible explanation could be that the WSBM is fit to predict edge weights *as well as edge probabilities*. (Compare to formulas (1) and (4) in [1].) Hence, vertices with similar degrees tend to end up in the same cluster, if the edge weights do not contradict this. Is this correct?
>>
>
> This has nothing to do with having weights or not; if you use an
> unweighted SBM you get the same behavior.
I see that we are probably mixing up two things here. Regarding this point:

> is there a way to suppress the likelihood of the edge probabilities as in [2] where the alpha-parameter can be used to fit "only to the weight information"? (Compare to formula (4) in [2].)
> [...]
> [2] C. Aicher, A. Z. Jacobs, and A. Clauset. 2014.  Learning  latent  block  structure  in  weighted  networks. Journal of Complex Networks, 3(2):221–248.

How does the graph-tools implementation relate to the alpha-parameter in formula (4)? Is it equivalent to giving equal weight to edge probabilities and weights (alpha = 0.5)?

>
> This clustering makes sense under the model, because a random multigraph
> model with the same degree sequence would yield a larger number of
> connections between the hubs, and between the nodes with smaller degree.
>
> See an explanation for this in this paper: https://arxiv.org/abs/2002.07803

Is it possible to use LatentMultigraphBlockState() with a weighted graph?

>
> Best,
> Tiago
>
>
>
> _______________________________________________
> graph-tool mailing list
> [hidden email]
> https://lists.skewed.de/mailman/listinfo/graph-tool
>

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

0x67336E43281F46CA.asc (693 bytes) Download Attachment
signature.asc (235 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Effect of hubs in WSBM

Tiago Peixoto
Administrator
Am 17.07.20 um 14:19 schrieb Dominik Schlechtweg:
>> is there a way to suppress the likelihood of the edge probabilities as in [2] where the alpha-parameter can be used to fit "only to the weight information"? (Compare to formula (4) in [2].)
>> [...]
>> [2] C. Aicher, A. Z. Jacobs, and A. Clauset. 2014.  Learning  latent  block  structure  in  weighted  networks. Journal of Complex Networks, 3(2):221–248.
>
> How does the graph-tools implementation relate to the alpha-parameter in formula (4)? Is it equivalent to giving equal weight to edge probabilities and weights (alpha = 0.5)?

This parameter is not implemented in graph-tool.

Note that such a parameter does not have an obvious interpretation from
a generative modelling point of view, specially in a Bayesian way. We
cannot just introduce ad-hoc parameters to cancel certain parts of the
likelihood, without paying proper attention to issues of normalization,
etc, and expect things to behave consistently.

In other words, I do not fully agree with the alpha parameter of Aicher
et al.

> Is it possible to use LatentMultigraphBlockState() with a weighted graph?

Not yet.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Effect of hubs in WSBM

Dominik Schlechtweg

Am 17.07.20 um 19:44 schrieb Tiago de Paula Peixoto:

> Am 17.07.20 um 14:19 schrieb Dominik Schlechtweg:
>>> is there a way to suppress the likelihood of the edge probabilities as in [2] where the alpha-parameter can be used to fit "only to the weight information"? (Compare to formula (4) in [2].)
>>> [...]
>>> [2] C. Aicher, A. Z. Jacobs, and A. Clauset. 2014.  Learning  latent  block  structure  in  weighted  networks. Journal of Complex Networks, 3(2):221–248.
>>
>> How does the graph-tools implementation relate to the alpha-parameter in formula (4)? Is it equivalent to giving equal weight to edge probabilities and weights (alpha = 0.5)?
>
> This parameter is not implemented in graph-tool.
>
> Note that such a parameter does not have an obvious interpretation from
> a generative modelling point of view, specially in a Bayesian way. We
> cannot just introduce ad-hoc parameters to cancel certain parts of the
> likelihood, without paying proper attention to issues of normalization,
> etc, and expect things to behave consistently.
>
> In other words, I do not fully agree with the alpha parameter of Aicher
> et al.
Thanks for clarifying this. Last question: Does your doubt also concern the special case where alpha = 0, i.e., ignoring edge probabilities completely? (This is the actually interesting case for us. We are not interested in tuning this parameter in any way.)

>
>> Is it possible to use LatentMultigraphBlockState() with a weighted graph?
>
> Not yet.

We will open a feature request then, in case there is none yet.

>
> Best,
> Tiago
>
>
> _______________________________________________
> graph-tool mailing list
> [hidden email]
> https://lists.skewed.de/mailman/listinfo/graph-tool
>

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

0x67336E43281F46CA.asc (693 bytes) Download Attachment
signature.asc (235 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Effect of hubs in WSBM

Tiago Peixoto
Administrator
Am 17.07.20 um 20:35 schrieb Dominik Schlechtweg:> Thanks for clarifying
this. Last question: Does your doubt also
> concern the special case where alpha = 0, i.e., ignoring edge
> probabilities completely? (This is the actually interesting case for
> us. We are not interested in tuning this parameter in any way.)

I'd have to look at this more closely, but I think this is strange since
the edges define the support of the distribution. The more correct thing
to do, in my opinion, would be to marginalize over the edges. This might
end up being equivalent o setting alpha=0, but I would need to check.

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Effect of hubs in WSBM

Dominik Schlechtweg

Am 17.07.20 um 20:44 schrieb Tiago de Paula Peixoto:

> Am 17.07.20 um 20:35 schrieb Dominik Schlechtweg:> Thanks for clarifying
> this. Last question: Does your doubt also
>> concern the special case where alpha = 0, i.e., ignoring edge
>> probabilities completely? (This is the actually interesting case for
>> us. We are not interested in tuning this parameter in any way.)
>
> I'd have to look at this more closely, but I think this is strange since
> the edges define the support of the distribution. The more correct thing
> to do, in my opinion, would be to marginalize over the edges. This might
> end up being equivalent o setting alpha=0, but I would need to check.
You're probably quite busy, but as we would really profit from this, I would then open a feature request, if you don't stop me right now. :-)

>
>
> _______________________________________________
> graph-tool mailing list
> [hidden email]
> https://lists.skewed.de/mailman/listinfo/graph-tool
>

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

0x67336E43281F46CA.asc (693 bytes) Download Attachment
signature.asc (235 bytes) Download Attachment