# Effect of hubs in WSBM Classic List Threaded 7 messages Open this post in threaded view
|

## Effect of hubs in WSBM

 Hi Tiago, we noticed that with certain weighted graphs minimize_blockmodel_dl() tends to put hubs (vertices with many edges) into the same cluster. Please find a minimal example below, which produces the clustered graph in the attached plot. This happens even if edge weights are distributed uniformly over edges. Is this intended behavior? We wonder if a possible explanation could be that the WSBM is fit to predict edge weights *as well as edge probabilities*. (Compare to formulas (1) and (4) in .) Hence, vertices with similar degrees tend to end up in the same cluster, if the edge weights do not contradict this. Is this correct? In case the above makes any sense, is there a way to suppress the likelihood of the edge probabilities as in  where the alpha-parameter can be used to fit "only to the weight information"? (Compare to formula (4) in .) This is also related to the question we asked here: https://git.skewed.de/count0/graph-tool/-/issues/664Best, Dominik (and Enrique)  Tiago Peixoto. 2017. Nonparametric weighted stochastic block models. Physical Review E, 97.  C. Aicher, A. Z. Jacobs, and A. Clauset. 2014.  Learning  latent  block  structure  in  weighted  networks. Journal of Complex Networks, 3(2):221–248. ----- import graph_tool.all as gt h = gt.Graph(directed=False) h.add_edge(0,3) h.add_edge(1,3) h.add_edge(2,3) h.add_edge(8,3) h.add_edge(9,3) h.add_edge(12,3) h.add_edge(13,3) h.add_edge(14,3) h.add_edge(15,3) h.add_edge(3,4) h.add_edge(5,4) h.add_edge(6,4) h.add_edge(7,4) h.add_edge(10,4) h.add_edge(11,4) h.add_edge(16,4) h.add_edge(17,4) h.add_edge(18,4) h.add_edge(19,4) h.add_edge(20,4) h.add_edge(22,21) h.add_edge(23,21) h.add_edge(24,21) h.add_edge(25,21) h.add_edge(26,21) h.add_edge(27,21) h.add_edge(28,21) h.add_edge(29,21) h.add_edge(30,21) h.add_edge(31,21) h.add_edge(32,21) h.add_edge(5, 21) ew = h.new_edge_property("double") ew.a =  * len(h.get_edges()) h.ep['weight'] = ew state = gt.minimize_blockmodel_dl(     h,     state_args=dict(recs=[h.ep.weight],rec_types=["discrete-binomial"]) ) b = state.get_blocks() b = gt.perfect_prop_hash([b]) gt.graph_draw(   h,   edge_text=h.ep.weight,   vertex_size=20,   vertex_fill_color=b,   ink_scale=0.9,   bg_color=[1,1,1,1] ) _______________________________________________ graph-tool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graph-tool hub_wsbm_1.png (143K) Download Attachment 0x67336E43281F46CA.asc (693 bytes) Download Attachment
Open this post in threaded view
|

## Re: Effect of hubs in WSBM

 Administrator Am 16.07.20 um 00:49 schrieb Dominik Schlechtweg: > Hi Tiago, > > we noticed that with certain weighted graphs minimize_blockmodel_dl() tends to put hubs (vertices with many edges) into the same cluster. Please find a minimal example below, which produces the clustered graph in the attached plot. This happens even if edge weights are distributed uniformly over edges. Is this intended behavior? > > We wonder if a possible explanation could be that the WSBM is fit to predict edge weights *as well as edge probabilities*. (Compare to formulas (1) and (4) in .) Hence, vertices with similar degrees tend to end up in the same cluster, if the edge weights do not contradict this. Is this correct? > This has nothing to do with having weights or not; if you use an unweighted SBM you get the same behavior. This clustering makes sense under the model, because a random multigraph model with the same degree sequence would yield a larger number of connections between the hubs, and between the nodes with smaller degree. See an explanation for this in this paper: https://arxiv.org/abs/2002.07803Best, Tiago -- Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graph-tool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graph-tool signature.asc (849 bytes) Download Attachment -- Tiago de Paula Peixoto
Open this post in threaded view
|

## Re: Effect of hubs in WSBM

 thanks, short follow-up: Am 17.07.20 um 12:36 schrieb Tiago de Paula Peixoto: > Am 16.07.20 um 00:49 schrieb Dominik Schlechtweg: >> Hi Tiago, >> >> we noticed that with certain weighted graphs minimize_blockmodel_dl() tends to put hubs (vertices with many edges) into the same cluster. Please find a minimal example below, which produces the clustered graph in the attached plot. This happens even if edge weights are distributed uniformly over edges. Is this intended behavior? >> >> We wonder if a possible explanation could be that the WSBM is fit to predict edge weights *as well as edge probabilities*. (Compare to formulas (1) and (4) in .) Hence, vertices with similar degrees tend to end up in the same cluster, if the edge weights do not contradict this. Is this correct? >> > > This has nothing to do with having weights or not; if you use an > unweighted SBM you get the same behavior. I see that we are probably mixing up two things here. Regarding this point: > is there a way to suppress the likelihood of the edge probabilities as in  where the alpha-parameter can be used to fit "only to the weight information"? (Compare to formula (4) in .) > [...] >  C. Aicher, A. Z. Jacobs, and A. Clauset. 2014.  Learning  latent  block  structure  in  weighted  networks. Journal of Complex Networks, 3(2):221–248. How does the graph-tools implementation relate to the alpha-parameter in formula (4)? Is it equivalent to giving equal weight to edge probabilities and weights (alpha = 0.5)? > > This clustering makes sense under the model, because a random multigraph > model with the same degree sequence would yield a larger number of > connections between the hubs, and between the nodes with smaller degree. > > See an explanation for this in this paper: https://arxiv.org/abs/2002.07803Is it possible to use LatentMultigraphBlockState() with a weighted graph? > > Best, > Tiago > > > > _______________________________________________ > graph-tool mailing list > [hidden email] > https://lists.skewed.de/mailman/listinfo/graph-tool> _______________________________________________ graph-tool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graph-tool 0x67336E43281F46CA.asc (693 bytes) Download Attachment signature.asc (235 bytes) Download Attachment
Open this post in threaded view
|

## Re: Effect of hubs in WSBM

 Administrator Am 17.07.20 um 14:19 schrieb Dominik Schlechtweg: >> is there a way to suppress the likelihood of the edge probabilities as in  where the alpha-parameter can be used to fit "only to the weight information"? (Compare to formula (4) in .) >> [...] >>  C. Aicher, A. Z. Jacobs, and A. Clauset. 2014.  Learning  latent  block  structure  in  weighted  networks. Journal of Complex Networks, 3(2):221–248. > > How does the graph-tools implementation relate to the alpha-parameter in formula (4)? Is it equivalent to giving equal weight to edge probabilities and weights (alpha = 0.5)? This parameter is not implemented in graph-tool. Note that such a parameter does not have an obvious interpretation from a generative modelling point of view, specially in a Bayesian way. We cannot just introduce ad-hoc parameters to cancel certain parts of the likelihood, without paying proper attention to issues of normalization, etc, and expect things to behave consistently. In other words, I do not fully agree with the alpha parameter of Aicher et al. > Is it possible to use LatentMultigraphBlockState() with a weighted graph? Not yet. Best, Tiago -- Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graph-tool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graph-tool signature.asc (849 bytes) Download Attachment -- Tiago de Paula Peixoto
Open this post in threaded view
|

## Re: Effect of hubs in WSBM

 Am 17.07.20 um 19:44 schrieb Tiago de Paula Peixoto: > Am 17.07.20 um 14:19 schrieb Dominik Schlechtweg: >>> is there a way to suppress the likelihood of the edge probabilities as in  where the alpha-parameter can be used to fit "only to the weight information"? (Compare to formula (4) in .) >>> [...] >>>  C. Aicher, A. Z. Jacobs, and A. Clauset. 2014.  Learning  latent  block  structure  in  weighted  networks. Journal of Complex Networks, 3(2):221–248. >> >> How does the graph-tools implementation relate to the alpha-parameter in formula (4)? Is it equivalent to giving equal weight to edge probabilities and weights (alpha = 0.5)? > > This parameter is not implemented in graph-tool. > > Note that such a parameter does not have an obvious interpretation from > a generative modelling point of view, specially in a Bayesian way. We > cannot just introduce ad-hoc parameters to cancel certain parts of the > likelihood, without paying proper attention to issues of normalization, > etc, and expect things to behave consistently. > > In other words, I do not fully agree with the alpha parameter of Aicher > et al. Thanks for clarifying this. Last question: Does your doubt also concern the special case where alpha = 0, i.e., ignoring edge probabilities completely? (This is the actually interesting case for us. We are not interested in tuning this parameter in any way.) > >> Is it possible to use LatentMultigraphBlockState() with a weighted graph? > > Not yet. We will open a feature request then, in case there is none yet. > > Best, > Tiago > > > _______________________________________________ > graph-tool mailing list > [hidden email] > https://lists.skewed.de/mailman/listinfo/graph-tool> _______________________________________________ graph-tool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graph-tool 0x67336E43281F46CA.asc (693 bytes) Download Attachment signature.asc (235 bytes) Download Attachment
 Administrator Am 17.07.20 um 20:35 schrieb Dominik Schlechtweg:> Thanks for clarifying this. Last question: Does your doubt also > concern the special case where alpha = 0, i.e., ignoring edge > probabilities completely? (This is the actually interesting case for > us. We are not interested in tuning this parameter in any way.) I'd have to look at this more closely, but I think this is strange since the edges define the support of the distribution. The more correct thing to do, in my opinion, would be to marginalize over the edges. This might end up being equivalent o setting alpha=0, but I would need to check. -- Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graph-tool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graph-tool signature.asc (849 bytes) Download Attachment -- Tiago de Paula Peixoto
 Am 17.07.20 um 20:44 schrieb Tiago de Paula Peixoto: > Am 17.07.20 um 20:35 schrieb Dominik Schlechtweg:> Thanks for clarifying > this. Last question: Does your doubt also >> concern the special case where alpha = 0, i.e., ignoring edge >> probabilities completely? (This is the actually interesting case for >> us. We are not interested in tuning this parameter in any way.) > > I'd have to look at this more closely, but I think this is strange since > the edges define the support of the distribution. The more correct thing > to do, in my opinion, would be to marginalize over the edges. This might > end up being equivalent o setting alpha=0, but I would need to check. You're probably quite busy, but as we would really profit from this, I would then open a feature request, if you don't stop me right now. :-) > > > _______________________________________________ > graph-tool mailing list > [hidden email] > https://lists.skewed.de/mailman/listinfo/graph-tool> _______________________________________________ graph-tool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graph-tool 0x67336E43281F46CA.asc (693 bytes) Download Attachment signature.asc (235 bytes) Download Attachment