Hi Tiago,
we noticed that with certain weighted graphs minimize_blockmodel_dl() tends to put hubs (vertices with many edges) into the same cluster. Please find a minimal example below, which produces the clustered graph in the attached plot. This happens even if edge weights are distributed uniformly over edges. Is this intended behavior? We wonder if a possible explanation could be that the WSBM is fit to predict edge weights *as well as edge probabilities*. (Compare to formulas (1) and (4) in [1].) Hence, vertices with similar degrees tend to end up in the same cluster, if the edge weights do not contradict this. Is this correct? In case the above makes any sense, is there a way to suppress the likelihood of the edge probabilities as in [2] where the alphaparameter can be used to fit "only to the weight information"? (Compare to formula (4) in [2].) This is also related to the question we asked here: https://git.skewed.de/count0/graphtool//issues/664 Best, Dominik (and Enrique) [1] Tiago Peixoto. 2017. Nonparametric weighted stochastic block models. Physical Review E, 97. [2] C. Aicher, A. Z. Jacobs, and A. Clauset. 2014. Learning latent block structure in weighted networks. Journal of Complex Networks, 3(2):221–248.  import graph_tool.all as gt h = gt.Graph(directed=False) h.add_edge(0,3) h.add_edge(1,3) h.add_edge(2,3) h.add_edge(8,3) h.add_edge(9,3) h.add_edge(12,3) h.add_edge(13,3) h.add_edge(14,3) h.add_edge(15,3) h.add_edge(3,4) h.add_edge(5,4) h.add_edge(6,4) h.add_edge(7,4) h.add_edge(10,4) h.add_edge(11,4) h.add_edge(16,4) h.add_edge(17,4) h.add_edge(18,4) h.add_edge(19,4) h.add_edge(20,4) h.add_edge(22,21) h.add_edge(23,21) h.add_edge(24,21) h.add_edge(25,21) h.add_edge(26,21) h.add_edge(27,21) h.add_edge(28,21) h.add_edge(29,21) h.add_edge(30,21) h.add_edge(31,21) h.add_edge(32,21) h.add_edge(5, 21) ew = h.new_edge_property("double") ew.a = [4] * len(h.get_edges()) h.ep['weight'] = ew state = gt.minimize_blockmodel_dl( h, state_args=dict(recs=[h.ep.weight],rec_types=["discretebinomial"]) ) b = state.get_blocks() b = gt.perfect_prop_hash([b])[0] gt.graph_draw( h, edge_text=h.ep.weight, vertex_size=20, vertex_fill_color=b, ink_scale=0.9, bg_color=[1,1,1,1] ) _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 
Administrator

Am 16.07.20 um 00:49 schrieb Dominik Schlechtweg:
> Hi Tiago, > > we noticed that with certain weighted graphs minimize_blockmodel_dl() tends to put hubs (vertices with many edges) into the same cluster. Please find a minimal example below, which produces the clustered graph in the attached plot. This happens even if edge weights are distributed uniformly over edges. Is this intended behavior? > > We wonder if a possible explanation could be that the WSBM is fit to predict edge weights *as well as edge probabilities*. (Compare to formulas (1) and (4) in [1].) Hence, vertices with similar degrees tend to end up in the same cluster, if the edge weights do not contradict this. Is this correct? > This has nothing to do with having weights or not; if you use an unweighted SBM you get the same behavior. This clustering makes sense under the model, because a random multigraph model with the same degree sequence would yield a larger number of connections between the hubs, and between the nodes with smaller degree. See an explanation for this in this paper: https://arxiv.org/abs/2002.07803 Best, Tiago  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
thanks, short followup:
Am 17.07.20 um 12:36 schrieb Tiago de Paula Peixoto: > Am 16.07.20 um 00:49 schrieb Dominik Schlechtweg: >> Hi Tiago, >> >> we noticed that with certain weighted graphs minimize_blockmodel_dl() tends to put hubs (vertices with many edges) into the same cluster. Please find a minimal example below, which produces the clustered graph in the attached plot. This happens even if edge weights are distributed uniformly over edges. Is this intended behavior? >> >> We wonder if a possible explanation could be that the WSBM is fit to predict edge weights *as well as edge probabilities*. (Compare to formulas (1) and (4) in [1].) Hence, vertices with similar degrees tend to end up in the same cluster, if the edge weights do not contradict this. Is this correct? >> > > This has nothing to do with having weights or not; if you use an > unweighted SBM you get the same behavior. > is there a way to suppress the likelihood of the edge probabilities as in [2] where the alphaparameter can be used to fit "only to the weight information"? (Compare to formula (4) in [2].) > [...] > [2] C. Aicher, A. Z. Jacobs, and A. Clauset. 2014. Learning latent block structure in weighted networks. Journal of Complex Networks, 3(2):221–248. How does the graphtools implementation relate to the alphaparameter in formula (4)? Is it equivalent to giving equal weight to edge probabilities and weights (alpha = 0.5)? > > This clustering makes sense under the model, because a random multigraph > model with the same degree sequence would yield a larger number of > connections between the hubs, and between the nodes with smaller degree. > > See an explanation for this in this paper: https://arxiv.org/abs/2002.07803 Is it possible to use LatentMultigraphBlockState() with a weighted graph? > > Best, > Tiago > > > > _______________________________________________ > graphtool mailing list > [hidden email] > https://lists.skewed.de/mailman/listinfo/graphtool > _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 0x67336E43281F46CA.asc (693 bytes) Download Attachment signature.asc (235 bytes) Download Attachment 
Administrator

Am 17.07.20 um 14:19 schrieb Dominik Schlechtweg:
>> is there a way to suppress the likelihood of the edge probabilities as in [2] where the alphaparameter can be used to fit "only to the weight information"? (Compare to formula (4) in [2].) >> [...] >> [2] C. Aicher, A. Z. Jacobs, and A. Clauset. 2014. Learning latent block structure in weighted networks. Journal of Complex Networks, 3(2):221–248. > > How does the graphtools implementation relate to the alphaparameter in formula (4)? Is it equivalent to giving equal weight to edge probabilities and weights (alpha = 0.5)? This parameter is not implemented in graphtool. Note that such a parameter does not have an obvious interpretation from a generative modelling point of view, specially in a Bayesian way. We cannot just introduce adhoc parameters to cancel certain parts of the likelihood, without paying proper attention to issues of normalization, etc, and expect things to behave consistently. In other words, I do not fully agree with the alpha parameter of Aicher et al. > Is it possible to use LatentMultigraphBlockState() with a weighted graph? Not yet. Best, Tiago  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
Am 17.07.20 um 19:44 schrieb Tiago de Paula Peixoto: > Am 17.07.20 um 14:19 schrieb Dominik Schlechtweg: >>> is there a way to suppress the likelihood of the edge probabilities as in [2] where the alphaparameter can be used to fit "only to the weight information"? (Compare to formula (4) in [2].) >>> [...] >>> [2] C. Aicher, A. Z. Jacobs, and A. Clauset. 2014. Learning latent block structure in weighted networks. Journal of Complex Networks, 3(2):221–248. >> >> How does the graphtools implementation relate to the alphaparameter in formula (4)? Is it equivalent to giving equal weight to edge probabilities and weights (alpha = 0.5)? > > This parameter is not implemented in graphtool. > > Note that such a parameter does not have an obvious interpretation from > a generative modelling point of view, specially in a Bayesian way. We > cannot just introduce adhoc parameters to cancel certain parts of the > likelihood, without paying proper attention to issues of normalization, > etc, and expect things to behave consistently. > > In other words, I do not fully agree with the alpha parameter of Aicher > et al. > >> Is it possible to use LatentMultigraphBlockState() with a weighted graph? > > Not yet. We will open a feature request then, in case there is none yet. > > Best, > Tiago > > > _______________________________________________ > graphtool mailing list > [hidden email] > https://lists.skewed.de/mailman/listinfo/graphtool > _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 0x67336E43281F46CA.asc (693 bytes) Download Attachment signature.asc (235 bytes) Download Attachment 
Administrator

Am 17.07.20 um 20:35 schrieb Dominik Schlechtweg:> Thanks for clarifying
this. Last question: Does your doubt also > concern the special case where alpha = 0, i.e., ignoring edge > probabilities completely? (This is the actually interesting case for > us. We are not interested in tuning this parameter in any way.) I'd have to look at this more closely, but I think this is strange since the edges define the support of the distribution. The more correct thing to do, in my opinion, would be to marginalize over the edges. This might end up being equivalent o setting alpha=0, but I would need to check.  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
Am 17.07.20 um 20:44 schrieb Tiago de Paula Peixoto: > Am 17.07.20 um 20:35 schrieb Dominik Schlechtweg:> Thanks for clarifying > this. Last question: Does your doubt also >> concern the special case where alpha = 0, i.e., ignoring edge >> probabilities completely? (This is the actually interesting case for >> us. We are not interested in tuning this parameter in any way.) > > I'd have to look at this more closely, but I think this is strange since > the edges define the support of the distribution. The more correct thing > to do, in my opinion, would be to marginalize over the edges. This might > end up being equivalent o setting alpha=0, but I would need to check. > > > _______________________________________________ > graphtool mailing list > [hidden email] > https://lists.skewed.de/mailman/listinfo/graphtool > _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 0x67336E43281F46CA.asc (693 bytes) Download Attachment signature.asc (235 bytes) Download Attachment 
Free forum by Nabble  Edit this page 