I'm attempting to use get_edges_prob to find the most likely missing edges out of every possible nonedge. I know every possible edge is O(n^2). Currently I'm sampling the like this: non_edges_probs = [[] for _ in range(len(non_edges))] def collect_edge_probs(s): s = s.levels[0] for i, non_edge in enumerate(non_edges): p = s.get_edges_prob([non_edge], [], entropy_args=dict(partition_dl=False)) non_edges_probs[i].append(p) gt.mcmc_equilibrate(nested_state, force_niter=100, mcmc_args=dict(niter=10), callback=collect_edge_probs, verbose=True) Is there a way to speed this up at all? If not, is there a heuristic I can use to reduce the number of possibilities? Currently I'm using vertex similarity measures to cut the possibilities, but I'm wondering if there is a heuristic involving just the NestedState. Any help appreciated! _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 
Administrator

Am 19.03.20 um 18:33 schrieb Deklan Webster:
> I'm attempting to use get_edges_prob to find the most likely missing > edges out of every possible nonedge. I know every possible edge is O(n^2). > > Currently I'm sampling the like this: > > non_edges_probs = [[] for _ in range(len(non_edges))] > > def collect_edge_probs(s): > s = s.levels[0] > > for i, non_edge in enumerate(non_edges): > p = s.get_edges_prob([non_edge], [], > entropy_args=dict(partition_dl=False)) > non_edges_probs[i].append(p) > > gt.mcmc_equilibrate(nested_state, > force_niter=100, > mcmc_args=dict(niter=10), > callback=collect_edge_probs, > verbose=True) > > Is there a way to speed this up at all? If not, is there a heuristic I > can use to reduce the number of possibilities? pass the actual list at once, instead of iterating through it and passing lists of size 1. The reason get_edges_prob() exists and accepts lists is precisely to speed things up in this case. Best, Tiago  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
I was under the impression that passing a list corresponds to getting the probability that *all* the edges are missing.
Indeed, when I try it out I get back a scalar not a np array.
I want to collect the probability that each individual edge is missing. Also, with respect to the heuristics I mention, I just saw this paper "Evaluating Overfit and Underfit in Models of Network Community Structure" use "s_ij = θ_i *θ_ j* l_gi,gj" If sampling is not computationally feasible, this is what I had in mind. 1) Is there a way built into graphtool to compute this similarity function efficiently? (i.e., without Python slowing me down) 2) Is there a hierarchical analog, like just summing this similarity at each level? Thanks as always On Mon, Mar 23, 2020 at 10:13 AM Tiago de Paula Peixoto <[hidden email]> wrote: Am 19.03.20 um 18:33 schrieb Deklan Webster: _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool 
Administrator

Am 01.04.20 um 02:40 schrieb Deklan Webster:
> I was under the impression that passing a list corresponds to getting > the probability that *all* the edges are missing. Indeed, when I try it > out I get back a scalar not a np array. I want to collect the > probability that each individual edge is missing. Yes, this is true. > Also, with respect to the heuristics I mention, I just saw this paper > "Evaluating Overfit and Underfit in Models of Network Community > Structure" use "s_ij = θ_i *θ_ j* l_gi,gj" This is not substantially faster that what is actually computed in graphtool, it is just less accurate. > If sampling is not computationally feasible, this is what I had in mind. > > 1) Is there a way built into graphtool to compute this similarity > function efficiently? (i.e., without Python slowing me down) You should switch to using MeasuredBlockState.get_edge_prob() which is implemented entirely in C++. The whole approach described in https://graphtool.skewed.de/static/doc/demos/inference/inference.html#measurednetworks should be preferred over the older binary classification scheme, as it completely subsumes it. > 2) Is there a hierarchical analog, like just summing this similarity at > each level? Yes, any of the above approaches work in the exact say way with the hierarchical model. Best, Tiago  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (849 bytes) Download Attachment

Tiago de Paula Peixoto <tiago@skewed.de> 
Free forum by Nabble  Edit this page 