Hi,
the documentation describes that GraphView can be given a unary function to filter vertices or edges. I have tried that and it seems to fail. My GraphView has not the expected vertices and edges. However, my assumption is that the filter is only evaluated one time (at initialization). Let me make an example: ``` g = graph_tool.Graph() a = graph_tool.GraphView(g, vfilt=...) fill_the_graph(g) do_stuff_with(a) # < here a does not contain any data ``` From the documentation, it seems that graph_tool constructs a property from the filter function and uses this for filtering (therefore also needing O(N)), but fill this property only on construction. Can you mention this in the documentation as a hint or warning? Maybe also a recalculate function for GraphView is meaningful that evaluates the lambda function again. Best, Gerion _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (673 bytes) Download Attachment 
Administrator

Am 06.01.20 um 13:53 schrieb Gerion Entrup:
> Hi, > > the documentation describes that GraphView can be given a unary function > to filter vertices or edges. > > I have tried that and it seems to fail. My GraphView has not the > expected vertices and edges. > > However, my assumption is that the filter is only evaluated one time > (at initialization). > > Let me make an example: > > ``` > g = graph_tool.Graph() > a = graph_tool.GraphView(g, vfilt=...) > > fill_the_graph(g) > do_stuff_with(a) # < here a does not contain any data > ``` > > From the documentation, it seems that graph_tool constructs a property > from the filter function and uses this for filtering (therefore also > needing O(N)), but fill this property only on construction. Can you > mention this in the documentation as a hint or warning? Right, this is entirely expected behavior. It seems obvious to me in the documentation, but I will make it more explicit. Note that it would be completely unreasonable performancewise to populate the filter property map lazily on demand. Note also that if you had modified `a` instead of `g` in your example, the filtering would behave as expected (i.e. new vertices or edges would appear in the graph view). > Maybe also a recalculate function for GraphView is meaningful that > evaluates the lambda function again. I don't think this is good design. GraphViews are supposed to be cheap objects that can be constructed on demand. If the filtering needs to be redone, then a new GraphView should be constructed, maybe even composed from the older one. I.e. in your example you would recreate `a` after you had modified `g`. Best, Tiago  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool

Tiago de Paula Peixoto <tiago@skewed.de> 
Am Montag, 6. Januar 2020, 15:11:53 CET schrieb Tiago de Paula Peixoto:
> Am 06.01.20 um 13:53 schrieb Gerion Entrup: > > Hi, > > > > the documentation describes that GraphView can be given a unary function > > to filter vertices or edges. > > > > I have tried that and it seems to fail. My GraphView has not the > > expected vertices and edges. > > > > However, my assumption is that the filter is only evaluated one time > > (at initialization). > > > > Let me make an example: > > > > ``` > > g = graph_tool.Graph() > > a = graph_tool.GraphView(g, vfilt=...) > > > > fill_the_graph(g) > > do_stuff_with(a) # < here a does not contain any data > > ``` > > > > From the documentation, it seems that graph_tool constructs a property > > from the filter function and uses this for filtering (therefore also > > needing O(N)), but fill this property only on construction. Can you > > mention this in the documentation as a hint or warning? > > Right, this is entirely expected behavior. It seems obvious to me in the > documentation, but I will make it more explicit. > Note that it would be completely unreasonable performancewise to > populate the filter property map lazily on demand. Only kind of. It should be feasible to populate the property on demand (only for the nodes/edges requested), but cache them and only recalculate them if a graph change is done and only for the changed vertices/edges. Then overall, it should be an O(N) operation again (with N = amount of all vertices/edges, even the deleted ones). > Note also that if you had modified `a` instead of `g` in your example, > the filtering would behave as expected (i.e. new vertices or edges would > appear in the graph view). > > > Maybe also a recalculate function for GraphView is meaningful that > > evaluates the lambda function again. > > I don't think this is good design. GraphViews are supposed to be cheap > objects that can be constructed on demand. If the filtering needs to be > redone, then a new GraphView should be constructed, maybe even composed > from the older one. I.e. in your example you would recreate `a` after > you had modified `g`. A somewhat related but other question. Currently, I use lambdas only to match for enum (int) values of properties, because my property can have three variants instead of two, e.g.: ``` from enum import IntEnum class TypeEnum(IntEnum): Type_A = 1 Type_B = 2 Type_C = 3 g = graph_tool.Graph() g.vertex_properties['type'] = g.new_vp('int') v = g.add_vertex() g.vp.type[v] = TypeEnum.Type_C g_view = graph_tool.GraphView(g, vfilt=lambda x: g.vp.type[x] == TypeEnum.Type_C) ``` This works with the behavior described above. I guess, the same filter directly in C++ would be really efficient. What do you think of adding C++Filters? One possible syntax could be: ``` from graph_tool.filter import Filter, Equal, Lesser g_view1 = graph_tool.GraphView(vfilt=Filter(Equal(g.vp.type, TypeEnum.Type_C))) g_view2 = graph_tool.GraphView(vfilt=Filter(Equal(g.vp.type, 2))) g_view3 = graph_tool.GraphView(vfilt=Filter(Lesser(g.vp.type, 3))) ``` Of course they need some constraints: 1. The comparison can only done between two properties or a constant and a property 2. Only basic operations (<, >, <=, >=, ==, !=) are possible. Maybe also boolean operations (and, or). Best, Gerion _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (673 bytes) Download Attachment 
Administrator

Am 08.01.20 um 15:33 schrieb Gerion Entrup:
> >> Note that it would be completely unreasonable performancewise to >> populate the filter property map lazily on demand. > Only kind of. It should be feasible to populate the property on demand > (only for the nodes/edges requested), but cache them and only recalculate > them if a graph change is done and only for the changed vertices/edges. > Then overall, it should be an O(N) operation again (with N = amount of > all vertices/edges, even the deleted ones). The point is that this would require the GraphView to know and be updated when the underlying graph changes, and it would tie _access_ to the filtered graph (even from C++) to function calls to the Pythonside filter function. > A somewhat related but other question. Currently, I use lambdas only to > match for enum (int) values of properties, because my property can have > three variants instead of two, e.g.: > ``` > from enum import IntEnum > > class TypeEnum(IntEnum): > Type_A = 1 > Type_B = 2 > Type_C = 3 > > g = graph_tool.Graph() > g.vertex_properties['type'] = g.new_vp('int') > > v = g.add_vertex() > > g.vp.type[v] = TypeEnum.Type_C > > g_view = graph_tool.GraphView(g, vfilt=lambda x: g.vp.type[x] == TypeEnum.Type_C) > ``` A much more efficient approach would be to use the numpy array interface to property maps, instead of a lambda function: g_view = GraphView(g, vfilt=g.vp.type.fa == TypeEnum.Type_C) The equal comparison is done in C, and hence is much faster. > This works with the behavior described above. I guess, the same filter directly > in C++ would be really efficient. What do you think of adding C++Filters? > > One possible syntax could be: > ``` > from graph_tool.filter import Filter, Equal, Lesser > g_view1 = graph_tool.GraphView(vfilt=Filter(Equal(g.vp.type, TypeEnum.Type_C))) > g_view2 = graph_tool.GraphView(vfilt=Filter(Equal(g.vp.type, 2))) > g_view3 = graph_tool.GraphView(vfilt=Filter(Lesser(g.vp.type, 3))) > ``` > Of course they need some constraints: > 1. The comparison can only done between two properties or a constant and a property > 2. Only basic operations (<, >, <=, >=, ==, !=) are possible. Maybe also boolean > operations (and, or). All of this is completely unnecessary once you remember that the numpy array interface exists, which already implements all of this and more. Best, Tiago  Tiago de Paula Peixoto <[hidden email]> _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool

Tiago de Paula Peixoto <tiago@skewed.de> 
Am Mittwoch, 8. Januar 2020, 18:49:47 CET schrieb Tiago de Paula Peixoto:
> Am 08.01.20 um 15:33 schrieb Gerion Entrup: > > > >> Note that it would be completely unreasonable performancewise to > >> populate the filter property map lazily on demand. > > Only kind of. It should be feasible to populate the property on demand > > (only for the nodes/edges requested), but cache them and only recalculate > > them if a graph change is done and only for the changed vertices/edges. > > Then overall, it should be an O(N) operation again (with N = amount of > > all vertices/edges, even the deleted ones). > > The point is that this would require the GraphView to know and be > updated when the underlying graph changes, and it would tie _access_ to > the filtered graph (even from C++) to function calls to the Pythonside > filter function. > > > A somewhat related but other question. Currently, I use lambdas only to > > match for enum (int) values of properties, because my property can have > > three variants instead of two, e.g.: > > ``` > > from enum import IntEnum > > > > class TypeEnum(IntEnum): > > Type_A = 1 > > Type_B = 2 > > Type_C = 3 > > > > g = graph_tool.Graph() > > g.vertex_properties['type'] = g.new_vp('int') > > > > v = g.add_vertex() > > > > g.vp.type[v] = TypeEnum.Type_C > > > > g_view = graph_tool.GraphView(g, vfilt=lambda x: g.vp.type[x] == TypeEnum.Type_C) > > ``` > > A much more efficient approach would be to use the numpy array interface > to property maps, instead of a lambda function: > > g_view = GraphView(g, vfilt=g.vp.type.fa == TypeEnum.Type_C) > > The equal comparison is done in C, and hence is much faster. > > > This works with the behavior described above. I guess, the same filter directly > > in C++ would be really efficient. What do you think of adding C++Filters? > > > > One possible syntax could be: > > ``` > > from graph_tool.filter import Filter, Equal, Lesser > > g_view1 = graph_tool.GraphView(vfilt=Filter(Equal(g.vp.type, TypeEnum.Type_C))) > > g_view2 = graph_tool.GraphView(vfilt=Filter(Equal(g.vp.type, 2))) > > g_view3 = graph_tool.GraphView(vfilt=Filter(Lesser(g.vp.type, 3))) > > ``` > > Of course they need some constraints: > > 1. The comparison can only done between two properties or a constant and a property > > 2. Only basic operations (<, >, <=, >=, ==, !=) are possible. Maybe also boolean > > operations (and, or). > > All of this is completely unnecessary once you remember that the numpy > array interface exists, which already implements all of this and more. Best, Gerion _______________________________________________ graphtool mailing list [hidden email] https://lists.skewed.de/mailman/listinfo/graphtool signature.asc (673 bytes) Download Attachment 
Free forum by Nabble  Edit this page 