using internal properties with string type to filter graph

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

using internal properties with string type to filter graph

mvdnheuv
Hi,

I am working on a very large graph of companies and wanted to make some
functions to easily filter out certain subgraphs I would need for some
calculations.

So I made a graph G and populated it with nodes and edges and some internal
property maps because I don't want to always remake this graph. The point is
to just write the complete graph out to a file once I get through all the
data cleaning and have a final graph in a file.

So this gives me:
G, a <Graph object, directed, with 11944189 vertices and 7828750 edges at
0x7f49254e90f0>
with
G.list_properties()
    ID             (vertex)  (type: string)
    company_country (edge)    (type: string)
    shareholder_country (edge)    (type: string)
    shareholderdirect (edge)    (type: double)

Now I just want to do a filtering based on these properties as suggested
earlier in this forum:
g_AT = GraphView(G, efilt=G.ep.company_country.a == 'AT')

But as mentioned in the docs for internal properties: "Internal graph
property maps behave slightly differently. Instead of returning the property
map object, the value itself is returned from the dictionaries"

Which I guess is why running G.ep.company_country.a gives me None, and
running G.ep['company_country'][G.edges().next()] gives me 'AT'.

So for filtering I now do:

ep_filter = G.new_ep('bool')
for e in G.edges():
    ep_filter[e] = G.ep['company_country'][e] == 'AT'

But I was wondering if there is some way to not have to go through this edge
by edge but rather just get the whole PropertyArray returned which would be
more elegant and avoid having to constantly make new boolean properties.

PS: for the double type, G.ep.shareholderdirect.a does give me a nice
PropertyArray which I can directly use in the form of
G.ep.shareholderdirect.a > .5, which gives me an easy to use filtering array
to input into GraphViews.

Thanks in advance,
Milan




--
Sent from: http://main-discussion-list-for-the-graph-tool-project.982480.n3.nabble.com/
_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: using internal properties with string type to filter graph

Tiago Peixoto
Administrator
Am 12.05.20 um 14:38 schrieb mvdnheuv:

> Now I just want to do a filtering based on these properties as suggested
> earlier in this forum:
> g_AT = GraphView(G, efilt=G.ep.company_country.a == 'AT')
>
> But as mentioned in the docs for internal properties: "Internal graph
> property maps behave slightly differently. Instead of returning the property
> map object, the value itself is returned from the dictionaries"
>
> Which I guess is why running G.ep.company_country.a gives me None, and
> running G.ep['company_country'][G.edges().next()] gives me 'AT'.
This has nothing to do with property maps being internal or not. It is
not possible to obtain an array of a string-type property map (internal
or not) because its storage is not contiguous in memory. This is true
also for vector and python object types.

I think be best alternative for you is to convert the strings to numeric
codes that you store in a dictionary, so that you can do:

   g.ep.company_country_code.a == code["AT"]

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>