Memory requirement estimate for a large graph

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Memory requirement estimate for a large graph

linello
Dear Tiago,

How is it possible to get an estimate for the memory requirement of a graph in graph-tool?
I know that graph-tool is built upon C++ and Boost, and the adjacency list is stored via a hash-map.
Apart from the cost of storing the values of vertices indices and edges indices as `unsigned long`, what is the memory overhead of the structures used in storing the graph?

For example, for a network of 1M vertices and 100M links without attributes, how much real memory should I plan to use, excluding temporaries?

Sorry if the question is repeated, but I could not find it in the previous mailing list posts.

Regards,
Carlo


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Memory requirement estimate for a large graph

Tiago Peixoto
Administrator
Am 06.07.20 um 13:00 schrieb Carlo Nicolini:
> Dear Tiago,
>
> How is it possible to get an estimate for the memory requirement of a
> graph in graph-tool?

Yes, it is, and I should put this in the documentation somewhere.

> I know that graph-tool is built upon C++ and Boost, and the adjacency
> list is stored via a hash-map.

Not quite, we use an adjacency list using std::vector<>.

> Apart from the cost of storing the values of vertices indices and edges
> indices as `unsigned long`, what is the memory overhead of the
> structures used in storing the graph?

We use a vector-based bidirectional adjacency list, so each edge appears
twice. Each edge is comprised of two size_t (uint64_t) values, for the
target/source and the edge index, so we need 32 bytes per edge.

For each vertex we need a std::vector<> which is 24 bytes and a uint64_t
to separate the in-/out-lists, so we also need 32 bytes per node.

Therefore we need in total:

    (N + E) * 32 bytes

> For example, for a network of 1M vertices and 100M links without
> attributes, how much real memory should I plan to use, excluding
> temporaries?

That would be:

   3232000000 bytes = 3.01 GB

In practice you will need a little more, since std::vector<> tends to
over-commit.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Memory requirement estimate for a large graph

linello
Many thanks Tiago for the quick answer.

I've tried (using `htop` command) to measure the RES memory requirement for such a graph with 1M nodes and 100M links, but the results is almost twice the size, a total of 6.2 GB.
Is there a reason why I get that figure?
I am on MacOS 10.15.6 Catalina



Il giorno lun 6 lug 2020 alle ore 14:41 Tiago de Paula Peixoto <[hidden email]> ha scritto:
Am 06.07.20 um 13:00 schrieb Carlo Nicolini:
> Dear Tiago,
>
> How is it possible to get an estimate for the memory requirement of a
> graph in graph-tool?

Yes, it is, and I should put this in the documentation somewhere.

> I know that graph-tool is built upon C++ and Boost, and the adjacency
> list is stored via a hash-map.

Not quite, we use an adjacency list using std::vector<>.

> Apart from the cost of storing the values of vertices indices and edges
> indices as `unsigned long`, what is the memory overhead of the
> structures used in storing the graph?

We use a vector-based bidirectional adjacency list, so each edge appears
twice. Each edge is comprised of two size_t (uint64_t) values, for the
target/source and the edge index, so we need 32 bytes per edge.

For each vertex we need a std::vector<> which is 24 bytes and a uint64_t
to separate the in-/out-lists, so we also need 32 bytes per node.

Therefore we need in total:

    (N + E) * 32 bytes

> For example, for a network of 1M vertices and 100M links without
> attributes, how much real memory should I plan to use, excluding
> temporaries?

That would be:

   3232000000 bytes = 3.01 GB

In practice you will need a little more, since std::vector<> tends to
over-commit.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Memory requirement estimate for a large graph

Tiago Peixoto
Administrator
Am 06.07.20 um 15:53 schrieb Carlo Nicolini:
> Many thanks Tiago for the quick answer.
>
> I've tried (using `htop` command) to measure the RES memory requirement
> for such a graph with 1M nodes and 100M links, but the results is almost
> twice the size, a total of 6.2 GB.
> Is there a reason why I get that figure?

Hard to say without a minimal working example. Try saving the network to
disk, and loading it again from a newly started interpreter.

I tried on my machine, and I got 4.5 GB. As I explained, std::vector<>
may over-commit the allocations.

--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>