What formula is used for std in vertex_average and edge_average?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

What formula is used for std in vertex_average and edge_average?

VaSa
I am curious what is being used to calculate the standard deviation of the average in gt.vertex_average and gt.edge_average

>>> t2=gt.Graph()
>>> t2.add_vertex(2)
>>> t2.add_edge(t2.vertex(0), t2.vertex(1))
>>> gt.vertex_average(t2, "in")
(0.5, 0.35355339059327373)

Now, shouldn't std be σ(n)=sqrt(((0-0.5)^2+(1-0.5)^2)/2)=0.5 ?
also q(n-1)=sqrt((0.5^2+0.5^2)/(2-1))~=0.70710

0.3535 is sqrt(2)/4 which happens to be σ(n-1)/2, so it seems there is some relation to that.

A little bigger graph.
>>> t3=gt.Graph()
>>> t3.add_vertex(5)
>>> t3.add_edge(t3.vertex(0), t3.vertex(1))
>>> gt.vertex_average(t3, "in")
(0.2, 0.17888543819998318)

Now, we should have 0,1,0,0,0 series for vertex incoming degree.
So Windows calc gives σ(n)=0.4 and σ(n-1)~=0.44721, so where does 0.1788854 come from ?

Reason, I am asking because, I have a large graph, where the average looks quite alright but the std makes no sense, as going by the histogram, degree values are quite a bit more distributed than the std would indicate.





Reply | Threaded
Open this post in threaded view
|

Re: What formula is used for std in vertex_average and edge_average?

Tiago Peixoto
Administrator
Hi there,

On 05/21/2013 01:37 PM, VaSa wrote:
> I am curious what is being used to calculate the standard deviation of the
> average in gt.vertex_average and gt.edge_average

These functions return the standard deviation of *the mean* not the
standard deviation of the distribution, which is given by,

    \sigma_a = \sigma / sqrt(N)

where \sigma is the standard deviation of the distribution, and N is the
number of samples.

>>>> t2=gt.Graph()
>>>> t2.add_vertex(2)
>>>> t2.add_edge(t2.vertex(0), t2.vertex(1))
>>>> gt.vertex_average(t2, "in")
> (0.5, 0.35355339059327373)
>
> Now, shouldn't std be σ(n)=sqrt(((0-0.5)^2+(1-0.5)^2)/2)=0.5 ?
> also q(n-1)=sqrt((0.5^2+0.5^2)/(2-1))~=0.70710

The standard deviation of the mean is therefore:

    0.5 / sqrt(2) = 0.35355339059327373...

which is what you see.

> A little bigger graph.
>>>> t3=gt.Graph()
>>>> t3.add_vertex(5)
>>>> t3.add_edge(t3.vertex(0), t3.vertex(1))
>>>> gt.vertex_average(t3, "in")
> (0.2, 0.17888543819998318)
>
> Now, we should have 0,1,0,0,0 series for vertex incoming degree.
> So Windows calc gives σ(n)=0.4 and σ(n-1)~=0.44721, so where does 0.1788854
> come from ?
Again, 0.4 / sqrt(5) = 0.17888543819998318...

> Reason, I am asking because, I have a large graph, where the average looks
> quite alright but the std makes no sense, as going by the histogram, degree
> values are quite a bit more distributed than the std would indicate.

If you want the deviation of the distribution to compare with the
histogram, just multiply by sqrt(N).

Cheers,
Tiago


--
Tiago de Paula Peixoto <[hidden email]>


_______________________________________________
graph-tool mailing list
[hidden email]
http://lists.skewed.de/mailman/listinfo/graph-tool

signature.asc (567 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: What formula is used for std in vertex_average and edge_average?

Éverton Fernandes da Cunha
Hi, there

I had the same problem. This topic answered me what I wanted, but I have a
doubt: Why this calculation is more importante/often then just standard
deviation of the distribution?
It is just a curiosity because I never saw that measurement :)

Thanks,
Éverton



--
Sent from: https://nabble.skewed.de/
_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: What formula is used for std in vertex_average and edge_average?

Tiago Peixoto
Administrator
Am 16.07.20 um 21:45 schrieb Éverton Fernandes da Cunha:
> Hi, there
>
> I had the same problem. This topic answered me what I wanted, but I have a
> doubt: Why this calculation is more importante/often then just standard
> deviation of the distribution?

Because we want to express the uncertainty of the mean, not of the
distribution.

> It is just a curiosity because I never saw that measurement :)

https://en.wikipedia.org/wiki/Standard_deviation#Standard_deviation_of_the_mean

--
Tiago de Paula Peixoto <[hidden email]>
_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
--
Tiago de Paula Peixoto <tiago@skewed.de>