Model Selection across Distributions

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Model Selection across Distributions

kicasta
Hi all,

I´d have a question regarding model selection with different distributions.
When we want to decide the partition that best describes the data for a
given distribution we go with that that gives the smallest entropy. However
say we want to compare 2 different distributions d1 and d2 and the best fit
for d1 gives an entropy value of e1 and for d2 e2 respectively. If e1 < e2,
can we say that d1 describes better our data than d2?

Best Regards,
Enrique Castaneda



--
Sent from: https://nabble.skewed.de/
_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Model Selection across Distributions

Tiago Peixoto
Administrator
Am 30.11.20 um 10:29 schrieb kicasta:
> Hi all,
>
> I´d have a question regarding model selection with different distributions.
> When we want to decide the partition that best describes the data for a
> given distribution we go with that that gives the smallest entropy. However
> say we want to compare 2 different distributions d1 and d2 and the best fit
> for d1 gives an entropy value of e1 and for d2 e2 respectively. If e1 < e2,
> can we say that d1 describes better our data than d2?

Could you be more specific about to which "distributions" you are
referring? Are you talking about edge covariates?

If so, model selection is explained here:

https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#id28

In this case, the entropy* itself is not enough, you have to consider
also the derivative terms, as is explained in the above.

(The term "entropy" is actually misleading in this context, since the
value refers to a log-density rather than a log-probability.)

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

OpenPGP_0x612DEFB798507F25.asc (40K) Download Attachment
OpenPGP_signature (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>
Reply | Threaded
Open this post in threaded view
|

Re: Model Selection across Distributions

kicasta
Hi Tiago,

yes, I mean edge-covariates. In the example you referenced you compare
state.entropy() for two distributions, i.e. exponential and
log-normal, where for the log-normal model the covariates were scaled,
which is handled by subtracting log(g.ep.weight.a).sum().

In case I want to simply compare two models with unscaled discrete
covariates: one using a geometric distribution and one using a
binomial distribution. Can I perform model selection by  simply
comparing their state.entropy() values?

Best Regards,
Enrique Castaneda


El lun, 30 nov 2020 a las 13:45, Tiago de Paula Peixoto
(<[hidden email]>) escribió:

>
> Am 30.11.20 um 10:29 schrieb kicasta:
> > Hi all,
> >
> > I´d have a question regarding model selection with different distributions.
> > When we want to decide the partition that best describes the data for a
> > given distribution we go with that that gives the smallest entropy. However
> > say we want to compare 2 different distributions d1 and d2 and the best fit
> > for d1 gives an entropy value of e1 and for d2 e2 respectively. If e1 < e2,
> > can we say that d1 describes better our data than d2?
>
> Could you be more specific about to which "distributions" you are
> referring? Are you talking about edge covariates?
>
> If so, model selection is explained here:
>
> https://graph-tool.skewed.de/static/doc/demos/inference/inference.html#id28
>
> In this case, the entropy* itself is not enough, you have to consider
> also the derivative terms, as is explained in the above.
>
> (The term "entropy" is actually misleading in this context, since the
> value refers to a log-density rather than a log-probability.)
>
> Best,
> Tiago
>
> --
> Tiago de Paula Peixoto <[hidden email]>
> _______________________________________________
> graph-tool mailing list
> [hidden email]
> https://lists.skewed.de/mailman/listinfo/graph-tool
_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool
Reply | Threaded
Open this post in threaded view
|

Re: Model Selection across Distributions

Tiago Peixoto
Administrator
Am 30.11.20 um 22:39 schrieb Enrique Castaneda:
> In case I want to simply compare two models with unscaled discrete
> covariates: one using a geometric distribution and one using a
> binomial distribution. Can I perform model selection by  simply
> comparing their state.entropy() values?

Yes, in the case of discrete distributions, the derivative term is not
applicable.

Best,
Tiago

--
Tiago de Paula Peixoto <[hidden email]>

_______________________________________________
graph-tool mailing list
[hidden email]
https://lists.skewed.de/mailman/listinfo/graph-tool

OpenPGP_0x612DEFB798507F25.asc (40K) Download Attachment
OpenPGP_signature (849 bytes) Download Attachment
--
Tiago de Paula Peixoto <tiago@skewed.de>