mirror of https://github.com/KatolaZ/NetBunch
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
136 lines
4.6 KiB
136 lines
4.6 KiB
.\" generated with Ronn/v0.7.3
|
|
.\" http://github.com/rtomayko/ronn/tree/0.7.3
|
|
.
|
|
.TH "FITMLE" "1" "September 2017" "www.complex-networks.net" "www.complex-networks.net"
|
|
.
|
|
.SH "NAME"
|
|
\fBfitmle\fR \- Fit a set of values with a power\-law distribution
|
|
.
|
|
.SH "SYNOPSIS"
|
|
\fBfitmle\fR \fIdata_in\fR [\fItol\fR [TEST [\fInum_test\fR]]]
|
|
.
|
|
.SH "DESCRIPTION"
|
|
\fBfitmle\fR fits the data points contained in the file \fIdata_in\fR with a power\-law function P(k) ~ k, using the Maximum\-Likelihood Estimator (MLE)\. In particular, \fBfitmle\fR finds the exponent \fBgamma\fR and the minimum of the values provided on input for which the power\-law behaviour holds\. The second (optional) argument \fItol\fR sets the acceptable statistical error on the estimate of the exponent\.
|
|
.
|
|
.P
|
|
If \fBTEST\fR is provided, the program associates a p\-value to the goodness of the fit, based on the Kolmogorov\-Smirnov statistics computed on \fInum_test\fR sampled distributions from the theoretical power\-law function\. If \fInum_test\fR is not provided, the test is based on 100 sampled distributions\.
|
|
.
|
|
.SH "PARAMETERS"
|
|
.
|
|
.TP
|
|
\fIdata_in\fR
|
|
Set of values to fit\. If is equal to \fB\-\fR (dash), read the set from STDIN\.
|
|
.
|
|
.TP
|
|
\fItol\fR
|
|
The acceptable statistical error on the estimation of the exponent\. If omitted, it is set to 0\.1\.
|
|
.
|
|
.TP
|
|
TEST
|
|
If the third parameter is \fBTEST\fR, the program computes an estimate of the p\-value associated to the best\-fit, based on \fInum_test\fR synthetic samples of the same size of the input set\.
|
|
.
|
|
.TP
|
|
\fInum_test\fR
|
|
Number of synthetic samples to use for the estimation of the p\-value of the best fit\.
|
|
.
|
|
.SH "OUTPUT"
|
|
If \fBfitmle\fR is given less than three parameters (i\.e\., if \fBTEST\fR is not specified), the output is a line in the format:
|
|
.
|
|
.IP "" 4
|
|
.
|
|
.nf
|
|
|
|
gamma k_min ks
|
|
.
|
|
.fi
|
|
.
|
|
.IP "" 0
|
|
.
|
|
.P
|
|
where \fBgamma\fR is the estimate for the exponent, \fBk_min\fR is the smallest of the input values for which the power\-law behaviour holds, and \fBks\fR is the value of the Kolmogorov\-Smirnov statistics of the best\-fit\.
|
|
.
|
|
.P
|
|
If \fBTEST\fR is specified, the output line contains also the estimate of the p\-value of the best fit:
|
|
.
|
|
.IP "" 4
|
|
.
|
|
.nf
|
|
|
|
gamma k_min ks p\-value
|
|
.
|
|
.fi
|
|
.
|
|
.IP "" 0
|
|
.
|
|
.P
|
|
where \fBp\-value\fR is based on \fInum_test\fR samples (or just 100, if \fInum_test\fR is not specified) of the same size of the input, obtained from the theoretical power\-law function computed as a best fit\.
|
|
.
|
|
.SH "EXAMPLES"
|
|
Let us assume that the file \fBAS\-20010316\.net_degs\fR contains the degree sequence of the data set \fBAS\-20010316\.net\fR (the graph of the Internet at the AS level in March 2001)\. The exponent of the best\-fit power\-law distribution can be obtained by using:
|
|
.
|
|
.IP "" 4
|
|
.
|
|
.nf
|
|
|
|
$ fitmle AS\-20010316\.net_degs
|
|
Using discrete fit
|
|
2\.06165 6 0\.031626 0\.17
|
|
$
|
|
.
|
|
.fi
|
|
.
|
|
.IP "" 0
|
|
.
|
|
.P
|
|
where \fB2\.06165\fR is the estimated value of the exponent \fBgamma\fR, \fB6\fR is the minimum degree value for which the power\-law behaviour holds, and \fB0\.031626\fR is the value of the Kolmogorov\-Smirnov statistics of the best\-fit\. The program is also telling us that it decided to use the discrete fitting procedure, since all the values in \fBAS\-20010316\.net_degs\fR are integers\. The latter information is printed to STDERR\.
|
|
.
|
|
.P
|
|
It is possible to compute the p\-value of the estimate by running:
|
|
.
|
|
.IP "" 4
|
|
.
|
|
.nf
|
|
|
|
$ fitmle AS\-20010316\.net_degs 0\.1 TEST
|
|
Using discrete fit
|
|
2\.06165 6 0\.031626 0\.17
|
|
$
|
|
.
|
|
.fi
|
|
.
|
|
.IP "" 0
|
|
.
|
|
.P
|
|
which provides a p\-value equal to 0\.17, meaning that 17% of the synthetic samples showed a value of the KS statistics larger than that of the best\-fit\. The estimation of the p\-value here is based on 100 synthetic samples, since \fInum_test\fR was not provided\. If we allow a slightly larger value of the statistical error on the estimate of the exponent \fBgamma\fR, we obtain different values of \fBgamma\fR and \fBk_min\fR, and a much higher p\-value:
|
|
.
|
|
.IP "" 4
|
|
.
|
|
.nf
|
|
|
|
$ fitmle AS\-20010316\.net_degs 0\.15 TEST 1000
|
|
Using discrete fit
|
|
2\.0585 19 0\.0253754 0\.924
|
|
$
|
|
.
|
|
.fi
|
|
.
|
|
.IP "" 0
|
|
.
|
|
.P
|
|
Notice that in this case, the p\-value of the estimate is equal to 0\.924, and is based on 1000 synthetic samples\.
|
|
.
|
|
.SH "SEE ALSO"
|
|
deg_seq(1), power_law(1)
|
|
.
|
|
.SH "REFERENCES"
|
|
.
|
|
.IP "\(bu" 4
|
|
A\. Clauset, C\. R\. Shalizi, and M\. E\. J\. Newman\. "Power\-law distributions in empirical data"\. SIAM Rev\. 51, (2007), 661\-703\.
|
|
.
|
|
.IP "\(bu" 4
|
|
V\. Latora, V\. Nicosia, G\. Russo, "Complex Networks: Principles, Methods and Applications", Chapter 5, Cambridge University Press (2017)
|
|
.
|
|
.IP "" 0
|
|
.
|
|
.SH "AUTHORS"
|
|
(c) Vincenzo \'KatolaZ\' Nicosia 2009\-2017 \fB<v\.nicosia@qmul\.ac\.uk>\fR\.
|
|
|