Hard Light Productions Forums
Off-Topic Discussion => General Discussion => Topic started by: SF-Junky on December 25, 2015, 05:05:48 am
-
I suck so hard when it comes to statistics and econometrics, I should just give it up. But can the smart people around here explain me one thing, please?
The probability density function can be used to calculate to probability of a randomly drawn value x to lie within a particular interval by forming the integral. So, the smaller the integral, the smaller is the probability that said value x lies within that interval, yes? Consequently, the p.d.f. should be platykurtic when variance is higher and leptokurtic when it is lower than for a standard normal distribution.
But now all textbooks tell me then the more outliers there are in the sample, the bigger is the curtosis and therefore leptokurtosis implies a larger number of outliers. That fully makes sense if you just look at the formula for kurtosis. But: Does a larger share of outliers not also imply bigger variance?
:confused:
[attachment DELETED!! by Strong Bad]
-
There seem to be a couple competing definitions of what kurtosis actually measures, but this image from Wikipedia (usual grain of salt applies) seems to explain it: https://en.wikipedia.org/wiki/File:Standard_symmetric_pdfs.png
The description states that these distributions all have a variance of 1 and different kurtosis. For two distributions with equal variance, the one with higher kurtosis will have more outliers - that is, more of the variance is from the "tail" of the distribution rather than the "shoulder" (for instance, look at the uniform distribution - it has the same variance as all the others, but no outliers/no tail).
From what I've gathered, it seems like kurtosis is mostly useful for measuring distributions with the same variance to get more info on how they're actually shaped, rather than just their size.
[EDIT] I'm not sure where the image you posted came from, but (again, just from what I've gathered) it appears to actually be wrong - in fact it looks like it's varying the variance, not the kurtosis of its distribution.
-
[EDIT] I'm not sure where the image you posted came from, but (again, just from what I've gathered) it appears to actually be wrong - in fact it looks like it's varying the variance, not the kurtosis of its distribution.
I made this myself using R. And yes, it only changes the variance. But obviously a change in the variance implies a change in the kurtosis - which makes sense analytically.
What my intuitive problem of understanding is here: How can one increase the number of outliers while the overall variance of the variable remains the same?
...
No, wait, it's pretty obvious, no? Then the variance in the sample without outliers has to decrease.
Maaaan, I'm just too dumb for this. :D
-
What my intuitive problem of understanding is here: How can one increase the number of outliers while the overall variance of the variable remains the same?
I'd say the wikipedia image answers that - if it's correct, the variance of all the distributions shown is the same. But some of them, for instance the uniform distribution, have all their variance coming from moderately-infrequent deviation from the mean (the shoulder), whereas some have very few extreme deviations from the mean (outliers/the tail).
I'm still not sure that comparing the kurtosis of distributions with different variance is even that useful in the first place. Some of the online lessons I've found siggest that it's not.
Anyway, if the distributions in your image are all normal (which again, it looks like they are), then they all have a kurtosis of 0 (the normal dist is mesokurtic). You can't change the kurtosis by changing the variance of a normal distribution - you have to use a different distribution, since the shape of the distribution (not its variance) is what kurtosis seems to measure.
/me hopes this is all correct and he hasn't made a fool of himself for when a real stats nerd finds the thread
-
The WP article mentions that all normal distributions have kurtosis equal to 3, and a normal distribution can have any variance you like.
-
Ah yes, the image I linked is using the "excess kurtosis" (which, it seems, is just kurtosis - 3).