Saturday, 17 May 2014 00:00

The Maxwell Distribution





Suppose we are given a certain quantity of an ideal gas at some fixed temperature, and we want to know what sort of distribution of velocities to associate with this gas.
That is, given a range of velocities, \[\Delta v = v_\beta - v_\alpha, \]
what is the number of molecules, \[\Delta n,  \]with velocities in the region of phase space \[\Delta v= \Delta v_x \Delta v_y \Delta v_z? \]

Sunday, 12 February 2012 22:57

Standard Deviations and Cuttlefish



What's the difference between sample standard deviation and population standard deviation? When do we use N-1 and when N in the denominator?

Saturday, 23 January 2010 22:09

Population Distribution vs. Sample Distribution


{module [79]}



Saturday, 23 January 2010 22:05

Sample Size: Proportion


Saturday, 23 January 2010 00:00

Sample Standard Deviation


The sample standard deviation is almost the same formula as Lowercase Sigma = Population Standard Deviation, except that the denominator calls for N - 1 (one less than the number of values given). 



Saturday, 23 January 2010 03:01

Sampling Distribution of the Sample Mean


I admit: "sampling distribution of the sample mean" sounds a little creepy, not only because the term is too long-winded for its own good, but also because it feels like you're running in an endless loop. 


The best way to explain this one is to give an example:


Suppose you have a population of 5 basketball players:

A, B, C, D and E. 

Let us suppose that their respective heights are:

76, 78, 79, 81 and 86


If we had a sample size of 2, then we would be able to derive the following combinations of these players and their heights:

SAMPLE (size 2) HEIGHTS X - Bar Values
A, B 76, 78 77.0
A, C 76, 79 77.5
A, D 76, 81 78.5
A, E 76, 86 81.0
B, C 78, 79 78.5
B, D 78, 81 79.5
B, E 78, 86 82.0
C, D 79, 81 80.0
C, E 79, 86 82.5
D, E 81, 86 83.5

The X-bar column values represent the Sampling Distribution of the Sample Mean, because they are the MEAN of the values for each SAMPLE.


Now let's try a different sample size.  Let's try a sample size of 4.

SAMPLE (size 4) HEIGHTS X - Bar Values
A, B, C, D 76, 78, 79, 81 78.50
A, B, C, E 76, 78, 79, 86 79.75
A, B, D, E 76, 78, 81, 86 80.25
A, C, D, E 76, 79, 81, 86 80.50
B, C, D, E 78, 79, 81, 86 81.00

The X-bar column values represent the Sampling Distribution of the Sample Mean, because they are the MEAN of the values for each SAMPLE.


And that's all it is!

Saturday, 23 January 2010 02:57

The Distribution of a Statistic


{module [80]}The distribution of a statistic is officially called the sampling distribution of the statistic. 

Broken down a little bit further, the distribution of a statistic is all possible values of the statistic for samples of any given size.  Try not to get too crazed by all the fancy lingo when first starting out in a Stat course.  Check out our section on What's with the Greek? for more definitions broken down.

Saturday, 23 January 2010 00:00

Z-Score Chart



This is the standard type of table you will see in  most Statistics Textbooks. 

If you are allowed to use a calculator for calculating Z-scores and areas under the curve, I suggest you glance at this to get familiar with what it is, and MOVE ON. 

If you are NOT allowed to use a calculator, it would be a good idea to get friendly with this table - and FAST.  During an exam, the last thing you want to be worrying about is figuring out how to find your way around this thing!


Friday, 22 January 2010 23:54

Mean vs. Median - Which is Better?



Both the mean and the median are measures of center. 

If you have a symmetrical set of data -- IF THE NUMBERS IN THE SET ARE EVENLY SPACED -- the mean and the median will be EXACTLY THE SAME. 

Here is WHY:

If you have a data set: 25, 50, 75

MEAN = (25 + 50 + 75) = 150 / 3 = 50

MEDIAN = 50 (the number bang in the center)

Both values are the same. 

When dealing with skewed data sets (when the numbers are NOT evenly spaced), it is better to use the median to express the center.  It is RESISTANT to extreme values. 


Here is WHY:

If you have a data set: 20, 50, 100

MEAN = (20 + 50 + 100) / 3 = 56.6666666



If we make this set even more extreme: 10, 50, 150

MEAN = (10 + 50 + 150) / 3 = 53.333333



No matter how we change the values in this set, if the middle number is 50, the MEDIAN will be 50.  ALWAYS.  

The mean is SENSITIVE to change by every value, and therefore should only be used where the data is normally distributed. 

I always remembered this by memorizing that we are all "sensitive to mean [people]" - but whatever works for you!

Page 1 of 3