Both the mean and the median are measures of center.

If you have a symmetrical set of data -- IF THE NUMBERS IN THE SET ARE EVENLY SPACED -- the mean and the median will be EXACTLY THE SAME.

**Here is WHY:**

If you have a data set: 25, 50, 75

MEAN = (25 + 50 + 75) = 150 / 3 = 50

**MEDIAN = 50 (the number bang in the center)**

**Both values are the same. **

When dealing with skewed data sets (when the numbers are NOT evenly spaced), it is better to use the median to express the center. It is RESISTANT to extreme values.

**Here is WHY:**

If you have a data set: 20, 50, 100

MEAN = (20 + 50 + 100) / 3 = 56.6666666

**MEDIAN = 50**

If we make this set even more extreme: 10, 50, 150

MEAN = (10 + 50 + 150) / 3 = 53.333333

**MEDIAN = 50**

**No matter how we change the values in this set, if the middle number is 50, the MEDIAN will be 50. ALWAYS. **

**The mean is SENSITIVE to change by every value, and therefore should only be used where the data is normally distributed. **

I always remembered this by memorizing that we are all "__sensitive__ to __mean__ [people]" - but whatever works for you!

**The Greek lowercase letter for "M" (pictured above on the right) is pronounced as "mew."**

**This symbol represents the mean of a data set. **

The EMPIRICAL RULE, otherwise known as the 68.26-95.44-99.74 RULE, says the following:

**1) 68.26% of all observed data values will fall between ONE standard deviation to the RIGHT or LEFT of the mean. **

**2) 95.44% of all observed data values will fall between TWO standard deviations to the RIGHT or LEFT of the mean. **

**3) 99.74% of all observed data values will fall between THREE standard deviations to the RIGHT or LEFT of the mean. **

This is what the illustrated version of the Empirical Rule looks like:

EXAMPLE:

**If we are told that the mean of our data is 100, and the standard deviation is 10, then we know the following:**

**1) 68.26% of our data will fall between 90 and 110. **

**2) 95.44% of our data will fall between 80 and 120. **

**3) 99.74% of our data will fall between 70 and 130. **

Disclaimer: I did not create nor do I own these videos. I have simply embedded them, courtesy of YouTube.

**NOTE: If a sample size is greater than 30, it is USUALLY (though not always) large enough to prove the Central Limit Theorem true. **

**A SAMPLE is a sub-set of the **POPULATION**.**

**A SAMPLE is drawn to represent the population, negating the need to conduct an extensive census. **

**An example of a sample would be:**

You decide you want to take a survey of the student body at your school. Without a team of helpers, it will be nearly impossible to survey EVERYONE in a short period of time. So instead, you decide to draw a SIMPLE RANDOM SAMPLE, which you determine is representative of the population.

**Studying and drawing CONCLUSIONS from a sample would be a heck of a lot easier than trying to survey every person (and study every person) in the Population. **

** **

** **

**OK, so "population" doesn't exactly merit a "wordy definition" on its own. But when we think of "population" we often think of the U.S. population - such as is recorded by the U.S. Census. **

** **

** **

**This is not too far off-the-mark. According to Wikipedia: "A population can be defined as including all people or items with the characteristic one wishes to understand."**

** **

** **

**More simply put, a statistical POPULATION is the POOL from which a SAMPLE can be drawn. **

** **

**POPULATIONS can often be large, making studies overly complex, time-consuming and expensive. This is why we draw a SAMPLE and go to great lengths to find a SAMPLE that is REPRESENTATIVE of the POPULATION. This yields more time-efficient studies conducted on a SAMPLE instead of the entire POPULATION. **

Disclaimer: I did not create nor do I own these videos. I have simply embedded them, courtesy of YouTube.

This is a great video because it gives walk-throughs of z-score calculations from homework problems. You may not have these exact problems, but the same concepts can be applied to your own work!

These examples rely on the Z-Score Formula:

MEMORIZE this formula, make sure you know it COLD!

If you do not know what the "m-like" symbol or the "o" with a tail are, check out What's with the Greek?

**Sometimes we need a standardized scale to measure a value's distance from the center. **

**A Z-score indicates how many STANDARD DEVIATIONS a value is from the mean. **

**The official formula is:**

**So let's say the MEAN is 100 and the Standard Deviation is 15. **

**If you are given a value of 132, you just plug that into the formula above. **

**132 - 100 = 32**

**32 / 15 = 2.133 **

**VOILA - Your Z-Score is 2.133**

**X Bar (pictured below under Sample Mean) is simply the mean of a given set of sample values. **

**As you will notice, "X Bar" is the same as the POPULATION MEAN, merely reexpressed. **

**Read **Sample Mean vs. Population Mean** for more information. **

** **

**The MODE of a data set is simply the number that appears the most often. **

**For example, in this set: [1, 3, 6, 6, 6, 6, 7, 7, 12, 12, 17] - The mode is 6. This is a UNIMODAL set, and looks like this:**

In the set: [1, 1, 2, 4, 4] - There are TWO modes (1 and 4), making this set BIMODAL, which looks like this:

**For sets where there are more than TWO modes, the set is called MULTIMODAL. **