Mode, Median, & Mean
Univariate statistics are statistics that are not designed to inform people
about relationships or multiple variables, but instead, these statistics focus
on a single variable, hence the name univariate. They ultimately are used
to give information about the way the scores in a variable are
distributed. Among univariate statistics, the most common are those
referred to as "averages," which are more correctly labeled as
measures of central tendency. Being that measures of central
tendency are the averages of data collections, they therefore focus
on the middle of the data. They can be broken down into more
specific measurements, and the three main measures are mode, median, and mean.
The first measure of central tendency to discuss is the mode. This measurement is supposed to be the
simplest to find, and can be very useful.
The term mode basically stands for the most, as it is defined as the
category with the greatest number of cases.
Because one can look at the set of data and determine the mode, this
average makes no arithmetic requirements.
No matter how many numbers there are, if one number appears even just
one more time than any other number, it is the mode of that particular set of
data. While there is usually only one
mode for any set of data and it is simple to find, things can get a somewhat
more complicated when there are two or more categories that are tied for the
largest number of cases. If this is the
case, there is more than one mode, or multiple modes. If only two categories are tied and there are
then two modes, the distribution is labeled as bimodal. If three are tied, the distribution is then
labeled as trimodal. If there or four or
more distributions tied, the case is labeled as multimodal. The mode can be very useful because it only
requires nominal-level data. With these
requirements, it can be used with any level of data. One example of the way that a mode is useful
is that it can tell a most frequent value, such as what is the most frequent
crime. A search of the modal category of
index crimes in the UCR would give us the correct answer (which is theft, by the
way). You could also use the mode when
predicting the results of throwing or rolling a pair of dice (such as in the
game of craps or even monopoly). When coming
up with all of the possibilities of combinations to give a specific sum, seven
is the mode of this particular exercise.
Seven can be rolled in seven different ways, which is more than any
other sum. With that being said, a mode
can be useful in predicting categories, or even outcomes. That is pretty useful being that it is such a
simple statistic.
The next measure of central tendency is the median, which stands for the
middle of a set of data. The median is
defined as the midpoint case in an ordered distribution. To obtain the median, one must place all of
the numbers in a set of data in order from smallest to largest. Once this is done, one just simply finds the
middle value by counting the numbers. It
is most simple to find the median if the number of values in a given set is
odd, because there would be one value in the exact middle. For example, if the set has the numbers 0, 3,
5, 7, and 9, there are 5 numbers. The third
number would then be the median, which is 3.
However, the ability to find the mean can become a little more
complicated when the number of values in a given set is even, making two values
the middle of the data. If a case listed
the numbers 1 through 12, then 6 and 7 would share the middle. When this type of median occurs, the median
is referred to as the artificial median, meaning that it is not the middle of
the data, but it is the halfway point between the two numbers which were found
in the middle. The median of the list
containing the numbers 1 through 12 would then be 6.5, because that is what is
between 6 and 7. If one wanted to find
the median using a formula, they could use (n
+ 1) / 2. The n represents the number of cases.
This formula works well with large data sets, and if the data set is
smaller, it is easier to determine the median visually.
Unlike the mode, the median does require at least ordinal-level data, which
is why the values must be placed in order.
This is also why the median is referred to as the ordinal-level
average. One thing that is more
beneficial when looking at the median is that it is not influenced by extreme
cases in the data set. If a data set
contains the numbers 7, 11, 14, 16, 20, 25, and 50, the number 50 is considered
an extreme. When using the median, the median
is a useful average because it is 16 and actually fits the rest of the data
set. This median meets the
characteristics of the data set and is a beneficial representation. Any time an extreme is present, the median
would be the wisest average to use to give the most accurate result.
The last measure of central tendency to discuss is the mean. The mean is the most common average
used. It is defined as the arithmetic
average of a set of scores. Basically,
the mean is a calculated score that requires at least interval-level data
because it uses “real” values or magnitudes.
That is the main thing to keep in mind about the mean, it cannot be
calculated with only nominal or ordinal-level data because you cannot perform
any arithmetic (add, multiply, subtract, or divide) without interval-level
data, and it is an arithmetic average. Any
time there is interval or ratio data, the mean is an excellent measure of
central tendency. The mean is taught is
school and is most commonly known, which is why it is most commonly used. Students and teachers both use the mean to
average a final grade in a class over a certain period of time. The mean is calculated by summing up all of
the scores (or numbers) in a set and dividing them by the number of scores
present. For example, if a student made
an 80, 96, 93, and 89 on four tests, the student our teacher would find the
final grade by adding the four scores up to get the total of 358, and dividing
by four because there are four tests graded.
The final grade would be 89.5. The
way of calculating the mean gives it one very important property because of its
arithmetic base. It is the one point that
is closest to not one, or two, but all of the score in a set of data.
The mean is the most common average used; although, it can be a problem. Unlike the median, the mean can be greatly affected
by an extreme in the data set. If a data
set contains the scores 1, 5, 7, 10, 2, 1, 4, 11, 8, 127, and 3, the total is
179. After dividing by 11, the mean
would equal 16.5 which is higher than most of the scores in the data set. An extreme throws the average way off, and
can create a poor representation of a data set.
When looking at the mean, one can see even further that when looking at
a data set containing an extreme, it is wiser to use the median instead of the
mean. Other than needing interval or
ratio-level data and the problem with extremes when determining, the mean is
the most popular and is also very easily interpreted. It requires no order, just direct
calculations. More importantly, the mean
is said to be the foundation of ones more sophisticated and most powerful
statistics. While a mean is a form of central tendency used in univariate statistics, it can also be used when comparing two or more groups of individuals. The mean can further tell if they are alike or different. Because this comparison is so simple and most people understand the mean, it is most commonly used. The mean compares two or more groups by using their central points. For example, one study used the mean to determine if males or females were more likely to be victimized by drunk drivers. The study showed that males had a mean occurrence of lifetime DWI victimization of 0.16, and females had a mean occurrence of lifetime DWI victimization of 0.14. From this information, it was easy to tell that males are slightly more likely to be a DWI victim in their lifetime than females.
All three of the main measures of central tendency, mode, median, and mean, are easily calculated and simple to understand. It is important that one understands these terms when looking for average or proper representations of any set of data. While central tendency measurements as a whole compare two or more groups of individuals to see if they are alike or different, univariate statistics can be used to do the same thing, but with a single variable. It is important that a person knows when to use which measure of central tendency and with which set of data so that poor representations are not present during research and when presenting studies to someone else, or even the public.
My information came from Chapter 4: Measures of Central Tendency. I feel that the author did an excellent job explaining each of the measurement throughout the chapter. Each type was explained in great detail. The author gave thorough examples, and also listed the problems or "glitches" of each different average. Being someone who already knew all about these measures of central tendency, I thought I would be bored throughout this chapter; however, the author wrote in a way that made it simple, yet interesting and it kept me captivated. I can also see that if someone had no previous knowledge of the subject matter, they would be able to follow along easily, stay captivated, and learn the material to a decent understanding. I can honestly say that I did not have a problem with this author's writing, and I hope that when people read mine in this blog, they feel the same way.
Sounds like you're off to a great start, I hope to see some pictures in the future!
ReplyDelete