Sometimes Statistics Lie

Lately, I’ve been noticing that people are using statistics more often to lie instead of using common sense.

For instance, I work with a friend who was very upset because it wasn’t recognized that she increased survey participation by 50% versus the prior month’s participation numbers.  Since I’m a little more mathematically astute, she came to me to check if her calculations were correct (and to complain).  So I quickly ran the numbers in my head and she was correct, she did increase the survey participation by 50%.

Now most of us would agree that a 50% increase is fairly impressive, at least from a mathematical perspective!

Statistics Don’t Tell The Entire Story

The catch with my friend is that the number of participants involved with the survey the prior month was only two.  The month that she became involved with the survey results, the number increased to three participants.  So while technically she can claim that her involvement increased the participation rate by 50% versus the previous month… it’s still just one additional person!

To really make matters worse was the fact that her manager was upset because 2 month prior the survey participation number was four!  So in her manager’s viewpoint, the numbers actually decreased by 25%.  While I wholeheartedly agree with my friend, when you speak in terms one or two people, the statistics really doesn’t matter much.  It could be just luck that one more person filled out a survey in her month than versus the previous month, or two months ago.

Obviously, she was right in her calculations, but the significance of one person doesn’t really matter.  Especially when the goal is for having hundreds of responses per month instead of just single digits numbers.

I personally find that a floating average number, or some other kind of baseline is the best for comparing performance.

For example, with my friend, if the average participation rate for the surveys is one person per month, then realistically 3 actually is a phenomenal number and she should be acknowledge for her efforts.

I’m writing this because all to often I see people on all sides of arguments using mathematically correct statistics in a way that promote their argument without taking into account the historic average of numbers the statistics represent.  This is a common tactic with politicians in general.

Statistics that Use Bad Sampling Sets

Poor Statistical Sample Set

Sometimes Statistics Lie

Okay, I’m going to get nerdy on everybody here, so just bare with me… If the sampling set taken isn’t representative of the general population or the target population being represented, then the statistics that are used on the non-representative sample will be inaccurate.  For example, if you take a count of the eye color of 20 Swedish people as your sample set, you’ll derive a number that states that % of the eye color of everybody is blue.  Obviously, we know this isn’t true, but such sampling occasionally happens, especially in politics.  The above picture declares Dewey is the new president, but we know he wasn’t elected.  This was due to an error in the sample set or size of the sample set.

What to Use When Statistics Lie or Are Abused

Try to find unbiased results.  This isn’t easy though, since usually the statistics that a person or group presents was collected by them for the presentation.  This alone should set off red flags!  How accurate can the statistics be if they are using the statistics for their presentation?  Wouldn’t be be kind of silly of them to present statistics that would undermine there cause?

Personally, I try to come to a conclusion based on common sense with such matters.  This isn’t easy and is highly subjective, but without conducting a fair and representative sample, it’s just not possible to get what the real statistics are around a topic.

Statistics that Don’t Account For All Variables and Time

There is a concept called Spurious Relationships, which basically exists when two variables seem related but actually aren’t.  For instance, one of my favorite examples is that ice cream consumption increases in summer and so does the temperature in that given area.  So does eating ice cream raise the temperature of the climate in summer?  No, but yet the two variables seem to be highly correlated…

So basically, what I’m trying to say is just because a correlation may seem to be a statistical number that make sense, often time it doesn’t.  Using common sense, you can debunk such poor statistical usage and you should question the presenters and their motives.  Don’t be fooled by numbers, just because someone uses statistics doesn’t mean that they are right.

Don’t believe the hype, think things through!

MR

19 thoughts on “Sometimes Statistics Lie

  1. There is a reason that Mark Twain popularized the saying, “Lies, damn lies, and statistics.” It is possible to make the numbers say anything you want and emphasize a certain portion of a data set.

  2. I love when ‘mathematgicians’ point to statistics to prove their points. You can make just about anything skew in your favor.

    This is a great post MR. I know people love solid evidence to prove a point, but stats are not solid by any means. Sure data has its place in research and such, but so many people use stats in the same manner your coworked did- to try to get people to align with their beliefs or to make themselves look good.

    • I agree, there is a fallacy that numbers don’t lie, and they don’t but the people presenting them do, or twist the stats by changing the range the stats are made up of to present an favorable number that would backup their argument.

      I’m saying all stats are bad! But much like TV media, you can’t blindly believe them either without an unbiased presentation, and if you make sure the correlation isn’t spurious…

      This was an 3:00am post, so I was late, and this was the first thing that came to my mind…

  3. Or how about the results of final home game of the Washington Redskins and how it relates to Presidential elections? Post hoc ergo propter hoc – correlation does not imply causation, so you better have a justification why one can cause the other – and get the dependent/independent variable correct!

  4. I work with statistics in my job and I can say they can lie. Numbers can be spun any way you like to send a certain message. Plus you can adjust where you get your data from to have a more favourable result. You really have to be careful when reading reports. They may not be as objective as you think.

  5. Your friend can’t make a case with those kind of numbers. One person is not statistically significant and I’m sure she knows that.
    Statistic can be misleading.

  6. I look for trends versus what the numbers may tell me. If the trend is negative, I need to do something. If the trend is positive, I want to do more. Statistics over time creates a trend which is more meaningful!

  7. Very good point. Statistics don’t tell the whole truth and often are biased based on what they are representing or depending on the variables. Though I love using stats myself to make a point, I know it’s only a part of the equation and other factors need to be presented to show a larger picture. The vagueness of my comment is very similar to the vagueness of stats! 😉

    • Actually, I love stats myself… 🙂

      I use stats for my investing and plenty of other things. As Well Heeled said, they are a great tool…

      I think of a baseline is established, then stats are a little better. Of course if the stats producer is unbiased, that’s wonderful too!

    • lol, honestly… it was hard for me not to laugh at her when she told me her calculations. But I can tell she was hurt by her manager being disappointed in her results, so I somehow managed to control my laugher and smile…

      I have to admit, even with her mathematical 50% increase, I sympathised with her manager. Of course I didn’t spin it that way to her though!

  8. I Love when watching the debates (regardless of whether they are R or D) and the stats that are just thrown out there…then someone after will just go point by point and basically expose each one as a freaking lie or twist

  9. Most people aren’t knowledgeable enough about statistics to question the data integrity aspect of it. I see statistics being used erroneously in science experiments all the time and half the crazed assumptions about vaccinations and chemicals are based on whacked out data or correlating random things together.

Comments are closed.