## Chi Squared Notes

.  ___________________________________________________________________________________________________________________________

Chi- Squared-

Chi- Squared Basics – (Notes for those Learners that like to read).
Chi- Squared is our last statistical value that will help us quantify our outcomes when dealing with data from categories, like how many plants have a certain phenotype.  It is similar to SEM (Standard Error of the Mean) in that we generate a probability value that helps us measure whether our hypothesis is supported or not supported.
Probability values that are within 95% is generally the line in the sand between accepting or not accepting an outcome.  Remember that 95% interval in SEM is +/- 2 SEM reliability from the true mean of the population.  If two means overlap WITHIN their respective 95% intervals there is no real difference between these 2 means and the probability that there is a statistical difference has a very low probability (less than or equal to a 5% probability).

SEM is really a variance (spread of measure values) around the True Mean of the population.

Chi- Squared is really a variance value around the “NULL HYPOTHESIS” which is a NULL statement that say’s “There is no difference between the observed results and the expected results.”  The line in the sand here is:  How much difference is there from what we observe and what we expect?  To measure how much difference between the observed and expected values in a research investigation we must measure from a “NULL” or a starting point of ZERO Difference.  If we measure how tall I am aren’t we measuring how many inches or centimeters I am from zero or “NULL”?  The key in CHi-Squared is that we ALWAYS measure from an expectation that there will be no difference between the observed and expected values and THUS we always run a Chi- Squared from AGAINST a NULL HYPOTHESIS.

Now you can setup your NULL Hypothesis so that you are testing a certain outcome. For instance in the example in the lecture above we setup our Null Hypothesis (line in the sand) so that THERE WILL BE NO DIFFERENCE between the observed phenotypes and a 9:3:3:1 ration of phenotypes expected in Mendelian Genetics.  If our Chi-Squared value is small that means that their is not much variance between the observed and the expected values and OUR NULL HYPOTHESIS will be accepted, meaning any small differences are due to sampling errors or Chance EventsSampling errors are errors in that a random sample was not generated and that that there may have been a bias.   Example:  Measuring the height from a basketball team will not reflect the mean of the true population.  Chance events are those events that occur in the experiment that caused outcomes to differ because of “flipping a coin” does not always give us the same number of heads as tails.  In order to achieve a 9 : 3 : 3: 1 outcome the probabilities of chromosomes segregating into gametes must be 50% (heads or tails).  But there is chance that the probabilities are not 50%, especially in small samples.  How many families do you know have more boys than girls yet there is 50 – 50 chance to a have a girl or boy? In the entire world the amount of males and females are about 50 – 50 but in single family there is usually a deviation from 50% chance of males and females.

If our Chi-Squared value is LARGE that means that their is so much variance between the observed and the expected values and OUR NULL HYPOTHESIS will be rejected, meaning the difference is so large that the THERE MUST BE AN ALTERNATIVE  REASON that is DRIVING THE difference (beyond sampling errors and Chance events)!  We often will provide an Alternative Hypothesis as the possible reason that the NULL is rejected.  In the case of the example in the worksheet that is used in the lecture above, the Alternative Hypothesis could be that the alleles do not follow the Law of Independent Assortment (due to crossover).

What determines if we accept or reject the Null Hypothesis?  The Chi-Squared critical values table! This is given to you in your AP Biology reference table. Once we determine the degrees of freedom ( subtract 1 from the number of categories used) we see if the Chi-Squared value IS LARGE ENOUGH to fit in box that has a p value of 0.05 (5%) or a p value of 0.01% (1 %).

If it is LARGE enough that “fit” into those boxes then THERE IS TOO SMALL OF A PROBABILITY TO SUPPORT THE NULL HYPOTHESIS and we REGECT the NULL, which means that there is something (alternative hypothesis) DRIVING or causing the change. Remember the significance of 5%?  If the change or variance from the NULL is OUTSIDE the 95% range then THERE MUST BE A SIGNIFICANT DIFFERENCE (REGECT the NULL).

*Remember that the Chi-Squared is ALWAYS A TEST on the NULL and the p values determined  from the Chi – Squared critical values table are ALWAYS probabilities FOR THE NULL HYPOTHESIS!

The table reveals how much support the there is for the NULL If the Chi-squared value is BIG ENOUGH to fit in the box for p value of 0.05 or 0.01 there is TOO little support for the NULL meaning there is TOO small of a probability that the NULL is supported.

If the Chi-squared value DOES NOT fits in the box for p value of 0.05 or 0.01 there is TOO MUCH support for the NULL meaning there is a LARGE ENOUGH of a probability that the NULL is supported and we accept the NULL.  IS the NULL within the 95% probability range (high probability that the NULL is TRUE) or is it outside 95% probability range (low probability that the NULL is TRUE).  Chi- Squared is always based on the Null and the line in sand is the 95% probability range.

So from the homework you can see that the Chi-squared value of 2.04 is not BIGGER or EQUAL to 7.82 for 3 degrees of freedom so the p value is TOO large (greater than 0.05 or 5 %). This means there is enough probability (with the 0.95 or 95% range) to accept the NULL to be true. WE setup the Null so that the expected values would equal the 9 : 3 : 3: 1 ratios expected if Mendelian Genetics is supported.  OUR statistical analysts of Chi-Squared supports the Null and thus supports the outcome of Mendelian Genetics.  The reason there is some differences from EXACTLY 9: 3 : 3 : 1 ration is due to only sampling errors and chance events. Here is a table that could help: Statistics Presentation: For Chi – Squared Statistic – slides 60 – 66