

Writing chi, I'm going to write capital X squared. Let me just show it to you, and instead of Now to calculateĬhi-square statistic, we essentially just take. On Thursday, we would haveĮxpected 20% of 200 customers, so that would haveīeen 40 customers. So this would have been 20Ĭustomers, 10% times 200. Number on Monday? Well, on Monday, we would
CHI SQUARE TEST DEGREES OF FREEDOM PLUS
So we have 30 plus 14 plusģ4 plus 45 plus 57 plus 20. The actual number is, we need to figure out the So we would have expectedġ0% of the total customers in that week toĬustomers of that week to come in on Tuesday, 15% The expected observed? So let me write this right here. Percentage here, but what would have been Owner's distribution was correct, what would haveīeen the expected observed? So we have expected Statistic, what I'm going to do is- so here we're assuming Is greater than my alpha, than my significance level, That, if I say, hey, the probability of gettingĪ chi-square statistic that is this extreme or more The null hypothesis, which is essentially just rejecting This is less than 5%, then I'm going to reject If the probability of gettingĪ result like this or something less likely than This or a result more extreme less than 5%. Getting this result, or getting a result like That, what I want to see is the probability of

Of degrees of freedom and we're going to calculate And given that it does haveĪ chi-square distribution with a certain number It is it that statistic that I'm going toĬalculate has approximately a chi-square distribution. And it's going to beĬhi-square statistic. Thinking about it, I'm going to calculate a And I want to do this withĪ significance level of 5%. I should reject the owner's distribution. That the owner's distribution- so that's this thingĪlternative hypothesis is going to be thatĬorrect distribution, that I should not feel I want to accept or reject his hypothesis right Of customers, when they come in during the week,Īnd this is what I get from my observed data. This distribution that he's describing actuallyįits observed data. Little bit suspicious, so I decide to see how good Tuesday, 15% on Wednesday, so forth, and so on. Says 10% of his customers come in on Monday, 10% on This distribution over here, which essentially Of the number of customers you get each day? And he says, oh, I've The current owner, what is the distribution It depends on consequences, risk, what I already know and many other things! Then I would ask for another test! But if I was in the line for a super discount offer on black friday, and a clever person had calculated that there were only 5% chance that I would get the item before it was sold out, then I would step out of the line immediately. Say, the docter tells me: there is only a 5% chance that you have that life threatening disease, given the test result, so you can go home. Sometimes it might be life changing - if it was a test for some disease, I would never be satisfied with a 5% risk. That is your result! What significance level to chose depends on the situation. However, before I confronted him, I think I would observe another week to get more certain knowledge.Ī lot of talking, sorry! My point here: you get the probability from the test. On the other hand, if I knew the shop owner was sloppy with numbers, and had a tendency to lie, etc., then I would be more likely to reject the hypothesis on basis of my observations. I mean, after all, 5% corresponds to about 1/20 - it is not a veeery rare observation. If I generally trusted the shop owner, and new that he had kept track of customers for a long period, and was a clever guy, then I would still believe his hypothesis. The 5% significance criteria is a subjective choice. So, if the hypothesis is right and you make observations for for a weak, then there is almost 5% chance that you see what you see or something even less likely. This is the question you answer with the test, and you can calculate that probability exactly (or you can use tables). what percentage of customers come each day), what is then the probability to see the given observations (30 on monday, 14 on Tuesday, etc) or something more unlikely?" The question you answer with the test can be rephrased like this: "if the shop owner's theory is right (i.e.
