A low p value means that the sample result would be unlikely if the null hypothesis were true and leads to the rejection of the null hypothesis. A high p value means that the sample result would be likely if the null hypothesis were true and leads to the retention of the null hypothesis.
But how low must the p value be before the sample result is considered unlikely enough to reject the null hypothesis? When this happens, the result is said to be statistically significant. This does not necessarily mean that the researcher accepts the null hypothesis as true—only that there is not currently enough evidence to conclude that it is true. The p value is one of the most misunderstood quantities in psychological research Cohen, [1]. Even professional researchers misinterpret it, and it is not unusual for such misinterpretations to appear in statistics textbooks!
The most common misinterpretation is that the p value is the probability that the null hypothesis is true—that the sample result occurred by chance. For example, a misguided researcher might say that because the p value is. But this is incorrect. The p value is really the probability of a result at least as extreme as the sample result if the null hypothesis were true. So a p value of. You can avoid this misunderstanding by remembering that the p value is not the probability that any particular hypothesis is true or false.
Instead, it is the probability of obtaining the sample result if the null hypothesis were true. Specifically, the stronger the sample relationship and the larger the sample, the less likely the result would be if the null hypothesis were true.
That is, the lower the p value. This should make sense. If there were really no sex difference in the population, then a result this strong based on such a large sample should seem highly unlikely. If there were no sex difference in the population, then a relationship this weak based on such a small sample should seem likely. And this is precisely why the null hypothesis would be rejected in the first example and retained in the second.
Of course, sometimes the result can be weak and the sample large, or the result can be strong and the sample small. In these cases, the two considerations trade off against each other so that a weak result can be statistically significant if the sample is large enough and a strong relationship can be statistically significant even if the sample is small. Table The columns of the table represent the three levels of relationship strength: weak, medium, and strong. The rows represent four sample sizes that can be considered small, medium, large, and extra large in the context of psychological research.
Thus each cell in the table represents a combination of relationship strength and sample size. If it contains the word No , then it would not be statistically significant for either.
There is one cell where the decision for d and r would be different and another where it might be different depending on some additional considerations, which are discussed in Section If you keep this lesson in mind, you will often know whether a result is statistically significant based on the descriptive statistics alone. It is extremely useful to be able to develop this kind of intuitive judgment.
One reason is that it allows you to develop expectations about how your formal null hypothesis tests are going to come out, which in turn allows you to detect problems in your analyses.
For example, if your sample relationship is strong and your sample is medium, then you would expect to reject the null hypothesis. If for some reason your formal null hypothesis test indicates otherwise, then you need to double-check your computations and interpretations. A second reason is that the ability to make this kind of intuitive judgment is an indication that you understand the basic logic of this approach in addition to being able to do the computations.
A statistically significant result is not necessarily a strong one. Even a very weak result can be statistically significant if it is based on a large enough sample. The P -value approach involves determining "likely" or "unlikely" by determining the probability — assuming the null hypothesis were true — of observing a more extreme test statistic in the direction of the alternative hypothesis than the one observed.
Specifically, the four steps involved in using the P -value approach to conducting any hypothesis test are:. Recall that probability equals the area under the probability curve. It can be shown using statistical software that the P -value is 0. The graph depicts this visually. The P -value, 0. So, you might get a p -value such as 0. However, you want to know whether this is "statistically significant". We reject it because at a significance level of 0. Whilst there is relatively little justification why a significance level of 0.
However, if you want to be particularly confident in your results, you can set a more stringent level of 0. When considering whether we reject the null hypothesis and accept the alternative hypothesis, we need to consider the direction of the alternative hypothesis statement.
For example, the alternative hypothesis that was stated earlier is:. The alternative hypothesis tells us two things. First, what predictions did we make about the effect of the independent variable s on the dependent variable s? Second, what was the predicted direction of this effect? Let's use our example to highlight these two points. Sarah predicted that her teaching method independent variable: teaching method , whereby she not only required her students to attend lectures, but also seminars, would have a positive effect that is, increased students' performance dependent variable: exam marks.
If an alternative hypothesis has a direction and this is how you want to test it , the hypothesis is one-tailed. That is, it predicts direction of the effect.
If the alternative hypothesis has stated that the effect was expected to be negative, this is also a one-tailed hypothesis. Alternatively, a two-tailed prediction means that we do not make a choice over the direction that the effect of the experiment takes.
Rather, it simply implies that the effect could be negative or positive. If Sarah had made a two-tailed prediction, the alternative hypothesis might have been:. In other words, we simply take out the word "positive", which implies the direction of our effect. In our example, making a two-tailed prediction may seem strange.
After all, it would be logical to expect that "extra" tuition going to seminar classes as well as lectures would either have a positive effect on students' performance or no effect at all, but certainly not a negative effect. However, this is just our opinion and hope and certainly does not mean that we will get the effect we expect.
0コメント