For which of the following is a chi-square goodness-of-fit test most appropriate?
Show
1. Exploratory Data Analysis
Test for distributional adequacy The chi-square test (Snedecor and Cochran, 1989) is used to test if a sample of data came from a population with a specific distribution. An attractive feature of the chi-square goodness-of-fit test is that it can be applied to any univariate distribution for which you can calculate the cumulative distribution function. The chi-square goodness-of-fit test is applied to binned data (i.e., data put into classes). This is actually not a restriction since for non-binned data you can simply calculate a histogram or frequency table before generating the chi-square test. However, the value of the chi-square test statistic are dependent on how the data is binned. Another disadvantage of the chi-square test is that it requires a sufficient sample size in order for the chi-square approximation to be valid. The chi-square test is an alternative to the Anderson-Darling and Kolmogorov-Smirnov goodness-of-fit tests. The chi-square goodness-of-fit test can be applied to discrete distributions such as the binomial and the Poisson. The Kolmogorov-Smirnov and Anderson-Darling tests are restricted to continuous distributions. Additional discussion of the chi-square goodness-of-fit test is contained in the product and process comparisons chapter (chapter 7). Definition The chi-square test is defined for the hypothesis:
Chi-Square Test Example We generated 1,000 random numbers for normal, double exponential, t with 3 degrees of freedom, and lognormal distributions. In all cases, a chi-square test with k = 32 bins was applied to test for normally distributed data. Because the normal distribution has two parameters, c = 2 + 1 = 3 The normal random numbers were stored in the variable Y1, the double exponential random numbers were stored in the variable Y2, the t random numbers were stored in the variable Y3, and the lognormal random numbers were stored in the variable Y4. H0: the data are normally distributed Ha: the data are not normally distributed Y1 Test statistic: Χ 2 = 32.256 Y2 Test statistic: Χ 2 = 91.776 Y3 Test statistic: Χ 2 = 101.488 Y4 Test statistic: Χ 2 = 1085.104 Significance level: α = 0.05 Degrees of freedom: k - c = 32 - 3 = 29 Critical value: Χ 21-α, k-c = 42.557 Critical region: Reject H0 if Χ 2 > 42.557As we would hope, the chi-square test fails to reject the null hypothesis for the normally distributed data set and rejects the null hypothesis for the three non-normal data sets. Questions The chi-square test can be used to answer the following types of questions:
Importance Many statistical tests and procedures are based on specific distributional assumptions. The assumption of normality is particularly common in classical statistical tests. Much reliability modeling is based on the assumption that the distribution of the data follows a Weibull distribution. There are many non-parametric and robust techniques that are not based on strong distributional assumptions. By non-parametric, we mean a technique, such as the sign test, that is not based on a specific distributional assumption. By robust, we mean a statistical technique that performs well under a wide range of distributional assumptions. However, techniques based on specific distributional assumptions are in general more powerful than these non-parametric and robust techniques. By power, we mean the ability to detect a difference when that difference actually exists. Therefore, if the distributional assumption can be confirmed, the parametric techniques are generally preferred. If you are using a technique that makes a normality (or some other type of distributional) assumption, it is important to confirm that this assumption is in fact justified. If it is, the more powerful parametric techniques can be used. If the distributional assumption is not justified, a non-parametric or robust technique may be required. Related Techniques Anderson-Darling Goodness-of-Fit TestKolmogorov-Smirnov Test Shapiro-Wilk Normality Test Probability Plots Probability Plot Correlation Coefficient Plot Software Some general purpose statistical software programs provide a chi-square goodness-of-fit test for at least some of the common distributions. Both Dataplot code and R code can be used to generate the analyses in this section. What is a chiThe chi-square goodness of fit test is a hypothesis test. It allows you to draw conclusions about the distribution of a population based on a sample. Using the chi-square goodness of fit test, you can test whether the goodness of fit is “good enough” to conclude that the population follows the distribution.
What is an example of a chiThere are six different colors: red, orange, yellow, green, blue and brown. Suppose that we are curious about the distribution of these colors and ask, do all six colors occur in equal proportion? This is the type of question that can be answered with a goodness of fit test.
In which situation is a chiA chi-square test is used to help determine if observed results are in line with expected results, and to rule out that observations are due to chance. A chi-square test is appropriate for this when the data being analyzed are from a random sample, and when the variable in question is a categorical variable.
Which of the following is the test for goodnessGoodness of fit of a distribution is tested by Chi-square test. It is a widely used non-parametric statistical test that describes the magnitude of discrepancy between the observed data and the standard data.
|