Math 445—T Confidence Interval and Prediction Interval Example

 

Consider the lengths (in minutes) of the 58 nine-inning games from the first week of the 2001 major league baseball season (consider these a representative sample of all nine-inning game times).  A graph of the game-length distribution is shown in both the histogram and normal-quantile plot below. Furthermore, the numerical summaries are also shown.

 

 

Variable                N     Mean   StDev   Minimum      Q1   Median      Q3   Maximum

game length (in min.)  58   178.24   19.53    136.00  165.00   177.00  188.50    218.00

 

Confidence Interval for a Population Mean

Suppose we want to estimate the average game length (in minutes) of all nine-inning games. Based on our sample data, it’s appropriate to assume the game lengths follow an approximate normal distribution. Then it’s appropriate to use a confidence interval based on the t distribution. For a t distribution with (58 – 1) = 57 degrees of freedom, the t-value with area 0.025 to the right is t = 2.002 (this value is from Minitab—you can also determine this approximately from A.8 in the textbook, using degrees of freedom 60, instead of 58).

 

Hence, a 95% confidence interval for the average length of all nine-inning games is

 

Per usual, our confidence is the method we used to create the interval. That is, we used a method that provides correct results 95% of the time (and we hope this is one of those times!).

 

Note that the sample size is large in this case (beyond the textbook’s rule of thumb of n = 40). Hence, the z-distribution and t-distribution will be so close that you can really use either to create the confidence interval. Based on the z-distribution the confidence interval is

 

Prediction Interval for a Single Future Value

Suppose now we want to make a prediction about the game length of the next nine-inning game, not about the average length of all games (it’s important that we answer the appropriate question—sometimes a prediction interval, not a confidence interval, is what we really want).

 

Because a prediction interval depends on the t distribution and the condition that the population distribution is normal, it’s important to check this condition. As already mentioned, the normality condition seems reasonable.

 

Then a 95% prediction interval for the length of the next nine-inning game is

 

Note that the prediction interval is much wider than the confidence interval (it’s harder to predict an individual value than an average). FYI: The next nine-inning game in 2001 lasted 152 minutes.