Skip to main content

Mean, Variance

Mean and variance

We have dropped a ball from 1 meter high and measured the time it took to land on the ground. If you did not replicate the scenario perfectly, you would obtain three different numbers for the interested duration. What can you report based on them?

Wikipedia: Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data.

The most important function of Statistics is to make sense of variation in data. To know the average weight of Texas residents, you can avoid Statistics by weighing every Texas resident, or if every Texas resident has the same weight (no variation in data). However, in general, data collection is budgeted by resources, and variation is untamed in data. Is it feasible to weigh every Texas resident? If not, you can try Statistics, such as randomly asking 1,000 Texas residents for their weights. Can you use the average weight of this 1,000 sample to estimate the average weight of the whole Texas population? That is one of the essential questions in Statistics.

A statistic (without capitalizing the first "s" and without the ending "s"), is a single-number summarized from data to represent something interesting about the data. Suppose you wrote down times for the 1-meter fall as 0.40 s, 0.45 s, and 0.50 s. We can calculate two statistics:

  • Sample mean Xˉ=1ni=1nXi\bar{X}=\frac{1}{n}\sum\limits_{i=1}^n X_i.
  • Sample variance S2=1n1i=1n(XiXˉ)2S^2=\frac{1}{n-1}\sum\limits_{i=1}^n(X_i-\bar{X})^2.

In this case, X=13(0.40+0.45+0.50)s=0.45s\overline{X}=\frac{1}{3}(0.40+0.45+0.50)s=0.45s, and S2=131[(0.400.45)2+(0.450.45)2+(0.500.45)2]s2=0.0025s2S^2=\frac{1}{3-1}[(0.40-0.45)^2+(0.45-0.45)^2+(0.50-0.45)^2]s^2=0.0025s^2. The sample standard deviation S=S2=0.05sS=\sqrt{S^2}=0.05s. Therefore, we can write the reported time as t=0.45±0.05st=0.45\pm 0.05s, which tells the location and the spread of the interested value.