No paper of this kind would be complete without
at least a short discussion of the confidence limits of the
results, and without a comparison with the results of
other, similar, surveys. In our tables the frequency of
occurrence of the various items being counted is
expressed as a percentage p of the total number of
items in the texts. For a 99% confidence factor, the
limits of the variability of the results will lie between
p+2.576[-p(1–p)/n']«.
This means that for this 99% confidence factor the
frequency of occurrence values given in Table II will
be within about a 4-0.3% limit, which can probably
be regarded as more than satisfactory. The situation is
different for the diagram frequencies given in Table IV.
The values of digram frequency are based on much
smaller samples because these sequential frequencies are
calculated from separate subtotals for each row in the
table and these totals are a function of the frequency
of occurrence of the phonemes to which they relate.
Even at best, that is, for the most frequently occurring
phone-roes, the limits of variability for a 99% confidence
factor is about 4-1.5%, and the limits are much wider
for the less frequently occurring phonemes. More data
would, therefore, make a useful improvement in the
reliability of the values of these digram frequencies