Identification of bigramme lists and of valuation of unknown bigrammes.
The Verfahrenkenngruppe (V.K.G. or trigramme) is as we have explained not chosen at random, but from a list of about 11,000, and within this list the choices are not made uniformly. This fact enables us to identify which bigramme lists are being used, for if we choose the right bigramme list and work out the V.K.G. we shall find that a comparatively large proportion of them have occured before, and if we choose the wrong one, a comparatively small proportion.
The more precise theory of this identification is as follows. Let us suppose that of the 263 different trigrammes v1, have been used before once v2 twice etc. Let us call a trigramme which has occurred before t times a "trigramme of the t-class". We can then express our information in the form:
Of the occurrrences of trigrammes there have been v1, in the 1-class, 2v2 in the 2-class, 3v3 in the 3-class etc:
Now take a random sample of these occurrences, forming a proportion ∝
of the whole, and let us imagine that this random sample consists of the last
of the trigrammes which were found. There will be ∝v1 in the
1 class, 2∝v2 in the 2 class, etc. Now the ones in the 1 class
would have been, when they were found, ones which had not occurred before,
and those in the 2 class ones which had occurred before once, and so on. Hence
we can say that for the last occurrences of trigrammes entered, the numbers
which had occurred before not at all, once, twice, threetimes, ... are in
the ratios of, v1, 2v2, 3v3, ... We must
expect these ratios to hold also of the next few occurrences to be entered.
The process of finding new occurrences of trigrammes and looking up the numbers
of previous occurrences can therefore be regarded as like having an urn containing
cards, each of which bears a trigramme and a number, and making draws from
the urn. The number of cards bearing the number r is to be proportional to
(r+1) vr+1. On the other hand we have to consider the process of
choosing trigrammes at random. This is to be compared with drawing cards from
an urn containing cards in new proportions.