Logical Foundations of Induction

Part 2, Chapter: 1|2|3|4|5

Induction And Probability

Chapter 1. Calculus of Probability

Introduction

We have already said that induction, in its first stage, is a sort of inference; and we shall show in this part that induction in this stage does not proceed from particular to universal and that inductive inference does not give certainty but the highest degree of probability. Thus induction in its first stage is related to the theory of probability, and it may then be well to begin with the latter. We often talk in ordinary life about probability, for example, when we are asked what is the degree of probability of seeing a piece of coin, thrown at random, on its head, our answer is 1/2. If one of John's ten children is blind, what is the degree of probability of one of them, chosen at random, to be blind? The answer is 1/10; but if we have chosen four of them at random, then the degree of probability that one of his children is blind would be 4/10[?]. We shall discuss three things: (a) this ordinary meaning of probability, and then try to find out the axioms presupposed by the theory, which make possible any arithmetical process; (b) In view of these axioms, we discuss the rules of probability calculus, the rules which determine the ways through which arithmetical processes on probability degrees are made; (c) A logical explanation of our ordinary meaning of probability consistent with those axioms.

Axioms of the theory

We shall use "p/h" to denote the probability of the event p, imposing another event h; we shall take this form as undefined notion. Bertrand Russell summarises the axioms of the theory of probability, acknowledging Professor C.D. Broad's work, as follows[11]:

I. Given P and h, there is only one value of P/h. We can therefore speak of "the probability of p given h".

II. The possible values of P/h all the real numbers from 0 to 1, both included.

III. If h implies p, then P/h = 1; we use "F' to denote certainty.

IV. If h implies not-p, then P/h = 0; we use "O" to denote impossibility.

V. The probability of book p and q given h is the probability of p given h multiplied by the probability of q given p and h, and is also the probability of q given h multiplied by the probability of p given q and h. This is called the "conjunctive" axiom. For example, suppose we want to know the degree of probability of one student in the class to be excellent in both logic and mathematics, we say the degree of probability his excellence in both subject [- matters] is equal to the degree of probability of his excellence in logic multiplied by the probability of the student who is excellent in logic is also excellent in mathematics.

VI. The probability of p and (or) q given h is the probability of P given h plus the probability of q given h minus the probability of both p and q given h. This is called the "disjunctive" axiom. In the previous example, when we want to know the degree of Probability of the excellent student in logic and mathematics in a class we get the degree of probability of his excellence in mathematics plus the degree of his excellence in logic, then subtract from this; the degree of probability of his excellence in these matter as determined by the conjunctive axiom, the product will be the degree of probability of his excellence in one of them. These are the six axioms that the theory of probability presupposes, and so we should give probability a meaning consistent with these axioms, that is, the probability of p given h should have a meaning implying only one value in accordance with axiom 1, giving any value from zero onwards in accordance with axiom II, and requires that the value 1 when h implies P, and the value 0 when h implies not-p in accordance with axioms III and IV, etc.

Rules of the Calculus

Rule of sum in compatible probability: If h is a process which necessarily leads to one of the following results a, b, c, or d, then we have the following four probabilities: a/h, b/h, c/h, d/h. If we want to know the probability of finding a/h [or b/h], we reach it by adding the value of probability a/h and b/h, and this means that the probability of finding a certain result equals the sum of the probabilities of finding each result separately. That is, the probability a/h or b/h = the value of a/h + b/h; that is an application of the disjunctive axiom which says that the value of probability of one of two events a or b = the value of a + the value of b- the value of the whole. Assuming that the happening of both events is improbable in incompatible results, it is time that the probability of the happening of an event equals the sum of the happening of both.

The sum of probabilities in compatible collection is 1. Suppose we have two or more instances and that one of them at most must happen, these instances are regarded as inverse, and such collection is called compatible collection. In throwing a piece of coin, its head and tail are compatible collection, because one of them only must occur; having a pamphlet containing ten pages, opening the first page, or the second .... or the tenth is an instance of all the compatible cases. We may thus maintain that the sum probability of compatible cases is always equal to number 1.

Rule of sum in compatible probabilities: If we have two probable instances a and b, [???] may occur together, and we want to know the probability of a's or b's occurring, it is not possible to determine this by adding the value of a and b, but by subtracting the value of the sum from the value of both instances, in order to arrive at the probability of a or b. we can know the same value in another way, that is, by getting a compatible collection consisting of two inconsistent cases, the occurrence of a or b, and their absence. The value of these two cases equals 1 in accordance with what has already been said in the previous paragraph.

Rule of multiplication in condition[ed] probabilities: If we have two probable cases a and b, the value of the occurrence of b assuming the occurrence of a may be greater than the value of the former's occurrence without the latter. For example, it is probable that a student passes in logic and mathematics examinations, but if we suppose that he has passed in logic then there is greater probability of his success in mathematics on the condition of high mentality shown in his success in logic; and vice versa: if we suppose that he passed in mathematics, the probability of his success in logic is greater. Probability which is affected by another probability is called "condition[ed] probability". If (a) stands for student's success in logic exam, (b) for his success in mathematics, (h) for membership of students, we get: value of probability of both a and b = value of the probability (a/h) + b/(h + a).

Rule of product in independent probabilities: There may be unconditioned probabilities, e.g., the probability that x will pass in logic exam., and y will pass in mathematics. The value of probability of the one is equal to that of the other; such is called "independent probability". If a stands for John's success and b for Smith's, h for studentship, we get a/h = a/(h + b). In such a case, the probability value of both a and b = the probability value a/h x b/h in accordance with "conjunctive axiom".

Principle of inverse probability: The conjunctive axiom tells that if we have two events (p and h), given the conditions of their happening (q), we get : and h/q = (p/q) x [h/(q + p)] The conjunctive axiom entails:

p / ( q and h ) = (p/h) x q / (p and h)
q / h

That [is] to say, the probability of distinction of some student in math exam provided he satisfies certain circumstances, and supposing he is as distinguished in logic as math provided such circumstances multiplied in the probability of his distinction in logic supposing he is distinguished in math, all is subtracted from the probability of his distinction in logic provided such circumstances.

Such equation is called "inverse probability". By virtue of this principle, the value of probability of gravitational theory after Newton was determined, in such a way the planet Neptune was discovered.

Bags example and Probability calculus: Take the famous example of bags. Suppose we have three bags each of which contains five balls. The first bag contains three white balls, the second contains four white balls, all [five] balls in the third bag are white. Suppose again that we take a bag without knowing which, and draw from it three balls and are found to be all white, then what is the probability that this is the third bag the balls in which are all white? The probability is 2 / 3. This may be explained thus:

1/3 x 1 = 1/3
1/3 x 1 + 1/3 x 1/10 + 1/3 x 4/10 1/2

Bernoulli's law of large numbers

Let us illustrate this law by the case of tossing a coin. Suppose you tossed a coin n times, and that the proportion of heads in each time is 1/2, then what is the probability of tossing the coin on its head once (m ) and on its tail n-m times? Since this may happen in various forms, it is possible to take one of these forms such that m many times and n-m many other times, and then calculate the value probability thus : (1/2)^m x (1 - 1/2) n-m [???]

Further, we have got to know the number of m's in n; this can be shown thus:

n x (n-1) x (n-2) x n- (m-1)
m x (m-1) x 2x 1

We can get the value when we get the value of variables.

Now, if the variable 'e' stands for an event, e for its absence, we want to know the number of times in which the event most probably occurs in n times. Suppose that we know the probability value of the occurrence of e at one time and give such value the variable q.

Bernoulli's equations give us the solution. Suppose we give certain number of times in n the variable r, and value probability of the occurrence of an event pe. First we get the fraction [pe (r +1 )]/ [pe(r)]. And if we want to know which is larger, nominator or denominator, is it 1 or less or more? We know this when we get the value of this fraction. Let us look at the formula : [(n-r)/(1-r)] x [q/(1+q)] if we want to determine the value of r, we get the relation: pe (r + 1) pe (r), then it is larger than 1. 1 is smaller than [(n-r)/ (1-r)] x [q/ (1+q)]. Then we find that values of r are always smaller than n x pe-(1-pe), that is, than the total of times multiplied in the value probability of the occurrence of event subtracted from the probability of its absence.

For if r is equal to such equation, 1 would be equal to [(n-r)/ (1-r)] x [q/ (1+q)]. When any number of times in which the event occurs in n times is less than the number of times x value probability of the event [-] value probability of its absence, this is called the limit.

Notes:

[11]Russell, B., Human Knowledge, pt. V, ch. 2, p. 363.

Part 2, Chapter: 1|2|3|4|5
Back to contents