Chapter 4. Modern Theories of Probability
We have explained in the previous chapter the deductive phase of induction in detail and argued that such phase needs presuppose nothing except the axioms of the theory of probability in the way [??? we ???] interpreted it. Now, the pioneers of the theory of probability seem to have taken already the same course as we did, with the difference that they defended the deductive phase of induction even without presupposing anything concerning causality.
Laplace is one such pioneer. In what follows we shall discuss Laplace's position and then compare it with ours. To begin with, we are reminded of the example of bags hitherto mentioned. If we have three bags a, c and d before us, each of which contains five balls, and that the bag a has three white balls, the bag c has four white ones, and all the five balls in the bag d are white.
Suppose we picked out one of these bags at random and drew three balls from it and found them white. Now, the probability that such bag is d is more likely, for its a priori value (before drawing the three balls) is 1/3, and its a posteriori value is 10/15. There is only one probability that such bag is a since this has only three white balls, and four probabilities that such bag is c, and ten probabilities that is the bag d. Thus we have fifteen probabilities which we have indefinite knowledge that one of them would be fulfilled. Ten of these probabilities concern the bag d, consequently, the a posteriori probability is 10/15, i.e. 2/3.
The probability that the next ball to draw is white is 4/5, because there remains in the bag two balls after drawing three ones. Since we probably draw one of them, then we have two probabilities which if multiplied by the fifteen forms referred to, give us thirty forms. These constitute a whole set of indefinite knowledge, twenty four of which involve that the next ball will be white; thus the value would be 24/30 = 4/5. If n denotes the balls drawn, n the whole number of balls, Laplace arrived at the following two equations:
(1) the chance that all the balls in the bag are white = (m+1)/(n+1)
(2) the chance that the next ball to be drawn is white = (m+1)/(m+1 )???
And that is true and consistent with our position.
But Laplace wanted to generalise those values within one bag (n) containing five balls, and determine whether it has 3 or 4 or 5 white balls. We get three probabilities: (i) n may be similar to a, i.e., that the bag has three white balls only; (ii) n may be similar to c, i.e., that the bag has four white balls; (iii) n may be similar to d, i.e. that all the balls in the bag are white.
Laplace assumed that these probabilities have equal chances, thus the value is 1/3 in each; in consequence, we get ten probable ways to draw the three balls.
Now, if we draw three white balls, this means that one of those forms is fulfilled, and since ten of those fifteen forms favour the bag d, the chance is that the bag necessarily contains all white balls: 10/15 = 2/3, and the chance that the next ball to be drawn is white is 12/15. And what applies to the ball n applies to all processes of induction. Therefore, Laplace explained inductive inference on the basis of probability theory and determined the value of the probability of generalisation on inductive basis, i.e.(m+1)/(n+1)and the chance that the next instance have the same quality as (m+1)/(m+2)???.
In his Positivistic Logic, Dr. Zaki Naguib regarded Laplace's second equation as a ground to determine the chances that an event be repeated. Yet, he does not explain the equation in mathematical form, but founds it on an assumption not explicitly proved. "Suppose that a given event never occurred in the past, and that the chance of its occurrence is equal to its non-occurrence, then the value is 1/2. Suppose that it happened once, then the chance that it will happen again is (1+1)/ (1+2) = 2/3.
Then the equal probabilities become three, one of which is positive. The second is also positive and the third is negative. That is, we have two chances that the event will occur and one chance that it will not. In general, if an event occurred m times, this, gives as m chances of its occurrence, and adds two chances, one of which might occur and the other might not"
This quotation clearly assumes that the occurrence of an event more than once favours its occurrence once more, in accordance with Laplace's equation. But this does not justify its frequency. On the other hand, in our exposition of Laplace's theory, we have found an interpretation of the deductive phase of induction different from ours. What is important in Laplace's interpretation is that it dispenses with any axioms save those of probability theory itself. Further, Laplace's interpretation dispenses even with the assumption of causal principle.
Difficulties of Laplace's theory
Laplace's theory involves some difficulties. First, what justification is there for supposing causal chances that the bag n is similar to the bags a, c and d? Or why do we assume that bag n, containing five balls, has three white ones only, and that this is equal to the supposition that it has four or five white ones? The second difficulty is that what justification Laplace has for increasing the probability that all the balls in n are white, because this justification depends on finding out an hypothetical indefinite knowledge which explains increasing chances that all are white. But Laplace's theory has no conception of such indefinite knowledge. The third difficulty is that how do we explain the generalisation of inductive inference on the basis of Laplace's theory?
Let us discuss the second difficulty first. According to our own solution to this difficulty, we suggested there being two sorts of indefinite knowledge, the first sort is the knowledge that n's five balls include three or four or five white balls.
The other sort is the knowledge that drawing three balls from the bag n is taken one of the ten possible forms of drawing three out of five. Again, the first indefinite knowledge includes three members, while the second includes ten members. When the two sorts are multiplied we get thirty forms, ten of which represent the way of drawing three out of five supposing that the bag n is similar to the bag a, the second ten represent the forms supposing that n is similar to c, the other ten represent the forms supposing the similarity between n and d. When we draw three white balls then nine out of ten forms disappear in the first case, or six out of ten disappear in the second case, or none disappears in the third case. Therefore, by means of multiplication we get fifteen forms out of thirty, ten of which favour the similarity between n and d, then the value becomes 10/15 or 2/3 or (m+1)/(n+1).
Such deductive structure depend on the assumption of the two sorts of indefinite knowledge, but it has something wrong, because having drawn three balls from the bag n, we cannot have the second sort of indefinite knowledge, that is, knowledge of one of the ten forms for drawing three balls out of five. In fact, we get a definite knowledge of only one of these forms. This shows that we do not get fifteen probable forms, after drawing three white balls, as Laplace supposed.
Such is the main difference between supposing the determinate choice of a bag n and random choice of the bag a or c or d. In the latter case, we already know that a includes three white balls only, that c includes four white balls only, and that the balls in d are all white. If we randomly choose any of those three bags and draw three white balls, then, the supposing that the bag in question is d, we get fifteen probabilities, and the value would be 10/15.
What has passed does not apply to the choice of n which includes five balls. Here, we get not fifteen forms, but one form. Thus, in the case of choosing the bag n we do not have any indefinite knowledge, on the ground of which we could explain the increasing chances that all the balls are white, and this refutes Laplace's equation for determining the a posteriori probability for generalisation.
Further, on reflection, we may discover in the case of the bag n a hypothetical indefinite knowledge, but that does not satisfy Laplace's purpose. When we draw three balls from the bag n and are seen to be white, we cannot obtain the indefinite knowledge which informs us of the increasing value for generalisation. But we can discover a hypothetical knowledge expressed in the following way. If the bag n includes at least one black ball it is either the one drawn first or the second one drawn or the third or the fourth or that it is not yet drawn. Such statement involves five probable hypothetical statements, in three of which the consequent cannot be factually given, since the balls drawn are not black.
Thus we get three probable statements that assert the absence of there being any black ball, and this means that the probability that all n's balls are white is 3/5. And this value is different from that which is determined by Laplace for a posteriori probability i.e. 4/6 , for 3/5 is smaller than 4/6. Thus, we have suggested that such hypothetical knowledge does not fulfil Laplace's end because it does not justify the value assumed by him.
If we acknowledge hypothetical knowledge as a basis of calculating probability, we fulfil Laplace's end , such that we may give the deductive step of inductive inference a mathematical interpretation, without recourse to any postulates of causality. The new equation would be that the value of a posteriori probability of generalization is m/n instead of (m+1)/(n+1) given by Laplace (m denotes the number of individuals examined, n the whole number of individuals concerned); But such hypothetical knowledge cannot be a ground of increasing probability, because the consequent is factually undetermined. [Here???] we conclude our discussion of the second difficulty facing Laplace by discovering the main mistake underlying his theory.
We now turn to the first difficulty facing Laplace, that is, what justification for assuming the equality among the three probabilities; that the bag n has three white balls among the five, or that it has four or five white ones. We have already remarked the difference between the supposition of the bag n and that of the three bags a, c and d any one of which we randomly take. In the latter case, the three probabilities are equal, while in the former we do not have three bags but only one in which we do not know the number of white balls. Now, if we have no previous knowledge how many white balls and black ones included in the bag n, then the chance that any ball is white is 1/2; the chance that any one is black is also 1/2. Thus, the value probability that n is similar to d is 1/2 x 1/2 + 1/4. n is similar to c is 1/2 x 1/2 + 1/2 x 1/2 = 1/2.
Now, there is no justification for the equality of the three probabilities, hence we do not obtain fifteen equi-probable value as Laplace suggested.
The third difficulty facing Laplace is expressed thus : first his equation cannot determine the value of a posteriori probability of generalisation if n denotes an infinite class, because the denominator of the fraction (m+1)/(n+1) infinite, and it is impossible to determine the ratio of finite number to infinite number. Secondly, if n denotes a finite class but has a great number of members, we cannot obtain the probability of generalisation to a higher degree, because the ratio of the members under examination to the total number would be very low.
But our interpretation of probability, hitherto given, supplies a definite value for the probability of generalisation after a small number of successful experiments. For the value of a posteriori probability always expresses a certain ratio to the total possible forms for the occurrence of an event or its absence, and such total is always theoretically and factually definite in quantity.
Keynes and Induction
Keynes tried his best to establish induction on purely mathematical lines, by deducing the value of a posteriori probability of the generalisation from the laws of probability calculus, as Laplace already did.
Keynes supposes that inductive generalisation has a definite value before inductive process. Let p be the value of such a priori probability, and obtaining favourable instances of the generalisation, then the probability of the generalisation after the first instance: p + the first instance; let the sum be p1 the value, after getting the second favourable instance, becomes: p+ the first two instances, the sum be denoted by p2. After n of instances, the probability of the generalisation becomes : p+n instances, i.e., Pn.
Suppose we want to know whether Pn continually moves towards 1 (certainty number) as n increases, then it is possible to know that by determining the value of the probability of n instances, supposing that the generalisation is false; let this value be Kn. When Kn moves to zero while n increases, Pn moves to 1 with the increase of n. The value of Kn can be determined through multiplying the value of the probability of the first instance occurring supposing the generalisation is false, in the value of the probability of the second instance occurring.
Suppose n contains four instances for example, and we denote the value of the probabilities by (K1), (K2), (K3), (K4), we can then say that Kn = K1 x K2 x K3 x K4.
Difficulties of Keynes' Interpretation
Keynes' task is to get a definite value of a priori probability of the generalisation, and continually moving it towards certainty, while the instances increase, with the some degree of moving towards zero. Suppose we have a generalised statement e.g., all metals extend value of the probability that every metal extends by heat, before making any inductive process, then through induction we find out the truth of the generalisation. Such process enables us of getting near to certainty. Keynes has two points, first, the determination of the value of a priori probability of the generalisation is a necessary condition of explaining inductive inference and its role in reaching the generalisation to a greater degree. Secondly, these are two probabilities (Pn) and (Kn), and the more Kn moves towards zero the more Pn approaches the number 1.
Take the first point first. In the light of our own interpretation of the deductive phase of induction, we may know that the necessary condition of applying the general way determining such phase is to give the inductive conclusion a priori probability assumed by the first kind of indefinite knowledge, such that its value does not exceed the value of the probability of negating such conclusion assumed by the second kind of indefinite knowledge. Fulfilling such condition, inductive inference is workable in its deductive phase. However, what is the degree of a priori probability of inductive conclusion? If such inclusion expresses causal relation between two terms, the degree of its probability is determined by the number of things thought to be causes or not. If the conclusion expresses causal law, that is, a uniform conjunction between a and b by chance, then what determines the value of its a priori probability is the indefinite knowledge consisting of the set the probabilities of a's and b's.
Two attempts have been given by Russell. Russell suggested first a position confirming Keynes in determining the value of a priori probability of the generalisation. The generalisation is regarded as merely uniform conjunction. Let us suppose that the number of things in the universe is finite, say M. Let B be a class of n things, and let a random selection of m things Then the number of possible a's is N!/ [m! (N-m)!] and the number of these that are contained in B is n!/[m! (n-m)!]
Therefore the chance of "all a's are B's" is [n! (N-m)!] /[m! (N-m)!] which is finite. That is to say, every generalisation as to which we have no evidence has a finite chance of being true. But this attempt is futile for, first, knowing the number of things in the world[/universe] is practically impossible even if we accepted the finitude of the world; and secondly, the number of things in the world is immensely vast and it is clear that the greater (n) is, the less is its a priori probability.
Russell offered another attempt to regard the generalisation as merely uniform conjunction. He suggested that when we regard metals as extended by heat, we find that the a priori probability is 1/2, because we have two equal probabilities (that they extend and that they do not), before any inductive process. He suggested secondly that we deduce the value of the probability of "the extension of metals by heat " from the value of the a priori probability of any piece of metal. When the probability in the latter case is 1/2, then probability of n extended metals 1/(n2) and this is clearly definite value. This way of determining the value of the a priori probability of the generalisation is adopted by us in a previous chapter.
We come now to the second difficulty facing Keynes' theory. Keynes supposed an [xen] any case [of] two probabilities (Pn) and (Kn), that the former gets near the number 1 when the latter moves to zero. The result is that if the occurrence of n (the number of instances favouring the generalisation) becomes less, supposing the generalisation to be false, then the chances increase that n occurs. In fact, this is a deduction of the value of a posteriori probability of the generalisation from hypothetical indefinite knowledge. Suppose we have the generalisation that all a is b, suppose again that a has six members, and that by experiment we found that the first four members of a are b; let these be n, and the probability of the generalisation be Pn, and that the probability that four members of a are b, supposing the generalisation false be Kn. Then we can say that knowledge, such knowledge involves that the value of Kn, after four experiments, is 2/5. If n increases in number, Kn decreases to 1/6.
But we have already shown that such sort of indefinite knowledge cannot be a ground of determining probability, because it does not include a consequent in fact. Therefore the value of Kn cannot be determined by such hypothetical indefinite knowledge.
From what has gone before concerning the deductive phase of induction, we have maintained that the necessary condition of this deductive phase is there not being any a priori justification for refuting the rationalistic theory of causality. We are now considering such condition, by discussing another justification for refuting casualty. We may classify the justification of this refutation into four: logical, philosophical, scientific and practical justification.
The logical justification for refuting the principle of causality rests partly on certain claims, provided by logical positivism. This has maintained that the meaning of any proposition is the way of its verification. A proposition is meaningful if we can affirm it or deny it within experience, otherwise it is meaningless. Now, the proposition all a are followed by b, is meaningful because it is possible to find its truth or falsity through observation. But the proposition a is necessarily connected with b' is different, because necessity adds nothing to mere conjunction through experience, thus experience does not enable us of knowing the truth or falsity of that proposition so positivism asserts that propositions of this kind have no meaning.
We shall later argue that Logical Positivism is mistaken in their conception of meaningful propositions.
Empiricism, as an epistemological doctrine, claims that sensible experience is the main and only source of human knowledge, and denies that knowledge has any other source. Empiricism, thus understood, has certain views which may be regarded as a philosophical justification for refuting causality as necessary connection. However, empiricists differ from logical positivists: the latter maintain, as has just been said, that a proposition which we are unable to confirm or refute by experience is meaningless. Whereas classical empiricism admits that such proposition is logically meaningful because the meaning of it and its truth are not identified. Empiricism is satisfied to say that we cannot accept as true those propositions which cannot empirically be verified.
Now, empiricism maintains that causality, as involving necessity, cannot be known to be true through experience, because experience shows us the cause and the effect but not the necessity involved in their connection. That is the point which has been made by Hume. He explained causality as merely concomitance or uniform succession between certain two events. Such empiricistic view has dominated [cent???] thought concerning causality. Instead of regarding it as necessary relation, we consider it as expressing uniform succession among phenomena.
In fact, we cannot[??? can ???] refuse the empiricist view of causality as making experience the criterion of causal relation. But[???] empiricism does not emphatically deny the necessity involved in causal relation rationalistically considered. Empiricism rather implies that such necessity can neither be proved not denied through experience, thus a proposition about necessary connection between events is logically probable. And it is such probability that is needed for induction in order to explain its deductive phase, inductive inference starts from the probability of the relation of necessity between a and b; therefore induction is supposed both by rationalism and empiricism, for the ground of its generalisations.
Some scientists have claimed that the principle of causality, involving determinism and necessity, does not apply to the atomic world. But we cannot reject causal principle when we are unable to find a causal interpretation of the behaviour of the atoms. We can only say that our actual experiments do not show yet a definite cause of certain phenomena. At best, this may give rise to a doubt in causal interpretation; and this doubt involves the probability of the truth of the principle, and this is all that is needed for inductive inference as postulate. Further, even if physics has come to the absence of any cause of the behaviour of atoms, it is still possible that causality may probably apply to macrophysical bodies.
There is one argument left to justify us in moving from causality (involving determinism and necessity) to causality in the sense of mere conjunction among phenomena. This justification Lord Russell clarifies as follows.
Suppose we have a common sense generalisation that A causes B- e.g. that acorns cause oaks. If there is any finite interval of time between A and B, something may happen during this time to prevent B-for example, pigs may eat the acorns. We cannot take account of all the infinite complexity of the world, and we cannot tell, except through previous causal knowledge, which among possible circumstances would prevent B. Our law therefore becomes: "A will cause B if nothing happens to prevent B". Or more simply: "A will cause B unless it does not". This is a pure sort of law and not very useful as a basis for scientific knowledge.
Now, it is reasonable to offer statistical uniformities instead of rational causality. Instead of saying "A causes B unless it does not", we can say: A is succeeded by B once or fifty times in hundred times. Thus we reach a useful law. Nevertheless, all this does not prevent us from talking in causal terms, on the basis of ignorance. If we are able to know all the things that may prevent A to cause B, we could have formulated causal principle in a more precise hypothetical statement. But that is beyond our reach. For these reasons, we may inquire statistically into the chances according to which A causes B, and then say for instance that A is succeeded by B in twenty cases out of hundred if that has occurred to us in an experiment. But what we are now doing is a generalisation, which itself needs the assumption of causality. Otherwise we fall back in absolute chance, and this cannot be a basis of any sort of generalisation.
We may now conclude that statistical laws are not inconsistent with the assumption of causality because any statistical law expresses a certain ratio of frequency and generalises it, but such generalisation presupposes the a priori assumption of causality even in probability terms.
Another Form of Deductive Phase
We have hitherto been considering the deductive phase on the ground that there is an event A inquiring into its cause B. Now, we want to ask first about the very being of A. The latter takes the following forms.
Inductive inference may determine the value of the probability of the existence of A on the basis of an indefinite knowledge which increases the value of the probability of the existence of C. Suppose we say that B has two causes, one of which is A, the other is C. Suppose again that A is a given event whereas C denotes a complex of three determined events d, e and f. When B occurs once, we have an indefinite knowledge that A and C have occurred. Thus we may determine the a priori probability of the existence of A with 1/2.
But suppose the probability of the occurrence of A equals that of any of d, e or f, then we get a different indefinite knowledge, which includes the probabilities of the last three events. Such knowledge involves eight probabilities, one of which is the occurrence of all the three events, while the other seven involves the absence of at least one of these events. These seven probabilities implies the occurrence of A because they presuppose the absence of C (d + e + f), and since B has occurred then A occurred. Therefore the value of the occurrence of A is 7.5/8 = 15/16. Here we notice the difference between the two indefinite kinds of knowledge in that each determines the value of the probability of the occurrence of A. Thus we obtain the true value if we apply dominance axiom. Now, in the present case we do not need such application.
We need applying multiplication axiom. Hence we get 16 forms, seven of which are impossible (where assume the occurrence of C without d, e and f); then remain nine forms, eight of which favour A, two favour C (one of these two is common with the eight forms-group) Therefore the value of A is 8/9. We may conclude that the ground of the probability is an indefinite knowledge which increases the probability of C, with the help of multiplication axiom.
Requirements of the deductive phase
It is clear from what has gone before that inductive inference involves generalisation, that is, all A are succeeded by B by virtue of strengthening the probability of causality. Such probability is a result of the probability that there is no other cause of B than A in the first experiment, plus the non-occurrence of a cause of B other than A in the second experiment plus [...] until we reach the final experiment. Each probability shows that A causes B, thus the proposition gets higher probability. But the group of those probability values of the proposition 'A causes B' depends on the condition that a causes b implying the causality of all a's. Such dependence has its justification, because we have already shown that there is a necessary relation between two terms.
Consequently, for induction, in deductive phase, to be performed, it is necessary that several experiments involve many a's between which there is a unity, not just mere grouping. If so, there must be another condition of the generalisation, namely, that there should not be any essential difference between the particular instances of causal relation. For example, if you choose randomly an individual person from every country in the world, and when you notice that some of these persons are white, because this group is arbitrarily chosen and does not have an essential unity. By contrast, if [y011???] choose a normal sample of negroes and notice that some of them are black, you can inductively generalise that all negroes are black, because all have a common property.
Successful induction depends on considering natural unities or common characteristics. But this shows that induction involves causality, on rationalistic lines, namely, there being a necessary connection between its terms.
Induction and formal logic
We may now consider the view that evidence for inductive normalisation does not make induction logically valid inference. This view claims that inductive conclusions are not logically necessary. Here we refer to some examples of invalid inductions given by Lord Russell. He classified these examples into two classes, namely, those included in arithmetic and those in physics. As concerns the former, it is easy to produce premises that give true conclusions and other giving false ones. Given the numbers 5, 15, 35, 45, 65, 95, we notice that each number begins with 5 and is divisible by 5. And this suggests that each number beginning with 5 is divided by 5, and that is true.
But if you take some numbers beginning with 7, such as 7, 17, 37,47, 67, 97, they falsely suggest that they are divided by 7 analogically. Further, when we say 'there is no number less than n which can be divided by n', we can enlarge n as we like, thus, we may give the false generalisation that no number is divided by n. Likewise, we can get false inductions in physics when we generalise from a small number of instances.
But it can easily be shown that false generalisations arise from a failure of fulfilling the conditions of induction. In arithmetic, when we make n as large as we can as in the previous example, and find that any number smaller than n is not divided by n, we cannot generalise and apply this property to all numbers, because all common, namely, being smaller than n. If we neglect this condition inductive generalisation, we get false generalisations. Suppose we take a series of numbers beginning with 5 such as 15, 35, 45, 65, 95. Now, what distinguishes this series from any other series of numbers beginning with 5 does not matter, because if we have large numbers beginning with 5, they are still divided by 5, however large they are. Here we may distinguish between induction in arithmetic and mathematical induction which gives us general laws for all integers. In the latter, we have the following steps: applying the law to the smallest number, then applying it to the number (n), finally applying it to the number (n+ 1). This sort of induction is always valid and give us general mathematical laws, and this does not concern us in induction considered in this book.
Let us look at the supposedly false generalisations in physics. Suppose we say "no man whom I now know died", and then say "all living men are immortal". That is a false generalisation because immortality is understood to mean continuous living inaccessible to human limits, and thus it is not applied to men; therefore we cannot infer that men are immortal.
Further, Russell and other logicians have remarked that induction is not valid unless it satisfies certain conditions which give rise to successful generalisations.
Thus, those logicians classify induction into particular and general. Suppose we have before us two classes of things (A) and (B), that we want to know whether a member of A is also a member of B; if by observation we find that all a's are b's then particular induction concludes that another member of A, yet unobserved, is a member of B. General induction shows that all A's are B's. Russell remarks that in particular induction it is necessary to find an instance that verifies our previous observations, that in general induction we should find that all A's are B's, not merely that some A's are B's.