Testing Theoretical Hypotheses, Slides of Logic

Philosophers of science concerned with theories and the nature of evidence tend currently to fall into several only partially overlapping groups. One group ...

Typology: Slides

2022/2023

Uploaded on 02/28/2023

jannine
jannine 🇺🇸

4.9

(15)

239 documents

1 / 30

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
---------Ronald
N.
Giere---------
Testing Theoretical Hypotheses
1.
Introduction
Philosophers
of
science
concerned
with theories and
the
nature
of
evidence
tend
currently
to fall into several only partially overlapping
groups.
One
group follows its logical empiricist ancestors
at
least to
the
extent
of
believing that
there
is
a "logic" in
the
relation
between
theories
and evidence. This logic
is
now most often
embedded
in
the
theory of a
rational (scientific) agent. Bayesian agents are currently most popular,
but
there
are notable dissenters from
The
Bayesian Way such
as
Henry
Kyburg
and Isaac Levi.
Another
group derives its inspiration from
the
historical
criticisms
of
logical empiricism
begun
a generation ago by such writers
as
Gerd
Buchdahl, Paul
Feyerabend,
N.
R.
Hanson, Thomas Kuhn, and
Stephen
Toulmin. Partly because
their
roots
tend
to
be
in intellectual
history, and partly in reaction to logical empiricism, this group emphasizes
the
evolution
of
scientific ideas
and
downplays
the
role of empirical data in
the
development
of science.
For
these
thinkers,
the
rationality of science
is
to
be
found in
the
historical process
of
science
rather
than in
the
(idealized)
minds
of
scientists.
If
there
is
something that can rightfully be called a
middle group, it consists mainly
of
the
followers
of
the
late
lmre
Lakatos,
who skillfully
blended
Popper's version
of
empiricism with
elements
of
Kuhn's account of scientific development. Yet Lakatos's "methodology of
scientific research programmes" also locates
the
ultimate rationality of
science in a larger historical process
rather
than in relations
between
particular hypotheses and particular bits of data.
I shall
be
arguing for a theory of science in which
the
driving rational
force
of
the
scientific process
is
located in
the
testing
of
highly specific
theoretical models against empirical data. This
is
not to
deny
that
there
are
elements of rationality
throughout
the
scientific enterprise.
Indeed,
it
is
only
as
part
of
an overall theory
of
science that one can fully
comprehend
what goes on in tests
of
individual hypotheses. Yet
there
is
a "logic" in
the
The author's research has
been
supported in part by a grant from
the
National Science
Foundation.
269
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e

Partial preview of the text

Download Testing Theoretical Hypotheses and more Slides Logic in PDF only on Docsity!

---------Ronald N. Giere---------

Testing Theoretical Hypotheses

1. Introduction

Philosophers of science concerned with theories and the nature of evidence tend currently to fall into several only partially overlapping groups. One group follows its logical empiricist ancestors at least to the extent of believing that there is a "logic" in the relation between theories and evidence. This logic is now most often embedded in the theory of a rational (scientific) agent. Bayesian agents are currently most popular, but there are notable dissenters from The Bayesian Way such as Henry Kyburg and Isaac Levi. Another group derives its inspiration from the historical criticisms of logical empiricism begun a generation ago by such writers as Gerd Buchdahl, Paul Feyerabend, N. R. Hanson, Thomas Kuhn, and Stephen Toulmin. Partly because their roots tend to be in intellectual history, and partly in reaction to logical empiricism, this group emphasizes the evolution of scientific ideas and downplays the role of empirical data in the development of science. For these thinkers, the rationality of science is to be found in the historical process of science rather than in the (idealized) minds of scientists. If there is something that can rightfully be called a middle group, it consists mainly of the followers of the late lmre Lakatos, who skillfully blended Popper's version of empiricism with elements of Kuhn's account of scientific development. Yet Lakatos's "methodology of scientific research programmes" also locates the ultimate rationality of science in a larger historical process rather than in relations between particular hypotheses and particular bits of data. I shall be arguing for a theory of science in which the driving rational force of the scientific process is located in the testing of highly specific theoretical models against empirical data. This is not to deny that there are elements of rationality throughout the scientific enterprise. Indeed, it is only as part of an overall theory of science that one can fully comprehend what goes on in tests of individual hypotheses. Yet there is a "logic" in the

The author's research has been supported in part by a grant from the National Science Foundation.

269

270 Ronald N. Giere

parts as well as in the whole. Thus I agree with contemporary students of probability, induction, and the foundations of statistics that the individual hypothesis is a useful unit of analysis. On the other hand, I reject completely the idea that one can reduce the rationality of the scientific process to the rationality of individual agents. The rationality of science is to be found not so much in the heads of scientists as in objective features of its methods and institutions. In this paper I shall not attempt even to outline an overall theory of science. Rather, I shall concentrate on clarifying the nature of tests of individual hypotheses, bringing in further elements of a broader theory of science only when necessary to advance this narrower objective. My account of how individual hypotheses are tested is not entirely new. Indeed, it is a version of the most ancient of scientific methods, the method of hypothesis, or, the hypothetico-deductive (H-D) method. But some elements of the account are new, and some have been borrowed from other contexts.

2. Models, Hypotheses, and Theories

Views about the nature of evidence and its role in science depend crucially on views about the nature of hypotheses and theories. The major divergences of current opinion in the philosophy of science are correlated with strong differences as to just what the highly honorific title "theory" should apply. For the moment I shall avoid the term "theory" and speak of "models" and "hypotheses" instead. My use of the term "model" (or "theoretical model") is intended to capture current scientific usage-at least insofar as that usage is itself consistent. To this end, I would adopt a form of the "semantic" or definitional view of theories (hereafter, models). On this view, one creates a model by defining a type of system. For most purposes one can simply identify the model with the definition. But to avoid the consequence that rendering the definition in another language would create a different model, it is convenient to invent an abstract entity, the system defined, and call it the model. This move also preserves consistency with the logician's and mathematician's notion of a model as a set of objects that satisfies a given linguistic structure. For present purposes it will make no difference whether we focus on the definition itself or its nonlinguistic counterpart, so long as there is no presumption that in referring to "the model'' one is thereby committed to there being any such thing in the empirical world.

272 Ronald N. Giere

generalization. Within this framework it is easy to regard a theory as simply a conjunction of universal generalizations. This would mean that testing a theory is just testing universal generalizations. The distinction between models and hypotheses permits a view of the goals of science that is more particularized, or at least more restricted-and therefore, I think, more applicable to the contemporary practice of science. The simplest form of a theoretical hypothesis is the claim that a particular, identifiable real system fits a given model. Though extremely limited in scope, such claims may be very complex in detail and wide- ranging in space and time. The claim that the solar system is a Newtonian particle system (together with a suitable set of initial conditions) contains the whole mechanical history of this system-so long as it has been or will be a system of the designated type. Moreover (although this is more controversial), the same hypothesis contains all the different possible histories of this system that could result from different, but physically possible, initial conditions. Thus even a very particular theoretical hypoth- esis may contain a tremendous amount of empirical content. Contrary to what some philosophers have claimed, one can have a science that studies but a single real system. Current geological models of the earth are not less than scientific, or scientifically uninteresting, simply because the only hypotheses employing these models refer to a single entity limited in time and space. Nor would models of natural selection be in any way scientifically suspect if there were no life anywhere else in the universe. Geology, however, is atypical. The models of a typical science are intended to apply to one or more kinds of systems, of which there are numerous, if only finitely many, instances. What then is a "theory"? It is tempting to identify a theory with a generalized model; for example, the theory of particle mechanics with a generalized Newtonian model (i.e., one in which the number of particles is left unspecified). But most physicists would immediately reject the suggestion that "Newton's theory" is just a definition. And most scientists would react similarly concerning the theories in their fields. They think that "theories" have empirical content. This is a good reason to use the term "theory" to refer to a more or less generalized theoretical hypothesis asserting that one or more specified kinds of systems fit a given type of model. 2 This seems broad enough to encompass all the sciences, including geology and physics. Testing a theory, then, means testing a theoretical hypothesis of more or

TESTING THEORETICAL HYPOTHESES 273

less restricted scope. This is an important qualification because the scope of a hypothesis is crucial in any judgment of the bearing of given evidence on that hypothesis. Knowing what kind of thing we are testing, we can now turn to an analysis of empirical tests. Here I shall not be challenging, but defending, a time-honored tradition.

3. The Hypothetico-Deductive Tradition

To put things in proper perspective, it helps to recall that the hypothe- tico-deductive method had its origins in Greek science and philosophy. Its most successful employment, of course, was in astronomy. Recast in the above terminology, the goal of astronomy was to construct a model of the heavens that one could use to deduce the motions of the various heavenly bodies as they appear from the earth. "Saving the phenomena" was thus a necessary requirement for an acceptable hypothesis. The methodological issue, then as well as now, was whether it is also sufficient. Greek astronomers were well aware that the phenomena could be equally well saved by more than one hypothesis. This methodological fact was exemplified by the construction of both heliocentric and geocentric models. But it was also evident on general logical grounds. Every student of Aristotle's logic knew that it is possible to construct more than one valid syllogism yielding the same true conclusion, and that this could be done as easily with false premises as with true. Truth of the conclusion provides no logical ground for truth of the premises. This obvious logical principle generated a methodological controversy that continues to this day. If two different hypotheses both saved the phenomena, there could be no logical reason to prefer one to the other. Some thinkers seemed content to regard any empirically adequate hypothesis acceptable and did not attempt to argue that one was fundamentally better. Others, however, wished to regard one model as representing the actual structure of the heavens, and this requires some way of picking out the correct hypothesis from among those that merely save the phenomena. In the ensuing centuries of debate, the antirealists clearly had logic on their side. The realists, however, did offer several suggestions as to what, in addition to saving the phenomena, justified regarding a hypothesis as uniquely correct. Some appealed to the internal simplicity, or harmony, of the model itself. But this suggestion met the same objections it meets today. There is no objective criterion of simplicity. And there is no way to justify thinking that the simpler of two models, by whatever criterion, is

TESTING THEORETICAL HYPOTHESES 275

ses." The method of hypothesis was apparently thought to be almost as discredited as Cartesian physics and Ptolemaic astronomy. Inference to general laws "by induction" from the phenomena was the methodological rule of the day. Interest in the hypothetical method did not revive until the triumphs of wave theories of optics in the nineteenth century-association with scientific success being the apparent standard against which method- ological principles are in fact judged. Thus by the third quarter of the nineteenth century we find such eminent methodologists as Whewell and Jevons expounding the virtues of hypotheses with explicit reference to the remarkable predictions that had been based on the wave theory of light. Whewell, for example, writes: "If we can predict new facts which we have not seen, as well as explain those which we have seen, it must be because our explanation is not a mere formula of observed facts, but a truth of a deeper kind." 7 This passage is typical of many of Whewell's writings. Whewell' s homage to the methodological virtues of successful prediction did not go unchallenged. Mill, in particular, denigrated the celebrated predictions of the wave theory as "well calculated to impress the unin- formed," but found it "strange that any considerable stress should be laid upon such coincidences by persons of scientific attainments." Moreover, Mill goes on to explain why "coincidences" between "prophecies" and "what comes to pass" should not count for a hypothesis any more than simple agreement with the predicted occurrence. I shall pass over the details of his argument here. 8 Of more interest for this brief survey is that the essentials of the exchange between Whewell and Mill were repeated more than a half-century later in a similar exchange between Peirce and John Maynard Keynes. In many of his scattered writings, Peirce advocated versions of the following "rule of prediction": "A hypothesis can only be received upon the ground of its having been verified by successful prediction. "^9 Unlike his many predecessors who either lacked the necessary concepts, did not think to apply them, or did not know how, Peirce attempted to justify his rule by explicit appeal to considerations of probability. But even this appeal was not decisive. Keynes, whose own view of scientific reasoning incorporated a theory of probability, examined Peirce's arguments for the rule of prediction and concluded that "the peculiar virtue of prediction" was "altogether imaginary." Addressing the details of Keynes's argument would again take us too far afield. 10 I shall only pause to suggest that there must be methodological principles beyond a commitment to concepts of

276 Ronald N. Giere

probability that separate the tradition of Huygens, Whewell, and Peirce from that of Bacon, Mill, and Keynes. Among contemporary methodolgists, the main defenders of the hypothe- tico-deductive method seem to be Popper and his intellectual descendents. Elie Zahar and Alan Musgrave have even advocated a special role for successful "novel predictions" in a Lakatosian research programme. 11 Yet these writers seem to me not to be the legitimate heirs of Huygens, Whewell, or Peirce. For the main stream of the hypothetico-deductive tradition, confirmation of a hypothesis through the verification of its consequences, particularly its predicted consequences, provides a reason to believe or accept that the hypothesis is true. Popper explicitly denies that there can be any such reasons. No matter how "severely tested" and "well- corroborated" a hypothesis might be, it remains a "conjecture" whose truth we have no more reason to believe than we did on the day it was first proposed. Similarly, for Lakatos or Zahar the success of a novel prediction is merely one sign of a "progressive" research program-not a sign of the truth of any particular theory or hypothesis. Only if one accepts the "problem shift" that replaces "reasons to regard as true" with the very different notions of "corroboration" or "progress" can one place these methodological suggestions firmly within the hypothetico-deductive tradition. Similar remarks apply to those who take their methodological cues from Quine. Insofar as Quine belongs in the hypothetico-deductive tradition, it is that of the antirealists among the classical astronomers. Saving the phenomena is the main thing. Simplicity in one's hypotheses is desirable, but not because of any supposed link between simplicity and truth. Simplicity is desirable in itself or because it contributes to some pragmatic end such as economy of thought. Similarly with prediction. Hypotheses that are useful in making reliable predictions are desirable, but not because this makes them any more likely to be true. Rather, there is pragmatic value in being able to foresee the future, and we value hypotheses with this virtue without thereby ascribing to them any "truth of a deeper kind." In championing the hypothetico-deductive method of testing scientific hypotheses, I am adopting only the "realist" tradition of Huygens, Whewell, Peirce, and, in part, Popper. I am not defending the more pragmatic or conventionalist versions represented by Quine. 12 Neverthe- less, most of the following account is compatible with the subtle antirealism of van Fraassen's The Scientific Image.^13 Just how I would differ from van Fraassen will be explained later.

278 Ronald N. Giere

The way probability enters our account is through the characterization of what constitutes an "appropriate" test of a theoretical hypothesis. So far we have concluded only that a test of a hypothesis is a process whose result provides the basis for our "verdict" either that the hypothesis is true or that it is false. This general characterization, however, is satisfied by the procedure of flipping a coin and calling the hypothesis true if heads comes up and false if tails. This procedure has the virtue that our chances of reaching the correct conclusion are fifty-fifty regardless of the truth or falsity of the hypothesis. But no one would regard this as a satisfactory way of "testing" hypotheses. It does, however, suggest that an "appropriate" test would be one that has higher probabilities for leading us to the correct conclusion. We shall follow this suggestion. Thinking about tests in this way throws new light on the classical objections to the method of hypothesis. Let us assume for the moment that our powers of deduction and observation are perfect. This will allow us to concentrate on the nature of tests themselves. At least some realists among the classical astronomers may be viewed as advocating a testing procedure that recommended calling a hypothesis true if and only if it saves the phenomena. Following this procedure, the chances of calling a hypothesis false if it is in fact true are (ideally) zero. A true hypotheses cannot have false consequences. The defect of the procedure is that the chances of calling a false hypothesis true are at best simply not known. One might even argue that this probability is high on the ground that there are many, perhaps even infinitely many, false hypotheses that would also save the phenom- ena. The odds seem overwhelming that the hypothesis in question is one of these. What is needed to improve the procedure, therefore, is some way of increasing the chances that a false hypothesis will be rejected. This must be done in such a way, however, that the probability of rejecting a true hypothesis is not increased. It would be trivially easy to design a procedure guaranteed to reject false hypotheses: simply reject any proposed hypothe- sis, regardless of the evidence. Unfortunately this procedure is also guaranteed to reject any true hypothesis as well. The above considerations suggest characterizing an appropriate test as a procedure that has both an appropriately high probability ofleading us to accept true hypotheses as true and to reject false hypotheses as false. Alternatively, an appropriate test of a hypothesis is a procedure that is reasonably unlikely to lead us either to accept a false hypothesis or to reject a true one. This characterization still requires considerable elaboration and refinement, but it makes clear the kind of account we seek.

TESTING THEORETICAL HYPOTHESES 279

One immediate task is to clarify the interpretation of probability assumed in the above characterization of a good test. By a "procedure" I mean an actual process in the real world. If such a procedure is to have probabilities for leading to different results, these must be physical probabilities. Our account thus presupposes an acceptable physical inter- pretation of probability, something many philosophers regard as impossi- ble. Here I would agree with those who reject attempts to reduce physical probability to relative frequency, and opt for some form of "propensity" interpretation. But since this is again too big an issue to be debated here, I shall proceed under the assumption that there is some acceptable physical interpretation of probability. 15 Moreover, we must assume that we can at least sometimes have good empirical grounds for judging the relevant physical probabilities to be high.

5. Example: Fresnel's Model of Diffraction

An example may help to flesh out the relatively abstract outline presented so far. This example is appropriate in many ways, one being its historical association with the re-emergence of the H-D method of testing in the early eighteen-hundreds after a century in the shadows of Newtonian methodological orthodoxy. Wave models of optical phenomena had been developed by Hooke and Huygens at the end of the seventeenth century. Particle models were favored by Newton and the later Newtonians. At that time, the evidence for either type of model was genuinely ambiguous. Each type of model explained some phenomena better than the others. The then recently discovered phenomenon of polarization, for example, was an embarrass- ment to both, though perhaps more so to wave theorists. In general, particle models dominated eighteenth-century theorizing, perhaps partly because of greater empirical success but also, I think, because of the general triumph of Newtonianism. In any case, for most of the eighteenth century there was little serious work on wave models until Thomas Young took up the cause around 1800. The scientific establishment, including the French Academy of Sciences, was then dominated by particle theorists. Laplace, for example, published a particle model of double refraction in

  1. But interest in optics was obviously high, since the Academy prizes in 1810 and 1818, for example, were for treatments of double refraction and diffraction respectively. The diffraction prize eventually went to Augustin Fresnel for a wave model. In Fresnel's models, diffraction patterns are produced by the

TESTING THEORETICAL HYPOTHESES 281

be taken for granted once it was firmly established for a single case. The emphasis placed by empiricist philosophers on such generalizations is quite misleading. One reason for focusing on Fresnel's theory of diffraction rather than on a broader wave theory of light is that Arago' s experiments provide an appropriate test of Fresnel's theory, but not of the broader theory-in spite of the fact that the former is a logical consequence of the latter. This follows from our characterization of an appropriate test of a hypothesis, as we shall now see. At the time of Arago' s experiments, techniques for dealing with optical phenomena were sufficiently well developed that it was very probable that the spot would be observed-given that the Fresnel-Poisson model does fit this situation. So, given that Fresnel's theory is true, it was very unlikely that it should mistakenly have been rejected. This aspect of the test was entirely appropriate to the circumstances. But what if Fresnel's theory had been false? How probable was it that the testing process should have yielded the predicted spot even if Fresnel's models did not really capture diffraction phenomena? To answer this question we must first decide just how much of the episode to include within the "testing process." According to common interpretations of the discovery/justification distinction, the decisive testing process began when Poisson constructed a Fresnel-style model for the circular disk and deduced that the spot should appear. Nothing that happened earlier is relevant to the confirmation of any of the hypotheses we have considered. I expect that many who reject a discovery/justification distinction would nevertheless agree with this conclusion. And indeed, this view of the matter follows naturally from the doctrine that there is a "direct" evidential relationship, analogous to deduction, between hypothesis and evidence. But this is not our view. On our account, the relationship between hypothesis and evidence is medi- ated by the testing process, and there is no a priori reason why incidents that occurred before the actual formulation of the hypothesis should not be relevant to the character of this process. In particular, the process by which a hypothesis is selected for consideration might very well influence its content and thus the likelihood of discovering a further consequence to be true. The Commissioners apparently did not regard Fresnel's success in explaining the diffraction pattern of straight edges as decisive. Why? Was

282 Ronald N. Giere

this just prejudice? Or did they have good reasons for not regarding these familiar patterns as being part of a good test of Fresnel's models? I think the latter is the case. From Fresnel's own account it is clear that the straight- edge pattern acted as a constraint on his theorizing. He was unwilling to consider any model that did not yield the right pattern for straight edges. Thus we know that the probability of any model he put forward yielding the correct pattern for straight edges was near unity, independently of the general correctness of that model. Since the straight-edge pattern thus had no probability of leading to rejection of any subsequently proposed hypothesis that was in fact false, this pattern could not be part of a good test of any such hypothesis. We could regard agreement with the straight-edge pattern as a test of a hypothesis if we knew the probability that Fresnel should pick out a satisfactory model using this, together with similar data, as a constraint. At best this probability is simply unknown. And given the frequency with which even experienced scientists come up with unsatisfactory models, there is reason to judge such probabilities to be fairly low. In either case we fail to have a good test of the hypothesis. The case with the spot is quite different. We know that this result did not act as a constraint on Fresnel's choice of models. Suppose, then, that Fresnel had come up with a model that applied satisfactorily to straight edges and the like but was not correct for diffraction phenomena in general. The corresponding theory would therefore be false. What is the probability that any model selected in this way should nevertheless yield the correct answer for the disk? In answering this question we must also take into account the fact that the disk experiment was specifically chosen because it seemed to Poisson and others that no such phenomenon existed. So the consequence selected for the test was one that knowledgeable people thought unlikely to be true. Given all these facts, it seems clear that the test was quite likely to lead to a rejection of any false theory that Fresnel might have proposed. And this judgment about the test was one that could easily be made by everyone involved. My view is that they did make this judgment, implicitly if not explicitly, concluded that Poisson's proposed test was quite adequate, and, when the result came in, acted accordingly.

In thinking about this and similar examples it is crucial to remember that

the probabilities involved are physical probabilities inherent in the actual scientific process itself. If one slips into thinking in terms of probability relations among hypotheses, or between evidence and hypotheses, one

284 Ronald N. Giere

satisfies a "converse consequence condition." That is, if Tis confirmed by the truth of some consequence, 0, then, if T' implies T, T' is equally confirmed. In particular, T and H, for any H, is confirmed. And, granted that a logical consequence of any hypothesis is at least as well confirmed as the hypothesis itself, by confirming T we can equally well confirm any H whatsoever. The above discussion shows that such objections are based on an oversimplified view of the H-D method-indeed, a version to which few if any serious defenders of the H-D method ever subscribed. 17

6. The Role of Novel Predictions

As we have seen, many champions of the H-D method have suggested that successful predictions are sufficient for the confirmation of hypotheses; some, such as Peirce, have taken them to be necessary as well. Critics argued that successful predictions were neither necessary nor sufficient. From our present perspective we can see why the defenders were on the right track even though their critics were technically correct. First let us give the critics their due. That successful predictions are not sufficient is easily seen by imagining other possible sequences of events in the Fresnel example. Suppose that Biot had repeated Fresnel's calculations for at straight edge and persuaded Arago to repeat these measurements. Of course Biot's prediction would have been verified, but no one would have regarded this replication as providing a decisive test of Fresnel's hypothesis regarding this experiment or of the general adequacy of his approach to diffraction phenomena. Why? Because the imagined process would not have been a good test of the hypothesis. The process had a high probability of supporting Fresnel's hypothesis if it were true. But it also had a high probability of supporting the hypothesis even if it were false. The many previous experiments with straight edges had provided ample evidence for the empirical generaliza- tion that this type of experiment yielded the indicated diffraction pattern. So regardless of the truth or falsity of Fresnel's hypothesis, it was highly probable that the hypothesis would be supported. This violates our conditions for a good test of a hypothesis. Both Mill and Keynes used this sort of example in their analyses of the prediction criterion, though each within a quite different framework. This same counterexample also shows why many H-D theorists have insisted on novel predictions. If a predicted result is not novel, there will be a more or less well-justified low-level empirical hypothesis linking the type

TESTING THEORETICAL HYPOTHESES 285

of experiment and the type of result. This makes it likely that the test will justify the hypothesis no matter whether it is true or false. Thus it is difficult to have a good test unless the prediction is novel. This point seems to have been missed by empiricist critics of the H-D method such as Mill and Keynes, although perhaps the true value of novelty was also not sufficiently understood by either Whewell or Peirce. That successful predictions are not necessary is also easily demonstrated by an imaginary variation on the same example. It has been claimed that the bright spot in the center of a circular shadow was observed in the early part of the eighteenth century by J. N. Delisle. 18 It seems pretty clear that none of the principles in the case had ever heard of these supposed observations. But suppose they did occur and were published. Imagine, then, that Laplace, but not Fresnel, knew of these results, and upon reading Fresnel's memoir recalled Delisle' s unexplained observations. It would not have taken him long to apply Fresnel's method to the case and conclude that Fresnel's model explained Delisle' s results. I think the Commission would have been equally convinced that Fresnel's theory was correct. But whether or not they would have been, they should have been. It was about as improbable that Fresnel, ignorant of Delisle' s results, should have developed an inadequate model that nevertheless explained Delisle' s results, as it was that an inadequate model should have happened to predict correctly the result of Arago' s later experiment. In either case the conditions for a good test are satisfied. Returning to the champions of the prediction rule, it is clear that they overstated their case. But they were fundamentally correct in thinking that the fact that a result was predicted successfully may be relevant to the confirmation of a hypothesis. The conditions that define an appropirate test of a hypothesis are themselves contingent empirical hypotheses about the actual process by which a particular hypothesis is tested. This is due to the fact that the relevant probabilities are physical probabilities embodied in the testing process. Judging a process to constitute a good test of a hypothesis thus requires judging that the relevant physical probabilities are high. All sorts of empirical facts about the case may be relevant to these judgments-including facts about when a specified consequence of a hypothesis became known, and to whom. Let us imagine yet another variant on the Fresnel example. Suppose that someone named Delisle really did observe the spot and that Fresnel, but not other principles in the case, knew of Delisle' s results right from the

TESTING THEOHETICAL HYPOTHESES 287

think that the test has a high probability of discovering a false con- sequence-if there are any to be discovered. Of course a scientist need not be attempting to refute the hypothesis in question; he may just be trying to devise the best possible test. But the knowledge that a given consequence was selected for investigation in a well-informed attempt to refute the hypothesis is relevant to the judgment as to how good the test might be.

7. The Logic of Tests

A good test of a hypothesis is a physical process with specified stochastic properties, namely, a high probability of one outcome if H is true and another if H is false. That this process has one outcome rather than the other, however, also has epistemic consequences. If a good test has a favorable outcome, we are to "conclude" that His true, "accept" Has being true, or some such thing. One must provide some account of the rationale, or "logic," of this step from the physical outcome to the epistemic conclusion. Here I shall follow those who regard the epistemic step as a kind of decision. This opens the way to a decision-theoretic analysis of scientific inference. But since we have renounced probabilities of hypothe- ses, our decision theory must be "classical," or "non-Bayesian," decision theory. Casting the problem in decision-theoretic terms, we realize immedi- ately that what really needs to be justified is not so much the decision to accept (or reject) any particular hypothesis, but the general decision rule that tells us, for each possible outcome of the experiment, whether to accept or reject the hypothesis. It is so obvious which of the four possible decision rules is correct that most traditional accounts of the H-D method do not even note the epistemic step from physical outcome to accepted conclusion. To understand the logic of the step, however, it is useful to consider the full range of possibilities. In any test of a theoretical hypothesis there are four possible epistemic results, two correct and two incorrect. The correct ones are accepting H if it is true and rejecting H if it is false. The incorrect ones are rejecting H if it is true and accepting H if it is false. Now to conceptualize the problem in decision-theoretic terms we must assume that it is possible to assign some kind of "value" to these possible results. For the moment we shall not worry what kind of value this is or whether it has the formal properties of utility. And just for convenience (it makes no difference to the argument), we shall suppose that both correct results have the same value (which we

288 Ronald N. Giere

may set arbitrarily at 1) and that both incorrect results also have the same value (which we may set arbitrarily at O). Finally, let Q'. be the probability that the prediction is false even though H is true and 13 be the probability that the prediction is true even though H is false. For the moment it will not matter much what these probabilities are so long as they are strictly between zero and one half. With these assumptions we can represent the "decision" to accept or reject H in a two by two matrix (Figure 1).

Hypothesis Hypothesis true false Accept Pr=^ 1 -^ Q'.^ Pr= 13 Hypothesis V=l V=O Reject Pr=O'. Pr= 1 - 13 Hypothesis V=O^ V=l

Figure 1.

Each of the four possible outcomes is labeled with its respective value and probability (conditional on the hypothesis being true or false). The meta-decision problem of choosing a decision rule for making the object-level decision is represented as a four-by-two matrix (Figure 2). The obvious decision rule to accept H if and only if the prediction is true is represented as (A, R), and the others are represented accordingly. The outcomes are labeled with the appropriate expected values of applying the rule conditional on the truth or falsity of the hypothesis.

Hypothesis Hypothesis true false

(A, A) 1 0

(A, R) 1 - Q'. (^) 1 - 13

(R, A) Q'.^13

(R, R) 0 1

Figure 2.