CHAPTER 3
POPPER AND THE OBJECTIVITY OF SCIENTIFIC PROBLEMS
[--- Unable To Translate Graphic ---]
3.1 Popper's Conception of Knowledge
Our interview of Popper's theory of science as problem-solving (and
a comparison of it with Dewey's account) immediately raises a series of
questions:
a) Is Popper correct in stressing the overriding importance of the objective
features of problem-situations? What does he mean by saying problems are
in World-3? Is the analogy he draws between the problem-solving of Einstein
and that of an amoba ( or even a species!) a useful one?
b) What sense can we make out of Popper's claim that we can't understand
a theory until we know what problems it was designed to solve? What does
he mean by saying scientific progress can be measured by the depth of the
problems under study?
c) When Popper provided his demarcation between science and non-science
it was in terms of characteristics of theories. But if we are going to
center our account of science around problems, shouldn't we also look for
distinguishing characteristics of scientific problems? What exactly is
it that makes a problem cogent, deep, interesting, ripe for solution? To
reiterate the central problem of this book, how can problems (as opposed
to theories) be evaluated?
In this chapter we will look more critically at Popper's claims about the
objectivity of problems. We will then look in more detail at his theory
of science as problem-solving. Our aim is to see hos much of Popper's account
we want to preserve and to discover what needs to be drastically revised
or supplemented.
As an introduction to the issue of the objectivity of problems, let us
begin with what Popper understands by objective knowledge.
3.1 Popper's Conception of Scientific Knowledge
According to the Encyclopedia of Philosophy (Vol. IV, p. 345), the most
widely accepted definition of knowledge is "justified true belief".
However, on Popper's view none of these terms describe important characteristics
of knowledge. First of all, knowledge propositions need not be believed
by anyone. Suppose the world expert on widgets writes a definitive summary
for an encyclopedia but then dies and no one ever reads her article; nevertheless,
the information therein remains knowledge although no one actively believes
it. Or, to cite one of Popper's examples, consider a table of logarithms
generated by a computer program. It may turn out that certain items of
the table are never read by any human. Yet they are part of mathematical
knowledge.
So for Popper the genus of knowledge is objective propositional content,
not subjective attitudes towards propositions. Knowledge may be encoded
in human brains, but it may also be contained in books, diagrams, or software.
Whether any one actively believes the proposition is irrelevant to its
status as knowledge. (As my old chemistry teacher used to tell us, "If
you want to be a good chemist, don't make a handbook out of your head!")
Knowledge items for Popper need not be believed, but mustn't they at least
be worthy of rational belief? Shouldn't they be true and justifiable?
Here again Popper departs from the traditional conception. Knowledge claims
need not be true, Popper argues and he cites two sort s of examples. First
of all, many of the most important propositions in science are literally
false - e.g., all of the laws of classical physics and chemistry. Matter
is not conserved, planets do not travel in perfect elipses, atoms are not
indivisible, and not all molecules of water are alike. Neither are the
claims of the theories which replaced classical science, such as Relativity
Theory and Quantum Mechanics, without flaw. If we were to follow the traditional
epistemological delimitation of knowledge to true propositions, it might
well turn out that there is no scientific knowledge. (Cf. Cartwright, How
the Laws of Physics Lie) Popper goes on to argue that many of the most
basic bits of common-sense knowledge are also false. The sun does not rise
every twenty-four hours, at least not in the Land of the Midnight Sun and
bread can be poisonous, if it is made from ergotic wheat.
But if we admit the existence of false knowledge claims shouldn't we at
least require that in order to count as knowledge in a particular historical
context a statement must have been well-confirmed by the evidence available
at that time? Shouldn't we insist that one should be justified in believing
knowledge propositions, even if later evidence may cause us to reverse our
appraisal of their truth value?
Some of the details of Popper's criticisms of various justificationist
epistemologies will emerge below. Suffice it to say here that with respect
to the logical positivists Popper argues that there is no infallible empirical
base - the most trivial sounding observation report, such as "Here
is a glass of water", contains so many untested implications, such
as, "If you were to drop it, the water would spill while the glass
would break" or "If you were to cool it, the water would turn
into ice," that it is impossible to verify them all. Furthermore,
even if we were to take some set of observation reports as indubitable evidence,
Popper follows Hume in arguing that they can never provide the kind of logical
justificatory support assumed by inductivist philosophers.
But if we give up these traditional epistemological requirements, how are
we to distinguish those propositions which are part of knowledge from random
sentences generated by the apocryphal monkey at the typewriter, or, more
realistically, from the outputs of programs such as Rachter? Is Stove (Popper
and After) right in claiming that Popper has so changed the meaning of knowledge
by completely denying its sense of
cognitive success and achievement that it is misleading for him to continue
to use it? What, according to Popper, are the delimiters of knowledge?
I know of no place where Popper gives a short, italicized definition of
knowledge but we can construct one from his writings. First of all if we
are prepared to identify knowledge with our best science, he does give an
explicit minimal characterization of a scientific claim - it is one which
can be subjected to empirical test - and our best scientific claims are
falsifiable propositions which are not false as far as we know but which
have successfully passed severe empirical testing. Such a tri-partite definition
would replace the old justified-true-belief account by replacing each of
its components with corroborated-falsifiable-claim.
However, Popper often uses knowledge in a looser sense when he speaks of
'background knowledge' and its role in setting problems or appraising the
severity of a test. These propositions are assumed to be unproblematic
in a particular problem-situation (C. & R, p. 238) but there is no suggestion
that they each have been carefully tested. In fact to demand systematic
testing would lead to a falsificationist regress almost as vicious as the
inductivist one. Neither must critical scrutiny be limited to empirical
testing. Unlike the positivists, Popper would never consider excluding
mathematics or philosophy from knowledge.
The proper conclusion to draw, I think, is that Popper does not try to
delimit knowledge propositions because on his account it makes no sense
to do so. No propositions receive permanent gold stars on Popper's account.
All claims are conjectural although some have been more carefully scrutinized
than others. In some contexts a proposition will be temporarily accepted;
in others it may be challenged. What we can do at any point is to open
the case on any claim and review its test record, its logical compatibility
with other statements, its explanatory power, what problems it helps solve,
what problems it generates, etc. But it would be fruitless to demand that
we always start by reviewing the credentials of every proposition in sight.
3.2 The Objectivity of Scientific Problems
If problems arise within a knowledge context and if knowledge is the content
of some set of propositions (not human attitudes towards them), then it
would seem fairly straight-forward to define problems as some sort of logical
inadequacy within the propositional set, such as inconsistency or incompleteness.
Thus although Popper himself often uses vague, quasi-psychologistic talk
of "violated expectations" (reminiscent of Dewey), one could replace
it by analyses in terms of logical inconsistencies between theoretical systems
and singular observation statements. Of course, logic alone doesn't tell
us which statements count as observations nor which theories are "accepted"
(No problem arises if a theory already considered to be false leads to a
prediction failure.), but presumeably Popper hopes to give non-psychologistic
accounts of these additional factors.
By giving an objective account of problems, one can explain why there is
often considerable inter-subjective agreement among scientists on what is
problematic about a particular theory. (There is something really "out
there" which makes them feel puzzled.) And one can clarify Dewey's
point about neurotic feelings of doubt not being adequate for inquiry -
the doubt must arise from a situation which is objectively open or indeterminate.
For Popper, although theories are generated by humans, theories can in
turn autonomously generate problems of which no human being is aware. For
example, when people first invented the integers (perhaps using a Brouwer-like
construction), the problem of whether there is a largest prime also came
into existence (although people only worried about it much later). Whether
a problem is solved or not is also determined by looking at the objective
state of the arguments which could be constructed for and against a proposed
solution. Whether people actually accept those arguments or cease feeling
puzzled is not relevant.
To dramatize the difference between the objective status of a problem and
our feelings about it, Popper places them in different worlds. Roughly
speaking, World-1 contains the objects traditionally studied by natural
science (e.g., electrons and chairs). World-2 contains psychological states
(e.g., feelings of doubt, beliefs). And World-3 contains "the objective
contents of thought" (e.g., theories, arguments, problems). Although
minds in W-2 (contained in W-1 bodies) produce everything in W-3, no mind
working either individually or collectively can be subjectively aware of
everything in W-3. (We can't think about each integer nor actually derive
every Euclidean Theorem.) Popper claims the growth of knowledge is best
understood by concentrating on the logical relations between W-3 objects,
not on the psychological states of the people who produce and manipulate
them. But although Popper's notion of objective problem is very useful,
I think we cannot ignore the psychological and sociological aspects of scientific
problems. I will develop this point by listing various points at which
we must refer to these other dimensions of the problem-situation in order
to understand the growth of knowledge.
Let us suppose that humans have generated propositions T and O - both now
reside in W-3. Let us further suppose that T entails ~O, but the deduction
is abstruse and so no one has noticed. Objectively, there is now a problem
in W-3, but no inquiry will result until someone comes to believe that T
entails ~O.
At this point, Popper might wish to add the explicit proposition, T _ ~O,
to W-3 and draw a distinction between potential logical consequences of
our theories and those which have actually been drawn. This seems fair
enough. The derivation of ~O from T constitutes a powerful argument against
T and creating new arguments surely is a contribution to knowledge. (Recall
Galileo's wonderful criticism of the Aristotelian law of falling bodies
- if two cannonballs, while falling, should somehow be tied together to
form a mass twice
as heavy, according to Aristotle's theory, they should speed up, but this
is absurd.) However, this does mean that only those features of W-3 which
people notice can contribute to the growth of knowledge.
Furthermore, if people believe T and O are inconsistent (even though they
aren't), they will try to solve a problem although it would seem that objectively
no problem exists! For example, an early criticism of Darwinian theory
(D) was that it could not explain the evolution of altruistic behavior (A).
We now know that the argument was incorrect - sociobiologists describe
a number of mechanisms, such as kin selection, which resolve the puzzle.
Here is a case where Darwinian theory and Altruism were in W-3 and people
also believed D _ ~A so that was also in W-3. However, ~(D _ ~A) follows
from D so presumably that was also at least potentially in W-3. Yet people's
inquiry was influenced by what was in some sense a "psuedo-problem"
- or at least a problem based on a mistake. Nevertheless, the inquiry resulted
in new knowledge, the discovery of kin selection!
I conclude that logical inconsistencies in W-3 are of no importance in
the growth of knowledge until people notice them. And even "neurotic"
problems (puzzlement based on inconsistencies which aren't really there!)
can be important for the growth of knowledge. So it seems that the subjective
and social awareness of problems is important. However, at any given time,
we are probably aware of an enormous number of problems. Not only are there
the problems arising from known inconsistencies (Lakatos claimed that every
theory lives in a "sea" of anomalies), there are also the explanatory
problems which arise from gaps in our knowledge. Which problem will actually
provoke inquiry? When it came to the issue of problem of selection, Popper
admits that subjective experience plays a part in which problem we emphasize,
or select as important (OK, pp. 166-67). Later in this essay we will explore
the possibility of finding objective ways of evaluating problems.
3.3 Popper's Strong Analogy Between Evolutionary Biology and Epistemology
The account of Popper's theory of the objectivity of knowledge and problems
given above is extracted from the essays collected together in Objective
Knowledge but it would be misleading as it stands because Popper develops
this view in a context which stresses the similarities between the growth
of knowledge in the scientific community, animal learning, and the evolution
of biological species. There is at least a metaphorical sense in which
each three of these processes involve problem-solving, trial solutions,
and error elimination, but Popper proposes that we take the analogy very
seriously indeed. Thus his examples of the problem of violated expectations
include the case of a newborn foal who sucks on the hair under the mare's
front legs and is disappointed until it finds its way to the back, as well
as the case of Newtonian astronomers who did not expect the results to the
Eddington eclipse expedition which detected light bending in a gravitational
field. Popper is not suggesting that foals have propositional attitudes.
Rather it has an inborn "theory" which has "run into difficulties.
He also draws parallels between the problem-solving activities of Einstein
and an amoeba -- he cites the example of a hungry amoeba who learns to swim
towards a light in order to get food. And he includes birds' nests among
the problem solutions which reside in World-3!
Of course, there are other places where Popper stresses the unique features
of science (more of this later) but by separating knowledge from human consciousness
it is very easy for Popper to posit that knowledge is encoded in genotypes
and guppies as well as in geniuses!
Let us now look in more detail at the parallels Popper draws between biological
evolution, animal learning, and scientific inquiry, three processes which
instantiate his general schema:
P --> TS --> EE --> P'
P stands for problem, which is to be understood in an objective sense and
does not imply that the entity which "has" the problem is conscious
of it. In the biological domain, species face problems connected with survival
and reproductive success, such as the problems of escaping predators, raising
young, finding food, mates, etc.
TS stands for tentative solution, such as the information encoded within
a new genotype within the species' pool. EE, or error elimination occurs
if the phenotype bearing the new genotype dies without reproducing (or reproduces
at a lower rate than its con-specifies).
The outcome of reiterations of the TS and EE steps is a species better
adapted to its environment, but new problems (P1) will typically lead to
a repetition of the whole selection process.
When the schema is applied to animal learning it looks fairly similar to
Skinnerian operant conditioning. [See Skinner's "Selection by Consequences"]
In response to the "problem" posed by hunger pangs, or whatever,
the animal engages in exploratory behavior (thus proposing a tentative "solution").
Unsuccessful solutions lead to no food, or even pain, and are extinguished.
When a new behavior is successful, however, it becomes part of the individual
animal's patterned response to that type of problem-situation. New problems
then lead to more learning.
The application of the schema to scientific inquiry is quite straight-forward.
Scientists propose falsifiable conjectures in response to problems arising
within their knowledge situation, which are then subjected to empirical
test. False hypotheses are thereby eliminated and the scientist is then
free to confront new cognitive problems.
------------------------
Insert Fig. 3.1 about here
------------------------
These three instantiations of the schema are summarized in Figure 3.1.
Let us now comment on some of the important dissimilarities between scientific
inquiry and these other selection processes. One crucial difference, as
Popper notes, is that in science "our mistaken theories die in our
stead". If a species fails to solve a survival problem, it goes extinct.
If a rat fails to find a path through a maze, it goes hungry. In both
cases, the consequences of error (or success for that matter) have a direct
physical effect on organisms. The scientist, on the other hand, may experience
elation or disappointment as a result of empirical testing but these psychological
reactions are not simply coupled within the selection process. For example,
I may derive satisfaction from designing a clever test which refutes a hypothesis
even when the hypothesis was of my own creation. And to the extent to which
science is a "friendly hostile" competition between ideas (to
use Popper's phrasing), there could even be a division of labor between
creation and criticism such that every prediction failure is a personal
triumph!
Because the success or failure of a scientific hypothesis can be decoupled
from pleasure and pain, the scientist is free both to propose bold conjectural
solutions and to test them severely. And the fact that as scientists we
are free to choose problems - they are not forced on us by the environment
- also allows us to operate less cautiously. Although the evaluation of
scientific theories depends crucially on feedback from the environment,
human scientists experience relatively little feedback from prediction successes
or failures. Contrast the situation of the technologist, e.g., a potter
who is trying to solve the problem of how to prevent pots from exploding
in the kiln. Here the problem is set by practical considerations. Tentative
solutions should be economically viable and may be of such limited scope
as to apply only to the local clay and kilns. And it would be absurd to
push any solution which appears to work to extremes. Prediction failures
cost time and money and so the potter will theorize conservatively. Thus,
although the potter, unlike the animal, can articulate the hypotheses under
test and use information in books to criticize them, the potter's situation
is more like the animal's than the scientist's because she or he is directly
rewarded or punished according to the success of the tentative solution.
The scientist's relative freedom from personal repercussions sounds wonderful
and liberating, but it can also pose the following problem. Note that on
the biological or animal level it is not possible for the organism to "ignore"
refutations because it is causally connected to the environment. A dogmatic
potter may engage in a process of psychological denial of the pot shards
from exploding pots but will soon go out of business. But an individual
scientist may evade the elimination of erroneous theories by using ad hoc
modifications or conventionalist strategems with impunity. To do so is
like cheating as Solitaire - it may not be as much fun, but nothing keeps
you from doing it - except one's internalized standards of fair play. An
interesting question, then, is how scientific institutions and traditions
can best reward (or punish!) scientists' activities as they engage in scientific
inquiry. (For example, how do we discourage people from publishing non-reproducible
experimental results while encouraging them to produce interesting detailed
conjectures which may well be falsified?)
Many evolutionary epistemologists have been captivated by the formal resemblances
between the modification of species by natural selection, the modification
of behavior through differential reinforcement, and the modification of
scientific systems through hypothesis testing. For Popper the parallels
are especially easy to draw because he down-plays the importance of conscious
beliefs in science. However, analogies can lead us astray as well as illuminate
and although there may be a definite sense in which genotypes have propositional
content, I think it can hardly be helpful to say birds' nests do -- unless
we are also to say that sand dunes are the winds' solution to the problem
of where to deposit suspended particles of soil and diamonds are the solution
of carbon's problem of how best to solidify under extreme pressure!
As we now look more closely at Popper's account of scientific problem solving
we will note other mischevious features of his taking the analogy too seriously.
3.4 Popper's Theory of Science as Problem-Solving
Philosophers of science today might admit that to be complete any account
of scientific inquiry should say something about scientific problems but
nevertheless resist the idea of putting problems at the very center of the
enterprise. Let us now look in detail at Popper's methodology and see what
he says about problems at each juncture. We will use an an outline the
flowchart in Figure 3.2.
----------------------------
Insert Figure 3.2 about here
----------------------------
Throughout the discussion I will sometimes supplement Popper's examples
with my own, but they are intended to be ones consonant with his scheme.
a. Typical Scientific Problems
As we have seen, according to Popper, no inquiry begins in a vacuum. Regardless
of what the topic may be, the scientist, like all of us, begins with a motley
collection of ideas, some clear, some confused, some true, some false.
Puzzlement arises when there are inconsistencies or gaps within existing
bodies of knowledge. But how are scientific problems different from those
of ordinary life? Or are they different? Let us begin by surveying the
typical kinds of scientific problems which Popper discusses and then we
will comment on their special characteristics.
(i) Problems arising from violated expectations. A common sort of scientific
problem arises when something surprising or unexpected occurs and we wonder
how or why it happened. An important problem for early astronomers was
the following: In general, celestial bodies, such as the sun, moon and
stars, move across the sky in smooth arcs. However, it was discovered that
the planets wander around the sky irregularly. Can one describe precisely
how the planets move and explain why they move differently from the other
heavenly bodies? Plato called this the problem of the planets. Ptolemy,
Copernicus, and Kepler each offered a different solution to it.
Here is another example of a scientific problem caused by violated expectations:
In 2896 Becquerel found that a batch of photographic plates which had been
carefully stored in black paper were fogged. According to the best scientific
knowledge available at the time, only visible light or x-rays could expose
photographic plates. What could have happened? Becquerel finally began
to suspect that the fogging was caused by an unusual rock he had used as
a paper weight. And it was thus that he discovered radioactivity. Later
Madame Curie showed that the rock contained radium.
(ii) Problems arising from a quest for deep explanations. Even if the
scientist is lucky enough to discover a generalization which seems to have
no exceptions, he or she is still faced with a problem: What causes the
regularity? Why do things happen just that way? For example, early astronomers
asked why the sun rose every day in the east. Some said it was because
the sun moved in a circle around the earth. Later this geocentric theory
was replaced with a heliocentric theory. In either case, a further question
arose: What caused the sun (or earth) to move? According to Aristotle,
there was a Prime Mover. Later people suggested a law of circular inertia,
saying a wheel would move forever if there were no friction. Newton explained
the regular motion in terms of linear inertia and the force of gravity.
There are many other cases in which the problem is to explain a regularity.
Bohr wondered why the wavelengths of the spectral lines of hydrogen should
fit the simple mathematical formula discovered by Balmer. Mendeleev and
other chemists of the late 19th century wondered why the elements should
arrange themselves so nicely into a Periodic Table. By the end of the 18th
century, after the work of Boyle and Charles, everyone knew that gases expanded
on heating. But why? Caloric theorists said that heat was a fluid which
flowed into gases and as a result they took up more room. Kinetic theorists
said heat was kinetic energy and hot gases expanded because their molecules
moved faster. Both sides agreed on the regularity to be explained, but
they offered competing explanations of it.
(iii) Problems arising from a quest for unity. As a science develops,
a new sort of problem often arises: Can one find a unified theory which
covers two or more domains which have previously been treated separately?
For example, for a long time organic chemistry (which deals primarily with
covalent compounds) and inorganic chemistry (which is mainly concerned with
ionic compounds) were considered to be quite distinct fields. At this time
people believed that naturally occurring organic compounds, such as urea,
could not be synthesized in the laboratory because they contained a vital
life force. However, today's theories of chemical bonding apply equally
well to inorganic and organic materials.
Before Galileo, it was held that terrestrial bodies and celestial bodies
obeyed different laws. Galileo (and later Newton) gave a unified account
of the motions of all bodies. A pressing problem in physics today is the
search for a unified field theory--a theory which would successfully combine
relatively theory and quantum mechanics. Psychologists are looking for
a unified theory of learning. Behaviorists can account for some kinds of
learning; cognitive psychology provides explanations for other types of
learning. But one would like to find a single theory which covers all instances
of learning.
(iv) Problems of conflict between theories. Often, problems of finding
a unifying explanation are exacerbated because of inconsistencies between
the component theories. And contradictions can also arise between theories
which appear to cover quite different domains. For example, the biggest
objection to Copernicus' astronomical theory was its conflict with Aristotelian
physics, according to which nothing could continue to move without a mover.
And a strong contemporary objection to Darwin's theory of biological evolution
was Kelvin's geophysical calculation of the age of the earth. (It turned
out later that Kelvin's thermal estimates were wrong because they did not
include the heat generated by radioactive decay.)
Each of the four types of scientific problems discussed above arises out
of a rich background of information and expectations. New scientific theories
are invented when scientists are faced with a problem: Why did my old theory
or set of unconscious expectations fail? What causes this regularity which
I have observed? Can I unify these two branches of science? Or resolve
the inconsistencies between them?
None of these problem types are unique to science. Myth-makers are also
looking for deep explanations and try to give unified pictures of the world
we live in. Everyday life produces many calls for explanations, often of
singular events. And many of our practical problems of existence arise
because the common-sense generalizations we make about the world, including
other people, are violated.
But although there is no sharp demarcation of scientific problems, there
are some obvious differences in degree. In a well developed scientific
field, problems arise within a body of knowledge which is generally more
extensive, more detailed, and better systematized than that of other domains.
(This is not always the case - both folk mythologies or craft technical
lore may be of comparable sophistication.) Furthermore, the scientific
tradition for the most part actively rewards people who expose contradictions
or gaps within the body of science. Folklore and religious systems, by
contrast, are often embedded within conservative institutions which discourage
criticism or revision of the traditional beliefs. To summarize, to the
extent to which scientific knowledge is well-articulated it is relatively
easy to discover flaws in it, and scientific traditions encourage us to
take these problems seriously.
b. Scientific Problem Solutions
We have described various sorts of problems which trigger scientific inquiry.
Our next task is to characterize the sorts of problem solutions which count
as scientific. This is the core of the demarcation problem with which Popper
began.
However, let me digress a moment to point out that we have skipped over
the process by which these tentative solutions are dreamt up in the first
place and the problem of whether there is a logic of discovery. Early philosophers
were optimistic about the prospects of describing a method for discovering
true theories. Bacon and other inductivists thought that through careful
observation and systematic use of his tables one could easily arrive at
the solution to scientific problems. Descartes and other rationalists thought
that a systematic analysis of our clear and distinct ideas would provide
the answers.
Popper argues that there is no recipe for discovery, but from this he concludes
that all the scientist can do is guess at the answer. Some conjectures
will be "happy guesses" as Whewell described them; others will
turn out to be dead wrong. It's all a matter of trial and error. In biological
evolutions mutations occur by chance--we can't predict what new variations
will occur. But natural selection will filter out those who are not adapted
to the environment. Likewise for science. People make up all sorts of
crazy hypotheses. But tests will weed out those which do not match reality.
Quality control is insured by careful testing procedures, not by censorship
of new ideas. The pattern of reasoning which leads to a new hypothesis
is not important--it may be based on dreams, mystical experiences, weak
analogies or what have you. According to Popper, the origins of the idea
are irrelevant; what is crucial is how well the scientist's hunch stands
up to testing.
Today both cognitive scientists and philosophers of science are optimistic
about being able to describe the structure of the process Popper calls "trial
and error". Here is a place where he is ill-served by the analogy
to biology although ironically biologists have now given a reduced role
to blind mutations.
Sociologists of course would argue that the origins of ideas are relevant
-- a hypothesis which originates in Utah will have less initial plausibility
than one which comes from MIT. And cognitive scientists, as well as philosophers
such as Campbell and Hesse, dispute the claim that analogies only play a
role in discovery and are then discarded.
And it is interesting to recall that Popper himself claims that one can't
understand theories without knowing about the problems which they solve.
Might this be construed as meaning that the problem-situation out of which
the theory arose is relevant to its evaluation? But let us return to Popper's
order of exposition.
As our account so far makes clear, the solutions to problems which scientists
propose start out being mere hypotheses or conjectures. When they are first
proposed, we have no particular reason to believe them true. Furthermore,
these hypotheses tend to be rather bold and far-reaching. This is because
the typical scientific problems we listed above all require as solutions
theories of high content. Consider Problem Type 1: To explain why our
expectations are violated, we need a theory which accounts both for the
exceptions and the normal states of affairs we had expected. For example,
a good answer to the problem of the planets' irregular motions would also
explain the sun's regular motion.
To turn to Problem Type 2: Trying to give a deep explanation of a regularity
(such as the Balmer formula for hydrogen spectral lines) generally results
in a conjecture which has many other consequences as well (such as a formula
for the spectral lines of sodium). As for Problem Type 3, it is clear that
a unified theory will have more content than either of the separate fields.
And generally such a theory will have lots of new consequences as well.
(For example, the unified theory of chemical bonding covered not only traditional
organic and inorganic compounds, but a whole new domain of organic-metallic
compounds, such as hemoglobin.)
Although they are bold conjectures, Popper argues that conjectures do have
one very important property in their favor: they can be tested by means
of experiments. If one of our conjectures is false, it is realistic to
hope that we will eventually discover its erroneous nature.
Let us now discuss the precise requirements that a theory must satisfy
in order to be falsifiable.
(i) The Logical Requirement. Statements of the form "Some A's are
B's" cannot be refuted by any report involving a finite number of instances,
but universal generalizations, be they affirmative or negative, can be.
A necessary condition for a theory to be falsifiable is that it be logically
possible to contradict it by a finite conjunction of sentences which describe
particular instances.
Popper used the logical requirement to argue for the unfalsifiable status
of many Marxist doctrines. Statements about the "inevitability"
of the downfall of capitalism fail the logical requirement if no time limit
is given. "Light has a maximum velocity" also fails unless a
value is specified.
Many claims which at first appear to be universal generalizations also
fail. For example, "Every metal has a melting point" or "every
action is rational" may be better analyzed as what Watkins called "all-some"
statements, i.e. as saying that for every metal there is some temperature
above which it will melt, and for every action, there is some description
of the agent's problem situation such that the action was appropriate to
it.
On the other hand, the claim "some copper is brittle" looks like
it is not open to refutation by a finite observation report; however, if
it is accompanied by a recipe, "To make copper brittle, place a thin
sheet of it for three days in a nuclear reactor where the neutron flux is..."
it becomes testable.
(ii) The Empirical Requirement. Having the proper logical form is not
sufficient to insure that a hypothesis is scientifically testable. "All
repressions are seated in the libido" satisfies the logical requirement
but, as it stands, it is not subject to experimental test. How exactly
are we to recognize a repression And even if we could, how could we tell
whether or not it is seated in the libido?
Contrast the following sentence which has the same logical form: "All
samples of iron have a melting point less then 2000_ C." This universal
generalization is subject to test. We can easily determine whether a sample
is iron or not through chemical analysis. (We might use the potassium thiocyanate
test, for example.) And there are also a variety of reliable procedures
for measuring melting points.
The contrast in the above two cases suggests the following requirement:
A falsifiable theory is one which is inconsistent with at least one finite
conjunction of observation test reports. Popper's discussion of test reports,
or 'basic' statements, as he called them in the Logic of Scientific Discovery,
is traditional in many respects: they describe observable events occurring
in an individual region of space and time (p. 103); they are inter-subjectively
testable, i.e. they describe experimental arrangements in such a way that
anyone who has learned the relevant technique can check on their validity
(p. 99).
But Popper departs from the logical positivist or other standard empiricist
accounts by not claiming that the 'basic' statements are infallible, nor
are they picked out by any psychological criteria. The store of 'basic'
statements and hence whether or not a theory is testable depends on the
technology and state of scientific development available at the time. Before
the invention of the mass spectrograph, "All atoms of an element have
the same weight" would not have been considered testable because as
yet there was no way to determine the weights of individual atoms. What
counts as an observation sentence also changes with the development of instrumentation
and with new theoretical developments. For modern scientists, "This
sample is oxygen" and "This is an electron track" are considered
to be observation statements. In an earlier era they would not have been.
"This sample is a gas which supports combustion" and "This
track is a cloud chamber curves towards the positive plate" might have
been used instead, if the identity of the gas or of the particle was still
in question. The truth of observation statements cannot be decided with
certainty; even so, members of the scientific community can tentatively
agree in their judgments about the truth of observation statements.
Although Popper originally proposed his falsifiability doctrine as a demarcation
between science and pseudo-science, one could also view it as a regulative
principle to guide the development of good scientific theories, not as a
sharp criterion. We can increase the degree of falsifiability of a conjecture
by increasing the domain of phenomena to which it applies, by making more
precise the descriptive claims about the domain, and by inventing less and
less controversial observational procedures for evaluating those claims.
More important then the question of whether Freud's theory has any potential
falsifiers whatsoever is the question of how we might increase its degree
of falsifiability, either by making its claims more precise or by using
detection methods such as plethysmography for detecting patterns of sexual
arousal instead of relying solely on dreams or other traditional psychoanalytic
techniques.
I have just recited the standard Popperian answer to the demarcation problem
which is described in his intellectual autobiography (Unended Quest) and
in Chapter 1 of Conjectures and Refutations as the problem which his falsificationist
theory of science was intended to solve.
But let us now ask how this account might differ if we take seriously Popper's
own claim that theories should be solutions to problems? On this perspective
some of the criteria for appraisal would be different. For example, before
checking on the falsifiability of a theory, shouldn't we first see if it
is even a solution of the problem? Popper discusses the Maori conjecture
that the earth is held up by a turtle and criticizes it, not because it
is false or unfalsifiable, but because it immediately raises the same problem
which it was supposed to solve, namely what holds up the earth ( or turtle)?
This example strongly suggests that before (or in addition to) appraising
a conjecture in terms of its falsifiability we should check on whether it
solves "the" problem. This brings the historical context of the
conjecture and perhaps even the intentions of its inventor into the evaluation
of a hypothesis. It also suggests that a Freud or whatever might not be
castigated so severely for proposing unfalsifiable conjectures if they were
at least solutions to his problem, particularly if no other more falsifiable
solution was available. Perhaps we should instead fault his choice of problem,
not his theory. We will need to return to this case when we present our
account of problem evaluation.
c. The Choice of Scientific Tests
In his account of the empirical appraisal of scientific theories, Popper
once again inverts the positivists' rhetoric. Rather than trying to collect
data which will confirm our conjectures, we should instead conduct those
tests which seem most likely to refute them.
Popper's central point is nicely illustrated by an anecdote recounted by
Francis Bacon:
...it was a good answer that was made by one who, when they showed him hanging
in a temple a picture of those who had paid their vows as having escaped
shipwreck, and would have him say whether he did not now acknowledge the
power of the gods--"Aye," asked he again, "but where are
they painted that were drowned after their vows?" And such is the
way of all superstition...(The New Organon, BK I, Aphorism LXVI.)
It is obvious that Bacon is criticizing the way data is being used to argue
for the "power of the gods." But we need to spell out the objection
in detail.
First of all, what exactly is the claim about the power of the gods which
is under discussion? It would appear that the basic thesis which can be
directly tested is the following: "If one makes a vow during a storm
at sea, then one will survive." We can abbreviate the conjecture as:
"If V, then S."* The proposed method for collecting data which
will either support or refute the conjecture is as follows: Go to churches
and record instances of people who paid their vows as thanks for having
escaped drowning. Using our abbreviations, we can describe the instances
so collected as cases of V and S.
At first glance, it may appear that these data do indeed tend to confirm
the conjecture because they are positive instances of the generalization.
But let us look more carefully. What kind of evidence would refute the
conjecture? The answer is a case of someone who made a solemn vow, but
drowned at sea nevertheless, i.e., a case of V and not-S. But given our
method of collecting data, it is logically impossible that we would ever
find such a refuting instance. By looking only at pictures of survivors
(i.e., unless it is logically possible that there could have been another
cases of S) we will never come across an instance of V and not-S, even if
there be millions of such cases. One of the basic principles of scientific
testing can be stated roughly as follows: The outcome of a certain test
procedure cannot confirm a theory outcome which would have disconfirmed
the theory.
In order to test "If V, then S", we should sample the domain
of V and find out whether any of them drowned. As Bacon says, "Where
are they painted that were drowned after their vows?" In addition,
we should also look at examples of people who in fact drowned and find out
if any of them had made vows. (This might be difficult to do in practice,
but we could check their diaries, ask their mates, etc.) It is useless
to look at cases already known to be S or not-V. Such "tests"
are irrelevant to the conjecture under consideration because it is logically
impossible that they could ever yield a refuting case.
[--- Unable To Translate Graphic ---]
*This is probably somewhat over simplified. The proponents of the power-or-the-gods
theory may have only wished to defend a weaker claim: "If one prays,
one is less likely to be drowned." We will postpone the discussion
of the testing of probabilistic generalizations until later.
We might label the procedure described by Bacon as "no--risk data
collecting" because the way in which the data is collected makes it
logically impossible for a refutation to appear. Once pointed out, the
methodological error is blatant; nevertheless it can be seductive. For
example, after teaching scientific method for a number of years, I once
caught myself reasoning as follows: I observed that all of my close friends
who blinked a lot and tipped their heads back when looking at me wore contact
lenses. I then started investigating other people who behaved similarly
and sure enough I nearly always found independent evidence that they were
wearing contacts. Sometimes I asked them. Other times I would see a lens
holder in their purse or bathroom, etc. I soon jumped to the following
conclusion: "All people who wear contact lenses blink a lot and peer
down their noses when they look at you."
This conclusion was obviously too strong, given that I had done only an
informal study on a very small sample. But I did think that my experience
justified a more modest statement: "All contact lens wearer whom I
have met blink a lot, etc." What was not clear to me for quite some
time is that none of my observations had served as a test for either conjecture.
For I had always begun my observations with people who blinked! Given
this choice of sample domain, I could have investigated all the blinkers
and peerers in the world and never found a counter-example to my conjecture--not
because there weren't any, but simply because it was logically impossible
for my method of data collection to uncover them.
Popper adds to Bacon's point by stressing that good scientific tests should
be severe ones, that is they should be deliberately designed, using our
general background knowledge to probe the conjecture at its weakest point,
i.e., to find a refutation if one does in fact exist. For example, when
Kohlberg put forward a theory about the development of moral reasoning in
children, he was well advised to test it on children from Turkey and Taiwan.
We might expect a theory developed on the basis of experience with kids
in Boston to fail when applied to children from quite different cultures
and religions. (As it turned out, the Kohlberg theory passed this severe
test.) Similarly, theories about the universality of the Oedipal complex
should be tested on aborigines, and theories about language learning on
deaf and blind children. Theories about geological change and biological
evolution should be tested, where possible, by data from other planets.
Physicists know that theories often fail under conditions of high energy
or high velocity; and often processes at the micro level violate generalizations
which work well with medium-sized objects. For this reason physicists want
to build ever bigger accelerators for smaller and smaller particles.
The general procedure for designing a severe test is as follows: The hypothesis
under test always makes a series of claims. For example, the claim "All
arsenic compounds are poisonous" says that both soluble and insoluble
arsenic compounds are poisonous. It also says that both yellow and green
non-poisonous substances are free of arsenic. (Don't forget the contrapositive!)
According to our background information, some of these claims sound less
plausible than others. For example, since we know that many poisons have
to be digested in order to act, we may decide that insoluble arsenic compounds
are less likely to be poisonous than soluble ones. A severe test is one
which tests the least plausible claims of a theory. In our example, given
our background theories about the relationship between solubility and poisonous
character, we should start testing by looking at insoluble arsenic compounds.
If the conjecture passes this severe test, we will then look at the class
of soluble arsenic compounds. Other things being equal, severe tests, i.e.,
tests of the least plausible claims of a conjecture, are more stringent
than less severe ones.
Note that our appraisal of the severity of tests depends on the background
information available at the time. Consider the two claims: (a) "All
yellow non-poisonous substances are free of arsenic" and (b) "All
green non-poisonous substances are free of arsenic." Which domain
should be investigated first if one wishes to perform a severe test of the
original conjecture? Recall that counter-example to the original conjecture
would be a non-poisonous arsenic compound. So if we think green substances
are more likely to contain arsenic than yellow ones, we should sample the
domain of non-poisonous green substances. If we know nothing about the
typical color of arsenic compounds, however, or if we have reason to believe
that color is not correlated to chemical composition, we would judge the
tests to be equally severe. (As a matter of fact, many arsenic materials
are yellow or black, so there may be a slight preference for a test of yellow
non-poisonous substances.)
Because they depend on vague and incomplete background knowledge, judgments
about which tests are most likely to refute the conjecture are unusually
fallible. For example, the Kohlberg theory of the development of moral
reasoning worked surprising well when tested on boys raised in Muslim and
Confucian cultures, but failed when tested on young American girls. (See
Gilligan.) Kohlberg had thought his universal theory might well be sensitive
to differences in the religious ethos, but that factor turned out to be
much less important than gender differences.
A special case of severe testing is what Bacon called a "crucial experiment."
Here one probes the vulnerability of a hypothesis by comparing its predictions
with those of a plausible rival conjecture. If hypothesis A predicts P
and rival hypothesis B predicts not-P, checking on whether P or not-P is
the case will allow us immediately to eliminate one alternative. Contrary
to what its name may imply, a crucial experiment does not prove the truth
of the undefeated hypothesis because there may exist more alternatives which
we have not yet thought of.
For example, according to the Copernican theory, Venus should wax and wane
like the moon. The Ptolemaic system, on the other hand, predicted that
Venus should not exhibit extremely different phases at different times.
This conflict between the predictions of the rival cosmological systems
was noted by Copernicus in 1543. However, it was not possible to conduct
a crucial experiment without a telescope. In 1610, Galileo observed that
Venus did have phases and so the Ptolemaic system was refuted. This crucial
experiment in no way established the truth of the Copernican heliocentric
theory for in 1588 Tycho Brahe had proposed a geocentric system which also
gave the correct predictions concerning Venus. The next order of business
was to design a crucial experiment between the Tychonic and Copernican system.
Crucial tests are only stringent when the rival hypothesis is a fairly
plausible one (as judged against background knowledge). The more plausible
the rival conjecture to the hypothesis in question, the more stringent is
a crucial test between them. For example, no one would have thought it
necessary to design a crucial test if the only rival were an ad hoc hypothesis
to the effect that Venus shone by its own light but periodically varied
its luminous area from crescent shaped to circular!
Checking on the truth of the least plausible consequences of a conjecture
is the most efficient way of trying to falsify it, and hence Popper recommends
tests with samples which are in a sense biased against the conjecture!
How can this be reconciled with the standard statistical practices of using
random samples or stratified samples? Or can it be? To develop a full-fledged
critique of the Popperian approach to statistics is beyond the scope of
this book, but I will make a few preliminary remarks. First of all, many
statistical studies are not really tests at all, but simply demographic
measurements. If Kinsey wishes to make descriptive claims about overall
American sexual practices, clearly a non-biased sample is desirable. However,
if one is testing the claim that the half-life of radium is always 1600
years or that the M/F ratio of neonates is always 0.51 (regardless of conditions),
then it makes sense to focus our inquiry on samples of radium or births
in extraordinary circumstances, namely those which on our background knowledge
are most likely to violate the general claim.
In the case of evaluating causal claims by means of controlled tests, the
Popperian approach once more exhorts us to put most effort into controlling
for those factors which are most likely to be alternatives to the causes
described by our hypothesis. Of course, since our background hunches about
the weaknesses of our conjectures are always fallible, our assessments of
the severity of a test are also fallible and this is a good reason for eventually
performing a wide variety of tests whether they appear to be severe or not.
There have been a variety of reactions to Popper's account of severe testing.
Bayesians have analyzed parallels between Popper's account and their own.
Proponents of the semantic view of theories, on the other hand, sometimes
imply we should invert Popper's methodology and gradually increase the domain
of a theoretical model by first trying to apply it to the instances most
similar to the paradigm cases around which the model was originally constructed.
What new perspectives on scientific testing are provided if we view theories
as solutions to problems? Let's begin with a non-scientific example adapted
from van Fraassen (whose views we will discuss later). Suppose we wish
to test the claim C: Eve ate the apple from the tree of knowledge.
Now imagine two problem situations. In the first case, theologians are
puzzling over the exact symbolism of the apple treel Did it stand for eternal
life or did it have something to do with the knowledge of good and evil?
C proposes an answer.
In the second case, let us suppose that the controversy is over whether
Eve also ate the apple or whether she merely tempted Adam to eat while remaining
pure herself.
Now we can well imagine that the sorts of historical and textual testing
of C which would be appropriate in the two problem situations would be quite
different. The theologians would look primarily at evidence relating to
the tree issue and might not even care whether it was Adam or Eve or both
who ate the apple. In the second problem situation the relevance of the
tests would be reversed.
I conclude that at least in some cases, knowing which problem the theory
was supposed to solve would influence our choice of tests. Since scientific
theories have lots of content (and hence lots of places to go wrong) and
since most of our theories are probably literally false, it makes sense
to focus our testing on the aspects which are most relevant to the problem
we are trying to solve. Criticism of the non-relevant parts (such as "Eve
didn't actually eat the apple -- she just bit into and chewed it up but
didn't swallow it because just then God came and chased them out")
may strike us as pedantic.
Knowing the problem-situation seems to help us choose relevant tests in
the case of the idiographic inquiry where the conjectures are singular statements.
But what about in the case to law-like hypotheses? Do we really need to
know what the question is in order to test the truth of the answer?
I grant that in the case of fundamental scientific theories the influence
of problem on testing may be less, but I still think it may be as important
as Popperian severity which is based on improbability. Here is an illustrative
example -- consider the following conjecture:
C: The atomic weight of oxygen is sixteen.
Now the most severe test we can think of is to make measurements accurate
to six figures. (It is highly improbable that this value is exactly right.)
And if the issue is the existence of isotopes that would be quite appropriate.
But what if the problem-situation is an earlier one in which the main dispute
is whether oxygen gas is diatomic? Then accuracy to six significant figures
is not relevant at all.
Perhaps this point is better expressed by saying that before testing one
should clarify or amplify the conjecture. But then this process will also
require us to go back to the problem for which it is intended to be a solution.
d. The Ambiguity of Falsification
We have raised questions about the choice of tests to be performed, but
as described so far, the logic of testing is simple and clear-cut: (1)
We derive a prediction from our conjecture which can be subjected to experimental
check. (2) We do the experiment. (3) If the prediction is wrong, the theory
is refuted. Period. Or so it would seem. In the typical scientific case,
however, the situation is more complicated and the decision as to exactly
which premise is to be given up is less straightforward.
Let us illustrate the dilemma with a famous scientific example, the case
of stellar parallax. After Copernicus put forward his theory that the earth
revolved around the sun, astronomers noted that if his theory were true,
one should be able to detect stellar parallax. If one is moving with respect
to an object, then the direction in which the object appears changes. This
phenomenon is known as parallax. As a race driver moves past the pit stop,
at first it is ahead of him/her. Later it is behind. The angle a in the
diagram below is called the angle of parallax. A similar diagram could
be used to illustrate Copernicus' theory of the earth's annual movement
with respect to a particular star.
----------------------------
Insert Figure 3.3 about here
----------------------------
But when 17th-century observers looked for stellar parallax, they couldn't
detect any. Didn't this mean the theory was false? The supporters of Copernicus'
theory decided to blame an auxiliary assumption instead. Their argument
can be illustrated with the race-car analogy. Suppose the driver sights
on a distant radio tower instead of on the pit stop. Now the angle of parallax
may become too small to be easily noticeable. As the radio of D to R increases,
a gets smaller. At very large valued of D it will become to small to detect.
According to estimates of the distance between the earth and the stars
available at the time, stellar parallax should have been observable. But
the Copernicans argued that these estimates were wrong and claimed that
the universe was about 1,000 times bigger than had previously been imagined.
This bold move turned out to be correct, but 200 years passed before stellar
parallax was detected experimentally.
----------------------------
Insert Figure 3.4 about here
----------------------------
The logic of the testing situation was as follows:
Copernican theory: The earth revolves around the sun, which is stationary
relative to the stars.
Auxiliary hypothesis: The distance between the earth and the stars is about
20,000 earth radii.
Experimental Prediction: (Therefore) Stellar parallax should be easily observable
with the apparatus available.
Experimental Finding: No stellar parallax is observable with the available
apparatus.
Since the prediction failed, one of the premises had to be wrong. Copernicus
blamed the auxiliary hypothesis; anti-Copernicans defended it and blamed
the theory instead. With no good way at the time to test the auxiliary
hypothesis, the status of the Copernican theory was left open.
The philosopher who first stressed that almost all tests involve a lot
of auxiliary assumptions was Pierre Duhem, an early 20th-century philosopher,
physicist, and historian of science. Hence, we will call the following
the Duhemian problem:
When an experimental prediction turns out to be false, should the scientist
blame the theory under test or the auxiliary assumptions (or both)?
Popper emphasizes that there is no methodological recipe for dealing with
the Duhemian problem, but a few guidelines can be laid down. First of all,
one should not use the Duhemian problem as a general excuse for one's pet
theory. It is not good methodology to say, "My theory's prediction
failed? Well, not to worry. I probably made a false auxiliary assumption
somewhere along the line." If one wants to keep the theory despite
the prediction failure, one must point to a specific auxiliary assumption
and then design tests of that auxiliary assumption. If the auxiliary assumption
passes the tests, then we should conclude that our theory and not the auxiliary
was false. Sometimes, however, it is not possible nor practical to test
auxiliary hypotheses. (We saw an example of this in the Copernican case.)
In such instances, we can draw no firm conclusions about the original test
situation. If a theory in conjunction with a variety of auxiliary assumptions
makes a lot of false experimental predictions, though, we tend to decide
that the theory is false, even though we can't conclusively test each auxiliary.
The Duhemian dilemma can be analyzed as follows:
The theory under test (T) when conjoined with one or more auxiliary hypotheses
(A) makes a prediction (p). Experiments show that p is not the case. By
modus tollens we know that either T or A (or both) must be false, but logic
doesn't tell us which.
(T & A) - p
~p
(Therefore) ~T, or ~A, or ~T & ~A
Note that in the pure Duhemian problem situation there is no controversy
about the experimental result, ~p. Furthermore, all parties agree that
T & A imply p. The disagreement arises about whether to revise A or
to revise T. Of course, there are also cases in which people cannot agree
on experimental results or on what exactly the implications of the theory
are. These latter disagreements can usually be settled either through further
experimentation or by means of logical analysis. The Duhemian problem is
often more recalcitrant. Popper does give one firm piece of methodological
advise. No matter which premise we decide to replace the substitute should
never be lower in empirical content.
The main responses to Popper's remarks on the Duhemian dilemma, such as
those of Kuhn and Lakatos, point out that in the history of science, it
is fairly rare to find a case where a theory is refuted by a single, decisive
experiment. More often theories come to be rejected through a variety of
prediction failures. Theories are rarely struck down by a blow from one
type of crucial experiment, no matter how many times that experiment is
repeated. Rather they are eroded away be an accumulation of anomalous results.
We will develop this important critique in the next chapter. Here I will
only remark that if we view theories as problem solutions, then as we may
modify our system in response to the Duhemian dilemma we should either insure
that the new system also answers the original problem(s) or else explicitly
acknowledge that we are abandoning them.
e. The Status of Corroborated Theories
We have discussed what happens when our theory's prediction is refuted--either
we revise it or adjust an auxiliary hypothesis. What happens if our theory
passes the most severe experimental tests we can devise with flying colors?
Can we then declare it proven true, or at least highly probable? It is
perhaps on this issue that Popper's disagreement with the positivists is
deepest.
First of all the history of science strongly suggests that we should never
feel completely certain about any scientific generalization, no matter how
frequently or stringently it has been tested. Newton's theory of classical
mechanics had perhaps the best track record ever; yet it was superceded
by Einstein's relativistic mechanics. Here are a few other examples of
well-established claims which eventually had to be corrected or rejected:
(i) Matter cannot be created or destroyed. (Not true in nuclear fission
or fusion processes.)
(ii) The sun rises once every twenty-four hours. (Not true at the North
Pole.)
(iii) All molecules of water are made of the same stuff. (Not true for
heavy water, deuterium oxide.)
(iv) The major difference between homo sapiens and the lower animals
is that man can use language. (Not true for chimpanzees which can use
sign language.)
(v) Living matter can only come from living matter; it cannot be formed
from inanimate substances. (Not true--amino acids can be synthesized
from ammonia, methane, hydrogen, etc.)
So the history of science warns us that any scientific claim is fallible.
Logic and philosophy of science can help us understand why this is so.
Here are some of the reasons:
(i) Generalizations cover a potential infinity of cases. But we can
only check on a finite number of predictions. We can never be sure that
the next case won't violate the rule (e.g., a black swan may turn up in
Australia).
(ii) Scientific theories make infinitely precise claims. But we can
only make measurements of finite accuracy. (For example, Newton's law of
gravitation says the force of gravity varies inversely with the square of
the distance, i.e., the exponent is r2.00000...but our measurements cannot
discriminate between r2 and r2.0000000001.)
(iii) Many of our scientific laws only hold under idealized conditions--to
give two very simple examples, the law of the lever assumed no friction
at the fulcrum, and the law of the pendulum assumes there is no air resistance.
Of course, we can try to minimize such interferences when we conduct tests,
e.g. but resting our lever on a point or setting up a pendulum in a vacuum,
but our experiments never achieve the perfect conditions which are assumed
in our ideal laws.
(iv) There may be alternative theories which we have not even dreamt of
yet which account for all of the data we have in hand.
For all these reasons, theories are underdetermined by our observational
results and can never be proved through any amount of observation and experiments.
There are no rules for deciding when to accept a theory (for the time being)
and move on to new problems, but what we can do is to answer each of the
above sources of fallibility as best we can.
(i) By testing in widely scattered domains, we guard ourselves against
parochialism, e.g., the black swans in Australia.
(ii) By making our tests as precise and ideal as possible, we can approach
the infinite precision and perfection of our theories.
(iii) And the best way to rule out alternative explanations is to deliberately
try to imagine radically different ways of explaining our results. If we
can devise a new alternative, we can then set up a crucial experiment between
the two competing accounts.
But what is the exact epistemological status of theories which have survived
critical scrutiny? What positive claims can we make about them? Popper
introduced the term corroboration to describe the severity of the tests
passed by a hypothesis, but he emphatically denies that the degree of corroboration
is to be interpreted as a degree of reasonable belief in the hypothesis
or the probability that it is true. However, he does say that for purposes
of practical action, it is rational to base our behavior on our most highly
corroborated theories. And for purposes of scientific inquiry we should
use the degree of corroboration of various claims as guides to criticism
and revision of our scientific systems. The Duhemian problem would become
completely intractable if we had no way of at least tentatively assigning
the blame for prediction failures. And the whole mechanism of falsification
rests on the existence of 'basic' statements, i.e., statements which all
observers can test and presumably corroborate for themselves.
Popper's theory of corroboration and his views of induction are perhaps
the most controversial aspects of his philosophy and I will not comment
on that far-ranging debate. I will only remark that to the extent that
tests are chosen because of their relevance to the problem-situation, our
estimates of corroboration or Bayesian confirmation or what have you will
also be dependent on problems.
3.? Final Comments
Popper's characterization of the objective aspects of problems is a good
starting point, but it needs to be accompanied by a fuller account of the
factors, be they objective or subjective, which influence problem choice.
If scientists tried to work on all the problems which exist in a World-3
sense, or chose their problems randomly, science as we know it would not
exist.
Popper's methodology stresses problems as the starting point of inquiry
but makes problems less central in the later stages of theory evaluation.
A more thorough-going problems approach would lead us to modify Popper's
account of preliminary theory appraisal and the prioritizing of scientific
tests. It is less obvious how, if at all, viewing theories as solutions
to problems should affect our philosophical accounts of theory corroboration
or confirmation.
no scientific knowledge. (Cf. Cartwright, How the Laws of Physics Lie)
Popper goes on to argue that many of the most basic bits of common-sense
knowledge are also false. The sun does not rise every twenty-four hours,
at least not in the Land of the Midnight Sun and bread can be poisonous,
if it is made from ergotic wheat.
But if we admit the existence of false knowledge claims shouldn't we at
least require that in order to count as knowledge in a particular historical
context a statement must have been well-confirmed by the evidence available
at that time? Shouldn't we insist that one should be justified in believing
knowledge propositions, even if later evidence may cause us to reverse our
appraisal of their truth value?
Some of the details of Popper's criticisms of various justificationist
epistemologies will emerge below. Suffice it to say here that with respect
to the logical positivists Popper argues that there is no infallible empirical
base - the most trivial sounding observation report, such as "Here
is a glass of water", contains so many untested implications, such
as, "If you were to drop it, the water would spill while the glass
would break" or "If you were to cool it, the water would turn
into ice," that it is impossible to verify them all. Furthermore,
even if we were to take some set of observation reports as indubitable evidence,
Popper follows Hume in arguing that they can never provide the kind of logical
justificatory support assumed by inductivist philosophers.
But if we give up these traditional epistemological requirements, how are
we to distinguish those propositions which are part of knowledge from random
sentences generated by the apocryphal monkey at the typewriter, or, more
realistically, from the outputs of programs such as Rachter? Is Stove (Popper
and After) right in claiming that Popper has so changed the meaning of knowledge
by completely denying its sense of
cognitive success and achievement that it is misleading for him to continue
to use it? What, according to Popper, are the delimiters of knowledge?
I know of no place where Popper gives a short, italicized definition of
knowledge but we can construct one from his writings. First of all if we
are prepared to identify knowledge with our best science, he does give an
explicit minimal characterization of a scientific claim - it is one which
can be subjected to empirical test - and our best scientific claims are
falsifiable propositions which are not false as far as we know but which
have successfully passed severe empirical testing. Such a tri-partite definition
would replace the old justified-true-belief account by replacing each of
its components with corroborated-falsifiable-claim.
However, Popper often uses knowledge in a looser sense when he speaks of
'background knowledge' and its role in setting problems or appraising the
severity of a test. These propositions are assumed to be unproblematic
in a particular problem-situation (C. & R, p. 238) but there is no suggestion
that they each have been carefully tested. In fact to demand systematic
testing would lead to a falsificationist regress almost as vicious as the
inductivist one. Neither must critical scrutiny be limited to empirical
testing. Unlike the positivists, Popper would never consider excluding
mathematics or philosophy from knowledge.
The proper conclusion to draw, I think, is that Popper does not try to
delimit knowledge propositions because on his account it makes no sense
to do so. No propositions receive permanent gold stars on Popper's account.
All claims are conjectural although some have been more carefully scrutinized
than others. In some contexts a proposition will be temporarily accepted;
in others it may be challenged. What we can do at any point is to open
the case on any claim and review its test record, its logical compatibility
with other statements, its explanatory power, what problems it helps solve,
what problems it generates, etc. But it would be fruitless to demand that
we always start by reviewing the credentials of every proposition in sight.
3.2 The Objectivity of Scientific Problems
If problems arise within a knowledge context and if knowledge is the
content of some set of propositions (not human attitudes towards them),
then it would seem fairly straight-forward to define problems as some sort
of logical inadequacy within the propositional set, such as inconsistency
or incompleteness.
Thus although Popper himself often uses vague, quasi-psychologistic talk
of "violated expectations" (reminiscent of Dewey), one could replace
it by analyses in terms of logical inconsistencies between theoretical systems
and singular observation statements. Of course, logic alone doesn't tell
us which statements count as observations nor which theories are "accepted"
(No problem arises if a theory already considered to be false leads to a
prediction failure.), but presumeably Popper hopes to give non-psychologistic
accounts of these additional factors.
By giving an objective account of problems, one can explain why there is
often considerable inter-subjective agreement among scientists on what is
problematic about a particular theory. (There is something really "out
there" which makes them feel puzzled.) And one can clarify Dewey's
point about neurotic feelings of doubt not being adequate for inquiry -
the doubt must arise from a situation which is objectively open or indeterminate.
For Popper, although theories are generated by humans, theories can in
turn autonomously generate problems of which no human being is aware. For
example, when people first invented the integers (perhaps using a Brouwer-like
construction), the problem of whether there is a largest prime also came
into existence (although people only worried about it much later). Whether
a problem is solved or not is also determined by looking at the objective
state of the arguments which could be constructed for and against a proposed
solution. Whether people actually accept those arguments or cease feeling
puzzled is not relevant.
To dramatize the difference between the objective status of a problem and
our feelings about it, Popper places them in different worlds. Roughly
speaking, World-1 contains the objects traditionally studied by natural
science (e.g., electrons and chairs). World-2 contains psychological states
(e.g., feelings of doubt, beliefs). And World-3 contains "the objective
contents of thought" (e.g., theories, arguments, problems). Although
minds in W-2 (contained in W-1 bodies) produce everything in W-3, no mind
working either individually or collectively can be subjectively aware of
everything in W-3. (We can't think about each integer nor actually derive
every Euclidean Theorem.) Popper claims the growth of knowledge is best
understood by concentrating on the logical relations between W-3 objects,
not on the psychological states of the people who produce and manipulate
them. But although Popper's notion of objective problem is very useful,
I think we cannot ignore the psychological and sociological aspects of scientific
problems. I will develop this point by listing various points at which
we must refer to these other dimensions of the problem-situation in order
to understand the growth of knowledge.
Let us suppose that humans have generated propositions T and O - both now
reside in W-3. Let us further suppose that T entails ~O, but the deduction
is abstruse and so no one has noticed. Objectively, there is now a problem
in W-3, but no inquiry will result until someone comes to believe that T
entails ~O.
At this point, Popper might wish to add the explicit proposition, T _ ~O,
to W-3 and draw a distinction between potential logical consequences of
our theories and those which have actually been drawn. This seems fair
enough. The derivation of ~O from T constitutes a powerful argument against
T and creating new arguments surely is a contribution to knowledge. (Recall
Galileo's wonderful criticism of the Aristotelian law of falling bodies
- if two cannonballs, while falling, should somehow be tied together to
form a mass twice
as heavy, according to Aristotle's theory, they should speed up, but this
is absurd.) However, this does mean that only those features of W-3 which
people notice can contribute to the growth of knowledge.
Furthermore, if people believe T and O are inconsistent (even though they
aren't), they will try to solve a problem although it would seem that objectively
no problem exists! For example, an early criticism of Darwinian theory
(D) was that it could not explain the evolution of altruistic behavior (A).
We now know that the argument was incorrect - sociobiologists describe
a number of mechanisms, such as kin selection, which resolve the puzzle.
Here is a case where Darwinian theory and Altruism were in W-3 and people
also believed D _ ~A so that was also in W-3. However, ~(D _ ~A) follows
from D so presumably that was also at least potentially in W-3. Yet people's
inquiry was influenced by what was in some sense a "psuedo-problem"
- or at least a problem based on a mistake. Nevertheless, the inquiry resulted
in new knowledge, the discovery of kin selection!
I conclude that logical inconsistencies in W-3 are of no importance in
the growth of knowledge until people notice them. And even "neurotic"
problems (puzzlement based on inconsistencies which aren't really there!)
can be important for the growth of knowledge. So it seems that the subjective
and social awareness of problems is important. However, at any given time,
we are probably aware of an enormous number of problems. Not only are there
the problems arising from known inconsistencies (Lakatos claimed that every
theory lives in a "sea" of anomalies), there are also the explanatory
problems which arise from gaps in our knowledge. Which problem will actually
provoke inquiry? When it came to the issue of problem of selection, Popper
admits that subjective experience plays a part in which problem we emphasize,
or select as important (OK, pp. 166-67). Later in this essay we will explore
the possibility of finding objective ways of evaluating problems.
3.3 Popper's Strong Analogy Between Evolutionary Biology and Epistemology
The account of Popper's theory of the objectivity of knowledge and problems
given above is extracted from the essays collected together in Objective
Knowledge but it would be misleading as it stands because Popper develops
this view in a context which stresses the similarities between the growth
of knowledge in the scientific community, animal learning, and the evolution
of biological species. There is at least a metaphorical sense in which
each three of these processes involve problem-solving, trial solutions,
and error elimination, but Popper proposes that we take the analogy very
seriously indeed. Thus his examples of the problem of violated expectations
include the case of a newborn foal who sucks on the hair under the mare's
front legs and is disappointed until it finds its way to the back, as well
as the case of Newtonian astronomers who did not expect the results to the
Eddington eclipse expedition which detected light bending in a gravitational
field. Popper is not suggesting that foals have propositional attitudes.
Rather it has an inborn "theory" which has "run into difficulties.
He also draws parallels between the problem-solving activities of Einstein
and an amoeba -- he cites the example of a hungry amoeba who learns to swim
towards a light in order to get food. And he includes birds' nests among
the problem solutions which reside in World-3!
Of course, there are other places where Popper stresses the unique features
of science (more of this later) but by separating knowledge from human consciousness
it is very easy for Popper to posit that knowledge is encoded in genotypes
and guppies as well as in geniuses!
Let us now look in more detail at the parallels Popper draws between biological
evolution, animal learning, and scientific inquiry, three processes which
instantiate his general schema:
P --> TS --> EE --> P'
P stands for problem, which is to be understood in an objective sense and
does not imply that the entity which "has" the problem is conscious
of it. In the biological domain, species face problems connected with survival
and reproductive success, such as the problems of escaping predators, raising
young, finding food, mates, etc.
TS stands for tentative solution, such as the information encoded within
a new genotype within the species' pool. EE, or error elimination occurs
if the phenotype bearing the new genotype dies without reproducing (or reproduces
at a lower rate than its con-specifies).
The outcome of reiterations of the TS and EE steps is a species better
adapted to its environment, but new problems (P1) will typically lead to
a repetition of the whole selection process.
When the schema is applied to animal learning it looks fairly similar to
Skinnerian operant conditioning. [See Skinner's "Selection by Consequences"]
In response to the "problem" posed by hunger pangs, or whatever,
the animal engages in exploratory behavior (thus proposing a tentative "solution").
Unsuccessful solutions lead to no food, or even pain, and are extinguished.
When a new behavior is successful, however, it becomes part of the individual
animal's patterned response to that type of problem-situation. New problems
then lead to more learning.
The application of the schema to scientific inquiry is quite straight-forward.
Scientists propose falsifiable conjectures in response to problems arising
within their knowledge situation, which are then subjected to empirical
test. False hypotheses are thereby eliminated and the scientist is then
free to confront new cognitive problems.
------------------------
Insert Fig. 3.1 about here
------------------------
These three instantiations of the schema are summarized in Figure 3.1.
Let us now comment on some of the important dissimilarities between scientific
inquiry and these other selection processes. One crucial difference, as
Popper notes, is that in science "our mistaken theories die in our
stead". If a species fails to solve a survival problem, it goes extinct.
If a rat fails to find a path through a maze, it goes hungry. In both
cases, the consequences of error (or success for that matter) have a direct
physical effect on organisms. The scientist, on the other hand, may experience
elation or disappointment as a result of empirical testing but these psychological
reactions are not simply coupled within the selection process. For example,
I may derive satisfaction from designing a clever test which refutes a hypothesis
even when the hypothesis was of my own creation. And to the extent to which
science is a "friendly hostile" competition between ideas (to
use Popper's phrasing), there could even be a division of labor between
creation and criticism such that every prediction failure is a personal
triumph!
Because the success or failure of a scientific hypothesis can be decoupled
from pleasure and pain, the scientist is free both to propose bold conjectural
solutions and to test them severely. And the fact that as scientists we
are free to choose problems - they are not forced on us by the environment
- also allows us to operate less cautiously. Although the evaluation of
scientific theories depends crucially on feedback from the environment,
human scientists experience relatively little feedback from prediction successes
or failures. Contrast the situation of the technologist, e.g., a potter
who is trying to solve the problem of how to prevent pots from exploding
in the kiln. Here the problem is set by practical considerations. Tentative
solutions should be economically viable and may be of such limited scope
as to apply only to the local clay and kilns. And it would be absurd to
push any solution which appears to work to extremes. Prediction failures
cost time and money and so the potter will theorize conservatively. Thus,
although the potter, unlike the animal, can articulate the hypotheses under
test and use information in books to criticize them, the potter's situation
is more like the animal's than the scientist's because she or he is directly
rewarded or punished according to the success of the tentative solution.
The scientist's relative freedom from personal repercussions sounds wonderful
and liberating, but it can also pose the following problem. Note that on
the biological or animal level it is not possible for the organism to "ignore"
refutations because it is causally connected to the environment. A dogmatic
potter may engage in a process of psychological denial of the pot shards
from exploding pots but will soon go out of business. But an individual
scientist may evade the elimination of erroneous theories by using ad hoc
modifications or conventionalist strategems with impunity. To do so is
like cheating as Solitaire - it may not be as much fun, but nothing keeps
you from doing it - except one's internalized standards of fair play. An
interesting question, then, is how scientific institutions and traditions
can best reward (or punish!) scientists' activities as they engage in scientific
inquiry. (For example, how do we discourage people from publishing non-reproducible
experimental results while encouraging them to produce interesting detailed
conjectures which may well be falsified?)
Many evolutionary epistemologists have been captivated by the formal resemblances
between the modification of species by natural selection, the modification
of behavior through differential reinforcement, and the modification of
scientific systems through hypothesis testing. For Popper the parallels
are especially easy to draw because he down-plays the importance of conscious
beliefs in science. However, analogies can lead us astray as well as illuminate
and although there may be a definite sense in which genotypes have propositional
content, I think it can hardly be helpful to say birds' nests do -- unless
we are also to say that sand dunes are the winds' solution to the problem
of where to deposit suspended particles of soil and diamonds are the solution
of carbon's problem of how best to solidify under extreme pressure!
As we now look more closely at Popper's account of scientific problem solving
we will note other mischevious features of his taking the analogy too seriously.
3.4 Popper's Theory of Science as Problem-Solving
Philosophers of science today might admit that to be complete any account
of scientific inquiry should say something about scientific problems but
nevertheless resist the idea of putting problems at the very center of the
enterprise. Let us now look in detail at Popper's methodology and see what
he says about problems at each juncture. We will use an an outline the
flowchart in Figure 3.2.
----------------------------
Insert Figure 3.2 about here
----------------------------
Throughout the discussion I will sometimes supplement Popper's examples
with my own, but they are intended to be ones consonant with his scheme.
a. Typical Scientific Problems
As we have seen, according to Popper, no inquiry begins in a vacuum. Regardless
of what the topic may be, the scientist, like all of us, begins with a motley
collection of ideas, some clear, some confused, some true, some false.
Puzzlement arises when there are inconsistencies or gaps within existing
bodies of knowledge. But how are scientific problems different from those
of ordinary life? Or are they different? Let us begin by surveying the
typical kinds of scientific problems which Popper discusses and then we
will comment on their special characteristics.
(i) Problems arising from violated expectations. A common sort of scientific
problem arises when something surprising or unexpected occurs and we wonder
how or why it happened. An important problem for early astronomers was
the following: In general, celestial bodies, such as the sun, moon and
stars, move across the sky in smooth arcs. However, it was discovered that
the planets wander around the sky irregularly. Can one describe precisely
how the planets move and explain why they move differently from the other
heavenly bodies? Plato called this the problem of the planets. Ptolemy,
Copernicus, and Kepler each offered a different solution to it.
Here is another example of a scientific problem caused by violated expectations:
In 2896 Becquerel found that a batch of photographic plates which had been
carefully stored in black paper were fogged. According to the best scientific
knowledge available at the time, only visible light or x-rays could expose
photographic plates. What could have happened? Becquerel finally began
to suspect that the fogging was caused by an unusual rock he had used as
a paper weight. And it was thus that he discovered radioactivity. Later
Madame Curie showed that the rock contained radium.
(ii) Problems arising from a quest for deep explanations. Even if the
scientist is lucky enough to discover a generalization which seems to have
no exceptions, he or she is still faced with a problem: What causes the
regularity? Why do things happen just that way? For example, early astronomers
asked why the sun rose every day in the east. Some said it was because
the sun moved in a circle around the earth. Later this geocentric theory
was replaced with a heliocentric theory. In either case, a further question
arose: What caused the sun (or earth) to move? According to Aristotle,
there was a Prime Mover. Later people suggested a law of circular inertia,
saying a wheel would move forever if there were no friction. Newton explained
the regular motion in terms of linear inertia and the force of gravity.
There are many other cases in which the problem is to explain a regularity.
Bohr wondered why the wavelengths of the spectral lines of hydrogen should
fit the simple mathematical formula discovered by Balmer. Mendeleev and
other chemists of the late 19th century wondered why the elements should
arrange themselves so nicely into a Periodic Table. By the end of the 18th
century, after the work of Boyle and Charles, everyone knew that gases expanded
on heating. But why? Caloric theorists said that heat was a fluid which
flowed into gases and as a result they took up more room. Kinetic theorists
said heat was kinetic energy and hot gases expanded because their molecules
moved faster. Both sides agreed on the regularity to be explained, but
they offered competing explanations of it.
(iii) Problems arising from a quest for unity. As a science develops,
a new sort of problem often arises: Can one find a unified theory which
covers two or more domains which have previously been treated separately?
For example, for a long time organic chemistry (which deals primarily with
covalent compounds) and inorganic chemistry (which is mainly concerned with
ionic compounds) were considered to be quite distinct fields. At this time
people believed that naturally occurring organic compounds, such as urea,
could not be synthesized in the laboratory because they contained a vital
life force. However, today's theories of chemical bonding apply equally
well to inorganic and organic materials.
Before Galileo, it was held that terrestrial bodies and celestial bodies
obeyed different laws. Galileo (and later Newton) gave a unified account
of the motions of all bodies. A pressing problem in physics today is the
search for a unified field theory--a theory which would successfully combine
relatively theory and quantum mechanics. Psychologists are looking for
a unified theory of learning. Behaviorists can account for some kinds of
learning; cognitive psychology provides explanations for other types of
learning. But one would like to find a single theory which covers all instances
of learning.
(iv) Problems of conflict between theories. Often, problems of finding
a unifying explanation are exacerbated because of inconsistencies between
the component theories. And contradictions can also arise between theories
which appear to cover quite different domains. For example, the biggest
objection to Copernicus' astronomical theory was its conflict with Aristotelian
physics, according to which nothing could continue to move without a mover.
And a strong contemporary objection to Darwin's theory of biological evolution
was Kelvin's geophysical calculation of the age of the earth. (It turned
out later that Kelvin's thermal estimates were wrong because they did not
include the heat generated by radioactive decay.)
Each of the four types of scientific problems discussed above arises out
of a rich background of information and expectations. New scientific theories
are invented when scientists are faced with a problem: Why did my old theory
or set of unconscious expectations fail? What causes this regularity which
I have observed? Can I unify these two branches of science? Or resolve
the inconsistencies between them?
None of these problem types are unique to science. Myth-makers are also
looking for deep explanations and try to give unified pictures of the world
we live in. Everyday life produces many calls for explanations, often of
singular events. And many of our practical problems of existence arise
because the common-sense generalizations we make about the world, including
other people, are violated.
But although there is no sharp demarcation of scientific problems, there
are some obvious differences in degree. In a well developed scientific
field, problems arise within a body of knowledge which is generally more
extensive, more detailed, and better systematized than that of other domains.
(This is not always the case - both folk mythologies or craft technical
lore may be of comparable sophistication.) Furthermore, the scientific
tradition for the most part actively rewards people who expose contradictions
or gaps within the body of science. Folklore and religious systems, by
contrast, are often embedded within conservative institutions which discourage
criticism or revision of the traditional beliefs. To summarize, to the
extent to which scientific knowledge is well-articulated it is relatively
easy to discover flaws in it, and scientific traditions encourage us to
take these problems seriously.
b. Scientific Problem Solutions
We have described various sorts of problems which trigger scientific inquiry.
Our next task is to characterize the sorts of problem solutions which count
as scientific. This is the core of the demarcation problem with which Popper
began.
However, let me digress a moment to point out that we have skipped over
the process by which these tentative solutions are dreamt up in the first
place and the problem of whether there is a logic of discovery. Early philosophers
were optimistic about the prospects of describing a method for discovering
true theories. Bacon and other inductivists thought that through careful
observation and systematic use of his tables one could easily arrive at
the solution to scientific problems. Descartes and other rationalists thought
that a systematic analysis of our clear and distinct ideas would provide
the answers.
Popper argues that there is no recipe for discovery, but from this he concludes
that all the scientist can do is guess at the answer. Some conjectures
will be "happy guesses" as Whewell described them; others will
turn out to be dead wrong. It's all a matter of trial and error. In biological
evolutions mutations occur by chance--we can't predict what new variations
will occur. But natural selection will filter out those who are not adapted
to the environment. Likewise for science. People make up all sorts of
crazy hypotheses. But tests will weed out those which do not match reality.
Quality control is insured by careful testing procedures, not by censorship
of new ideas. The pattern of reasoning which leads to a new hypothesis
is not important--it may be based on dreams, mystical experiences, weak
analogies or what have you. According to Popper, the origins of the idea
are irrelevant; what is crucial is how well the scientist's hunch stands
up to testing.
Today both cognitive scientists and philosophers of science are optimistic
about being able to describe the structure of the process Popper calls "trial
and error". Here is a place where he is ill-served by the analogy
to biology although ironically biologists have now given a reduced role
to blind mutations.
Sociologists of course would argue that the origins of ideas are relevant
-- a hypothesis which originates in Utah will have less initial plausibility
than one which comes from MIT. And cognitive scientists, as well as philosophers
such as Campbell and Hesse, dispute the claim that analogies only play a
role in discovery and are then discarded.
And it is interesting to recall that Popper himself claims that one can't
understand theories without knowing about the problems which they solve.
Might this be construed as meaning that the problem-situation out of which
the theory arose is relevant to its evaluation? But let us return to Popper's
order of exposition.
As our account so far makes clear, the solutions to problems which scientists
propose start out being mere hypotheses or conjectures. When they are first
proposed, we have no particular reason to believe them true. Furthermore,
these hypotheses tend to be rather bold and far-reaching. This is because
the typical scientific problems we listed above all require as solutions
theories of high content. Consider Problem Type 1: To explain why our
expectations are violated, we need a theory which accounts both for the
exceptions and the normal states of affairs we had expected. For example,
a good answer to the problem of the planets' irregular motions would also
explain the sun's regular motion.
To turn to Problem Type 2: Trying to give a deep explanation of a regularity
(such as the Balmer formula for hydrogen spectral lines) generally results
in a conjecture which has many other consequences as well (such as a formula
for the spectral lines of sodium). As for Problem Type 3, it is clear that
a unified theory will have more content than either of the separate fields.
And generally such a theory will have lots of new consequences as well.
(For example, the unified theory of chemical bonding covered not only traditional
organic and inorganic compounds, but a whole new domain of organic-metallic
compounds, such as hemoglobin.)
Although they are bold conjectures, Popper argues that conjectures do have
one very important property in their favor: they can be tested by means
of experiments. If one of our conjectures is false, it is realistic to
hope that we will eventually discover its erroneous nature.
Let us now discuss the precise requirements that a theory must satisfy
in order to be falsifiable.
(i) The Logical Requirement. Statements of the form "Some A's are
B's" cannot be refuted by any report involving a finite number of instances,
but universal generalizations, be they affirmative or negative, can be.
A necessary condition for a theory to be falsifiable is that it be logically
possible to contradict it by a finite conjunction of sentences which describe
particular instances.
Popper used the logical requirement to argue for the unfalsifiable status
of many Marxist doctrines. Statements about the "inevitability"
of the downfall of capitalism fail the logical requirement if no time limit
is given. "Light has a maximum velocity" also fails unless a
value is specified.
Many claims which at first appear to be universal generalizations also
fail. For example, "Every metal has a melting point" or "every
action is rational" may be better analyzed as what Watkins called "all-some"
statements, i.e. as saying that for every metal there is some temperature
above which it will melt, and for every action, there is some description
of the agent's problem situation such that the action was appropriate to
it.
On the other hand, the claim "some copper is brittle" looks like
it is not open to refutation by a finite observation report; however, if
it is accompanied by a recipe, "To make copper brittle, place a thin
sheet of it for three days in a nuclear reactor where the neutron flux is..."
it becomes testable.
(ii) The Empirical Requirement. Having the proper logical form is not
sufficient to insure that a hypothesis is scientifically testable. "All
repressions are seated in the libido" satisfies the logical requirement
but, as it stands, it is not subject to experimental test. How exactly
are we to recognize a repression And even if we could, how could we tell
whether or not it is seated in the libido?
Contrast the following sentence which has the same logical form: "All
samples of iron have a melting point less then 2000_ C." This universal
generalization is subject to test. We can easily determine whether a sample
is iron or not through chemical analysis. (We might use the potassium thiocyanate
test, for example.) And there are also a variety of reliable procedures
for measuring melting points.
The contrast in the above two cases suggests the following requirement:
A falsifiable theory is one which is inconsistent with at least one finite
conjunction of observation test reports. Popper's discussion of test reports,
or 'basic' statements, as he called them in the Logic of Scientific Discovery,
is traditional in many respects: they describe observable events occurring
in an individual region of space and time (p. 103); they are inter-subjectively
testable, i.e. they describe experimental arrangements in such a way that
anyone who has learned the relevant technique can check on their validity
(p. 99).
But Popper departs from the logical positivist or other standard empiricist
accounts by not claiming that the 'basic' statements are infallible, nor
are they picked out by any psychological criteria. The store of 'basic'
statements and hence whether or not a theory is testable depends on the
technology and state of scientific development available at the time. Before
the invention of the mass spectrograph, "All atoms of an element have
the same weight" would not have been considered testable because as
yet there was no way to determine the weights of individual atoms. What
counts as an observation sentence also changes with the development of instrumentation
and with new theoretical developments. For modern scientists, "This
sample is oxygen" and "This is an electron track" are considered
to be observation statements. In an earlier era they would not have been.
"This sample is a gas which supports combustion" and "This
track is a cloud chamber curves towards the positive plate" might have
been used instead, if the identity of the gas or of the particle was still
in question. The truth of observation statements cannot be decided with
certainty; even so, members of the scientific community can tentatively
agree in their judgments about the truth of observation statements.
Although Popper originally proposed his falsifiability doctrine as a demarcation
between science and pseudo-science, one could also view it as a regulative
principle to guide the development of good scientific theories, not as a
sharp criterion. We can increase the degree of falsifiability of a conjecture
by increasing the domain of phenomena to which it applies, by making more
precise the descriptive claims about the domain, and by inventing less and
less controversial observational procedures for evaluating those claims.
More important then the question of whether Freud's theory has any potential
falsifiers whatsoever is the question of how we might increase its degree
of falsifiability, either by making its claims more precise or by using
detection methods such as plethysmography for detecting patterns of sexual
arousal instead of relying solely on dreams or other traditional psychoanalytic
techniques.
I have just recited the standard Popperian answer to the demarcation problem
which is described in his intellectual autobiography (Unended Quest) and
in Chapter 1 of Conjectures and Refutations as the problem which his falsificationist
theory of science was intended to solve.
But let us now ask how this account might differ if we take seriously Popper's
own claim that theories should be solutions to problems? On this perspective
some of the criteria for appraisal would be different. For example, before
checking on the falsifiability of a theory, shouldn't we first see if it
is even a solution of the problem? Popper discusses the Maori conjecture
that the earth is held up by a turtle and criticizes it, not because it
is false or unfalsifiable, but because it immediately raises the same problem
which it was supposed to solve, namely what holds up the earth ( or turtle)?
This example strongly suggests that before (or in addition to) appraising
a conjecture in terms of its falsifiability we should check on whether it
solves "the" problem. This brings the historical context of the
conjecture and perhaps even the intentions of its inventor into the evaluation
of a hypothesis. It also suggests that a Freud or whatever might not be
castigated so severely for proposing unfalsifiable conjectures if they were
at least solutions to his problem, particularly if no other more falsifiable
solution was available. Perhaps we should instead fault his choice of problem,
not his theory. We will need to return to this case when we present our
account of problem evaluation.
c. The Choice of Scientific Tests
In his account of the empirical appraisal of scientific theories, Popper
once again inverts the positivists' rhetoric. Rather than trying to collect
data which will confirm our conjectures, we should instead conduct those
tests which seem most likely to refute them.
Popper's central point is nicely illustrated by an anecdote recounted by
Francis Bacon:
...it was a good answer that was made by one who, when they showed him hanging
in a temple a picture of those who had paid their vows as having escaped
shipwreck, and would have him say whether he did not now acknowledge the
power of the gods--"Aye," asked he again, "but where are
they painted that were drowned after their vows?" And such is the
way of all superstition...(The New Organon, BK I, Aphorism LXVI.)
It is obvious that Bacon is criticizing the way data is being used to argue
for the "power of the gods." But we need to spell out the objection
in detail.
First of all, what exactly is the claim about the power of the gods which
is under discussion? It would appear that the basic thesis which can be
directly tested is the following: "If one makes a vow during a storm
at sea, then one will survive." We can abbreviate the conjecture as:
"If V, then S."* The proposed method for collecting data which
will either support or refute the conjecture is as follows: Go to churches
and record instances of people who paid their vows as thanks for having
escaped drowning. Using our abbreviations, we can describe the instances
so collected as cases of V and S.
At first glance, it may appear that these data do indeed tend to confirm
the conjecture because they are positive instances of the generalization.
But let us look more carefully. What kind of evidence would refute the
conjecture? The answer is a case of someone who made a solemn vow, but
drowned at sea nevertheless, i.e., a case of V and not-S. But given our
method of collecting data, it is logically impossible that we would ever
find such a refuting instance. By looking only at pictures of survivors
(i.e., unless it is logically possible that there could have been another
cases of S) we will never come across an instance of V and not-S, even if
there be millions of such cases. One of the basic principles of scientific
testing can be stated roughly as follows: The outcome of a certain test
procedure cannot confirm a theory outcome which would have disconfirmed
the theory.
In order to test "If V, then S", we should sample the domain
of V and find out whether any of them drowned. As Bacon says, "Where
are they painted that were drowned after their vows?" In addition,
we should also look at examples of people who in fact drowned and find out
if any of them had made vows. (This might be difficult to do in practice,
but we could check their diaries, ask their mates, etc.) It is useless
to look at cases already known to be S or not-V. Such "tests"
are irrelevant to the conjecture under consideration because it is logically
impossible that they could ever yield a refuting case.
[--- Unable To Translate Graphic ---]
*This is probably somewhat over simplified. The proponents of the power-or-the-gods
theory may have only wished to defend a weaker claim: "If one prays,
one is less likely to be drowned." We will postpone the discussion
of the testing of probabilistic generalizations until later.
We might label the procedure described by Bacon as "no--risk data
collecting" because the way in which the data is collected makes it
logically impossible for a refutation to appear. Once pointed out, the
methodological error is blatant; nevertheless it can be seductive. For
example, after teaching scientific method for a number of years, I once
caught myself reasoning as follows: I observed that all of my close friends
who blinked a lot and tipped their heads back when looking at me wore contact
lenses. I then started investigating other people who behaved similarly
and sure enough I nearly always found independent evidence that they were
wearing contacts. Sometimes I asked them. Other times I would see a lens
holder in their purse or bathroom, etc. I soon jumped to the following
conclusion: "All people who wear contact lenses blink a lot and peer
down their noses when they look at you."
This conclusion was obviously too strong, given that I had done only an
informal study on a very small sample. But I did think that my experience
justified a more modest statement: "All contact lens wearer whom I
have met blink a lot, etc." What was not clear to me for quite some
time is that none of my observations had served as a test for either conjecture.
For I had always begun my observations with people who blinked! Given
this choice of sample domain, I could have investigated all the blinkers
and peerers in the world and never found a counter-example to my conjecture--not
because there weren't any, but simply because it was logically impossible
for my method of data collection to uncover them.
Popper adds to Bacon's point by stressing that good scientific tests should
be severe ones, that is they should be deliberately designed, using our
general background knowledge to probe the conjecture at its weakest point,
i.e., to find a refutation if one does in fact exist. For example, when
Kohlberg put forward a theory about the development of moral reasoning in
children, he was well advised to test it on children from Turkey and Taiwan.
We might expect a theory developed on the basis of experience with kids
in Boston to fail when applied to children from quite different cultures
and religions. (As it turned out, the Kohlberg theory passed this severe
test.) Similarly, theories about the universality of the Oedipal complex
should be tested on aborigines, and theories about language learning on
deaf and blind children. Theories about geological change and biological
evolution should be tested, where possible, by data from other planets.
Physicists know that theories often fail under conditions of high energy
or high velocity; and often processes at the micro level violate generalizations
which work well with medium-sized objects. For this reason physicists want
to build ever bigger accelerators for smaller and smaller particles.
The general procedure for designing a severe test is as follows: The hypothesis
under test always makes a series of claims. For example, the claim "All
arsenic compounds are poisonous" says that both soluble and insoluble
arsenic compounds are poisonous. It also says that both yellow and green
non-poisonous substances are free of arsenic. (Don't forget the contrapositive!)
According to our background information, some of these claims sound less
plausible than others. For example, since we know that many poisons have
to be digested in order to act, we may decide that insoluble arsenic compounds
are less likely to be poisonous than soluble ones. A severe test is one
which tests the least plausible claims of a theory. In our example, given
our background theories about the relationship between solubility and poisonous
character, we should start testing by looking at insoluble arsenic compounds.
If the conjecture passes this severe test, we will then look at the class
of soluble arsenic compounds. Other things being equal, severe tests, i.e.,
tests of the least plausible claims of a conjecture, are more stringent
than less severe ones.
Note that our appraisal of the severity of tests depends on the background
information available at the time. Consider the two claims: (a) "All
yellow non-poisonous substances are free of arsenic" and (b) "All
green non-poisonous substances are free of arsenic." Which domain
should be investigated first if one wishes to perform a severe test of the
original conjecture? Recall that counter-example to the original conjecture
would be a non-poisonous arsenic compound. So if we think green substances
are more likely to contain arsenic than yellow ones, we should sample the
domain of non-poisonous green substances. If we know nothing about the
typical color of arsenic compounds, however, or if we have reason to believe
that color is not correlated to chemical composition, we would judge the
tests to be equally severe. (As a matter of fact, many arsenic materials
are yellow or black, so there may be a slight preference for a test of yellow
non-poisonous substances.)
Because they depend on vague and incomplete background knowledge, judgments
about which tests are most likely to refute the conjecture are unusually
fallible. For example, the Kohlberg theory of the development of moral
reasoning worked surprising well when tested on boys raised in Muslim and
Confucian cultures, but failed when tested on young American girls. (See
Gilligan.) Kohlberg had thought his universal theory might well be sensitive
to differences in the religious ethos, but that factor turned out to be
much less important than gender differences.
A special case of severe testing is what Bacon called a "crucial experiment."
Here one probes the vulnerability of a hypothesis by comparing its predictions
with those of a plausible rival conjecture. If hypothesis A predicts P
and rival hypothesis B predicts not-P, checking on whether P or not-P is
the case will allow us immediately to eliminate one alternative. Contrary
to what its name may imply, a crucial experiment does not prove the truth
of the undefeated hypothesis because there may exist more alternatives which
we have not yet thought of.
For example, according to the Copernican theory, Venus should wax and wane
like the moon. The Ptolemaic system, on the other hand, predicted that
Venus should not exhibit extremely different phases at different times.
This conflict between the predictions of the rival cosmological systems
was noted by Copernicus in 1543. However, it was not possible to conduct
a crucial experiment without a telescope. In 1610, Galileo observed that
Venus did have phases and so the Ptolemaic system was refuted. This crucial
experiment in no way established the truth of the Copernican heliocentric
theory for in 1588 Tycho Brahe had proposed a geocentric system which also
gave the correct predictions concerning Venus. The next order of business
was to design a crucial experiment between the Tychonic and Copernican system.
Crucial tests are only stringent when the rival hypothesis is a fairly
plausible one (as judged against background knowledge). The more plausible
the rival conjecture to the hypothesis in question, the more stringent is
a crucial test between them. For example, no one would have thought it
necessary to design a crucial test if the only rival were an ad hoc hypothesis
to the effect that Venus shone by its own light but periodically varied
its luminous area from crescent shaped to circular!
Checking on the truth of the least plausible consequences of a conjecture
is the most efficient way of trying to falsify it, and hence Popper recommends
tests with samples which are in a sense biased against the conjecture!
How can this be reconciled with the standard statistical practices of using
random samples or stratified samples? Or can it be? To develop a full-fledged
critique of the Popperian approach to statistics is beyond the scope of
this book, but I will make a few preliminary remarks. First of all, many
statistical studies are not really tests at all, but simply demographic
measurements. If Kinsey wishes to make descriptive claims about overall
American sexual practices, clearly a non-biased sample is desirable. However,
if one is testing the claim that the half-life of radium is always 1600
years or that the M/F ratio of neonates is always 0.51 (regardless of conditions),
then it makes sense to focus our inquiry on samples of radium or births
in extraordinary circumstances, namely those which on our background knowledge
are most likely to violate the general claim.
In the case of evaluating causal claims by means of controlled tests, the
Popperian approach once more exhorts us to put most effort into controlling
for those factors which are most likely to be alternatives to the causes
described by our hypothesis. Of course, since our background hunches about
the weaknesses of our conjectures are always fallible, our assessments of
the severity of a test are also fallible and this is a good reason for eventually
performing a wide variety of tests whether they appear to be severe or not.
There have been a variety of reactions to Popper's account of severe testing.
Bayesians have analyzed parallels between Popper's account and their own.
Proponents of the semantic view of theories, on the other hand, sometimes
imply we should invert Popper's methodology and gradually increase the domain
of a theoretical model by first trying to apply it to the instances most
similar to the paradigm cases around which the model was originally constructed.
What new perspectives on scientific testing are provided if we view theories
as solutions to problems? Let's begin with a non-scientific example adapted
from van Fraassen (whose views we will discuss later). Suppose we wish
to test the claim C: Eve ate the apple from the tree of knowledge.
Now imagine two problem situations. In the first case, theologians are
puzzling over the exact symbolism of the apple treel Did it stand for eternal
life or did it have something to do with the knowledge of good and evil?
C proposes an answer.
In the second case, let us suppose that the controversy is over whether
Eve also ate the apple or whether she merely tempted Adam to eat while remaining
pure herself.
Now we can well imagine that the sorts of historical and textual testing
of C which would be appropriate in the two problem situations would be quite
different. The theologians would look primarily at evidence relating to
the tree issue and might not even care whether it was Adam or Eve or both
who ate the apple. In the second problem situation the relevance of the
tests would be reversed.
I conclude that at least in some cases, knowing which problem the theory
was supposed to solve would influence our choice of tests. Since scientific
theories have lots of content (and hence lots of places to go wrong) and
since most of our theories are probably literally false, it makes sense
to focus our testing on the aspects which are most relevant to the problem
we are trying to solve. Criticism of the non-relevant parts (such as "Eve
didn't actually eat the apple -- she just bit into and chewed it up but
didn't swallow it because just then God came and chased them out")
may strike us as pedantic.
Knowing the problem-situation seems to help us choose relevant tests in
the case of the idiographic inquiry where the conjectures are singular statements.
But what about in the case to law-like hypotheses? Do we really need to
know what the question is in order to test the truth of the answer?
I grant that in the case of fundamental scientific theories the influence
of problem on testing may be less, but I still think it may be as important
as Popperian severity which is based on improbability. Here is an illustrative
example -- consider the following conjecture:
C: The atomic weight of oxygen is sixteen.
Now the most severe test we can think of is to make measurements accurate
to six figures. (It is highly improbable that this value is exactly right.)
And if the issue is the existence of isotopes that would be quite appropriate.
But what if the problem-situation is an earlier one in which the main dispute
is whether oxygen gas is diatomic? Then accuracy to six significant figures
is not relevant at all.
Perhaps this point is better expressed by saying that before testing one
should clarify or amplify the conjecture. But then this process will also
require us to go back to the problem for which it is intended to be a solution.
d. The Ambiguity of Falsification
We have raised questions about the choice of tests to be performed, but
as described so far, the logic of testing is simple and clear-cut: (1)
We derive a prediction from our conjecture which can be subjected to experimental
check. (2) We do the experiment. (3) If the prediction is wrong, the theory
is refuted. Period. Or so it would seem. In the typical scientific case,
however, the situation is more complicated and the decision as to exactly
which premise is to be given up is less straightforward.
Let us illustrate the dilemma with a famous scientific example, the case
of stellar parallax. After Copernicus put forward his theory that the earth
revolved around the sun, astronomers noted that if his theory were true,
one should be able to detect stellar parallax. If one is moving with respect
to an object, then the direction in which the object appears changes. This
phenomenon is known as parallax. As a race driver moves past the pit stop,
at first it is ahead of him/her. Later it is behind. The angle a in the
diagram below is called the angle of parallax. A similar diagram could
be used to illustrate Copernicus' theory of the earth's annual movement
with respect to a particular star.
----------------------------
Insert Figure 3.3 about here
----------------------------
But when 17th-century observers looked for stellar parallax, they couldn't
detect any. Didn't this mean the theory was false? The supporters of Copernicus'
theory decided to blame an auxiliary assumption instead. Their argument
can be illustrated with the race-car analogy. Suppose the driver sights
on a distant radio tower instead of on the pit stop. Now the angle of parallax
may become too small to be easily noticeable. As the radio of D to R increases,
a gets smaller. At very large valued of D it will become to small to detect.
According to estimates of the distance between the earth and the stars
available at the time, stellar parallax should have been observable. But
the Copernicans argued that these estimates were wrong and claimed that
the universe was about 1,000 times bigger than had previously been imagined.
This bold move turned out to be correct, but 200 years passed before stellar
parallax was detected experimentally.
----------------------------
Insert Figure 3.4 about here
----------------------------
The logic of the testing situation was as follows:
Copernican theory: The earth revolves around the sun, which is stationary
relative to the stars.
Auxiliary hypothesis: The distance between the earth and the stars is about
20,000 earth radii.
Experimental Prediction: (Therefore) Stellar parallax should be easily observable
with the apparatus available.
Experimental Finding: No stellar parallax is observable with the available
apparatus.
Since the prediction failed, one of the premises had to be wrong. Copernicus
blamed the auxiliary hypothesis; anti-Copernicans defended it and blamed
the theory instead. With no good way at the time to test the auxiliary
hypothesis, the status of the Copernican theory was left open.
The philosopher who first stressed that almost all tests involve a lot
of auxiliary assumptions was Pierre Duhem, an early 20th-century philosopher,
physicist, and historian of science. Hence, we will call the following
the Duhemian problem:
When an experimental prediction turns out to be false, should the scientist
blame the theory under test or the auxiliary assumptions (or both)?
Popper emphasizes that there is no methodological recipe for dealing with
the Duhemian problem, but a few guidelines can be laid down. First of all,
one should not use the Duhemian problem as a general excuse for one's pet
theory. It is not good methodology to say, "My theory's prediction
failed? Well, not to worry. I probably made a false auxiliary assumption
somewhere along the line." If one wants to keep the theory despite
the prediction failure, one must point to a specific auxiliary assumption
and then design tests of that auxiliary assumption. If the auxiliary assumption
passes the tests, then we should conclude that our theory and not the auxiliary
was false. Sometimes, however, it is not possible nor practical to test
auxiliary hypotheses. (We saw an example of this in the Copernican case.)
In such instances, we can draw no firm conclusions about the original test
situation. If a theory in conjunction with a variety of auxiliary assumptions
makes a lot of false experimental predictions, though, we tend to decide
that the theory is false, even though we can't conclusively test each auxiliary.
The Duhemian dilemma can be analyzed as follows:
The theory under test (T) when conjoined with one or more auxiliary hypotheses
(A) makes a prediction (p). Experiments show that p is not the case. By
modus tollens we know that either T or A (or both) must be false, but logic
doesn't tell us which.
(T & A) - p
~p
(Therefore) ~T, or ~A, or ~T & ~A
Note that in the pure Duhemian problem situation there is no controversy
about the experimental result, ~p. Furthermore, all parties agree that
T & A imply p. The disagreement arises about whether to revise A or
to revise T. Of course, there are also cases in which people cannot agree
on experimental results or on what exactly the implications of the theory
are. These latter disagreements can usually be settled either through further
experimentation or by means of logical analysis. The Duhemian problem is
often more recalcitrant. Popper does give one firm piece of methodological
advise. No matter which premise we decide to replace the substitute should
never be lower in empirical content.
The main responses to Popper's remarks on the Duhemian dilemma, such as
those of Kuhn and Lakatos, point out that in the history of science, it
is fairly rare to find a case where a theory is refuted by a single, decisive
experiment. More often theories come to be rejected through a variety of
prediction failures. Theories are rarely struck down by a blow from one
type of crucial experiment, no matter how many times that experiment is
repeated. Rather they are eroded away be an accumulation of anomalous results.
We will develop this important critique in the next chapter. Here I will
only remark that if we view theories as problem solutions, then as we may
modify our system in response to the Duhemian dilemma we should either insure
that the new system also answers the original problem(s) or else explicitly
acknowledge that we are abandoning them.
e. The Status of Corroborated Theories
We have discussed what happens when our theory's prediction is refuted--either
we revise it or adjust an auxiliary hypothesis. What happens if our theory
passes the most severe experimental tests we can devise with flying colors?
Can we then declare it proven true, or at least highly probable? It is
perhaps on this issue that Popper's disagreement with the positivists is
deepest.
First of all the history of science strongly suggests that we should never
feel completely certain about any scientific generalization, no matter how
frequently or stringently it has been tested. Newton's theory of classical
mechanics had perhaps the best track record ever; yet it was superceded
by Einstein's relativistic mechanics. Here are a few other examples of
well-established claims which eventually had to be corrected or rejected:
(i) Matter cannot be created or destroyed. (Not true in nuclear fission
or fusion processes.)
(ii) The sun rises once every twenty-four hours. (Not true at the North
Pole.)
(iii) All molecules of water are made of the same stuff. (Not true for
heavy water, deuterium oxide.)
(iv) The major difference between homo sapiens and the lower animals
is that man can use language. (Not true for chimpanzees which can use
sign language.)
(v) Living matter can only come from living matter; it cannot be formed
from inanimate substances. (Not true--amino acids can be synthesized
from ammonia, methane, hydrogen, etc.)
So the history of science warns us that any scientific claim is fallible.
Logic and philosophy of science can help us understand why this is so.
Here are some of the reasons:
(i) Generalizations cover a potential infinity of cases. But we can
only check on a finite number of predictions. We can never be sure that
the next case won't violate the rule (e.g., a black swan may turn up in
Australia).
(ii) Scientific theories make infinitely precise claims. But we can
only make measurements of finite accuracy. (For example, Newton's law of
gravitation says the force of gravity varies inversely with the square of
the distance, i.e., the exponent is r2.00000...but our measurements cannot
discriminate between r2 and r2.0000000001.)
(iii) Many of our scientific laws only hold under idealized conditions--to
give two very simple examples, the law of the lever assumed no friction
at the fulcrum, and the law of the pendulum assumes there is no air resistance.
Of course, we can try to minimize such interferences when we conduct tests,
e.g. but resting our lever on a point or setting up a pendulum in a vacuum,
but our experiments never achieve the perfect conditions which are assumed
in our ideal laws.
(iv) There may be alternative theories which we have not even dreamt of
yet which account for all of the data we have in hand.
For all these reasons, theories are underdetermined by our observational
results and can never be proved through any amount of observation and experiments.
There are no rules for deciding when to accept a theory (for the time being)
and move on to new problems, but what we can do is to answer each of the
above sources of fallibility as best we can.
(i) By testing in widely scattered domains, we guard ourselves against
parochialism, e.g., the black swans in Australia.
(ii) By making our tests as precise and ideal as possible, we can approach
the infinite precision and perfection of our theories.
(iii) And the best way to rule out alternative explanations is to deliberately
try to imagine radically different ways of explaining our results. If we
can devise a new alternative, we can then set up a crucial experiment between
the two competing accounts.
But what is the exact epistemological status of theories which have survived
critical scrutiny? What positive claims can we make about them? Popper
introduced the term corroboration to describe the severity of the tests
passed by a hypothesis, but he emphatically denies that the degree of corroboration
is to be interpreted as a degree of reasonable belief in the hypothesis
or the probability that it is true. However, he does say that for purposes
of practical action, it is rational to base our behavior on our most highly
corroborated theories. And for purposes of scientific inquiry we should
use the degree of corroboration of various claims as guides to criticism
and revision of our scientific systems. The Duhemian problem would become
completely intractable if we had no way of at least tentatively assigning
the blame for prediction failures. And the whole mechanism of falsification
rests on the existence of 'basic' statements, i.e., statements which all
observers can test and presumably corroborate for themselves.
Popper's theory of corroboration and his views of induction are perhaps
the most controversial aspects of his philosophy and I will not comment
on that far-ranging debate. I will only remark that to the extent that
tests are chosen because of their relevance to the problem-situation, our
estimates of corroboration or Bayesian confirmation or what have you will
also be dependent on problems.
3.? Final Comments
Popper's characterization of the objective aspects of problems is a good
starting point, but it needs to be accompanied by a fuller account of the
factors, be they objective or subjective, which influence problem choice.
If scientists tried to work on all the problems which exist in a World-3
sense, or chose their problems randomly, science as we know it would not
exist.
Popper's methodology stresses problems as the starting point of inquiry
but makes problems less central in the later stages of theory evaluation.
A more thorough-going problems approach would lead us to modify Popper's
account of preliminary theory appraisal and the prioritizing of scientific
tests. It is less obvious how, if at all, viewing theories as solutions
to problems should affect our philosophical accounts of theory corroboration
or confirmation.