[ Return to X200 Home Page ]

A Solution to Carey, Exercise I

The causal link suggested in the problem is between chiropractic treatment (which is left unspecified but generally involves manipulation of the spine) and lower back problems. The question we need to try to answer by our various types of studies is: Is manipulation of the spine more effective at treating lower back problems than is treatment involving drugs and surgery? The passage does not give us the success rate of medical doctors in treating such problems so we want to design experiments that will provide us with information about the relative effectiveness of the two types of treatment.


1. Randomized experiment. We might begin with a group of people all having lower back problems of roughly the same severity and none of whom have yet sought medical aid of any sort. Where we might find such a group is difficult to say, but we might cull one from among workers in a professionthat is known to involve a high risk of back injury, say furniture movers or longshoremen. Or we might simply run an ad in the newspaper asking for volunteers. At any rate, having found a group of experimental subjects, we will want to "fine tune" the group a bit to account for factors other than treatment known to influence the rate of improvement for back problems: weight, age, and fitness come to mind. Once we have come up with a group of subjects who are pretty much alike with respect to such factors, we will divide them into experimental and control groups.

Members of the experimental group will be sent to chiropractors for treatment and members of the control group will be sent to medical doctors who specialize in treatment of lower back problems. Since we know that 70% of people who see chiropractors report improvement within 90 days, we need to let our experiment run for at least that long. At the end of the specified period of time, we will evaluate the conditions of the subjects. If chiropractors are more effective than medical doctors we would expect more improvement in the experimental group.


la. Do you have a good sense, statistically speaking, of the level of effect required to indicate a causal link? The level of difference in effect will depend, of course, .on the size of our experimental and control groups. If, say, we were to use two groups of 100, we would expect a difference in of effect of about 20% (or perhaps a few percent less), as the margin of error for groups of 100 is + 10% . Any smaller difference would warrant the conclusion that the two types of treatment are approximately equal in effectiveness or that any difference in effectiveness is too small to measure in a study of this size.

1b. Have you controlled for other factors that might affect the outcome of your experiment? In selecting our initial group we took pains to ensure that all subjects had complaints of roughly the same severity and that all are roughly the same with respect to factors other than those for which we are testing that might contribute to improvement. One other factor comes to mind that might influence our results. There are no doubt differences in the effectiveness of treatment provided by various chiropractors and medical doctors. To control for this, we might want to specify the exact treatments each group will be allowed to use. Beyond this, it is hard to imagine what we might do to further ensure that we have really effective practitioners.

1c. Does your experimental design rule out the possibility of experimenter bias? One potential source of bias concerns the experimenter or experimenters who will be evaluating the results. It seems unlikely that most back problems will completely disappear after 90 days, so what will need to be assessed, in many cases, is the level of improvement; one crucial measure of this will be the subjects' subjective reports of how much better they feel-how much less pain they are feeling and how much more mobile they seem to be. Assessing such reports will be difficult enough, since the reports may not be all that precise in any quantifiable way. Here, the preconceptions of the evaluators might influence their rating of various subjects. Hence it seems important that our evaluators not know whether subjects were members of the experimental or control groups.


1d. Does it rule out effects due to experimental subject expectations? This question raises a real difficulty for our experiment. We cannot hope to keep our subjects "blind" to the type of treatment they are receiving. And it seems possible that reports by subjects of their level of improvement may be tainted by their beliefs about conventional medical and chiropractic treatment. About the only thing we could do to control for this possibility would be interview our potential subjects prior to the experiment and eliminate those who seem to have a strong bias one way or the other.One additional factor must be considered in our thinking about this experiment and what various results might be taken to show. As we noted earlier, we have as yet no information about the percent of clients who claim conventional medical treatment is successful for lower back problems. Nor, however, do we know the percent of cases in which such problems improve with no treatment whatsoever! Yet such information would be crucial to the proper assessment of our results. Suppose, for example, we were to discover that chiropractic patients improve at a significantly higher level than do the patients of medical doctors. If the level of improvement for those who seek no treatment is near that of chiropractors, we would need to consider two possibilities: first, that chiropractic treatment is not a causal factor and, second, that medical doctors actually do more harm than good. Fortunately, our results should provide us with some interesting information on this crucial issue.


2. Prospective experiment. In a prospective experiment we begin with two groups, one composed of people with lower back problems who are seeking treatment by medical doctors; the other, our experimental group, will be made up of people with lower back problems who are being treated by chiropractors. As our experiment needs only 90 days to run its course, we might admit only people who have started treatment with, say, 10 days to ensure that both groups will be treated over roughly the same amount of time.


2a. Do you have a good sense, statistically speaking, of the level of effect required to indicate a causal link? We may be able to work with larger groups than in our randomized experiment, as we will only need to examine the records of existing patients rather than recruiting a group of potential subjects who fall within a narrow set of guidelines. By beginning with groups much larger than in our randomized experiment, we will be able to accept a much smaller difference in levels of the effect as evidence for a causal link. If, for example, we could work with groups of 500, a difference of only 8% or a little less would suggest that one kind of treatment is more effective than the other.


2b. Have you controlled for other factors that might affect the outcome of your experiment? Many people seek chiropractic care only after conventional medical treatment has failed. Such people may well have problems that are much more difficult to treat than the typical problems for which new back pain sufferers seek treatment. Hence if a large number of chiropractic patients fall into this category we would expect the success rate of chiropractors to be lower than that of medical doctors; a higher percentage of chiropractic patients will suffer from problems that have no quick and easy cure. We might control for this possibility by eliminating from both groups all subjects who have been treated for their back problem by a medical doctor within, say, the last year or so. Another factor that may contribute to the success rates of the two types of practitioners, however, would be difficult to control: Our subjects have chosen the kind of treatment they are undergoing and it seems reasonable to suppose that many of them think the kind of treatment they are undergoing is the most effective. Otherwise they would have selected the other kind. (There are, of course, other reasons why people select chiropractors over doctors and vice versa; one reason why many people select chiropractors-even as their primary physicians-is that chiropractors are typically much less expensive than medical doctors.) Perhaps we can do something about this problem by surveying our subjects and eliminating those with the most outspoken prejudices. A problem with this sort of hands-on treatment of subjects is that it becomes quite time- consuming and expensive when dealing with the large groups that prospective studies have the potential to deliver. Other factors that may affect the outcome of our experiment-such as weight, age, and exercise-can be controlled for by matching.


2c. Does your experimental design rule out the possibility of experimenter bias? The same precautions must be taken here as proposed for the randomized experiment discussed earlier. Our evaluators must be kept "'blind" about whether subjects were members of the experimental or control group.


2d. Does it rule out effects due to experimental subject expectations? Our subjects have, in a sense, determined the group in which they are a member and their choice may well have been influenced by their beliefs about whether chiropractors are more effective that medical doctors. Thus we will want to make sure our subjects do not know the nature of the experiment when they are interviewed at the end of the 90-day test period. Otherwise their evaluation of their own condition may be influenced by their attitudes toward the type of treatment they are receiving.


3. Retrospective experiment. In a retrospective experiment, we look into the background of subjects who do and do not have the suspected effect. It may seem that the appropriate study here would be one in which we look for differences in type of treatment for subjects who have reported success after treatment. However, such a study does not meet the requirements for a retrospective experiment in that it involves nothing like a control group. So instead we might compare subjects who have reported improvement after treatment (the experimental group) with subjects who have reported no improvement after treatment (the control group). We can then look for differences in the percentages of people within the two groups who have been treated by chiropractors and medical doctors.


3a. Do you have a good sense, statistically speaking, of the level of effect required to indicate a causal link? In retrospective studies there is no way of gauging the level of effect because all subjects in the experimental groupwill have the effect in question while none in the control group will. we can, however, look for differences in the level of the suspected cause in the two groups. How we do so in this case is a bit tricky. Suppose, for example, we were to discover that among the experimental group 50% were treated by medical doctors, 30% by chiropractors, and 20% by other kinds of practitioners. It may at this point be tempting to conclude that medical doctors have a better success rate. Here lies the value of our control group. Suppose among the control group 70% were treated by medical doctors, 10% by chiropractors, and 20% by others. Suppose also that our two groups each number 1000. Of the 1200 people from the two groups treated by medical doctors (50% of the experimental group plus 70% of the control group), 500 or about 40% reported improvement; of the 400 treated by chiropractors, 75% reported improvement. This would suggest that chiropractors have a significantly higher success rate despite the fact that in our study the raw number of successful treatments for chiropractors is lower than that for medical doctors. Thus it is important to have some sort of control group in order to assess the significance of the results obtained in the experimental group.


3b. Have you controlled for other factors that might affect the outcome of your experiment? We might attempt some backward matching. We might, for example, eliminate subjects who have a prior history of treatment if we find that more such subjects visited chiropractors. But such matching provides little additional evidence for any differences we might uncover, as they are adjustments made after the experimental data is in, not prior to the experiment.


3c. Does your experimental design rule out the possibility of experimenter bias? The likelihood of experimenter bias seems low in that the experimenters will not have a chance to evaluate individual cases or to determine membership in the experimental or control groups. Attempts at backward matching might be suspect.


3d. Does it rule out effects due to experimental subject expectations? Though experimental subject expectations cannot influence the outcome of this experiment, something very similar does come into play. The initial decision as to which group a given subject falls will be completely determined by the subjects' own assessment of their amount of improvement. Moreover, such assessment requires that they compare their current status to their recollection of their condition 90 days or so ago. Such comparisons are liable to involve a lot of guesswork and estimation and to be influenced by the subjects' beliefs about the efficacy of the type of treatment they have undergone.