Multi-Site Collaborations Provide Robust Tests of Theories
According to Popper (1959) "We can say of a theory, provided it is falsifiable, that it rules out, or prohibits, not merely one occurrence, but always at least one event" (p. 70). I argue that, all else being equal, multi-site collaborations more robustly test theories than studies done at a single site at a single time by a single researcher because the data from a multi-site collaboration more robustly represent the theoretically falsifying event.
Let's break down the key concepts of this argument.
What is a multi-site collaboration?
A multi-site collaboration is a study that involves a team of researchers at several locations who each test the same hypothesis. Often these collaborations use the same data collection procedures and same stimuli. Their individual results are then pooled together, often times in a meta-analysis, regardless of the results from any of the individual labs.*
Thus, the features necessary to test the hypotheses are the same across all labs. But there are inevitably some lab-to-lab differences in the specifics of the samples, the physical setting of the lab, the precise time the data are collected, etc.
Good exemplars of multi-site collaborations are the ManyLabs projects (see here or here) or Registered Replication Reports (see here or here).
Occurrences vs. Events
The next key concept is the distinction between occurrences and events. In the first sentence I said that a scientific theory must forbid at least one event. Popper considered a specific instance of a researcher deducing a hypothesis, operationalizing the theoretically-necessary features, and making an observation to be an occurrence. Each occurrence includes the features of a study that are deduced from the theory. And each occurrence takes place in the presence of a unique and idiosyncratic combination of other factors such as the specific time and specific location of a study. An event, on the other hand, represents the class of all possible occurrences that are equally deducible from the theory (an event = occurrence1, occurrence2, occurrence3, ...occurrencek).
Thus, occurrences are confounded with the idiosyncratic combination of other factors at a specific time and specific location, whereas events transcend those factors. Events represent only what can be logically deduced from a theory; occurrences also contain the infinite other factors that are inevitably present when an event is instantiated. Thus, the more robustly we can create events, the more robustly we can test our theories.
An example
Suppose I have a theory that "listening to a song with violent lyrics increases the accessibility of aggressive cognitions". This is a legitimate scientific theory because it allows you to deduce which events are consistent with the theory and which events are inconsistent with the theory. Namely, those who listen to songs with violent lyrics should have an increase in aggressive thoughts and should not have a similar level or a decrease in aggressive thoughts.
Suppose Researcher A conducts a study. This study will include the necessary features to test a hypothesis that was deduced from a theory. For example, Researcher A may hypothesize that listening to Johnny Cash's Folsom Prison Blues (a song with violent lyrics) would cause them to complete more word stems (e.g., KI _ _) with aggressive words (e.g., KILL) than non-aggressive words (e.g., KISS; a measure of the accessibility of aggressive cognitions). The results from this study would be an occurrence. Thus, in addition to the deduced theoretically-necessary features to test a hypothesis, this single occurrence is confounded with an idiosyncratic combination of theoretically-irrelevant factors. For example, the observations in this single study occur in the presence of participants' interaction with the experimenter, what the 3rd participant ate for breakfast yesterday, the ambient temperature of the room, the position of the stars when the last participant completed the study, etc., etc., etc.
Now suppose Researcher B also conducts a study. This researcher also deduces the features that would be theoretically necessary to test the hypothesis. Suppose this researcher follows Researcher A's approach and uses Johnny Cash's Folsom Prison Blues as the song with violent lyrics and also uses the word-fragment completion task as the measure of aggressive thoughts. The results from this study also would be an occurrence. Thus, this study includes the features of the study that were deduced from a theory and occurs in the presence of an idiosyncratic combination of theoretically-irrelevant variables. Further, the idiosyncratic combination of theoretically-irrelevant variables are different for Researcher A and Researcher B. That is, the observations made by Researcher B will likely occur in the presence of different interactions with the experimenter, a different breakfast by the 3rd participant, a different ambient temperature of the room, a different position of the stars when the last participant completed the study, etc., etc., etc.
Because the combination of theoretically-irrelevant factors differ for each occurrence, the occurrence made by Researcher A will not be equivalent to the occurrence made by Researcher B in all possible ways. This non-equivalence is what people often refer to when they say "there is no such thing as an exact replication": Two studies always differ in some aspects (such people often point to the inarguable presence of differences between occurrences and imply those occurrences do not belong to the same event class). However, and critically, each of the occurrences in this example are equally deducible from the theory. So each of these occurrences belong to the same event class, which means they are equally useful for potentially falsifying the theory.
In fact, because a single occurrence is confounded by the combination of theoretically-relevant and theoretically-irrelevant factors that are present when a single observation is made, any individual occurrence is ambiguous: Was the observation due to the theoretically-necessary variables? Or was the observation due to a freaky alignment of other factors that will never be recreated?
With a single study at a single site, we can assume that an occurrence was due to the theoretically-necessary variables and we can assume that it was not due to a freaky alignment of other factors. It is up to individuals as to whether or not they want to accept those assumptions. To empirically test whether an event is consistent or inconsistent with a theory, we need observations from several occurrences. That is, we need several observations that maintain the deduced theoretically-necessary features, but differ in the theoretically-irrelevant features that confound each individual observation, in order to disentangle the former from the latter.
Putting it all together
Let's go back to our example. The observation made by Researcher A is an occurrence. The observation made by Researcher B is an occurrence. Because these occurrences were equally deducible from the theory, these occurrences belong to the same event. It is necessary to observe several occurrences to disentangle the effects due to the theoretically-deduced factors from the theoretically-irrelevant factors.
Multi-site collaborations involve several researchers who each make observations across a range of occurrences. That is, multi-site collaborations involve observations being made across a range of idiosyncratic combinations of theoretically-irrelevant factors. Collectively, these individual occurrences better approximate the class of events that are used to test theories than any individual occurrence. Thus, all else being equal, multi-site collaborations provide more robust tests of our theories than a single study done at a single location at a single time.
I argue that Researcher A and Researcher B should agree on what study is logically deduced from their theory, each collect data following the same agreed-upon protocol (i.e., each make an occurrence within the same event class), combine their data into a common analysis regardless of how their individual data come out, and plan to grab a beer together at the next conference.**
For this, and for many other reasons, I hope that multi-site collaborations become commonplace in psychological science.
Where to begin?
Are you wanting to get involved in some multi-site collaborations? Here are some places to begin.
StudySwap: an online platform where you can find other collaborators.
The Psychological Science Accelerator: a network of labs who have committed to devoting some of their research resources to multi-site collaborations.
Registered Replication Reports: multi-site collaborations of replications of previously-published research.
*I believe the lack of inclusion bias in the meta-analyses from multi-site collaborations is probably the greatest methodological strength of these studies. However, this post is focusing on a different benefit of multi-site collaborations.
**This last part is a crucial feature of a successful multi-site collaboration.
Let's break down the key concepts of this argument.
What is a multi-site collaboration?
A multi-site collaboration is a study that involves a team of researchers at several locations who each test the same hypothesis. Often these collaborations use the same data collection procedures and same stimuli. Their individual results are then pooled together, often times in a meta-analysis, regardless of the results from any of the individual labs.*
Thus, the features necessary to test the hypotheses are the same across all labs. But there are inevitably some lab-to-lab differences in the specifics of the samples, the physical setting of the lab, the precise time the data are collected, etc.
Good exemplars of multi-site collaborations are the ManyLabs projects (see here or here) or Registered Replication Reports (see here or here).
Occurrences vs. Events
The next key concept is the distinction between occurrences and events. In the first sentence I said that a scientific theory must forbid at least one event. Popper considered a specific instance of a researcher deducing a hypothesis, operationalizing the theoretically-necessary features, and making an observation to be an occurrence. Each occurrence includes the features of a study that are deduced from the theory. And each occurrence takes place in the presence of a unique and idiosyncratic combination of other factors such as the specific time and specific location of a study. An event, on the other hand, represents the class of all possible occurrences that are equally deducible from the theory (an event = occurrence1, occurrence2, occurrence3, ...occurrencek).
Thus, occurrences are confounded with the idiosyncratic combination of other factors at a specific time and specific location, whereas events transcend those factors. Events represent only what can be logically deduced from a theory; occurrences also contain the infinite other factors that are inevitably present when an event is instantiated. Thus, the more robustly we can create events, the more robustly we can test our theories.
An example
Suppose I have a theory that "listening to a song with violent lyrics increases the accessibility of aggressive cognitions". This is a legitimate scientific theory because it allows you to deduce which events are consistent with the theory and which events are inconsistent with the theory. Namely, those who listen to songs with violent lyrics should have an increase in aggressive thoughts and should not have a similar level or a decrease in aggressive thoughts.
Suppose Researcher A conducts a study. This study will include the necessary features to test a hypothesis that was deduced from a theory. For example, Researcher A may hypothesize that listening to Johnny Cash's Folsom Prison Blues (a song with violent lyrics) would cause them to complete more word stems (e.g., KI _ _) with aggressive words (e.g., KILL) than non-aggressive words (e.g., KISS; a measure of the accessibility of aggressive cognitions). The results from this study would be an occurrence. Thus, in addition to the deduced theoretically-necessary features to test a hypothesis, this single occurrence is confounded with an idiosyncratic combination of theoretically-irrelevant factors. For example, the observations in this single study occur in the presence of participants' interaction with the experimenter, what the 3rd participant ate for breakfast yesterday, the ambient temperature of the room, the position of the stars when the last participant completed the study, etc., etc., etc.
Now suppose Researcher B also conducts a study. This researcher also deduces the features that would be theoretically necessary to test the hypothesis. Suppose this researcher follows Researcher A's approach and uses Johnny Cash's Folsom Prison Blues as the song with violent lyrics and also uses the word-fragment completion task as the measure of aggressive thoughts. The results from this study also would be an occurrence. Thus, this study includes the features of the study that were deduced from a theory and occurs in the presence of an idiosyncratic combination of theoretically-irrelevant variables. Further, the idiosyncratic combination of theoretically-irrelevant variables are different for Researcher A and Researcher B. That is, the observations made by Researcher B will likely occur in the presence of different interactions with the experimenter, a different breakfast by the 3rd participant, a different ambient temperature of the room, a different position of the stars when the last participant completed the study, etc., etc., etc.
Because the combination of theoretically-irrelevant factors differ for each occurrence, the occurrence made by Researcher A will not be equivalent to the occurrence made by Researcher B in all possible ways. This non-equivalence is what people often refer to when they say "there is no such thing as an exact replication": Two studies always differ in some aspects (such people often point to the inarguable presence of differences between occurrences and imply those occurrences do not belong to the same event class). However, and critically, each of the occurrences in this example are equally deducible from the theory. So each of these occurrences belong to the same event class, which means they are equally useful for potentially falsifying the theory.
In fact, because a single occurrence is confounded by the combination of theoretically-relevant and theoretically-irrelevant factors that are present when a single observation is made, any individual occurrence is ambiguous: Was the observation due to the theoretically-necessary variables? Or was the observation due to a freaky alignment of other factors that will never be recreated?
With a single study at a single site, we can assume that an occurrence was due to the theoretically-necessary variables and we can assume that it was not due to a freaky alignment of other factors. It is up to individuals as to whether or not they want to accept those assumptions. To empirically test whether an event is consistent or inconsistent with a theory, we need observations from several occurrences. That is, we need several observations that maintain the deduced theoretically-necessary features, but differ in the theoretically-irrelevant features that confound each individual observation, in order to disentangle the former from the latter.
Putting it all together
Let's go back to our example. The observation made by Researcher A is an occurrence. The observation made by Researcher B is an occurrence. Because these occurrences were equally deducible from the theory, these occurrences belong to the same event. It is necessary to observe several occurrences to disentangle the effects due to the theoretically-deduced factors from the theoretically-irrelevant factors.
Multi-site collaborations involve several researchers who each make observations across a range of occurrences. That is, multi-site collaborations involve observations being made across a range of idiosyncratic combinations of theoretically-irrelevant factors. Collectively, these individual occurrences better approximate the class of events that are used to test theories than any individual occurrence. Thus, all else being equal, multi-site collaborations provide more robust tests of our theories than a single study done at a single location at a single time.
I argue that Researcher A and Researcher B should agree on what study is logically deduced from their theory, each collect data following the same agreed-upon protocol (i.e., each make an occurrence within the same event class), combine their data into a common analysis regardless of how their individual data come out, and plan to grab a beer together at the next conference.**
For this, and for many other reasons, I hope that multi-site collaborations become commonplace in psychological science.
Where to begin?
Are you wanting to get involved in some multi-site collaborations? Here are some places to begin.
StudySwap: an online platform where you can find other collaborators.
The Psychological Science Accelerator: a network of labs who have committed to devoting some of their research resources to multi-site collaborations.
Registered Replication Reports: multi-site collaborations of replications of previously-published research.
*I believe the lack of inclusion bias in the meta-analyses from multi-site collaborations is probably the greatest methodological strength of these studies. However, this post is focusing on a different benefit of multi-site collaborations.
**This last part is a crucial feature of a successful multi-site collaboration.
Comments
Post a Comment