Second, to guard against the threats to reliability (Neuendorf 2002), we performed a pilot test on articles meeting the search parameters from other top journals. That is, the articles used in the pilot test (a) were not part of the data set generated in Phase 1, and (b) the data generated from the pilot test were not included in the final data analysis for this study. Researchers independently categorized the articles in the pilot test based on the best fit among the nine research strategies. After all articles in the pilot test were categorized, the researchers compared their analyses. In instances where the independent categorizations did not match the researchers re-evaluated the article collaboratively by reviewing the research strategy definitions, discussing the disagreement thoroughly, and collaboratively assigning the article to a single category. This process allowed the researchers to develop a collaborative interpretation of the research strategy definitions. Simply stated, this pilot test served as a training session for accurately categorizing the articles for this study with respect to research strategy.
Each research strategy is defined by a specific design approach and each is also associated with certain tradeoffs that researchers must make when designing a study. These tradeoffs are inherent flaws that limit the conclusions that can be drawn from a particular research strategy. These tradeoffs refer to three aspects of a study that can vary depending on the research strategy employed. These variable aspects include: generalizability from the sample to the target population (external validity); precision in measurement and control of behavioural variables (internal and construct validity); and the issue of realism of context (Scandura and Williams 2000).
Cook and Campbell (1976) stated that a study has generalizability when the study has external validity across times, settings, and individuals. Formal theory/literature reviews and sample surveys have a high degree of generalizability by establishing the relationship between two constructs and illustrating that this relationship has external validity. A research strategy that has low external validity but high internal validity is the laboratory experiment. In the laboratory experiment, where the degree of measurement precision is high, cause and effect relationships may be determined, but these relationships may not be generalizable for other times, settings, and populations. While the formal theory/literature reviews and sample surveys have a high degree of generalizability and the laboratory experiment has a high degree of precision of measurement, these strategies have low degree of contextual realism. The only two strategies that maximize degree of contextual realism are field studies that use either primary or secondary data because the data is collected in an organizational setting (Scandura and Williams 2000).
The other four strategies maximize neither generalizability, nor degree of precision in measurement, nor degree of contextual realism. This point illustrates the futility of using only one strategy when conducting Internet marketing research. Because no single strategy can maximize all types of validity, it is best for researchers to use a variety of research strategies. Table 2 contains an overview of the nine strategies and their ranking on the three strategy tradeoffs (Scandura and Williams 2000).
Two coders independently reviewed and classified each article according to research strategy. Only a few articles were reviewed at one sitting to minimize coder fatigue and thus protect intercoder reliability (Neuendorf 2002). Upon completion of the independent classification, a tabulation of agreements and disagreements were computed, intercoder crude agreement (percent of agreement) was 91.8 % percent, and intercoder reliability using Cohen’s Kappa (Cohen 1960) was calculated (k = 0.847). These two calculations were well within the acceptable ranges for intercoder crude agreement and intercoder reliability (Neuendorf 2002). The reliability measures were calculated prior to discussing disagreements as mandated by Weber (1990). If the original reviewers did not agree on how a particular article was coded, an additional reviewer arbitrated the discussion of how the disputed article was to be coded. This process resolved the disputes in all cases.