INTERNATIONAL JOURNAL OF COMPARATIVE AND APPLIED CRIMINAL JUSTICE

SPRING 1992, VOL. 16. NO. I


Impediments to Cross-national Research:

Problems of Reliability and Validity


MARC G. GERTZ

Florida State University


LAURA B. MYERS

East Tennessee State University


Extrapolating from experiences in three European countries, the authors enumerate some of the practical problems in conducting criminal court research and gathering data cross-nationally. Within the context of the key concepts of validity and reliability, the impact of substantial research problems are discussed. The authors conclude that although compromises must be made, if care is taken, the compromises do not have to damage the ultimate worth of the research. The inherent value of cross-national research makes it incumbent upon criminal justice scholars to find ways to overcome unavoidable problems.


Introduction

Problems associated with reliability and validity in criminological research are a serious concern in large pan due to the decentralized and complex nature of the criminal justice system. Reliability problems stem from data collection difficulties. For instance, one cannot assume the reliability of official data. Among other problems, the potential for clerical carelessness, ambiguous official interpretation, and varying jurisdictional provisions exist. Validity problems are inherent in the measurement of critical criminal justice concepts (Krahn, Hartnagel, and Gartrell, 1986; Eisenstein and Jacob, 1974; Gertz and Talarico, 1977). The validity of even key concepts, such as offense and sentence, are questionable. These terms appear clear before operationalization, but multiple measures may be plausible, all with some claim to legitimacy. Which is the more valid measure of offense, for example? Is it the convicted offense or the charged offense? This choice can affect the meaning of research results. The problems of reliability and validity increase when cross-national criminological studies are undertaken. Present]),, the logical progression of criminological research is toward a more global perspective (Cole, Frankowski, and Gertz, 1987). To execute cross-national. research correctly, not only must the usual considerations described above be acknowledged, but social science training also must alert scholars to language and cultural differences and how, they further impede the challer.2e to conduct research in a reliable and valid manner.

The impact of these factors on measurement are of primary concern. The variation across nations can make the measurement of concepts questionable.

Ordinary sampling and observational problems associated with social science research also have to be considered in light of the complications arising from language barriers and cultural differences. An understanding of these differences and methods of addressing them must be developed if the validity and reliability of research findings is to be increased and if comparisons are to be made (Steiber, 1980). Otherwise, the information gained will be bereft of any meaning or, even worse, interpretations will be based on invalid data.

To illustrate the problems of validity and reliability inherent in comparative research, this paper includes examples from criminological cross-national studies and from current cross-national criminal court research. The discussion begins with validity and reliability problems produced by language and cultural differences. Discussion then proceeds to illustrate how these factors cause sampling and observational problems.


Impact of Language and Cultural Differences on Measurement

The Language Barrier

From a practical point of view, the language barrier is usually the first concern. When the English speaking researcher approaches cross-national research, a command of the other country's language is of great benefit. Yet, multiple language abilities are necessary to carry out comparisons of more than one country, and, unfortunately, such advantages are rare for most Americans. Several alternatives are available, such as using interpreters, relying on the English abilities of the subject, or utilizing one's own fragmented foreign language to obtain data. This decision is based on constraints of ability, time, and money, especially when studying multiple countries. When attempting two-way translation, there should be a concern with whether communication is actually taking place. The extent of reliability is probably increased with the use of qualified interpreters.

It may also be the case that there are no English equivalents for foreign words or, at best, incomplete translation. Measurement instruments constructed in one country are not usually designed to handle differences in meaning among indicators tapped in another country (Junger-Tas, 1989). In West Germany, for example, the German term for rape can be translated as rapture of freedom. In one study (Gertz, 1990), interpretation was impeded because the foreign judge did not know the term rape. The phrase, rapture of freedom, had to be broken down to determine its meaning. In addition, the formal, polite nature of the Germans made discussion of sex crimes very difficult.

These problems are of direct concern to the construction of the codebook and the collection of data. Before preparing the codebook of needed variables, the researcher must confront all language and cultural problems if the variables are to have any meaning. Preliminary, exploratory work may be necessary before any actual research project can be started (Schummann, 1983). Blanck (1987) discusses the importance of field research in the study of the courts.

The complications induced by language and cultural differences in cross-national research make these preparatory steps even more necessary. The important concerns of any study include (1) theory development and the construction of the purpose of the study, (2) developing proper techniques for entry to the research setting and proper logistics, (3) assessment of ethical costs and benefits, (4) gathering the data and construction of methods for the particular setting, and (5) dealing with concerns after the fact (Blanck, 1987). These five considerations are not easily addressed in one's native country. Confronting these concerns in a foreign country, with limited time, money, and access could be very discouraging. Preliminary fieldwork can reduce the problems associated with unfamiliar language, new research settings, and limited access. Sending out letters in advance to find willing participants can save lost time and effort, and should begin to highlight language problems which might hamper the research process at some later point. Some situations may even dictate that new techniques of data gathering be used. Being flexible enough to try new data gathering techniques is very helpful when doing cross-cultural research (Van den Berghe, 1973).

Cultural Differences

While most language problems eventually can be remedied, cultural differences are more problematic. Different values and mores present a distinctive background from which crimes emerge. In many countries, there are crimes unique to that culture. In a detailed comparison of criminal statistics from the United States and the Federal Republic of Germany (Teske and Arnold, 1982), the uncommon crimes of homicide upon request of person killed, predatory extortion, and extraction of electrical power were discussed in the analysis. Unfortunately for the reader, these crimes were never defined. In a sentencing study of Germany, Holland, and Denmark (Gertz, forthcoming), the crime of feeding hormones to cows, a practice common in the United States, was the most serious crime sampled in the Netherlands. The gradations of certain crimes in the Netherlands were also distinct. The gradations of theft, for instance, included theft, attempted theft, theft by two or more united persons, theft with violence by two or more united persons, and theft with violence. The different values associated with theft in the United States reflect a concern for the amount stolen rather than the number of perpetrators and amount of violence. Understanding the country's values and incorporating that understanding into the collection, treatment, and interpretation of the data are essential to increase the reliability and validity of cross-cultural studies.

To lessen the problems, researching the history of the country, visiting the country, and/or interviewing natives can be of value. Each of these tools, however, has its own problems and consequently should be utilized with caution. First, historical explanations can sometimes be contradictory, lending further confusion to the inquiry. Second, while visiting a country, language barriers may compound the difficulties of asking the right questions and making the proper contacts. Lastly, interviewing a native must be conducted with the knowledge of what role this person plays in society. It is important to understand the potential bias behind a person's answers. A judge, for example, reflects the values, needs, and constraints associated with judicial activities. While a researcher may understand these values within the native country, such values would probably be very different in other countries.


Interpretation Problems

The problems inherent in cross-national research relate not only to conceptualization and measurement, which occur in the data collection phase, but also to the interpretation of findings, a validity issue. A comparative analysis of citizen crime reporting in such countries as Canada, the Netherlands, the United States, England, and Australia (Skogan, 1984 ), illustrates the problem of the equivalence of indicators (measures that mean essentially the same thing), but fails to discuss these differences. One must be careful not to fall into the trap of interpreting from one's own social background. Ethnocentrism is the inclination to judge the culture of others by one's own native standards (Hess, Markson, and Stein, 1982; Schaeffer, 1984). Great care should be taken to exercise cultural relativism, or objective analysis, when confronting a different culture. Interpretation has to be grounded in meanings derived from the indigenous country's culture and values (Hess, Markson, and Stein, 1982).

For example, a Netherlands criminal court study (Gertz and Myers, 1990) found that guilty pleas were very high. The first inclination was to assume plea bargaining was taking place due to the cooperative nature of the courtroom workgroup. Discussions with a native researcher divulged that most Netherlanders confess to their crimes because maximum sentences are very lenient, resulting primarily in fines, and few offenders ever receive the maximum. In addition, trials differ significantly from trials in the United States. Trials are administrative hearings used to determine the penalty. With this knowledge, assumptions about the behavior of American courtroom workgroups have little applicability.

Schummann (1983) illustrates another interpretation problem. He discusses Jasinski's (1976) comparative study of the relationship between crime rates and punitiveness in seventeen European countries. One of Jasinski's relationships showed that Poland was much more likely than the Netherlands to incarcerate offenders by a 20:1 ratio. A Critique by Graboski (1978) takes issue with this interpretation on the basis that the systems of social control in these two countries are not necessarily comparable, and that Poland is not necessarily more punitive than the Netherlands.

Schummann (1983) uses this argument in his message that each punishment system as a whole must be understood before such comparisons can be made. A high incarceration rate may indicate a lack of alternative sanctions. A low incarceration rate, as in the Netherlands, could result from the use of non-incarceration alternatives, such as fines. Such an interpretation is supported by interpretations of findings, comprehension can be increased further, results validated, and reliability checked through a process of interaction with the country involved. Sending back univariates and percentages, along with interpretations, to those from whom the data were obtained can be helpful. In one study (Gertz, 1990), data obtained from courts in German),, Denmark, and Holland were validated by sending this information back to the judges who had been visited to determine if interpretations and findings were consistent with what they knew. The process proved to be helpful in refining the project and led to greater understanding for the next endeavor.


Sampling and Observation Obstacles

Sampling

Proper sampling, a critical process, is often hampered by the realities of the subject under study. Choices may involve analyzing existing statistics or collecting primary data. When analyzing existing statistics, it may be difficult to determine what type of sampling has been utilized, if any. The use of secondary analysis is compounded even further by the language and cultural problems discussed earlier. Attempts must be made to understand the data from the view point of the person who actually collected it. This means understanding the intent of the initial project, but more importantly, understanding the cultural background from which it was collected. Gertz and Myers' (1990) prosecutorial guideline analysis was undertaken on a secondary data set obtained from the Ministry of Justice in the Netherlands. The entire data set of over 20,000 criminal court cases in the Netherlands was recorded in Dutch. Fortunately, the Ministry of Justice had published several analyses in English from this data set which helped to explain the purpose of the initial data collection.

The primary collection of data is superior to the use of secondary data sources because the validity of measurements is enhanced by direct contact with the collection process (Fitzgerald and Cox, 1987). A problem which impedes primary data collection is the ability to use probability sampling. Learning how the units of analysis are arranged so the proper sampling procedure can be chosen is the first step. This may sound simple but, in fact, the units of analysis may not be readily accessible, i.e. stored in distant localities. If the data have been computerized, then the process is less complicated. Some jurisdictions in Gertz's (1990) criminal court research project were Computerized, but most were not. One count:-,, in particular, had absolutely no computerized sentencing data at the individual level. This characteristic was usually associated with the academic climate of the area. The country was more likely to have computerized statistics if there were a university or research institute nearby.

Since many areas are not computerized, attempts must be made to explain complicated sampling procedures to people not trained in the social sciences interpretations of findings, comprehension can be increased further, results validated, and reliability checked through a process of interaction with the country involved. Sending back univariates and percentages, along with interpretations, to those from whom the data were obtained can be helpful. In one study (Gertz, 1990), data obtained from courts in Germany, Denmark, and Holland were validated by sending this information back to the judges who had been visited to determine if interpretations and findings were consistent with what they knew. The process proved to be helpful in refining the project and led to greater understanding for the next endeavor.


Sampling and Observation Obstacles

Sampling

Proper sampling, a critical process, is often hampered by the realities of the subject under study. Choices may involve analyzing existing statistics or collecting primary data. When analyzing existing statistics, it may be difficult to determine what type of sampling has been utilized, if any. The use of secondary analysis is compounded even further by the language and cultural problems discussed earlier. Attempts must be made to understand the data from the view point of the person who actually collected it. This means understanding the intent of the initial project, but more importantly, understanding the cultural background from which it was collected. Gertz and Myers' (1990) prosecutorial guideline analysis was undertaken on a secondary data set obtained from the Ministry of Justice in the Netherlands. The entire data set of over 20,000 criminal court cases in the Netherlands was recorded in Dutch. Fortunately, the Ministry of Justice had published several analyses in English from this data set which helped to explain the purpose of the initial data collection.

The primary collection of data is superior to the use of secondary data sources because the validity of measurements is enhanced by direct contact with the collection process (Fitzgerald and Cox, 1987). A problem which impedes primary data collection is the ability to use probability sampling. Learning how the units of analysis are arranged so the proper sampling procedure can be chosen is the first step. This may sound simple but, in fact, the units of analysis may not be readily accessible, i.e. stored in distant localities. If the data have been computerized, then the process is less complicated. Some jurisdictions in Gertz's (1990) criminal court research project were computerized, but most were not. One country, in particular, had absolutely no computerized sentencing data at the individual level. This characteristic was usually associated with the academic climate of the area. The country was more likely to have computerized statistics if there were a university or research institute nearby.

Since many areas are not computerized, attempts must be made to explain complicated sampling procedures to people not trained in the social sciences and who are coping with interpretation problems. Undertaking proper sampling can be a very frustrating experience. In the criminal court research project (Gertz, 1990), judges were interviewed about their cases. The importance of random sampling and the problems of sampling bias were explained. With time and patience, comprehension was believed to have been achieved. In one German court, as the data were being collected, the cases began to appear highly sensational and often involved a celebrity. The judge made assurances that he was conducting the process correctly, but it was felt that the judge might be biasing the procedure to impress his American visitor with interesting cases. If this is believed to be a problem, the impact on the reliability of the data could be checked by contacting alternate sources in the follow-up procedure.

Other sampling concerns include sample size and sample selection. In cross-national research, the impediments to insuring proper sample selection and sample size arise from interacting with a foreign country, the expense of travel; and general accessibility. Reliance on a sample of convenience may become necessary because of political concerns, social connections, or simply time and money constraints. When utilizing interviews with personnel, such as court judges, one cannot conceivably ask for a great deal of the person's time. They are busy people who do not have a lot of time for what they may consider clerical work. In addition, access to particular personnel can be a limitation. Who one knows has a great deal to do with whom one works.

The inability to obtain optimum samples should not inhibit important cross-national work. The importance of such knowledge is too great to be ignored. The logical step forward in the solving of problems is the national ecological level (Cole, Frankowski, and Gertz, 1987). Therefore, in the pursuit of knowledge, acknowledgement of these problems must be made when findings are reported. Brown and Woolley ( l 985) discuss how small samples and samples of convenience can bias results. Small samples can incorrectly inflate the value of R-Square and the resulting beta values can be unstable. Small samples can lead to problems with generalizability. Understanding, accompanied by acknowledgement, of any problems in this area is mandatory if cross-national research is to continue properly. The best circumstance is one in which a grant allows optimum time, money, and access for the research project.


Observation

In the development of observational plans, interviews almost always are necessary to obtain data because of the language problem, as well as time constraints. At best, going through individual case files is not easy. Much information could be lost if this process is attempted alone. Using the organizational actor directly involved is a measure intended to increase the reliability of the data obtained. This is a practical approach which, at the same time, meets the theoretical needs of a research project. Understanding the subject matter from the viewpoint of the organizational actor directly involved in the process under study is critical to proper model building (Hagan, 1982). Some attempts have been made to use self-administered questionnaires. The strengths of such an approach involve speed and efficiency, as well as less expense. The weaknesses involve the language problem of interpretation, a reliability issue. Are the questions worded properly? What do the results mean? This may require prior validation to be secure.


Conclusion

Cross-National research, by it's nature, compounds the already critical
issues of validity and reliability in the investigation process. With the growing importance of cross-national applications to dilemmas in the United States, the need for this type of research will continue to increase (Cole, Frankowski, and Gertz, 1987). If it is to be conducted properly, awareness of the complications induced by language barriers and cultural differences is a must. This article points out how such issues impede the ability to increase validity and reliability, and how they compound the difficult tasks of sampling and observation. The experiences of several cross-national researchers have been described in an attempt to help the future cross-national researcher address these problems. Utilizing the proper tools, cross-national research can be conducted, with proper caution, so that invalid and unreliable data will not render the cross-national experience useless.