Category: Couples / Close Relationships
Observational coding is a widely used method to study couples’ communication behavior (e.g., Baucom et al., 2015). However, one of the main challenges associated with observational coding is obtaining high interrater reliability. High interrater reliability is necessary to ensure that differences in observed communication behaviors among couples are due to between-couple differences instead of differences among coders (Heyman, 2001). Low interrater reliability may increase the likelihood of type II error by reducing power and conflating the observed behaviors, which limits the ability to observe behavioral change over time (Baucom, Leo, Adamo, Georgiou, & Baucom, under review; Hallgren, 2012). Cronbach postulates that the internal reliability of a measure can be increased through the addition of items measuring the same construct. Extending the same logic to observational coding, we hypothesized that increasing the number of coders would result in higher interrater reliabilities of aspects of relationship functioning.
Data were collected as part of a larger study of the effectiveness of an intervention program for low-SES couples transitioning to parenthood (Baucom et al., in press). Couples who participated in the larger study were over 18 years old, pregnant with their first child together, and living together. Fourteen couples participated in 10-min video recorded conflict and parenting interactions at up to two time points (approximately 1 and 3 months after the birth of their child), resulting in a total of 44 interactions.
Couples’ relationship quality and communication behaviors (e.g., negative and positive reciprocity, woman demand/man withdraw and man demand/woman withdraw, mutual avoidance, and vulnerability/empathy) were coded using the Naïve Observational Rating System (NORS; Christensen, 2006), which requires minimal training of coders. A team of 15 undergraduate coders coded all 44 interactions resulting in 660 sets of observational coding data. Data were analyzed by computing and averaging Cronbach’s alpha for all combinations of coding teams of size k, where k varies from 2 to 15. For example, Cronbach’s alpha for coding teams of 2 was the average of all 105 possible combinations of 2 coders drawn from the team of 15.
Consistent with Cronbach’s theory, increasing the number of coders resulted in higher interrater reliabilities for relationship quality in addition to all six communication behavior codes in all instances. For example, there was an increase in interrater reliability from α = .47 (2 coders, 105 combinations) to α = .79 (7 coders, 6435 combinations) for positive reciprocity. The results illustrate the utility of larger coding teams for increasing interrater reliability, which will subsequently ensure power necessary to detect behavioral change over time. Although larger coding teams may seem more effortful, they are particularly well suited for use with systems that require minimal training of coders. We discuss the application of the Spearman-Brown Prophecy formula to calculate k coders needed to achieve a desired interrater reliability using our results as an example, which helps optimize observational assessment study design.
Karena Leo– Graduate Student, University of Utah, Salt Lake City, Utah
Colin Adamo– Graduate Student, University of Utah, Salt Lake City, Utah
Panayiotis Georgiou– Assistant Professor, University of Southern California
Brian Baucom– Assistant Professor, University of Utah, Salt Lake City, Utah
Katherine Baucom– New York University