Standardized materials are a key element in science and essential for achieving reliable, replicable, and internally valid findings. Whereas other scientific disciplines have been using standardized materials for a long time, psychology is still lagging behind (Lang & Bradley, 2007). To equip psychologists with standardized stimulus sets, several research groups assembled databases of stimuli, such as the International Affective Picture System (IAPS; Lang et al., 2008) or the Open Affective Standardized Image Set (OASIS; Kurdi et al., 2017). As an example, the OASIS consists of 900 license-free photographs with standardized information on their valence and arousal. Based on this information, the pictures are used as stimuli in research across disciplines, ranging from emotion research over psychopathology and neuroscience (Lang & Bradley, 2007) to social and cognitive psychology. Typical research paradigms that use valenced pictures are priming (Herring et al., 2013), attitude formation (e.g., Vogel et al., 2019), and attitude measurement (e.g., Kurdi & Banaji, 2019).
Despite clear norms for the valence of these pictures, a closer look also reveals substantial heterogeneity in the pictures’ evaluations. As an example, picture I185 (showing a couple sitting on a bench) in the OASIS database yields a mean rating of 4.01, a seemingly neutral evaluation on the 1–7 scale. However, the standard deviation of the picture’s valence ratings is 1.41. Assuming a normal distribution of the valence ratings, approximately 32% of a random participant sample would evaluate it as negative or positive (below 2.6 or above 5.4 on the 1–7 scale). In the OASIS database, the average standard deviation of a picture’s valence rating is 1.10, indicating substantial heterogeneity in the evaluation of most pictures. Similar heterogeneity can also be found in other standardized sets (e.g., Lang et al., 2008). Where might this heterogeneity come from?
Interindividual Differences in Picture Evaluations
Evaluating pictures is a matter of complex appraisals and therefore prone to be influenced by many factors. Certainly, measurement error from single-item ratings used for most stimulus sets contributes to the large standard deviations. Even more systematic problems could be occurring with the response scale, as neutral ratings often represent ambivalent and not neutral attitudes (Schneider et al., 2016). However, in addition to situational influences, heterogeneity in picture evaluations could also reflect stable interindividual differences in psychological constructs. Previous research has already shown differences in valenced picture evaluations associated with sociodemographic variables such as age (Grühn & Scheibe, 2008) or gender (Lang & Bradley, 2007). Hence, differences associated with personality are also likely.
Notably, personality differences in the pictures’ evaluations would have crucial implications for interpreting previous results obtained in different research paradigms using them. That is, variance in these paradigms might actually be explained by interindividual differences in picture evaluations rather than interindividual differences in the processes under investigation. For example, consider interindividual differences in evaluative conditioning, which refers to a change in stimulus liking because of its paired presentation with a positive/negative stimulus (Hofmann et al., 2010). Vogel et al. (2019) presented participants with conditioned stimuli (e.g., faces) together with either positive or negative pictures from a standardized set. They also assessed the Big Five personality traits (John & Srivastava, 1999) and found that evaluative conditioning was stronger for people high in Neuroticism or Agreeableness. While this might indicate that a person high in Neuroticism or Agreeableness is more likely to form associations between a neutral and a positive/negative stimulus, alternatively, it could mean that the pictures evoked more intense evaluations. This example makes clear that interindividual differences in picture evaluations are important to consider beyond personality research as they could change the interpretation of many findings.
To shed more light on interindividual differences in picture evaluations from standardized sets, we set out to study them in relation to the most prominent and accepted taxonomy of personality traits, the Big Five (John & Srivastava, 1999). In the next section, we introduce the Big Five and propose how they should relate to evaluations of valenced pictures.
Picture Evaluation and Big Five Personality Traits
In classic definitions, personality is the “coherent patterning of affect, behavior, cognition, and desires” (Revelle & Scherer, 2009, p. 304). Thus, many personality traits are associated with or even defined by the differential experience of positive/negative stimuli (Augustine & Larsen, 2015). Arguably, this is particularly true for the Big Five. The Big Five include the dimensions Openness (to experience), Conscientiousness, Extraversion, Agreeableness, and Neuroticism. Out of these traits, Neuroticism and Extraversion are theoretically and empirically the most promising for this research question:
As noted by Costa and McCrae (1980, p. 673), “Extraversion […] predisposes individuals toward positive affect, whereas Neuroticism […] predisposes individuals toward negative affect”. This view on Neuroticism and Extraversion is also reflected in classic personality theories (H. J. Eysenck & Eysenck, 1985; Gray, 1981) and supported by a plethora of empirical findings (Augustine & Larsen, 2015). However, it is less clear how this predisposition translates into behavior when evaluating pictures.
From an affect-level view, one should generally expect more positive affect for people high in Extraversion and more negative affect for people high in Neuroticism (Howell & Rodzon, 2011; Lucas & Baird, 2004). This perspective has received some empirical support (e.g., Gross et al., 1998; Howell & Rodzon, 2011; Lucas & Baird, 2004) and would predict a negative association of Neuroticism and a positive association of Extraversion with picture evaluations, irrespective of the pictures’ valence.
From an affect-reactivity view, one would expect stronger reactivity of highly extraverted individuals to positive stimuli and highly neurotic individuals to negative stimuli. This perspective has also received empirical support (e.g., Canli et al., 2001; Gross et al., 1998; Larsen & Ketelaar, 1991; Rusting & Larsen, 1997; Smillie et al., 2012) and predicts more positive evaluations exclusively of positive pictures for individuals with higher Extraversion and more negative evaluations exclusively of negative pictures for individuals with higher Neuroticism. As both views agree on the latter associations, we conservatively expect:
H1: Higher levels of Neuroticism are associated with more negative evaluations of negative pictures.
H2: Higher levels of Extraversion are associated with more positive evaluations of positive pictures.
Regarding Agreeableness, there is less direct evidence of how it should be associated with picture evaluations. Yet, a vast amount of research has shown an overlap between disagreeableness and psychopathy (Decuyper et al., 2009; Stead & Fekken, 2014). People with high psychopathy show deviant reactions to emotional stimuli (Hoff et al., 2009; Kiehl et al., 2001). Correspondingly, Czerwon et al. (2011) found stronger valence judgments for both positive and negative faces for people with higher Agreeableness. Also, Vogel et al. (2019) found stronger evaluative conditioning effects for people with higher Agreeableness. In addition, with increasing levels of Agreeableness, people show stronger approach reactions towards positive pictures and stronger avoidance reactions towards negative pictures (Bresin & Robinson, 2015; Finley et al., 2017). Thus, we expect:
H3: Higher levels of Agreeableness are associated with more positive evaluations of positive pictures.
H4: Higher levels of Agreeableness are associated with more negative evaluations of negative pictures.
Next to positive or negative pictures, personality might be the deciding factor whether a neutral picture is actually rather seen as positive or negative. Thus, some of the previously mentioned relationships could also be present for neutral pictures. Indeed, neutral ratings in picture evaluations often reflect mixed responses towards ambivalent pictures (Schneider et al., 2016). Neutral pictures could represent “weak situations” in which associations with personality are the strongest. People high in Neuroticism, as an example, are more likely to interpret even ordinary situations as threatening (e.g., Lommen et al., 2010). However, the theoretical basis for directional hypotheses on neutral pictures is much weaker than for positive or negative pictures. Therefore, we refrain from formulating explicit hypotheses here.
Lastly, for the remaining traits of Conscientiousness and Openness, there is considerably less theoretical or empirical background than for the other three traits to make assumptions about how they might be associated with picture evaluations (Augustine & Larsen, 2015). Thus, we want to examine the association of the two traits with picture evaluations in an exploratory manner.
Despite the relevance of the issue and a plethora of previous research on personality and affect in general (Augustine & Larsen, 2015), empirical evidence for the association of the Big Five and picture evaluations in standardized sets is scarce. Of the few studies that have touched on the question, one did not include neutral pictures and used a statistical model not suited to answer our research question (Tok et al., 2010). Another study did not employ the Big Five but Impulsiveness and Anxiety (Aluja et al., 2015) and thus offers limited insight for our purposes. More recently, Levine and colleagues (2020) investigated interindividual differences in how participants cluster pictures from the IAPS. Since this study did not assess any valence ratings, it is not directly applicable, but their results do suggest substantial interindividual differences in how participants cluster pictures depending on the Big Five. Thus, previous research attests to the importance of our research question but also shows that a (more) systematic investigation is necessary to answer it.
Overview of the Present Research
In this research, we examine to what extent picture evaluations from standardized sets are associated with the Big Five. For that purpose, we administer the BFI-2 (Danner et al., 2019; Soto & John, 2017) and let participants evaluate pictures of different normed valence in the OASIS (Kurdi et al., 2017). As population estimates for correlations with personality traits require large and also heterogeneous samples (Schönbrodt & Perugini, 2013), we collected data from 936 German-speaking and English-speaking participants of different ages, gender, and education.
We report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study.
Design and Participants
In a single-factor design, normed valence of the pictures (positive vs. neutral vs. negative) varied within participants. The Big Five served as continuous covariates. To determine the sample size, we conducted an a priori power analysis with GPower (Faul et al., 2007). As a rough approximation for our design and analysis, we took the mixed ANOVA design with two groups and three repeated measures (α = .05, 1-β = .9). Our goal was to detect a small effect size of f = .1 for the between-within interaction and the between-participants main effect, which resulted in N = 214 and N = 704 as a minimum sample size. We thus aimed at a minimum sample size of N = 800. To achieve the necessary power and trait heterogeneity, 936 participants were recruited via the Respondi panel. We considered only finished interviews. English-speaking participants (53.95%) came from the UK and German-speaking participants from Germany, Austria, and Switzerland. Participants were compensated according to the panel’s incentive system (min. 1€) for a 20-minutes study that consisted of multiple independent tasks. Detailed descriptive statistics of our sample are displayed in Table 1. Overall, our sample was very heterogeneous regarding gender, age, and education.
|Mage (SD, Min, Max)||53.3 (14.82, 18, 84)||49.49 (15.28, 18, 88)|
|No formal education||8||0|
Note. Education levels of English- and German-speaking participants do not correspond exactly due to differences in the education systems. If not indicated otherwise, numbers represent frequencies.
Procedure and Materials
After providing informed consent, participants first filled out the personality questionnaire. Next, 30 pictures (10 per valence level) were drawn randomly via the PHP shuffle function from our stimulus pool and presented to the participants who rated them on valence. After this evaluation task, participants proceeded with other tasks unrelated to this research question1, demographic information was assessed, and participants were thanked and debriefed about the study’s purpose. In line with our university’s ethics committee guidelines, the study did not require specific approval. We received approval from our university’s data protection office.
Big Five Measures
For measuring the Big Five, we used the BFI-2 with 60 items2 (German: Danner et al., 2019; English: Soto & John, 2017). Descriptive statistics, Cronbach’s alphas, and intercorrelations are provided in Table 2.
Note. For all items, the scale ranged from 1 to 5. The first value corresponds to the English-speaking participants, whereas the second value corresponds to the German-speaking participants. N = Neuroticism; E = Extraversion; A = Agreeableness; C = Conscientiousness; O = Openness. All correlations are significant at p < .001, except for the correlation of Neuroticism and Openness in the English-speaking sample. According to five exploratory t-tests, German- and English-speaking participants did not differ significantly in mean trait levels (all p’s > .069). Following the preregistration, we nevertheless standardized the Big Five within each language for all following analyses to avoid any potential confound by language. The intercorrelations and internal consistencies were similar to those reported in the original publications (Danner et al., 2019; Soto & John, 2017).
As variation of our experimental factor ‘normed valence’, we selected 3 x 30 pictures from the OASIS (Kurdi et al., 2017) as stimuli. We used the OASIS because the pictures are current and have high quality. Furthermore, the pictures are license-free and thus usable in online research. Thirty pictures with normed valence ratings higher than +1 SD above the mean of all OASIS pictures were chosen as positive stimuli (OASIS valence rating > 5.56 on the scale of 1-7), thirty pictures with normed valence ratings between -0.33 SD (3.99) and +0.33 SD (4.71) as neutral stimuli, and 30 pictures with normed valence ratings below -1 SD (2.86) as negative stimuli. Arousal was kept constant at medium levels, with normed arousal ratings between -1 SD (2.86) and +1 SD (4.50). Hence, pictures obviously differed in their valence ratings reported in the manual, F(2, 87) = 1559.17, p < .001, η2 = .97, 95% CI [.96, .98], Mneg = 2.40, Mneu = 4.42, Mpos = 5.88, but not in their arousal ratings, F(2, 87) = 1.76, p = .178, η2 = .04, 95% CI [.00, .13]. Also, the sets had similar valence standard deviations reported in the manual, F(2, 87) = 1.19, p = .308, η2 = .03, 95% CI [.00, .11], and the mean valence ratings within a set were similarly heterogeneous according to a Levene test, F(2, 87) = 2.13, p = .125. Each valence set consisted of ten images depicting scenes, ten depicting persons, five depicting objects, and five depicting animals. A list of the used stimuli is provided in the Supplementary Materials. We did not use pictures that depicted extreme violence or nudity.
For the evaluation task, we used the same instructions and rating scale format (a seven-point scale labeled at each point) as in the OASIS (Kurdi et al., 2017)3. For the German-speaking participants, all instructions and materials were first translated by the online tool deepl, and then translations were slightly modified by two German native speakers who are proficient in English (C2 level; instructions are provided in the Supplementary Materials). As in the original OASIS norming study, each picture was presented on a single slide. Below the picture, the heading “Valence” was presented together with the labeled scale (very negative to very positive). For the 90 pictures, the aggregated valence ratings in our study almost perfectly correlated with those reported in the OASIS manual, r(88) = .98, p < .001, also when using only the German-speaking participants.
Following the preregistration protocol, we excluded 35 participants who provided the same answer for over 25 pictures in the evaluation task, or the same answer for more than 50 items of the BFI-2, leading to a final sample of 901 participants. This exclusion criterion was chosen to exclude participants that answered redundantly (e.g., giving the same response to pass the study as fast as possible). In an exploratory manner, we also repeated the analyses without any exclusions. These analyses yielded nearly identical results and are thus only provided as an HTML document in the Supplementary Materials.
Preregistered Analytical Approach
As measurements were nested within participants, we ran multilevel regression models with the R package lme4 (Bates et al., 2019). In a first baseline model, we decomposed the variance of the picture evaluations by including random intercepts for the participant and the specific OASIS picture. There was substantial variance between pictures, SD = 1.52, which is not surprising given that the pictures had been selected to capture a large range of valence, but there was also variance between participants, SD = 0.35, next to residual variance, SD = 1.21.
Next, we added two dummy variables for positive and negative normed picture valence in the model (see Model 1 in Table 3). We allowed these effects to vary between participants by including random slopes in this and all following models. As expected, positive pictures were evaluated more positively, b = 1.49, 95% CI [1.30, 1.68], and negative pictures more negatively, b = -2.09, 95% CI [-2.29, -1.80], than neutral pictures. Notably, the effects of positive and negative pictures were heterogeneous across participants, as indicated by the standard deviation of the random slopes, SDpos = 0.29, SDneg = 0.56.
||Estimate with control variables|
|Intercept||4.42||4.28||4.55||< .001||4.42||4.28||4.55||< .001||4.42|
|Positive||1.49||1.30||1.68||< .001||1.49||1.30||1.68||< .001||1.49|
|Negative||-2.09||-2.29||-1.90||< .001||-2.09||-2.29||-1.90||< .001||-2.09|
|N * Positive||0.10||0.04||0.15||< .001||0.05a|
|N * Valence||-0.08||-0.14||-0.01||.023||-0.05a|
|N (Negative)||-0.11||-0.17||-0.06||< .001||-0.06|
|E * Positive||-0.00||-0.06||0.04||.890||-0.01|
|E * Negative||-0.13||-0.20||-0.07||< .001||-0.13|
|A * Positive||0.20||0.15||0.25||< .001||0.16|
|A * Negative||-0.19||-0.25||-0.13||< .001||-0.19|
|A (Positive)||0.25||0.20||0.29||< .001||0.23|
|A (Negative)||-0.14||-0.19||-0.09||< .001||-0.12|
|C * Positive||0.06||0.00||0.11||.040||0.03a,b|
|C * Negative||-0.09||-0.15||-0.02||.008||-0.09|
|O * Positive||0.01||-0.04||0.06||.819||0.01|
|O * Negative||-0.05||-0.11||0.01||.115||-0.05|
|Random Effect SDs|
Note. All models were run with N = 901. The Big Five were standardized. Confidence Intervals were computed with the confint.merMod function of the R package lme4 (Bates et al., 2019) with the profile method. N = Neuroticism; E = Extraversion; A = Agreeableness; C = Conscientiousness; O = Openness. N * Positive refers to the interaction of Neuroticism and positive normed valence of a picture; N (Positive) refers to the simple effect of Neuroticism for pictures with positive normed valence. Model 2 was repeated with the control variables age, gender, and language, and also when aggregating within valence level and participants. Detailed results of these analyses are provided in the Supplementary Materials (Tables A1 and B1).
aChange in significance (α = .05) in the control variable model.
bChange in significance (α = .05) in the aggregated model.
For our main model, we standardized the Big Five within languages. They were entered together with their two-way interactions with the two dummy variables into the model (see Model 2 in Table 3). This reduced the variance between participants, SD = 0.53, and the variance of the slopes, SDpos = 0.25, SDneg = 0.48, but not the variance between the pictures, SD = 0.13. In this model, the main effect of a Big Five trait refers to the effect for neutral pictures. The interaction term of positive/negative valence with a Big Five trait can be interpreted in two ways: First, it captures the extent to which interindividual differences in valence effects (i.e., the variation in the random slopes) can be explained statistically by the respective trait. Second, it captures changes in the main effect when a picture is positive/negative. We reran this model with different valence dummy variables to obtain simple slope estimates for each valence level, thereby changing the baseline. Due to our coding scheme, regression weights can be interpreted as the increase on the 7-point scale by steps of 1 standard deviation on a personality trait. The results of these analyses are visualized in Figure 1.
As can be seen in Figure 1 and Table 3, higher levels of Neuroticism were not associated with evaluations of neutral pictures, b = -0.04, 95% CI [-0.08, 0.01], but with more negative evaluations of negative pictures, b = -0.11, 95% CI [-0.17, -0.06]. This is consistent with Hypothesis 1. Unexpectedly, higher levels of Neuroticism were also related to more positive evaluations of positive pictures, b = 0.06, 95% CI [0.01, 0.11].
Higher levels of Extraversion were associated with more positive evaluations of positive pictures, b = 0.11, 95% CI [0.06, 0.16], therefore supporting Hypothesis 2. However, this positive association was also present for neutral pictures, b = 0.11, 95% CI [0.07, 0.16], but not for negative pictures, b = -0.02, 95% CI [-0.07, 0.03].
Consistent with Hypothesis 3 and 4, higher levels of Agreeableness were associated with more positive evaluations of positive pictures, b = 0.25, 95% CI [0.20, 0.29], and more negative evaluations of negative pictures, b = -0.14, 95% CI [-0.19, -0.09]. In addition, higher levels of Agreeableness were weakly positively related to the evaluations of neutral pictures, b = 0.05, 95% CI [0.01, 0.10].
Next to the hypothesized relevant traits, we also observed associations of Conscientiousness with picture evaluations: Higher levels of Conscientiousness were associated with more positive evaluations of positive pictures, b = 0.06, 95% CI [0.01, 0.11] and more negative evaluations of negative pictures, b = -0.08, 95% CI [-0.14, -0.03]. Finally, Openness was not significantly associated with evaluations of neutral, positive, or negative pictures (see Table 3). Thus, overall, our hypotheses were supported by the data.
Following the preregistration protocol, we next examined to what extent the found associations between the Big Five and picture evaluations were incremental to associations with sociodemographic variables. Both age and gender have been found to correlate with picture evaluations (Grühn & Scheibe, 2008; Lang & Bradley, 2007) and the Big Five (e.g., Donnellan & Lucas, 2008). Therefore, we repeated Model 2 while controlling for gender (standardized, higher scores = male), age (standardized), and language (standardized, higher scores = English). All control variables were allowed to interact with the two valence dummies. We only report results here if the inclusion of the control variables changed the significance (p < .05) of one of the terms (see Table 3). Detailed results of these analyses are provided in the Supplementary Materials (Table A1). Overall, the associations postulated in our hypotheses remained stable when controlling for sociodemographics. Only the association of Neuroticism with evaluations of negative pictures (H1) was weaker but still significant, and the unexpected association of Neuroticism with evaluations of positive pictures was not significant anymore.
Due to a nonnormal distribution of residuals, we repeated Model 2 with aggregated data as a statistical robustness check (which was not preregistered). The ten measures per individual and valence level were aggregated, which led to a model that only contained three measures per participant and thus only intercepts of the participants as random effects. Again, the results did not change much (see Table 3). The detailed results of these analyses can be found in the Supplementary Materials (Table B1).
Exploratory Facet-Level Analyses
In an exploratory manner, we also repeated our main model with the 15 facets of the BFI-2 (instead of the five dimensions) as predictors. As these analyses were not preregistered, we do not present the results here but in the Supplementary Materials (Table E1). In essence, none of the Neuroticism facets were significantly associated with picture ratings. For all other dimensions, the facets showed diverging and sometimes even opposing associations (e.g., Productiveness was positively associated but Responsibility negatively associated with ratings of negative pictures, despite both belonging to the dimension Conscientiousness).
In this research, we examined whether the Big Five personality traits are associated with valence ratings of pictures from a standardized database. Our preregistered large-scale study (N = 901) revealed that all Big Five traits except Openness are associated with evaluations of positive, neutral, or negative pictures. In the following, we first discuss these results in light of personality research, then implications for research that uses valenced pictures, and finally limitations of our study.
Big Five Traits and Picture Evaluations
Overall, our predictions for Neuroticism and Extraversion were validated by the pattern of results: Based on classical personality theories (H. J. Eysenck & Eysenck, 1985; Gray, 1981), Neuroticism should be associated with negative and Extraversion with positive affect. Consistent with this, higher levels of Neuroticism were associated with more negative evaluations of negative but not positive pictures. Correspondingly, higher levels of Extraversion were associated with more positive evaluations of positive but not negative pictures. These results are consistent with the affect-reactivity perspective on the link between personality and affect (Canli et al., 2001; Gross et al., 1998; Larsen & Ketelaar, 1991; Rusting & Larsen, 1997; Smillie et al., 2012). However, there was also a positive relationship of Extraversion with the ratings of neutral pictures. This might also fit the affect-reactivity perspective, as previous research has shown that neutral evaluations often reflect ambivalent attitudes (Schneider et al., 2016). Highly extraverted individuals might react positively to the positive aspects of a picture. However, it is also possible that participants evaluated the pictures in a way consistent with how they wanted to feel. For instance, extraverts prefer experiencing positive emotions and might thus evaluate the neutral pictures consistent with their preferred emotional state (e.g., Tamir, 2009).
Our results regarding Agreeableness also matched with our theoretical reasoning – higher self-reported Agreeableness was associated with more positive evaluations of positive pictures and more negative evaluations of negative pictures. In other words, high Agreeableness emphasizes the valence implied in the norm ratings. This fits the idea that less agreeable individuals show deviant affective reactions to emotional stimuli (Decuyper et al., 2009; Stead & Fekken, 2014) and is in line with previous research on Agreeableness and emotional stimuli (Bresin & Robinson, 2015; Czerwon et al., 2011; Finley et al., 2017; Vogel et al., 2019). Apparently, people high in Agreeableness are most likely to show the consensual reaction, thus a positive (negative) evaluation of pictures that are evaluated positively (negatively) by the vast majority of people.
However, there might also be alternative explanations of our findings on Agreeableness. For instance, previous research has shown that agreeable individuals are emotionally more responsive to social situations relevant to interpersonal relationships (Tobin et al., 2000). This would imply that the pronounced relationships of Agreeableness with picture evaluations might be due to pictures depicting social content. We thus reran our main model with an additional dummy variable that codes picture content (1 = social, 0 = non-social). These exploratory analyses revealed that the relationships of Agreeableness and picture evaluations were descriptively slightly stronger for the social pictures but still present for the non-social pictures. Detailed results of these analyses are provided in the Supplementary Materials (Table C1). As a further interpretation, agreeable participants might simply be more compliant with the evaluation task and thus provide more reliable ratings (Vogel et al., 2019). Overall, our findings show that Agreeableness and its role in affective processes deserve to be further examined in future research.
However, our results also revealed some effects that were not predicted by us, mainly regarding Conscientiousness. Higher levels of Conscientiousness were associated with more positive evaluations of positive pictures and more negative evaluations of neutral and negative pictures. One major aspect of Conscientiousness is acting dutifully and focused (Soto & John, 2017). Therefore, we speculate that people with high Conscientiousness simply took the task more seriously and provided more reliable judgments.
Overall, our results show that the Big Five are indeed related to interindividual differences in valence ratings from a standardized database, even beyond sociodemographic variables. This is consistent with previous research on personality and affect and further contributes to this field. Next to personality research, however, these results could also have important implications for psychological paradigms in other disciplines.
Implications Beyond Personality Research
We started this paper by arguing that systematic interindividual differences in picture evaluations could pose a potential problem for prominent paradigms in psychology, and our results indeed suggest that this may be the case. To pick up the introductory example, Vogel et al. (2019) found stronger evaluative conditioning effects for people with higher Neuroticism and Agreeableness – two traits for which we also found more pronounced effects of a picture’s valence. Thus, it seems likely that those traits do not moderate the conditioning process itself, but the unconditioned stimuli have a stronger valence for people with higher Neuroticism and Agreeableness. Clearly, future research is necessary to examine the Big Five x Evaluative Conditioning moderations further.
This also raises the question of how researchers using these paradigms should deal with interindividual differences in the pictures’ evaluation. On the one hand, researchers who want to avoid any association with personality could select only those pictures that are evaluated positively/negatively by the individual participant. Another possibility would be to control for interindividual differences statistically (i.e., adding pre-ratings as covariates). On the other hand, detecting (instead of avoiding) associations with personality could also improve our knowledge on the underlying mental processes in these paradigms. For instance, the fact that participants high in Extraversion evaluate positive pictures more positively but apparently do not show elevated conditioning effects (Vogel et al., 2019) could imply that some of the processes underlying evaluative conditioning are weaker amongst extraverts.
Finally, we are confident that our results have similar implications for other research designs or even other research areas (e.g., neuroscience). On the positive side for research using these pictures, one should also keep in mind that we find modest effect sizes, with maximum shifts of 1/4 scale points on the 1–7 scale for a +/- 1 SD increase/decrease on a trait. Still, we focused exclusively on the conceptually broad Big Five in our research – more narrow personality traits (e.g., Need for Affect, Maio & Esses, 2001) could lead to even more pronounced effects. However, this is just a speculation, which brings us to the limitations of our research.
Limitations and Future Research Directions
Our findings are restricted to a selection of 90 pictures. We chose this stimulus pool for its similarity to those used in a typical psychological paradigm (such as evaluative conditioning) regarding size, symmetric differences in valence, but no differences in arousal. Future research should aim to replicate our findings with another selection of pictures, perhaps even from other popular standardized databases, such as the IAPS (Lang et al., 2008).
Also, we used the rating paradigm and instructions from the OASIS to make our findings comparable to the normed ratings. However, such ratings only capture rather spontaneous appraisals but do not assess temporal dynamics of affective reactions. In many research paradigms, the same picture is presented either for a longer time or on multiple occasions. Previous research has shown that the Big Five are also associated with emotion regulation processes (Augustine & Larsen, 2015; Bresin & Robinson, 2015). Thus, future research should investigate whether found associations of the Big Five and evaluations of affective pictures also vary over time.
In addition, we tested our hypotheses in a broad sample of German- and English-speaking participants. The fact that we find the same pattern independent of participants’ language (or nationality) speaks for the robustness of our findings across Western countries. Yet, replications in different cultures are recommended as cultural differences in the appraisal of such pictures can be expected (cf., Kurdi et al., 2017).
Last, we focused exclusively on valence ratings in this research. In general, valence is considered to be the most important dimension in affective experiences (Lang & Bradley, 2007). However, classic theories on personality would also predict systematic interindividual differences in arousal ratings (H. J. Eysenck & Eysenck, 1985). Because valence and arousal are not independent dimensions (i.e., arousal is higher for pictures of positive and negative than of neutral valence)4, a thorough investigation of arousal effects might also further our understanding of interindividual differences in valence ratings. Future research should therefore investigate interindividual differences in arousal ratings as well. We have no reason to believe that the results depend on other characteristics of the participants, materials, or context.
In the present research, we investigated interindividual differences in picture evaluations from a standardized database. We show that all Big Five traits except Openness are associated with interindividual differences in valence ratings of positive, neutral, and negative pictures. These findings have important implications for research designs in psychology and point to possible problems for interpreting their results. At the same time, they demonstrate the role of the Big Five in interindividual differences in emotional experiences.