Implicit Personality Self-Concept Assessment and ValidationKonrad Schnabel1123232.1452.26577#3667892.310112.4122.4.11314152.4.2161718192021222.523242.62.6.1252627282.6.229303303.1313.23.2.1323334353.2.23637383.33.3.139403.3.241424424.1434.1.14.1.2444.1.3454.1.4464.2474.348494.44.4.1504.4.251525354554.54.5.15657584.5.2594.5.3606162634.64.6.1646566674.6.268695695.15.1.1705.1.2715.1.3725.273745.35.3.1755.3.276777879808182835.45.4.1845.4.2855.4.3865.4.4875.4.5885.4.68990915.4.792935.4.89495969798991001011025.55.5.11035.5.25.5.31041055.5.410661066.11076.1.11086.1.26.21091101116.36.3.16.3.21121131146.46.4.11156.4.21166.4.36.4.41176.4.51181196.56.5.11206.5.21216.5.312212371237.11241251267.21271287.312913013181311329. References10. Appendix10.1 German IAT Stimuli
10.2 Experimental Set-Up and Screen Design of the IAPin Pilot Study 2 and in Study 1472#342473#323deInhaltsverzeichnisHilfe
Two Pilot Studies for the Adaptation of a New Indirect Measure for Shyness

Introduction

Depending on the context, a moderate correlation between direct and indirect measures is sometimes interpreted as convergent validity, sometimes as discriminant validity. However, direct measures were considered to assess explicit representations, and indirect measures to assess implicit representations. Explicit and implicit representations were conceptualized as elements of two different, but interacting systems (see Chapter 2.2). Thus, correlations between direct and indirect measures can neither be unambiguously interpreted as convergent nor as discriminant validity. Instead, in order to correctly evaluate the convergent validity of an indirect measure, a correlational analysis with another indirect measure is needed. Also, a comparison between two different indirect measures is necessary to judge the method effects of any specific indirect assessment procedure. Therefore, an additional indirect measure was developed. The measure was adapted to assess the implicit personality self-concept of shyness, and was pre-tested for the purpose of the next study (Study 1).

Priming methods have only partially been shown to be an adequate referent to the IAT from an individual assessment perspective (see Cunningham, Preacher, & Banaji, 2001, for successfully, and Bosson, Swann, & Pennebaker, 2000, for unsuccessfully correlating priming methods with the IAT). As an alternative the Evaluative Movement Assessment (EMA) from Brendl, Markman, and Messner (2003) was adapted to the study of the implicit personality self-concept.

The EMA was designed to employ automatic movement tendencies for the assessment of implicit preferences and motivations. The procedure induces automatic movement tendencies by two joystick movements that represent either approach behavior (pulling the joystick toward a target) or avoidance behavior (pushing the joystick away from a target). In cooperation with Brendl and Messner the EMA was noticeably modified in order to assess the associative strength between the concept of self and attribute concepts (e.g., shy). The modification of the EMA was named the Implicit Association Procedure (IAP). Its main difference to the IAT is that already the response (pulling the joystick toward a target or pushing it away from a target) has its own valence by triggering an automatic movement tendency. Another difference is that it is possible to specify unipolar target categories (such as self without specifying an opposite category such as others).

The detailed procedure of the IAP is described in the method section. In line with the EMA methodology it was hypothesized that attributes that play an important role in the self-concept could be responded to more quickly with a joystick movement towards oneself than away from oneself. The opposite should be true for attributes that are not associated with the concept of self. The psychometric properties of three different IAP variants for shyness were pre-tested in two pilot studies. The IAP variant that would be considered for further studies was expected to meet the following criteria. First, its internal consistency should be at least α = .70. Second, it should show a substantial correlation with the shyness IAT, that is, at least r = .40. Third, it should, like the shyness IAT, correlate intermediately with direct self-ratings of shyness, that is, .30 < r < .50. Forth, it should, like the shyness IAT, not correlate with social desirability. These criteria were explored in the pilot studies.

Pilot Study 1: The Bipolar and the Unipolar IAP Variant

In Pilot Study 1 a bipolar and a unipolar IAP variant was examined. Their main difference was that the bipolar variant included Shy and Nonshy words but no Me and Notme words whereas the unipolar variant included Shy, Me, and Notme words but no Nonshy words.

Methods

Participants and design. Participants were 32 (25 female and 7 male) psychology students that received research participation credit for an experiment on computer aided personality assessment. Their mean age was M = 22.3 years, with a range from 19 to 29 years. Since the joystick was situated on the right side of the keyboard and was operated with the right hand, we made sure to select only right-handed participants. Due to technical shortcomings of the first joystick that was used, data from 10 participants of the bipolar IAP version and from 7 participants of the unipolar IAP version had to be excluded.

All participants completed (a) self-ratings on bipolar personality-describing items, (b) the bipolar or unipolar shyness IAP, (c) other personality items, (d) the shyness IAT, (e) two social desirability scales, (f) the IAP variant different from (b), and (g) were interviewed about the experiment. The shyness items of the IAPs and the IAT were included as direct ratings in step (a). The application of the unipolar and the bipolar IAP in step (b) and (f) alternated between participants, such that half of participants completed the bipolar IAP in step (b) and the unipolar IAP in step (f). The other half of participants completed the IAPs in the reverse order.

Direct self-ratings. All direct self-ratings were assessed on the computer and were presented in a fixed random order. In step (a), participants had to rate their shyness on 10 bipolar adjective pairs (e.g., “shy 1-2-3-4-5-6-7 nonshy”) that were mixed with 30 conscientiousness, intellect, and irritability pairs. Step (c) comprised 28 personality-descriptive items on a 5-point scale (1 = not at all true for me, 5 = completely true for me). Five items referred to shyness and were the same used by Asendorpf et al. (2002). In step (e), participants responded to the 39 items of the social desirability scales from Lück and Timaeus (1969; English version by Crowne & Marlowe, 1960) and Stöber (1999; without the Item “Have you ever consumed drugs”). These scales contain 16 and 23 items, respectively, and measure socially desirable responding by asking for socially desirable but infrequent or socially undesirable but frequent behaviors on a true-false format. To obtain a score for socially desirable responding items of both scales were aggregated.

Implicit Association Test (IAT). The shyness IAT was identical to Asendorpf et al.’s (2002) studies. Task sequence and stimuli are depicted in Table 3. IAT scores were computed as the difference between mean response latencies in sequence 5 and sequence 3 (see Table 3). These sequences carried out different combinations of the two target categories (Me versus Others) with the two attribute categories (Shy versus Nonshy). Thus, high IAT scores represented quicker associations of Me-Shy and Others-Nonshy as opposed to Me-Nonshy and Others-Shy.

Throughout the five discrimination tasks, category labels assigned to the right or left response key were displayed in the right or left upper screen corner, respectively. Response keys were the number “5” of the right-side numeric keypad and the letter “a” on the left side of the keyboard. On each trail, a stimulus word was displayed in the center of the screen. Participants were instructed to categorize the stimulus as quickly and accurately as possible. Responses were recorded using ERTS software (Behringer, 1994). After correct responses the interstimulus interval was 300 ms. After incorrect responses, the stimulus was immediately replaced by the word FEHLER (German for error) for 1000 ms, resulting in a 1300 interstimulus interval. Since this study focused on interindividual differences, and I did not want to confound interindividual variance with order variance, the stimulus order was the same for all participants. In the two combined tasks, the stimuli alternated between target and attribute discrimination. The 10 target and 10 attribute stimuli were randomized in order within 4 blocks of 20 trials. Internal consistency was evaluated across these 4 subtests. Trials with incorrect responses were excluded from analysis, and response latencies above 3000 ms were recoded as 3000 ms. Since the adaptation of the shyness IAP was based on this data reduction procedure (raw instead of log-transformed latencies, inclusion of first two trials of combined blocks), the reported results refer to such procedure.

Implicit Association Procedure (IAP). The IAP was based on the Evaluative Movement Assessment (EMA), developed by Brendl, Markman and Messner (2003). Within Pilot Study 1, two earlier EMA versions were adapted to assess the self-concept of shyness. The two shyness IAP variants were similar to the shyness IAT in that they combined discriminations of Shy versus Nonshy (attribute discrimination) with discriminations of Me versus Notme (target discrimination). Contrary to the IAT, only Me was explicitly shown on the computer screen and no label for alternative targets was given. Therefore, Notme described the nonself-relevant alternatives better than Others. However, the main difference to the IAT was that participants responded by moving a joystick instead of pressing an answer key. With the joystick stimuli had to be pushed toward or away from the word Me depending on whether the stimuli had to be associated with Me or Notme. In the two IAP variants of Pilot Study 1 the joystick was situated before the participant, on the right side of the keyboard. The word Me was displayed in the center of the screen, whereas stimuli were presented on its right or left side. For stimuli to appear on the right side the joystick had to be pushed to the left, if the stimulus had to be associated with Me, and to the right, if the stimulus had to be associated with Notme. For stimuli to appear on the left side the opposite was true.

A bipolar and an unipolar IAP variant were adapted in Pilot Study 1. The task sequence of both is depicted in Table 4. In the bipolar version, there was a discrimination of Shy and Nonshy but not of Me and Notme words. Participants first had to push Shy words toward Me and Nonshy words away from Me. Then, the answer direction was reversed and Shy words had to be pushed away from Me and Nonshy words toward Me. The IAP score was computed as the difference in mean latency between both tasks (sequence 2 minus sequence 1, see Table 4). The Shy and Nonshy words were identical to the IAT and were randomized in order within 10 blocks of 10 trials. Internal consistency was evaluated across 5 subtests with 20 trials each. In the unipolar version there were Me, Notme, and Shy but no Nonshy words. First, participants learned to discriminate the target concepts that consisted of three Me (self, my, own) and three Notme (your, them, other) words that were identical to the IAT target stimuli. In the following initial combined tasks, the five Shy words from the bipolar version were added and had to be pushed toward Me. Finally, the answer direction for the Shy words was reversed. The IAP score was computed as the difference in mean latency between both combined tasks (sequence 3 minus sequence 2, see Table 4). Stimuli were randomized in order within 10 blocks of 11 trials. Internal consistency was evaluated across 5 subtests with 22 trials each.

As in the IAT, participants were instructed to respond as quickly and accurately as possible. The correct answer directions for the Me words (ME WORDS = TOWARDS ME) and/or the Shy words (SHY WORDS = towards ME or SHY WORDS = AWAY FROM ME) were presented in green color in the middle of the upper screen line. During all trials the word Me (white letters) with a frame around it was displayed in the center of the screen. Trials began by displaying the stimulus mask XXXX (red letters) for an interval of 500 ms at the right or left side of the Me. Next, a target or attribute word (red letters) was presented in the same place. The stimulus disappeared when participants moved the joystick clearly in one direction, whereas the reaction time was registered immediately at the beginning of the movement. Reaction time was measured as the time passed from the beginning of the stimulus presentation. After correct responses the interstimulus interval was 600 ms. After incorrect responses the stimulus was immediately replaced by (a) the word FEHLER (German for ‘error’) if the joystick was moved in the wrong direction, (b) the words ZU LANGSAM (German for ‘too slow’) if there was no response after 3000 ms, or (c) the words ZU FRÜH BEWEGT (German for ‘moved too early’) if there was any response during the presentation of the stimulus mask. All error announcements were displayed in yellow in the center of the screen for 200 ms and were followed by the 600 ms interstimulus interval. Within both IAP variants stimulus order was not randomized between participants. All trials with incorrect responses were excluded from analysis. As the presentation of the stimulus stopped after 3000 ms, there were no response latencies longer than that.

Interview. Finally, participants were asked to comment on the experiment and whether they had difficulties with the IAT or the IAPs. In addition, they estimated the difficulty of the IAT and the two IAP variants on five-point scales ranging from 1 = easy to 5 = very demanding.

Results and Discussion

Error rates and distribution of test scores. Error rates were for the bipolar IAP M = 7.9%, SD = 5.2%, for the unipolar IAP M = 4.9%, SD = 4.2%, and for the IAT M = 6.8, SD = 4.0%. Differences were tested by a 2x3 ANOVA with order (bipolar vs. unipolar IAP at first) as the between-subjects, and test (bipolar IAP, unipolar IAP, IAT) as the within-subjects factor. Results showed no main effect of order, but a marginal main effect of test, and a marginal interaction effect, F(1, 19) = .72, n.s., F(2, 38) = 2.80, p < .10, F(2, 38) = 3.16, p < .10. Post hoc comparisons with Bonferoni correction (p < .005) indicated that when the bipolar IAP was the first test its error rates were higher than for the unipolar IAP, t(11) = 4.00, p < .005, d = 1.64 , as well as error rates for the IAT were higher than for the unipolar IAP, t(11) = 4.05, p < .005, d = 1.65. (The effect size d for repeated measures was computed as √2(M1 - M2)/SD where SD is the standard deviation of the difference scores; see Cohen, 1988). All other differences were not even marginally significant, all |t|(11) < 2.30, n.s.. For all three indirect tests, no participant had error rates higher than 19%, and the distributions of the test scores were not even marginally different from a normal distribution, Z < 1.

Reliabilities and correlations of indirect and direct measures. As it can be seen in Table 3, the two IAPs only partially met the criteria for a new indirect procedure. First, reliability for both IAP variants was satisfactory and comparable to the IAT, although it tended to be lower for the bipolar IAP. Inspection of scatterplots (first test half against second test half) revealed that the somewhat higher reliability of the unipolar version was driven through one outlier. When this participant was discarded from analysis, Cronbach’s α decreased to .73 for the unipolar variant, too. However, exclusion of this participant did not affect the correlations of the unipolar IAP. Together, reliability was slightly smaller for the IAP than for the IAT but still on an acceptable level. Second, neither of the IAP variants even marginally correlated with the IAT. Although this correlation was somewhat higher for the bipolar IAP, it still did not reach the substantial convergent validity that was expected. Moreover, the two IAPs were only intermediately correlated, indicating small convergent validity between both variants. Third, concerning direct shyness measures, the bipolar IAP showed high correlations, whereas the unipolar IAP tended to correlate only marginally. Thus, the intermediate correlation of the IAT with direct measures was only replicated for the unipolar IAP, while the bipolar IAP showed high convergent validity with direct self-ratings. Fourth, like the IAT, the IAPs did not correlate with social desirability. However, this was also true for direct measures, what may very well be a matter of chance finding, as shyness self-ratings are usually correlated with social desirability (Jones, Briggs, & Smith, 1986), and were so in Pilot Study 2. Finally, the two shyness self-ratings were highly correlated, replicating the convergent validity of the bipolar items, which were used in the indirect tests (Asendorpf et al., 2002).

According to an advice of the EMA authors (C. Messner, personal communication, December, 2000), the unipolar IAP score may be better calculated when considering response latencies for only the Shy without the Me and Notme words. However, this had almost no effect on the results. If reaction times for Me and Notme words were excluded rather than included, the unipolar IAP’s reliability was virtually the same, α = .86 versus α = .84. The correlation with the bipolar IAP – that was completely without Me and Notme words – was slightly higher, r = .45 versus r = .37. All other correlations tended to be smaller, such as in the correlation with the IAT (r = -.15, versus r = .13), the bipolar shyness self-rating (r = .32, versus r = .33), and the shyness questionnaire (r = .32, versus r = .38). Together, this illustrated, that inclusion of the Me and Notme trials into the scoring algorithm did not decrease the validity of the unipolar IAP.

Interview. A 2x3 ANOVA with order (bipolar vs. unipolar IAP at first) as the between-subjects and test (bipolar IAP, unipolar IAP, IAT) as the within-subjects factor was performed on the difficulty estimates that participants reported for the three indirect tests. Results showed significant main effects for both factors and a marginally significant interaction effect, F(1, 19) = 4.52, p < .05, F(2, 38) = 13.74, p < .001, F(2, 38) = 3.01, p < .10. Post hoc comparisons with Bonferoni correction (p < .005) revealed that across the two order groups the unipolar IAP was judged as marginally easier when it was the last rather than the first test, t(19) = -2.93, p < .01, d = 1.34. This was not true for the bipolar IAP, t(19) = -.06, n.s.. The IAT, that was always the second test, was not judged differently between both groups, t(19) = -1.49, n.s.. Post hoc comparisons within the two order groups indicated that when the bipolar IAP was the first test it was judged as more difficult than the IAT and the unipolar IAP, t(11) = 4.42, p < .005, d = 1.80, t(11) = 3.80, p < .005, d = 1.55. In contrast, when the unipolar IAP was the first test it was not judged as more difficult than either the IAT or the bipolar IAP, t(8) = 2.63, n.s., t(8) = .00, n.s.. Neither the bipolar nor the unipolar IAP were judged as more difficult than the IAT when these were the last test, t(8) = 2.86, n.s., t(11) = 1.08, n.s..

What made the bipolar IAP - at least when it was the first test - more difficult and, as observed before, more susceptible to errors than the unipolar IAP? In the interview, participants reported that they had difficulties to associate the horizontal joystick movement to the right or to the left with a movement toward or away from Me. A movement toward versus away from Me could have been more directly associated with a vertical joystick movement, that is, with pulling the joystick towards oneself versus pushing it away from oneself. In the unipolar IAP version, the, although horizontal, Me-Notme dimension was continuously practiced by including the Me-Notme words. In both IAP versions, the Me-Notme discrimination might have been additionally difficult because Me-Notme could not be constantly assigned to a movement to the right versus to the left. Thus, the correct movement direction changed depending on whether the stimulus appeared on the right or the left side of the Me. For example, when Shy words had to be associated with Me, the joystick had to be pushed to the right, if a Shy word was presented on the left, versus to the left, if a Shy word was presented on the right. Whereas the assignment of response keys stayed constant during the combined tasks of the IAT, the assignment of movement directions in the IAP did not. As a consequence, the IAP required not only a discrimination of categories but also a consideration of presentation side. Both, the horizontal movement to the right versus to the left and its changeable mapping to Me versus Notme might have made the categorization within the bipolar IAP more difficult, especially since this was not trained by the presentation of Me and Notme words.

The task difficulty of the bipolar IAP may also account for its high correlation with direct shyness measures that reached almost the level of the bipolar IAP’s internal consistency. Due to the task difficulty, participants might have been forced to react more reflectively rather than spontaneously. Therefore, the bipolar IAP might have been more consistent with the direct measures than with the IAT. Evidence for this assumption was obtained through a 2x3 ANOVA with order (bipolar vs. unipolar IAP at first) as the between-subjects and test (bipolar IAP, unipolar IAP, IAT) as the within-subjects factor that was performed on mean reaction times within the tests. Results showed no main effect of order, but a main effect of test, and an interaction effect, F(1, 19) = 2.11, n.s., F(2, 38) = 13.74, p < .001, F(2, 38) = 6.21, p < .01. Post hoc comparisons with Bonferoni correction (p < .005) revealed the same pattern as for the difficulty estimateswithin the two order groups. When the bipolar IAP was the first test, it was completed more slowly than the IAT and the unipolar IAP, t(11) = 4.07, p < .005, d = 1.66, t(11) = 6.78, p < .001, d = 2.77. All other differences were not even marginally significant, all |t|(11) < 2.26, n.s.. Thus, when the bipolar IAP was the first test, participants needed more response time than for the other tests that may indicate that their reactions were more influenced by the reflective system. Another reason for the high correlations between the bipolar IAP and direct shyness measures could be that it was not confounded by task-switching accounts (Mierke & Klauer, 2001), as there was only a discrimination of Shy-Nonshy but not of Me-Notme. However, one would rather expect shorter instead of longer response latencies in the absence of task-switching (Mierke & Klauer, 2001). Thus, although the reported response latency differences were significant only for the first IAP and the sample size was small in this study, it would be an interesting topic for further research to explore whether correlations between indirect and direct measures increase with task difficulty and reflection time for the indirect test.

Conclusion. The IAPs’ satisfactory internal consistency as well as their congruent validity with direct measures showed that the IAPs are an acceptable procedure for the assessment of interindividual differences. Nevertheless, the interview and the correlation pattern made it clear that three main features had to be changed. First, the joystick had to be moved vertically rather than horizontally, as this would better represent a Me-Notme dimension. Second, Shy and Nonshy words should be included in the IAP, since the bipolar IAP showed higher correlations with the IAT and direct measures. Third, Me and Notme words should also be included, because task difficulty seems to be more comparable with the IAT. These changes were realized in Pilot Study 2.

Pilot Study 2: The Final IAP Variant

In Pilot Study 2 the final IAP variant was examined. It included like the IAT Shy, Nonshy, Me, and Notme words.

Methods

Participants and design. Participants were 31 (27 female and 4 male) psychology students that had not participated in Pilot Study 1. They were recruited for an experiment on computer aided personality assessment, and received research participation credit. Their mean age was M = 21.6 years, with a range from 19 to 32 years.

All participants completed (a) the shyness IAP, (b) two social desirability scales, (c) the shyness IAT, (d) personality-describing items, (e) a retest of (a), (f) self-ratings on bipolar personality items, and (g) were interviewed about the IAP. The shyness items of the IAP and the IAT were included as direct ratings in step (f). Contrary to Pilot Study 1, there were no direct shyness self-ratings before the indirect tests.

Direct self-ratings and interview. Again, direct self-ratings were assessed on the computer and were presented in a fixed random order. In step (b), participants responded to the Social Desirability Scales identical to Pilot Study 1. Step (d) comprised a 32-item self-monitoring scale and a 8-item irritability scale that were not analyzed for the purpose of the present study. Bipolar adjective pairs in step (f) were identical to Pilot Study 1 and included the shyness self-rating. The interview at the end of the experiment was the same as in Pilot Study 1.

Implicit Association Test (IAT) and Implicit Association Procedure (IAP). The shyness IAT was identical to Pilot Study1. For the shyness IAP, the main difference to the preceding variants was that the joystick was moved vertically rather than horizontally. The joystick had to be pulled toward oneself for words that were associated with Me, and to be pushed away from oneself for words that were not associated with Me. The task sequence for the final IAP version is depicted in Table 6. Identically to the unipolar variant of Pilot Study 1, participants first learned to discriminate the three Me and Notme words. In the following initial combined task, the five Shy and Nonshy words from the bipolar variant were added and had to be pulled to or pushed away from the participant, respectively. Finally, the direction for the Shy and Nonshy words was reversed, assigning Shy words to a movement away from the participant and Nonshy words to a movement toward the participant. The IAP score was computed as the difference in mean latency between both combined tasks (sequence 3 minus sequence 2, see Table 6). Stimuli were randomized in order within 8 blocks of 16 trials. Internal consistency was evaluated across 4 subtests with 32 trials each.

Trial presentation was identical to Pilot Study 1, except for the following points. The word Me with a frame around – representing the participant – was presented in the center of the lowest screen line (see Appendix). Stimuli appeared above it in the center of the screen. Stimuli and the stimulus mask were displayed in white to make the screen design more comparable to the IAT. The correct answer directions for the Shy (SHY = ME in sequence 2) or Nonshy (NONSHY = ME in sequence 3) words were presented in a subtle red in the left upper corner of the screen and only during the combined tasks. The joystick was located on the table directly in front of the participant, right in front of the keyboard and the screen (see Appendix). The joystick could be operated with the right or the left hand, allowing for both right-handed and left-handed participants.

Results and Discussion

Error rates and distribution of test scores. Error rates were for the first IAP M = 5.3%, SD = 4.3%, for the retest IAP M = 4.1%, SD = 3.7%, and for the IAT M = 5.1, SD = 3.3%. A one-way ANOVA with test (IAP, IAT, retest IAP) as a within-subjects factor revealed that they were not even marginally different, F(2, 60) = 2.36, n.s.. For all three tests, no participant had error rates higher than 17% and the distributions of the test scores were not even marginally different from a normal distribution, Z < 1.

Reliabilities and correlations of indirect and direct measures. The reliabilities and correlations, which are depicted in the first line of Table 7, met the criteria that were expected from the new IAP. First, the IAP’s internal consistency was completely satisfactory. Second, the IAP correlated highly with the IAT. Third, it correlated intermediately and as high as the IAT with the direct self-rating. Fourth, the IAP did not, similar to the IAT, correlate with social desirability, whereas this was the case for the direct self-rating. Furthermore, the test-retest reliability of the IAP was lower than its internal consistency, which replicated results for the IAT in other studies (cf. Egloff, Schwerdtfeger, & Schmukle, 2003). Finally, the second IAP showed lower correlations with both the IAT and the direct self-rating. A decrease in validity for the second test was also shown for the IAT (Asendorpf et al., 2002). Together, the correlational pattern of the IAP met all criteria and was highly comparable to the IAT.

Interview. A one-way ANOVA with test (IAP, IAT, retest IAP) as a within-subjects factor revealed that the difficulty estimates for the three indirect tests were not even marginally different, F(2, 60) = 2.39, n.s.. When the same ANOVA was performed on mean reaction times a significant main effect emerged, F(2, 60) = 7.98, p < .001. Post hoc single comparisons with Bonferoni correction (p < .015) indicated that the first IAP was completed more slowly than the IAT and the retest IAP, t(30) = 3.18, p < .01, d = .85, t(30) = 3.64, p < .01, d = .81. However, reaction times between the IAT and the retest IAP were not even marginally different, t(30) = 1.24, n.s.. Since I did not vary the order of the IAT and the IAP between subjects, I could not examine whether the difference between the first IAT and the subsequent IAT was due to learning effects. Nevertheless, when the order of the IAT and the IAP was counterbalanced across participants in the subsequent study (Study 1), their mean response latencies were not even marginally different, t(295) = 1.59, n.s.. More importantly, the first IAP in Pilot Study 2 was completed significantly quicker than the first bipolar IAP of Pilot Study 1, t(41)= 4.67, p < .001, d = 1.46.

Conclusion. The correlational pattern as well as the difficulty estimates by the participants revealed a correspondence between IAT and IAP. This is also illustrated by the high correlation (r = .60) between both tests that reached almost the level of the IAP’s retest reliability (r = .67). In general, the IAP seemed to be a good candidate for the purpose of replicating results of the IAT and estimating the method-specific variance of both tests.