The test, named the Implicit Association Test (IAT), was ostensibly designed to reveal and measure unconscious racism, writes Goldhill, and had subsequently been administered to millions of people worldwide and justify referencing implicit bias in perpetuating the myth of the gender pay gap or racist police shootings.
Goldhill notes that of various versions of the test (there are more than a dozen), the test receiving the most attention has been the black-white race IAT, which assumes that accurate reflections of implicit bias are gained from identifying words as “good” or “bad” as quickly as possible. She notes, “The slower you are and the more mistakes you make when asked to categorize African-American faces and good words using the same key, the higher your level of anti-black implicit bias — according to the test.”
[clip]
Then Goldhill gets down to the crux of the matter:
Flawed Psychological Test Used Everywhere To Ignite Claims Of 'Implicit Bias'
Goldhill notes that of various versions of the test (there are more than a dozen), the test receiving the most attention has been the black-white race IAT, which assumes that accurate reflections of implicit bias are gained from identifying words as “good” or “bad” as quickly as possible. She notes, “The slower you are and the more mistakes you make when asked to categorize African-American faces and good words using the same key, the higher your level of anti-black implicit bias — according to the test.”
[clip]
Then Goldhill gets down to the crux of the matter:
In recent years, a series of studies have led to significant concerns about the IAT’s reliability and validity. These findings, raising basic scientific questions about what the test actually does, can explain why trainings based on the IAT have failed to change discriminatory behavior.
First, reliability: In psychology, a test has strong “test-retest reliability” when a user can retake it and get a roughly similar score. Perfect reliability is scored as a 1, and defined as when a group of people repeatedly take the same test and their scores are always ranked in the exact same order. It’s a tough ask. A psychological test is considered strong if it has a test-retest reliability of at least 0.7, and preferably over 0.8.
Current studies have found the race IAT to have a test-retest reliability score of 0.44, while the IAT overall is around 0.5 (pdf); even the high end of that range is considered “unacceptable” in psychology. It means users get wildly different scores whenever they retake the test …
First, reliability: In psychology, a test has strong “test-retest reliability” when a user can retake it and get a roughly similar score. Perfect reliability is scored as a 1, and defined as when a group of people repeatedly take the same test and their scores are always ranked in the exact same order. It’s a tough ask. A psychological test is considered strong if it has a test-retest reliability of at least 0.7, and preferably over 0.8.
Current studies have found the race IAT to have a test-retest reliability score of 0.44, while the IAT overall is around 0.5 (pdf); even the high end of that range is considered “unacceptable” in psychology. It means users get wildly different scores whenever they retake the test …
Flawed Psychological Test Used Everywhere To Ignite Claims Of 'Implicit Bias'