Assessment Rigor Mortis
by Roberta Hill
Readers of my column will know by now that I am quite irreverent about the "absolutes" of using assessments. In this issue, I want to attack the concept of psychometric rigor, or, as I like to refer to it, psychometric rigor mortis. This whole discussion is a bit like flogging a dead horse.
Let me qualify this by saying that the scientific research behind an assessment is an extremely important component. However, the user of assessment instruments does not need to go into great depth on the studies.
Coaches who provide assessments to clients should ask two simple questions:
- Have both reliability and validity studies been conducted?
- Do the results fall within the accepted norms?
That's it. The exact statistics are less important. Why?
- These studies are usually paid for and sponsored by the company that created the tool and they are therefore subject to some perceived bias.
- The math is too complicated for most of us. Statistics may not lie, but I am not so sure about the statisticians.
- Depending on how the results are reported, they can be quite misleading. Usually the assessment company pays for more than one validity study out of five possible types. The results must then be compiled and reported, usually in a way that is favorable to the sponsor.
However, coaches seem to feel that they have to have the data in case someone asks. Most people won't ask, but if you do have some one who wants "proof," here is my short standard answer. Memorize it and you won't have any issues:
"I can tell you that the research and studies have been done. I am not a statistician and that is why the company that developed this instrument hired someone else to do the research. I have been assured that the results fall within the acceptable range for psychometrically sound instruments. "
If they still push to actually see the data for themselves, I add, "I am sure that we can get you a copy of the study if it is that important to you, but since the report will be quite large, there will be a cost associated." This has always stopped the discussion and provided a way for someone to back down gracefully.
Now for the part about this issue that really annoys me. I am tired of some vendors' representatives telling me (us) that their instrument is more valid (they should say scientifically sound) because it has a higher .90 versus a .70 correlation. Poppycock.
This may sound a little too much like math but bear with me. First, when these numbers are provided, they are related to reliability, NOT validity. Reliability measures stability--test-retest consistency over a period of time. Validity measures accuracy--that it actually measures what it says it measures. Second, "over a period of time" in the research is usually six months. The results are meant to mean that if I take an assessment with a .80 correlation today, my results will be about 80% the same in six months, but not necessarily in ten years. Third, some instruments by their very nature will have higher correlations on this scale.
This is my old complaint of trying to compare apples and oranges. It is much easier to get high reliability results with behavioral assessments, such as the DISC profile. Assessments that measure internal traits (motives, preferences, instincts, etc.) rather than externally observed behavior are less likely to achieve high reliability. The MBTI (Myers-Briggs Type Indicator), which is not behavioral in nature, has one of the lower reliability correlations. It is still within the acceptable range established by the Standards for Educational and Psychological Testing, published by the American Psychological Association. In addition, it is probably the most comprehensively researched tool of its kind in the "popular" market of ipsative (subjective) instruments.
Here is an analogy that might help. Assessing traits like motives, preferences, and instincts is like listening to a person speak in order to determine his or her mother tongue. It's not quite as obvious as it appears due to accents, dialect, education and many other criteria. On the other hand, assessing behavior is more like looking at a person and determining their physical gender. Sometimes you may not be sure, but I'll bet you are right over 95% of the time (or a .95 correlation).
I have three pieces of advice:
- Make sure the studies have been conducted.
- Forget the numbers and stop worrying about the technical stuff.
- Keep to the basics. Remember that the tool is not an end in itself--it's a means to help the client with discovery.
Roberta Hill, MBA, is a Professional Certified Coach (PCC), as well as a Professional Mentor Coach (PMC) and Certified Teleclass Leader with Corporate Coach U International. Roberta owns www.AssessmentsNow.com, an online assessment provider with a network of more than 40 qualified coaches worldwide. Read more about Roberta in the WABC Coach Directory. Roberta may be reached by email at firstname.lastname@example.org.