BACKGROUND: Mitigating rating inconsistency can improve measurement fidelity and detection of treatment response. METHODS: The International Society for CNS Clinical Trials and Methodology convened an expert Working Group that developed consistency checks for ratings of the Hamilton Anxiety Rating Scale (HAM-A) and Clinical Global Impression of Severity of anxiety (CGIS) that are widely used in studies of mood and anxiety disorders. Flags were applied to 40,349 HAM-A administrations from 15 clinical trials and to Monte Carlo-simulated data as a proxy for applying flags under conditions of inconsistency. RESULTS: Thirty-three flags were derived these included logical consistency checks and statistical outlier-response pattern checks. Twenty-percent of the HAM-A administrations had at least one logical scoring inconsistency flag, 4 % had two or more. Twenty-six percent of the administrations had at least one statistical outlier flag and 11 % had two or more. Overall, 35 % of administrations had at least one flag of any type, 19 % had one and 16 % had 2 or more. Most of administrations in the Monte Carlo- simulated data raised multiple flags. LIMITATIONS: Flagged ratings may represent less-common presentations of administrations done correctly. Conclusions-Application of flags to clinical ratings may aid in detecting imprecise measurement. Flags can be used for monitoring of raters during an ongoing trial and as part of post-trial evaluation. Appling flags may improve reliability and validity of trial data.
- MeSH
- lidé MeSH
- psychiatrické posuzovací škály MeSH
- psychometrie MeSH
- reprodukovatelnost výsledků MeSH
- úzkost * MeSH
- úzkostné poruchy * diagnóza farmakoterapie MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Symptom manifestations in mood disorders can be subtle. Cumulatively, small imprecisions in measurement can limit our ability to measure treatment response accurately. Logical and statistical consistency checks between item responses (i.e., cross-sectionally) and across administrations (i.e., longitudinally) can contribute to improving measurement fidelity. METHODS: The International Society for CNS Clinical Trials and Methodology convened an expert Working Group that assembled flags indicating consistency/inconsistency ratings for the Hamilton Rating Scale for Depression (HAM-D17), a widely-used rating scale in studies of depression. Proposed flags were applied to assessments derived from the NEWMEDS data repository of 95,468 HAM-D administrations from 32 registration trials of antidepressant medications and to Monte Carlo-simulated data as a proxy for applying flags under conditions of known inconsistency. RESULTS: Two types of flags were derived: logical consistency checks and statistical outlier-response pattern checks. Almost thirty percent of the HAMD administrations had at least one logical scoring inconsistency flag. Seven percent had flags judged to suggest that a thorough review of rating is warranted. Almost 22% of the administrations had at least one statistical outlier flag and 7.9% had more than one. Most of the administrations in the Monte Carlo- simulated data raised multiple flags. LIMITATIONS: Flagged ratings may represent less-common presentations of administrations done correctly. CONCLUSIONS: Application of flags to clinical ratings may aid in detecting imprecise measurement. Reviewing and addressing these flags may improve reliability and validity of clinical trial data.
International Society for CNS Clinical Trials and Methodology convened an expert Working Group that assembled consistency/inconsistency flags for the Personal and Social Performance Scale (PSP). One hundred and forty seven flags were identified, 16 flag errors in deriving the PSP decile (i.e., total) score from the four individual domain scores, 74 flag inconsistencies between domain scores relative to Positive and Negative Symptom Scale (PANSS) item ratings and 57 flag inconsistencies between PSP decile score and PANSS items ratings. The flags were applied to assessments from randomized clinical trial data of antipsychotics in schizophrenia from almost 18,000 ratings. Twenty-two flags were raised in at least 5 of 1000 ratings. Nearly 20% of the PSP ratings had at least one inconsistency flag raised. Application of flags to clinical ratings may improve the reliability of ratings and validity of trials.