BACKGROUND: Mitigating rating inconsistency can improve measurement fidelity and detection of treatment response. METHODS: The International Society for CNS Clinical Trials and Methodology convened an expert Working Group that developed logical consistency (LC) checks for ratings of the Young Mania Rating Scale (YMRS), which is widely used in studies of mood and bipolar disorders. LC and statistical outlier-response pattern checks (SC) were applied to 63,228 YMRS administrations from 14 clinical trials evaluating treatments for bipolar disorder. Checks were also applied to Monte Carlo-simulated data as a proxy for their use under conditions of inconsistency. RESULTS: 42 LC flags were developed, and four SC flags were created from the data set (n = 14). Almost 20 % of the rating administrations had at least one LC flag, 6.7 % had two or more, 1.7 % had three or more; 17.3 % percent of the administrations had at least one SC flag and 4.6 % percent had two or more. Overall, 31 % of administrations had at least one flag of any type, 12.1 % had two or more and 5.3 % had three or more. In acute antimanic treatment trials (n = 10) there were more flags of any type compared to relapse prevention trials (n = 4). LIMITATIONS: Flagged ratings may represent less-common presentations assessed correctly. CONCLUSIONS: Using established methods, we illustrate development and application of consistency flags for YMRS ratings. Applying flags and mitigation during trials may improve the value of YMRS data, help focus attention on rater training, and improve reliability and validity of trial data.
BACKGROUND: Mitigating rating inconsistency can improve measurement fidelity and detection of treatment response. METHODS: The International Society for CNS Clinical Trials and Methodology convened an expert Working Group that developed consistency checks for ratings of the Hamilton Anxiety Rating Scale (HAM-A) and Clinical Global Impression of Severity of anxiety (CGIS) that are widely used in studies of mood and anxiety disorders. Flags were applied to 40,349 HAM-A administrations from 15 clinical trials and to Monte Carlo-simulated data as a proxy for applying flags under conditions of inconsistency. RESULTS: Thirty-three flags were derived these included logical consistency checks and statistical outlier-response pattern checks. Twenty-percent of the HAM-A administrations had at least one logical scoring inconsistency flag, 4 % had two or more. Twenty-six percent of the administrations had at least one statistical outlier flag and 11 % had two or more. Overall, 35 % of administrations had at least one flag of any type, 19 % had one and 16 % had 2 or more. Most of administrations in the Monte Carlo- simulated data raised multiple flags. LIMITATIONS: Flagged ratings may represent less-common presentations of administrations done correctly. Conclusions-Application of flags to clinical ratings may aid in detecting imprecise measurement. Flags can be used for monitoring of raters during an ongoing trial and as part of post-trial evaluation. Appling flags may improve reliability and validity of trial data.
- MeSH
- lidé MeSH
- psychiatrické posuzovací škály MeSH
- psychometrie MeSH
- reprodukovatelnost výsledků MeSH
- úzkost * MeSH
- úzkostné poruchy * diagnóza farmakoterapie MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
BACKGROUND: Symptom manifestations in mood disorders can be subtle. Cumulatively, small imprecisions in measurement can limit our ability to measure treatment response accurately. Logical and statistical consistency checks between item responses (i.e., cross-sectionally) and across administrations (i.e., longitudinally) can contribute to improving measurement fidelity. METHODS: The International Society for CNS Clinical Trials and Methodology convened an expert Working Group that assembled flags indicating consistency/inconsistency ratings for the Hamilton Rating Scale for Depression (HAM-D17), a widely-used rating scale in studies of depression. Proposed flags were applied to assessments derived from the NEWMEDS data repository of 95,468 HAM-D administrations from 32 registration trials of antidepressant medications and to Monte Carlo-simulated data as a proxy for applying flags under conditions of known inconsistency. RESULTS: Two types of flags were derived: logical consistency checks and statistical outlier-response pattern checks. Almost thirty percent of the HAMD administrations had at least one logical scoring inconsistency flag. Seven percent had flags judged to suggest that a thorough review of rating is warranted. Almost 22% of the administrations had at least one statistical outlier flag and 7.9% had more than one. Most of the administrations in the Monte Carlo- simulated data raised multiple flags. LIMITATIONS: Flagged ratings may represent less-common presentations of administrations done correctly. CONCLUSIONS: Application of flags to clinical ratings may aid in detecting imprecise measurement. Reviewing and addressing these flags may improve reliability and validity of clinical trial data.