Interobserver and intraobserver agreement
Dotaz
Zobrazit nápovědu
Cíl: V rámci longitudinální studie sledujeme vývoj změn terče zrakového nervu u glaukomu s otevřeným úhlem. Pro posouzení změn exkavace, neuroretinálního lemu a vrstvy nervových vláken sítnice je nutné znát možnost chybného hodno- cení při opakovaném posuzování jedním lékařem a při posuzování stejné papily různými lékaři. Metodika: 85 diapozitivů papil osob sledovaných pro glaukom s otevřeným úhlem a oční hypertenzi bylo posuzováno 4 lékaři s různou zkušeností s vyšetřováním glaukomu. Byly srovnány výsledky opakovaného hodnocení jedním lékařem, i porovnány údaje jednotlivých pozorovatelů. Byl hodnocen horizontální i verti- kální poměr exkavace k terči (C/D ratio), neuroretinální lem, cirkumpapilární změny a celkový dojem: glaukom ano/ suspektní/ne. Výsledky: Opakované posouzení jedním pozorovatelem bylo nejpřesnější u lékaře pracujícího v glaukomové ambulanci a u lékaře, který hodnotil jen 51 terčů, které se mu zdály dobře hodnotitelné. U ostatních byly nálezy méně spolehlivé a rozdíly byly 0,2 DD a více v 4,3 %. Jako glaukomový hodnotili všichni 4 pozorovatelé terče jen v 30,3 %. Závěry: Při opakovaném hodnocení jsou u pozorovatelů, kteří se zabývají glauko- mem dobré, u ostatních jen tam, kde popisují jen jasně hodnotitelné terče. Posu- zování, zda je terč glaukomový z C/D poměru, je málo přesné.
Purpose: In a longitudinal prospective study the development of the optic disc is studied. For the evaluation of results inter- and intraobserver agreement is necessary to know. Method: 85 colour photographs of patients, folowed for open angle glaucoma or ocular hypertension were evaluated by four ophthalmologists with different experience in glaucoma. Inter- and intraobserver agreement was evaluated in horizontal and vetical C/D ratio, neuroretinal rim, circumpapilar alterations and overall impression: glaucoma yes/ suspect/no. Results: Interobserver agreement was best in the observer, working in the glau- coma outpatient department and another observer, who described only 51 discs, which seemed clear to him. In the others the results were less reliable with differences of 0,2 DD up to 4.3%. As glaucomatous the discs were evaluated by all four observers only i 30.3%. Conclusions: Interobserver agreement was high in an ophthalmologist, working in a glaucoma department, in others only in selected discs. Evaluation of a disc as glaucomatous from C/D ratio is less reliable.
OBJECTIVE: To evaluate interobserver agreement for the assessment of local tumor extension in women with cervical cancer, among experienced and less experienced observers, using transvaginal ultrasound (TVS) and magnetic resonance imaging (MRI). METHODS: The TVS observers were all gynecologists and consultant ultrasound specialists, six with and seven without previous experience in cervical cancer imaging. The MRI observers were five radiologists experienced in pelvic MRI and four less experienced radiology residents without previous experience in MRI of the pelvis. The less experienced TVS observers and all MRI observers underwent a short basic training session in the assessment of cervical tumor extension, while the experienced TVS observers received only a written directive. All observers were assigned the same images from cervical cancer patients at all stages (n = 60) and performed offline evaluation to answer the following three questions: (1) Is there a visible primary tumor? (2) Does the tumor infiltrate > ⅓ of the cervical stroma? and (3) Is there parametrial invasion? Interobserver agreement within the four groups of observers was assessed using Fleiss kappa (κ) with 95% CI. RESULTS: Experienced and less experienced TVS observers, respectively, had moderate interobserver agreement with respect to tumor detection (κ (95% CI), 0.46 (0.40-0.53) and 0.46 (0.41-0.52)), stromal invasion > ⅓ (κ (95% CI), 0.45 (0.38-0.51) and 0.53 (0.40-0.58)) and parametrial invasion (κ (95% CI), 0.57 (0.51-0.64) and 0.44 (0.39-0.50)). Experienced MRI observers had good interobserver agreement with respect to tumor detection (κ (95% CI), 0.70 (0.62-0.78)), while less experienced MRI observers had moderate agreement (κ (95% CI), 0.51 (0.41-0.62)), and both experienced and less experienced MRI observers, respectively, had good interobserver agreement regarding stromal invasion (κ (95% CI), 0.80 (0.72-0.88) and 0.71 (0.61-0.81)) and parametrial invasion (κ (95% CI), 0.69 (0.61-0.77) and 0.71 (0.61-0.81)). CONCLUSIONS: We found interobserver agreement for the assessment of local tumor extension in patients with cervical cancer to be moderate for TVS and moderate-to-good for MRI. The level of interobserver agreement was associated with experience among TVS observers only for parametrial invasion. © 2021 The Authors. Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.
- MeSH
- cervix uteri diagnostické zobrazování MeSH
- dospělí MeSH
- gynekologie statistika a číselné údaje MeSH
- klinické kompetence statistika a číselné údaje MeSH
- lidé středního věku MeSH
- lidé MeSH
- magnetická rezonanční tomografie metody statistika a číselné údaje MeSH
- nádory děložního čípku diagnostické zobrazování patologie MeSH
- odchylka pozorovatele MeSH
- radiologie statistika a číselné údaje MeSH
- reprodukovatelnost výsledků MeSH
- staging nádorů metody statistika a číselné údaje MeSH
- ultrasonografie metody statistika a číselné údaje MeSH
- vagina diagnostické zobrazování MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- hodnotící studie MeSH
- MeSH
- čtyřhlavý sval stehenní * diagnostické zobrazování MeSH
- kritický stav * MeSH
- lidé MeSH
- odchylka pozorovatele MeSH
- ultrasonografie MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Staging criteria for renal cell carcinoma differ from many other cancers, in that renal tumors are often spherical with subtle, finger-like extensions into veins, renal sinus, or perinephric tissue. We sought to study interobserver agreement in pathologic stage categories for challenging cases. An online survey was circulated to urologic pathologists interested in kidney tumors, yielding 89% response (31/35). Most questions included 1 to 4 images, focusing on: vascular and renal sinus invasion (n=24), perinephric invasion (n=9), and gross pathology/specimen handling (n=17). Responses were collapsed for analysis into positive and negative/equivocal for upstaging. Consensus was regarded as an agreement of 67% (2/3) of participants, which was reached in 20/33 (61%) evaluable scenarios regarding renal sinus, perinephric, or vein invasion, of which 13/33 (39%) had ≥80% consensus. Lack of agreement was especially encountered regarding small tumor protrusions into a possible vascular lumen, close to the tumor leading edge. For gross photographs, most were interpreted as suspicious but requiring histologic confirmation. Most participants (61%) rarely used special stains to evaluate vascular invasion, usually endothelial markers (81%). Most agreed that a spherical mass bulging well beyond the kidney parenchyma into the renal sinus (71%) or perinephric fat (90%) did not necessarily indicate invasion. Interobserver agreement in pathologic staging of renal cancer is relatively good among urologic pathologists interested in kidney tumors, even when selecting cases that test the earliest and borderline thresholds for extrarenal extension. Disagreements remain, however, particularly for tumors with small, finger-like protrusions, closely juxtaposed to the main mass.
- MeSH
- karcinom z renálních buněk patologie MeSH
- laboratorní medicína metody MeSH
- lidé MeSH
- nádory ledvin patologie MeSH
- odchylka pozorovatele MeSH
- patologové MeSH
- staging nádorů metody MeSH
- urologie metody MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
Tubular adenoma (TA) and syringocystadenoma papilliferum (SCAP) may show histopathological overlap, with some lesions having features of both neoplasms (SCAP + TA). TA has been recently suggested to represent a carcinoma. Four observers blindly assessed 67 cases of TA, SCAP, and their lookalikes (poroma, apocrine adenoma, apocrine carcinoma; all lesions focally featuring a pseudopapillary pattern), and classified the lesions into one of four categories: (1) TA, (2) SCAP, (3) SCAP + TA, and (4) others. Lesions were also classified as benign or malignant. In only 29 cases was there unanimous agreement among the four observers, who classified 22 lesions as TA, three as SCAP, and four cases as others. Of the 38 cases where there was interobserver diagnostic variation, in 30, the diagnosis varied between TA or SCAP or SCAP + TA; the remainder fell in the others category. Analysis of the factors leading to interobserver variability indicated that diagnostic problems occurred when there were any of the following: epidermal acanthosis, papillomatosis, connection of the neoplastic tubules to the overlying epidermis and/or follicular infundibula, and plasma cell infiltration. These features accounted for the morphological overlap between TA and SCAP. All observers agreed that the lesions were benign; the only apocrine carcinoma included was recognized as such by all observers. From the study, it was concluded that TA may arise in the deep dermis without any epidermal connection, or, in other cases, it may be more superficially located with or without an epidermal connection. It may be reasonably inferred that, possibly as a response to infection, there may be accompanying plasma cells and variable acanthosis and papillomatosis, such that the appearances are those of "pure" SCAP, or lesions may have features "intermediate" or overlapping between TA and SCAP.
- MeSH
- adenom potní žlázy diagnóza klasifikace MeSH
- cystadenom diagnóza klasifikace MeSH
- dermatologie metody MeSH
- diferenciální diagnóza MeSH
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladiství MeSH
- mnohočetné primární nádory diagnóza MeSH
- nádory potních žláz diagnóza klasifikace MeSH
- odchylka pozorovatele MeSH
- patologie metody MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- mladiství MeSH
- mužské pohlaví MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- multicentrická studie MeSH
Polymorphous adenocarcinoma (PAC) shows histologic diversity with streaming and targetoid features whereas cribriform adenocarcinoma of salivary gland (CASG) demonstrates predominantly cribriform and solid patterns with glomeruloid structures and optically clear nuclei. Opinions diverge on whether CASG represents a separate entity or a variant of PAC. We aimed to assess the level of agreement among 25 expert Head and Neck pathologists in classifying these tumors. Digital slides of 48 cases were reviewed and classified as: PAC, CASG, tumors with ≥50% of papillary architecture (PAP), and tumors with indeterminate features (IND). The consensus diagnoses were correlated with a previously reported molecular alteration. The consensus diagnoses were PAC in 18/48, CASG in16/48, PAP in 3/48, and IND in 11/48. There was a fair interobserver agreement in classifying the tumors (κ=0.370). The full consensus was achieved in 3 (6%) cases, all of which were classified as PAC. A moderate agreement was reached for PAC (κ=0.504) and PAP (κ=0.561), and a fair agreement was reached for CASG (κ=0.390). IND had only slight diagnostic concordance (κ=0.091). PAC predominantly harbored PRKD1 hotspot mutation, whereas CASG was associated with fusion involving PRKD1, PRKD2, or PRKD3. However, such molecular events were not exclusive as 7% of PAC had fusion and 13% of CASG had mutation. In conclusion, a fair to moderate interobserver agreement can be achieved in classifying PAC and CASG. However, a subset (23%) showed indeterminate features and was difficult to place along the morphologic spectrum of PAC/CASG among expert pathologists. This may explain the controversy in classifying these tumors.
- MeSH
- adenokarcinom klasifikace genetika patologie MeSH
- biopsie MeSH
- fúze genů MeSH
- genetická predispozice k nemoci MeSH
- hybridizace in situ fluorescenční MeSH
- kvantitativní polymerázová řetězová reakce MeSH
- lidé MeSH
- mutace MeSH
- mutační analýza DNA MeSH
- nádorové biomarkery genetika MeSH
- nádory slinných žláz klasifikace genetika patologie MeSH
- odchylka pozorovatele MeSH
- prediktivní hodnota testů MeSH
- reprodukovatelnost výsledků MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- multicentrická studie MeSH
- Research Support, N.I.H., Extramural MeSH
- Geografické názvy
- Evropa MeSH
- Kanada MeSH
- Spojené státy americké MeSH
Aim: The aim of the study was to investigate the inter-rater agreement (IRA) of the 8-item Brief Bedside Dysphagia Screening Test-Revised (BBDST-R). Design: An observational IRA study was conducted. Methods: Forty-six patients with stroke were independently assessed by two nurse raters using the BBDST-R. Rater agreement was described and analysed using descriptive statistics, Cohen's kappa (κ), the proportion of observed agreement (Po), the prevalence index (pindex), and the bias index (bindex). Results: Kappa ranged from -0.046 (p = 0.641; 95% CI = -0.126–0.034) for item “ability to clench the teeth” to 0.784 (p < 0.001; 95% CI = 0.377–1.000) for “thick liquid: cough”. For the overall BBDST-R result, κ was 0.241 (p = 0.114; 95% CI = 0.000–0.558). The Po was ≥ 0.80 for five items; the pindex was ≥ 0.80 for three items. The bindex ranged from 0.03–0.31. Conclusion: The IRA, as expressed by κ, was low for the overall result and variable for the individual items. However, for some items, κ may have misrepresented the IRA due to the high pindex and bindex. Several strategies are recommended to improve the IRA of the instrument. This should enhance the instrument's capacity to produce consistent results across raters.
- MeSH
- cévní mozková příhoda komplikace MeSH
- interpretace statistických dat MeSH
- klinické kompetence MeSH
- lidé MeSH
- odchylka pozorovatele * MeSH
- ošetřovatelské zhodnocení * metody normy statistika a číselné údaje MeSH
- poruchy polykání * diagnóza ošetřování MeSH
- reprodukovatelnost výsledků MeSH
- určení symptomu MeSH
- vyšetření u lůžka MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- pozorovací studie MeSH
- práce podpořená grantem MeSH
Účel práce: Pre oblasť trochanterickej oblasti proximálneho femuru existuje viacero klasifikácií. Medzi inými AO/ASIF a Evansova klasifikácia. Aby klasifikácia bola vhodná na určenie spôsobu liečby a bola presným nástrojom komunikácie medzi chirurgami musí byť spoľahlivou a reprodukovateľnou. Preto cieľom práce je porovnanie interobserver spoľahlivosti a intraobserver reprodukovateľnosti AO/ASIF a Evansovej klasifikácie. Materiál a metódy: Vstupné predoperačné a prvé pooperačné rtg snímky 39 po sebe nasledujúcich pacientov s trochantericou zlomeninou boli hod-notené podľa AO/ASIF (s/ bez delenia do podskupín) a Evansovej klasifikácie piatimi hodnotiacimi (2x skúsenými úrazovými chirurgami, 1x röntgenológom a 2 x študentami medicíny). Tá istá séria rtg snímok bola hodnotená tými istými hodnotiacimi o 3 mesiace. Výsledná zhoda bola zhodnotená váženým koeficientom kappa (?). Výsledky: Priemerná hodnota kappa pre interobserver spoľahlivosť bola 0,43 pri delení do podskupín podľa AO/ASIF (31 A1.1-A3.3), 0,51 pri hodnotení do AO/ASIF základných skupín a 0,37 pri Evansovej klasifikácii. Priemerná hodnota kappa pre intraobserver reprodukovateľnosť bola 0,50, 0,53 a 0,42. Úplná zhoda všetkých hodnotiacich bola dosiahnutá 18x (46 %) pri AO /ASIF klasifikácii do hlavných skupín v oboch sedeniach, 2x (5 %) pri klasifikácii do podskupín podľa AO/ASIF v oboch sedeniach a 5x (13 %) pri prvom a 4x (10 %) pri druhom sedení pri Evansovej klasifikácii. Diskusia a záver: Naše výsledky potvrdzujú závery iných autorov, že obe klasifikácie majú miernu až slabú inter - a intraobserver spoľahlivosť s najvyšším priemerom koeficientu kappa pre klasifikáciu AO/ASIF s rozdelením do hlavných skupín. Použitie prvých pooperačných snímok nezvýšilo priemernú hodnotu kappa v sledovanom súbore v porovnaní s inými autormi.
Purpose of the study: There are many of classification systems for grading trochanteric fractures. Of these the AO/ASIF and Evan's classification systems are widely used. To be a beneficial tool for indicating a method of treatment or permitting communication between surgeons, the classification system should be reliable and reproducible. The aim of this study was to compare the interobserver reliability and intraobserver reproducibility of the AO/ASIF and Evan's classifications. Material and Methods: Plain preoperative and the first postoperative radiographs of 39 consecutive series of patients were classified using the AO/ASIF (with and without subgroups) and Evan's classifications by five observers (2 experienced surgeons, an experienced radiologist and 2 medical students). The same radiographs were classified by the same observers three months later. Observers agreement was assessed with the weighed coefficient kappa (?). Results: The mean kappa (?) value for interobserver reliability was 0.43 with the AO/ASIF subgroups (31 A1.1-A3.3), 0.51 with the AO/ASIF main groups (31A1-A3), and 0.37 with Evan's classification. The mean ? value for intraobserver reproducibility was 0.50, 0.53 and 0.42 respectively. Perfect agreement of all observations in the first session was obtained 18 times (46%) with the AO/ASIF main groups, twice with the AO/ASIF subgroups (5%), and 5 times (13%) with Evan's classification and 18 (46%), 2 (5%) and 4 times (10%) in the second session respectively. Discussion and Conclusions: Our data confirm outcomes of other authors that both the AO/ASIF and Evan's classifications have moderate to fair inter- and intraobserver reliability with the best mean kappa values for the classification in the AO/ASIF three main groups. Using of early post-operative radiographs for classification did not increase the accuracy of classifying trochanteric fractures in comparison with other authors' outcomes.
RATIONALE, AIMS AND OBJECTIVES: To evaluate obstetricians' inter- and intra-observer agreement on intrapartum cardiotocogram (CTG) recordings and to examine obstetricians' evaluations with respect to umbilical artery pH and base deficit. METHODS: Nine experienced obstetricians annotated 634 intrapartum CTG recordings. The evaluation of each recording was divided into four steps: evaluation of two 30-minute windows in the first stage of labour, evaluation of one window in the second stage of labour and labour outcome prediction. The complete set of evaluations used for this experiment is available online. The inter- and intra-observer agreement was evaluated using proportion of agreement and kappa coefficient. Clinicians' sensitivity and specificity was computed with respect to umbilical artery pH, base deficit and to Apgar score at the fifth minute. RESULTS: The overall proportion of agreement between clinicians reached 48% with 95% confidence intervals (CI) (CI: 47-50). Regarding the different classes, proportion of agreement ranged from 57% (CI: 54-60) for normal to 41% (CI: 36-46) for pathological class. The sensitivity of clinicians' majority vote to objective outcome was 39% (CI: 16-63) for the umbilical artery base deficit and 27% (CI: 16-42) for pH. The specificity was 89% (CI: 86-92) for both types of objective outcome. CONCLUSIONS: The reported inter-/intra-observer variability is large and this holds irrespective of clinicians' experience or work place. The results support the need of modernized guidelines for CTG evaluation and/or objectivization and repeatability by introduction of a computerized approach that could standardize the process of CTG evaluation within the delivery ward.
- MeSH
- kardiotokografie statistika a číselné údaje MeSH
- klinické kompetence * MeSH
- koncentrace vodíkových iontů MeSH
- lidé MeSH
- odchylka pozorovatele MeSH
- porodnictví statistika a číselné údaje MeSH
- reprodukovatelnost výsledků MeSH
- senzitivita a specificita MeSH
- software MeSH
- Check Tag
- lidé MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
OBJECTIVES: The aim is to estimate agreement between two-dimensional transvaginal ultrasound (2D-TVS) and three-dimensional volume contrast imaging (3D-VCI) in diagnosing deep myometrial invasion (MI) and cervical stromal involvement (CSI) of endometrial cancer and to compare the two methods regarding inter-rater reliability and diagnostic accuracy. METHODS: Fifteen ultrasound experts assessed off-line de-identified 3D-VCI volumes and 2D-TVU video clips from 58 patients with biopsy-confirmed endometrial cancer regarding the presence of deep (≥50%) MI and CSI. Video clips and 3D volumes were assessed independently. Interrater reliability was measured using kappa statistics. Histological diagnosis after hysterectomy served as gold standard. Accuracy measurements were correlated to rater experience using Spearman's rank correlation coefficient (ρ). RESULTS: Agreement between 2D-TVU and 3D-VCI for diagnosing MI was median 76% (range 64-93%) and for CSI median 88% (range 79-97%). Interrater reliability was better for 2D-TVU than for 3D-VCI (Fleiss' kappa 0.41 vs. 0.31 for MI and 0.55 vs. 0.45 for CSI). Median accuracy for diagnosing deep MI was 76% (range 59-84%) with 2D-TVU and 69% (range 52-83%) for 3D-VCI; the corresponding figures for CSI were 88% (range 81-93%) and 86% (range 72-95%). Accuracy was significantly correlated to how many cases the raters assessed annually. CONCLUSIONS: Off-line assessment of MI and CSI in women with endometrial cancer using 3D-VCI has lower interrater reliability and lower accuracy than 2D-TVU video clip assessment. Since accuracy was correlated to the number of cases assessed annually it is advised to centralize these examinations to high-volume centres.
- MeSH
- dospělí MeSH
- invazivní růst nádoru MeSH
- kontrastní látky MeSH
- lidé středního věku MeSH
- lidé MeSH
- myometrium diagnostické zobrazování patologie MeSH
- nádory endometria diagnostické zobrazování patologie chirurgie MeSH
- odchylka pozorovatele MeSH
- reprodukovatelnost výsledků MeSH
- retrospektivní studie MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- staging nádorů metody MeSH
- ultrasonografie metody MeSH
- zobrazování trojrozměrné * MeSH
- Check Tag
- dospělí MeSH
- lidé středního věku MeSH
- lidé MeSH
- senioři nad 80 let MeSH
- senioři MeSH
- ženské pohlaví MeSH
- Publikační typ
- časopisecké články MeSH
- práce podpořená grantem MeSH
- srovnávací studie MeSH