Bài 5: CHỨNG CỨ CỦA CÁC Bài 5: CHỨNG CỨ CỦA CÁC NGHIÊN CỨU CHẨN ĐOÁN

Matthew J. Thompsonp GP & Senior Clinical Scientist

Department of Primary Health Care University of Oxford

g Nội dung bài học

(cid:132) Cơ sở chẩn đoán (cid:132) Đánh giá các nghiên cứu chẩn

đoán

Chẩn đoán là gì? Chẩn đoán là gì?

(cid:132) Làm tăng độ chắc chắn việc mắc hay không mắc bệnh (cid:132) Tính trầm trọng của bệnh (cid:132) Giám sát theo dỏi diển biến

lâm sàng

(cid:132) Đánh giá tiên lượng-nguy cơ/

ợ g g y

g

các giai đoạn của bệnh

(cid:132) Kế hoạch điều trị (cid:132) Kịp lúc (cid:132) Kịp lúc

Knottnerus, BMJ 2002

Sai lầm trong chẩn đoán Sai lầm trong chẩn đoán

Hầu hết các sai lầm trong chẩn đoán là (cid:132) Hầu hết các sai lầm trong chẩn đoán là cognitive errors: (cid:132) Conditions of uncertainty (cid:132) Thinking is pressured (cid:132) Shortcuts are used (Ann Croskerry. Ann Emerg Med 2003) (Ann Croskerry Ann Emerg Med 2003)

(cid:132) Những sai lầm trong chẩn đoán (Diagnostic errors - The next frontier for Patient Safety. Newman-Toker, JAMA 2009) (cid:132) 40,000-80,000 US hospital deaths from

th f l d it

40 000 80 000 US h misdiagnosis per year Adverse events, negligence cases, serious (cid:132) Adverse events, negligence cases, serious disability more likely to be related to misdiagnosis than drug errors

Diagnostic reasoning Diagnostic reasoning

(cid:132) Diagnostic strategies particularly important (cid:132) Diagnostic strategies particularly important

where patients present with variety of conditions and possible diagnoses. g p

Diagnostic reasoning Diagnostic reasoning

(cid:132) Thí dụ: nguyên nhân của ho là gì? (cid:132) Comprehensive history examination differential

diagnosis final diagnosis

Diagnostic reasoning Diagnostic reasoning

(cid:132) For example, what causes cough? (cid:132) Comprehensive history…examination…differential e amination differential

Comprehensi e histor diagnosis…final diagnosis

(cid:132) (cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132) (cid:132)

(cid:132)

(cid:132)

(cid:132)

(cid:132)

Cardiac failure left sided Chronic obstructive pulmonary disease Lung abscess Cardiac failure, left sided , Chronic obstructive pulmonary disease , Lung abscess Pulmonary alveolar proteinosis, Wegener's granulomatosis, Bronchiectasis Pneumonia, Atypical pneumonia, Pulmonary hypertension Measles, Oropharyngeal cancer, Goodpasture's syndrome Pulmonary oedema, Pulmonary embolism, Mycobacterium tuberculosis Foreign body in respiratory tract, Diffuse panbronchiolitis, Bronchogenic carcinoma Broncholithiasis, Pulmonary fibrosis, Pneumocystis carinii Captopril, Whooping cough, Fasciola hepatica Gastroesophageal reflux, Schistosoma haematobium, Visceral leishmaniasis Enalapril, Pharyngeal pouch, Suppurative otitis media Upper respiratory tract infection, Arnold's nerve cough syndrome, Allergic bronchopulmonary aspergillosis Chlorine gas, Amyloidosis, Cyclophosphamide Tropical pulmonary eosinophilia, Simple pulmonary eosinophilia, Sulphur dioxide Tracheolaryngobronchitis Extrinsic allergic alveolitis Laryngitis Tracheolaryngobronchitis, Extrinsic allergic alveolitis, Laryngitis Fibrosing alveolitis, cryptogenic, Toluene di-isocyanate, Coal worker's pneumoconiosis Lisinopril, Functional disorders, Nitrogen dioxide, Fentanyl Asthma, Omapatrilat, Sinusitis Gabapentin, Cilazapril

(cid:132) ……diagnostic reasoning

di ti i

Đánh giá test chẩn đoán

g

1. Kết quả có giá trị?

2. Kết quả gì? 2 Kết ả ì?

â

3. Kết quả đó có giúp chăm sóc bệnh nhân ệ ó không?

Thiết kế cơ bản của một nghiên cứu chẩn đoán chính xác

Hàng loạt bệnh nhân Hàng loạt bệnh nhân

Chỉ số của test Chỉ số của test

Tham chiếu tiêu chuẩn vàng ( Tham chiếu tiêu chuẩn vàng ( gold”) standard gold”) standard gold”) standard gold”) standard Xếp loại bệnh nhân (Blinded Xếp loại bệnh nhân (Blinded classification) cross--classification) cross ) )

ifi ifi

ti ti

l l

g Validity of diagnostic studies

y

1. Was an appropriate spectrum of patients pp p p p

included?

2. Were all patients subjected to the gold standard?

3. Was there an independent, blind or objective g comparison with the gold standard? p

pp p

p

1. Was an appropriate spectrum of patients included? Spectrum bias

Selected Patients Selected Patients

Index test Index test

Reference standard Reference standard

classification Blinded cross--classification Blinded cross classification Blinded cross Blinded cross classification

1. Was an appropriate spectrum of patients included? Spectrum bias patients included? Spectrum bias

(cid:132) You want to find out how good chest X

rays are for diagnosing pneumonia in the Emergency Department E t D

t

h t X

lt b

thi

t

(cid:132) Best = all patients presenting with difficulty breathing get a chest X-ray diffi (cid:132) Spectrum bias = only those patients in

h

ll

t

t

i

whom you really suspect pneumonia get a chest X ray

2. Were all patients subjected to the gold standard? Verification (work-up) bias t ti

d d? V ifi

) bi

k

(

Series of patients Series of patients

Index test Index test

Reference standard Reference standard

classification Blinded cross--classification Blinded cross classification Blinded cross Blinded cross classification

ti

ll

bj

t d t

th

2. Were all patients subjected to the gold 2 W ld t standard? Verification (work-up) bias (cid:132) You want to find out how good is exercise )

y g p

ECG (“treadmill test”) for identifying patients ( with angina

bi

ti

ti

(

)

t

l

(cid:132) The gold standard is angiography (cid:132) The gold standard is angiography (cid:132) Best = all patients get angiography (cid:132) Verification (work-up bias) = only patients k V ifi who have a positive exercise ECG get angiography angiograph

bj

3. Was there an independent, blind or objective comparison with the gold ld ith th i ti standard? Observer bias

Series of patients Series of patients

Index test Index test

Reference standard Reference standard

classification Unblinded cross--classification Unblinded cross classification Unblinded cross Unblinded cross classification

p

g

j

3. Was there an independent, blind or objective comparison with the gold standard? Observer bias

(cid:132) You want to find out how good is exercise ECG for identifying patients with angina

y g p

g

(cid:132) All patients get the gold standard

(angiography) (angiography)

(cid:132) Observer bias = the Cardiologist who does the angiography knows what the does the angiography knows what the exercise ECG showed (not blinded)

Incorporation Bias Incorporation Bias

Series of patients Series of patients

Index test Index test

Reference standard….. includes parts of Reference standard….. includes parts of Index test Index test Index test Index test

classification Unblinded cross--classification Unblinded cross classification Unblinded cross Unblinded cross classification

Differential Reference Bias Differential Reference Bias

Series of patients Series of patients

Index test Index test

Ref. Std A Ref. Std A

Ref. Std. B Ref. Std. B

classification Blinded cross--classification Blinded cross classification Blinded cross Blinded cross classification

g Validity of diagnostic studies

y

1. Was an appropriate spectrum of patients pp p p p

included?

2. Were all patients subjected to the Gold

Standard?

3. Was there an independent, blind or objective j ,

p comparison with the Gold Standard?

Appraising diagnostic tests

pp

g

g

1. Are the results valid?

2. What are the results? 2 Wh t lt ? th

3. Will they help me look after my patients?

Sensitivity, specificity, Sensitivity specificity positive & negative predictive values, likelihood ratios …aaarrrggh!!

2 by 2 table

y

Disease Disease -

+

+

Test

--

2 by 2 table

y

Disease Disease -

+

a b

+

Test

d c

- -

2 by 2 table

y

Disease Disease -

+

a

b b

+

True positives

False positives

d

c

Test

False i

True negatives i

- negatives -

y 2 by 2 table: sensitivity

y

Disease Disease -

+

p

p p

a

+

Proportion of people with the disease who have a positive test result. result

Test

c

- -

.…a highly sensitive test will not miss many will not miss many people

Sensitivity = a / a + c

y 2 by 2 table: sensitivity

y

Disease Disease -

+

99

+

Test

1

- -

Sensitivity = a / a + c

Sensitivity = 99/100 = 99%

y 2 by 2 table: specificity

p

y

Disease Disease -

+

Proportion of people without the disease who have a negative who have a negative test result.

b

+

Test

d

- -

….a highly specific test will not falsely identify people as having the disease.

Specificity = d / b + d

p Tip…..

(cid:132) Sensitivity is useful to me

(cid:132) Specificity isn’t….I want to know about the

false positives false positives …so……use 1-specificity which is the false positive rate f l

iti

t

2 by 2 table:

y

Disease Disease -

+

a b

+

Test

d c

- -

False positive rate = b/b+d

Sensitivity = a/a+c

(same as 1-specificity)

2 by 2 table:

y

Disease Disease -

+

99 10

+

Test

90 1

- -

False positive rate = 10%

Sensitivity = 99%

(same as 1-specificity)

Example Example

Your father went to his doctor and was told that his

(cid:132) After doing some reading, you find that for men

test for a disease was positive. He is really worried, and comes to ask you for help!

(cid:132)The prevalence of the disease is 30% (cid:132)The test has sensitivity of 50% and specificity of 90%

(cid:132) “Son, tell me what’s the chance I have this disease?”

of his age:

(cid:132) 100% Always

A disease with a A disease with a prevalence of 30%. prevalence of 30%.

(cid:132) 50% maybe (cid:132) 50% maybe

The test has sensitivity The test has sensitivity of 50% and specificity of 50% and specificity of 90%. of 90%.

(cid:132) 0% Never Ne er

0%

Prevalence of 30%, Sensitivity of 50%, Specificity of 90%

Sensitivity i i i S = 50%

Disease +ve

15 15

30 30

22 people test positive……….

Testing ve Testing +ve

100 100

of whom 15 h have the th disease

70 70

7 7

Disease -ve

False positive rate = 10%

So, chance of disease is disease is 15/22 about 70%

Try it again Try it again

(cid:132) A disease with a prevalence of 4% must be

(cid:132) It has a sensitivity of 50% and a specificity of

diagnosed.

(cid:132) If the patient tests positive, what is the

%90%.

chance they have the disease? ? th di th h h

Prevalence of 4%, Sensitivity of 50%, Specificity of 90%

Sensitivity i i i S = 50%

2 2

Disease +ve 4 4

11.6 people test positive……….

Testing ve Testing +ve

100 100

of whom 2 h have the th disease

9 6 9.6

96 96

Disease -ve

False positive rate = 10%

So, chance of disease is disease is 2/11.6 about 17%

Doctors with an average of 14 yrs experience D t f 14 ith i

….answers ranged from 1% to 99% answers ranged from 1% to 99%

Gigerenzer G BMJ 2003;327:741-744

p y % ….half of them estimating the probability as 50% g

Sensitivity and specificity don’t vary with prevalence

(cid:132) Test performance can vary in different settings/ (cid:132) Test performance can vary in different settings/

(cid:132) Occasionally attributed to differences in disease (cid:132) Occasionally attributed to differences in disease

patient groups, etc.

prevalence, but more likely is due to differences in diseased and non-diseased spectrums

2 x 2 table: positive predictive value 2 x 2 table: positive predictive value

Disease Disease -

+

PPV = a / a + b

a b

+

Test

d c

Proportion of people with a positive test who have the disease have the disease

- -

2 x 2 table: negative predictive value 2 x 2 table: negative predictive value

Disease Disease -

+

b a

+

Test

d c

- -

NPV = d / c + d

Proportion of people with a negative test who do not have the disease

What’s wrong with PPV and NPV? Wh t’ d NPV?

ith PPV

(cid:132) Depend on accuracy of the test and

p prevalence of the disease

Likelihood ratios

(cid:132) Can use in situations with more than 2

test outcomes

(cid:132) Direct link from pre-test probabilities to

post test probabilities post-test probabilities

2 x 2 table: positive likelihood ratio 2 x 2 table: positive likelihood ratio

How much more often a positive test occurs in people with compared to those without the disease

Disease Disease -

+

a b

+

c / b/b d LR+ = a/a+c / b/b+d

/

Test

or

d c

LR+ = sens/(1-spec) LR+ = sens/(1 spec)

- -

2 x 2 table: negative likelihood ratio 2 x 2 table: negative likelihood ratio

g

Disease Disease -

+

y How less likely a negative test result is in people with the disease compared to those without the disease without the disease

a b

+

LR- = c/a+c / d/b+d

Test T t

or

d c

LR- = (1-sens)/(spec) LR = (1 sens)/(spec)

- -

LR=1

LR>10 = strong positive test result result

LR<0.1 = strong negative test g result

No diagnostic g value

McGee: Evidence based Physical Diagnosis (Saunders Elsevier)

%

Bayesian Bayesian reasoning

Pre test 5%

Post test 20%

? Appendicitis: ? Appendicitis:

McBurney tenderness LR+ = 3.4

%

Fagan nomogram

Do doctors use quantitative methods of test accuracy? ?

f t

t

(cid:132) Survey of 300 US physicians (cid:132) Survey of 300 US physicians

(cid:132) 8 used Bayesian methods, 3 used

ROC curves, 2 used LRs

(cid:132) Why? …indices unavailable… …lack of training… …not relevant to setting/population. …other factors more important… other factors more important

(Reid et al. Academic calculations versus clinical judgements: practicing physicians’ use of quantitative measures of test accuracy. Am J Med 1998)

Appraising diagnostic tests

pp

g

g

1. Are the results valid?

2. What are the results? 2 Wh t lt ? th

3. Will they help me look after my patients?

Will the test apply in my setting? Will the test apply in my setting?

(cid:132) Reproducibility of the test and interpretation in my

(cid:132) Do results apply to the mix of patients I see? (cid:132) Will the results change my management? (cid:132) Impact on outcomes that are important to patients? (cid:132) Where does the test fit into the diagnostic strategy? (cid:132) Costs to patient/health service?

setting

Reliability – how reproducible is the test? ?

(cid:132) Kappa = measure of intra-

Test

Kappa value

0.25

yp Tachypnoea

0.41

Crackles on auscultation

observer reliability

Value of Kappa Value of Kappa

Strength of Agreement Strength of Agreement

Pleural rub Pleural rub

0.52 0 52

<0.20

Poor

0.48

0.21-0.40

Fair

CXR for cardiomegaly

0.41-0.60

Moderate

0.59

MRI spine for disc

0.61-0.80

Good

0.81-1.00

Very Good

Will the result change management? Will the result change management?

Probability of disease

100%

0%

No action

Test

Action (e.g. treat) (e g treat)

Testing threshold

Action threshold

Any questions! Any questions!