ERIS - Travaux statistiques

'Placebo' Example

An article reports the following results for a study designed to test the efficacy of a drug by comparing two groups (treatment vs placebo) of 15 patients each:

the observed (raw) difference D=+1.52 in favour of the treatment,

a "Student t test": t=+0.683, q=28 degrees of freedom, p=0.50, nonsignificant.

I would be interested in an interval estimate (frequentist confidence interval, or fiducial-Bayesian credible interval) in order to assess if the inefficacy of the treatment has really been proved.

Can I get an interval estimate for the true difference?

Yes! For a 100(1-α% interval, it is sufficient to know t_1-α/2: the (1-α/2 upper point of the Student distribution with q degrees of freedom.
The 100(1-α% interval estimate (frequentist or fiducial-Bayesian interval) for the true difference δ can be immediately deduced:

[ D - (D/t)t_(1-α)/2 , D + (D/t)t_1-α)/2 ]

We find here for α = 0.05 and q=28 degrees of freedom t_0.975= +2.0484, hence the 95% interval [-3.04,+6.08] (of course it is assumed that D and t are computed with appropriate accuracy).

Interpretation

This interval can be interpreted as a 95% "frequentist" confidence interval or as a 95% "fiducial-Bayesian" interval.

Get
the interval

Exemple
'placebo'

Exemple de
Student

Exemple
'interaction'

Student's example (1908)

In his original article about the "t test", Student illustrates his test for an inference about the difference between the additional hour's sleep gained by the use of two soporifics. The observed average (raw) difference is D=+1.58. In modern statements, we compute the t test statistic t=+4.06 (with q=9 degrees of freedom).
We find here for α = 0.05 and q=9 degrees of freedom t_0.975= +2.2622, hence the 95% interval [+0.70,+2.46] (of course it is assumed that D and t are computed with the appropriate accuracy).

Interval estimate and significance test

A formula equivalent to the previous formula is

[ D ( 1 - t_1-α)/2/t ) , D ( 1 + t_1-α/2/t ) ]

If t = t_1-α/2,
the t test is "exactly significant" at two-sided level α (p=a) Û the interval is [0,2D] (if D>0) or [-2D,0] (if D<0).

If t > t_1-α/2,
the t test is significant at two-sided level α (p<a) Û the interval does not include 0.
This is the case in the Student's example: the p-value is p=0.003.

If t < t_1-α/2,
the t test is non significant at two-sided level α (p>α) Û the interval includes 0.
This is the case in the 'placebo' example: the p-value is p=0.50.

Conceptual confusions

Even experts in statistics are not immune from conceptual confusions. For instance, Rosnow and Rosenthal (1996, page 336*) interpret the specific interval [0,+0.532] as "a 77% [frequentist] confidence interval" (given D=+0.532 and the one-sided p-value for the usual t test p=0.115, hence 77%=(1-2×0.115)100%). If we repeat the experience, 2D and the p-value will be different, and, in a long run repetition, the proportion of intervals [-2D,0] or [0,2D] (according to the sign of D) that contain the true value of the difference will not be 77%. Clearly, 77% is here a data dependent probability, which needs a Bayesian approach to be correctly interpreted.
[*Computing contrasts, effect sizes, and counternulls on other people's published data: General procedures for research consumers. Psychological Methods, 1, 331-340.]

Remark: Student and the interpretation of the p-value

Student wrote in 1908: "the probability is .9985 [1-p/2] or the odds are about 666 to 1 than 2 is the better soporific". This is clearly a Bayesian (ou fiducial) statement, and certainly not an orthodox frequentist statement!

Be careful!

It is only in the fiducial-Bayesian framework that you can state: "there is a 99.85% chance that the true difference is positive" and "there is a 97.5% chance that it is larger than +0.70".

If you adopt the frequentist framework, you must ban any colloquialism such as "I am 95% confident that the true difference lies between +0.70 and +2.46" that gives to understand that the confidence level may be a measure of uncertainty after the data have been seen, which it may not be.

Get
the interval

Exemple
'placebo'

Exemple de
Student

Exemple
'interaction'

'Interaction' example

Consider an experiment involving two crossed factors Age and Treatment, each with two modalities. The means of the four experimental conditions (with 10 subjects in each) are respectively 5.77 (a1,t1), 5.25 (a2,t1), 4.83 (a1,t2) and 4.71 (a2,t2).

The interaction effect can be characterized by the difference of differences:
D = (5.77-4.83) - (5.25-4.71) = +0.40
The ANOVA F ratio for this effect is F=0.47, p=0.50 (with 1 and q=36 degrees of freedom).

Given the property that the F ratio for a contrast is the square of the t statistic, we replace D/t with the absolute value of D/square-root(F).
We find here the 95% interval [-0.78,+1.58] (of course it is assumed that D and F are computed with appropriate accuracy).

Observed contrast	Student's t	ANOVA F ratio	Degrees of freedom
d=	t=	F=	q=

Interval estimate	[ Lower limit	Upper limit ]

Contrast between means You have the observed contrast and the test statistic: get the 90% 95% interval

'Placebo' Example

[ D - (D/t)t(1-α)/2 , D + (D/t)t1-α)/2 ]

Interpretation

Student's example (1908)

Interval estimate and significance test

[ D ( 1 - t1-α)/2/t ) , D ( 1 + t1-α/2/t ) ]

Conceptual confusions

Remark: Student and the interpretation of the p-value

Be careful!

'Interaction' example

Getthe interval Exemple'placebo' Exemple deStudent Exemple'interaction'

Contrast between means
You have the observed contrast and the test statistic:
get the 90% 95% interval

[ D - (D/t)t_(1-α)/2 , D + (D/t)t_1-α)/2 ]

[ D ( 1 - t_1-α)/2/t ) , D ( 1 + t_1-α/2/t ) ]

Get
the interval Exemple
'placebo' Exemple de
Student Exemple
'interaction'