Category: Suicide and Self-Injury


Exploratory Data Mining for a Single Outcome in Clinical Research

Friday, November 17
12:00 PM - 1:30 PM
Location: Cobalt 502, Level 5, Cobalt Level

Keywords: Statistics
Presentation Type: Symposium

In contrast to models typically used by applied researchers, the field of Exploratory Data Mining (EDM; McArdle & Ritschard, 2013; i.e. data mining, machine learning, statistical learning, Big Data) has popularized a number of statistical methods that efficiently fit highly nonlinear models and in many cases can produce interpretable results with good statistical properties. The past 10 years have seen an emergence of EDM applications in the social and behavioral science (for an overview, see McArdle & Ritschard, 2013). We use the term “exploratory data mining” to denote the use of statistical models that do not incorporate a priori constraints, as in directionality or the structure of a model (i.e. confirmatory factor analysis) based on hypothesized relationships. The purpose of this talk is to provide an overview of methods that are increasingly being used for clinical research.


This is broken into two sections: use of decision trees and their extensions, and regularizations methods. For the section on decision trees, guidelines for each methods use are provided, highlighting the balance between predictive performance and interpretability. In contrast to traditional methods that rely on p-values for assessing the significance of variables, emphasis is placed on how to interpret variable importance. For regularization, emphasis is placed on using lasso regression for variable selection. Lasso regression performs better in comparison to stepwise regression methods, while also having options for the calculation of p-values that take into account the adaptive nature of the method. Finally, the programming of the methods in the R statistical environment is discussed in both the resources available as well as providing sample scripts. In summary, this talk provides a detailed tutorial for understanding the use of decision trees and their extensions, as well as regularization methods. 

Ross Jacobucci

University of Notre Dame


Send Email for Ross Jacobucci


Exploratory Data Mining for a Single Outcome in Clinical Research

Attendees who have favorited this

Please enter your access key

The asset you are trying to access is locked. Please enter your access key to unlock.

Send Email for Exploratory Data Mining for a Single Outcome in Clinical Research