Trainee Resources

Webinars and Resources created by shAIRe
Explore training sessions, practical tools, and knowledge resources designed by shAIRe to advance trainee's competence in handling lung health data in research.
Career Development for Lung Health Research Trainees
In progress
Using Artificial Intelligence in Lung Health Data
In progress
Best Practices for Statistical Analyses using Lung Health Data
In Progress
Lung Health Data Cohort & Registry Data Access and Management
In progress

External Resources for Trainees
Access a collection of curated and trusted resources and learning materials to support data-driven respiratory research.
HarvardX: Causal Diagrams
Causal diagrams have revolutionized the way in which researchers ask: What is the causal effect of X on Y? They have become a key tool for researchers who study the effects of treatments, exposures, and policies. By summarizing and communicating assumptions about the causal structure of a problem, causal diagrams have helped clarify apparent paradoxes, describe common biases, and identify adjustment variables. As a result, a sound understanding of causal diagrams is becoming increasingly important in many scientific disciplines.
Maelstrom: Rmonize
Rmonize is an R package that supports structured and well documented processing of data from individual studies into a common harmonized format. It provides functions to prepare and validate the required inputs and produce harmonized datasets and documentation based on the user-specified list of variables to generate and elements and algorithms for data processing. Rmonize also includes functions to identify potential processing issues and produce descriptive summaries and visual reports of harmonized variables. As such, Rmonize provides a streamlined reusable pipeline that helps users to improve the efficiency, consistency, and transparency of their harmonization initiative
Click here to access Rmonize
Maelstrom: Mica
Mica is a software application used to create online portals for individual epidemiological studies or multi-study networks. It helps investigators and data custodians efficiently disseminate information about their studies and networks. As such, Mica allows annotation, organization, and dissemination of study and variable metadata. A web-based search interface then allows users to browse and query this metadata, thereby helping them identify studies and data items of interest to answer their research questions. Mica also includes other useful communication and dissemination features.
Click here to access Mica
Maelstrom: Opal
Opal is a software application used to manage, harmonize, and integrate epidemiological study data. As a central study data repository, Opal allows users to import, validate, derive, analyze, and export data. Opal provides a uniform interface capable of integrating data collected from multiple sources for a single study, and enables the derivation of common-format (i.e., harmonized) data across multiple studies. The Opal application also provides a state-of-the-art software infrastructure for data encryption, participant-identifier management (with import and export functions), and user authentication/authorization.
GraphPad: Choosing a Statistical Test
This is chapter 37 of the first edition of Intuitive Biostatistics by Harvey Motulsky. Copyright © 1995 by Oxford University Press Inc.
Choosing the correct analytical approach for your situation can be a daunting process. This resource contains a video and a review table that will help you in the process of determining the best analytical approach
UCLA Statistical Methods and Data Analytics: Which Statistical Test to Use?
The table below covers a number of common analyses and helps you choose among them based on the number of dependent variables (sometimes referred to as outcome variables), the nature of your independent variables (sometimes referred to as predictors). You also want to consider the nature of your dependent variable, namely whether it is an interval variable, ordinal or categorical variable, and whether it is normally distributed (see What is the difference between categorical, ordinal and interval variables? for more information on this). The table then shows one or more statistical tests commonly used given these types of variables (but not necessarily the only type of test that could be used) and links showing how to do such tests using SAS, Stata and SPSS.
UCLA Statistical Methods and Data Analytics: Annotate Output
These pages contain example programs and output with footnotes explaining the meaning of the output. This is to help you more effectively read the output that you obtain and be able to give accurate interpretations.
UCLA Statistical Methods and Data Analytics: Data Analysis Examples
The pages contain examples (often hypothetical) illustrating the application of different statistical analysis techniques using different statistical packages. Each page provides a handful of examples of when the analysis might be used along with sample data, an example analysis and an explanation of the output, followed by references for more information. These pages merely introduce the essence of the technique and do not provide a comprehensive description of how to use it.
UCLA Statistical Methods and Data Analytics: Textbook Examples
This page lists all of the books and papers for which we have developed web pages showing how to solve the examples using common statistical packages. We encourage you to obtain the textbooks or papers associated with these pages to gain a deeper conceptual understanding of the analyses illustrated. We are very grateful to the authors of these textbooks and papers for granting us permission to create these pages and to distribute their data files via our web pages. These books are just some of the books available for you to borrow via our Statistics Books for Loan.
UCLA Statistical Methos and Data Analytics: Upcoming and Past Seminars
Upcoming and past seminars available through UCLA Advanced Research Computing.
AI4PH Short Courses
AI4PH is pleased to offer a suite of free, short courses for graduate students, public health professionals, data science professionals who want to develop their skills and understanding in AI, public health and equity in order to apply them in their research and practice. This program is concerned with transformative change in addressing population and public health challenges and understanding how these tools impact health equity.
These free courses are offered throughout the year and require an application.
Causal Inference: What If? Textbook by Miguel A. Hernan & James M. Robins
We expect that the book will be helpful to anyone interested in causal inference, including epidemiologists, statisticians, psychologists, economists, sociologists, political scientists, computer scientists… The book is divided in three parts of increasing difficulty: (1) causal inference without models, (2) causal inference with models, and (3) causal inference from complex longitudinal data.
Statistical Rethinking: A Bayesian Course by Richard McElreath
This is a series of online Youtube lectures by Richard McElreath that covers lecture style versions of his textbook.
CANSTAT
The Canadian Network for Statistical Training in Trials (CANSTAT) is a pan-Canadian, multi-institutional and multidisciplinary training platform that will provide participants with the technical skills and practical experience needed to become leaders in their field and to ensure that clinical trials generate the highest-quality evidence to improve the health of Canadians.
The goals of the program are to equip fellows with:
- Knowledge about clinical trials
- Required technical and interpersonal skills
- Opportunities to implement skills into practice
The CANSTAT program will bring fellows together with clinical and statistical experts in clinical trials and allow fellows to learn through a comprehensive experiential learning program. Formal education will also be provided through workshops led by clinical trial experts from around the world, and though in-person capacity-building meetings.
Upon completion of the program, fellows will be prepared to work as professionals in biostatistics as effective collaborators, communicators, scholars and leaders who will contribute significantly to academic and industry clinical trials in Canada.
CANTRAIN
The Canadian Network for Statistical Training in Trials (CANSTAT) is a pan-Canadian, multi-institutional and multidisciplinary training platform that will provide participants with the technical skills and practical experience needed to become leaders in their field and to ensure that clinical trials generate the highest-quality evidence to improve the health of Canadians.
The goals of the program are to equip fellows with:
- Knowledge about clinical trials
- Required technical and interpersonal skills
- Opportunities to implement skills into practice
The CANSTAT program will bring fellows together with clinical and statistical experts in clinical trials and allow fellows to learn through a comprehensive experiential learning program. Formal education will also be provided through workshops led by clinical trial experts from around the world, and though in-person capacity-building meetings.
Upon completion of the program, fellows will be prepared to work as professionals in biostatistics as effective collaborators, communicators, scholars and leaders who will contribute significantly to academic and industry clinical trials in Canada.
CANUE
CANUE provides researchers with access to standardized, analysis-ready environmental exposure data covering all of Canada. This includes concentrations of key air pollutants (e.g., PM₂.₅, NO₂, ozone), land use, green space, noise, and climate variables, all indexed to six-character postal codes. Canadian Urban Environmental Health Research Consortium (CANUE) data are pre-linked to major Canadian cohort studies and administrative health databases that include measures and outcomes relevant to lung health. Researchers can also request customized datasets to link with their own study populations. These resources support high-quality, reproducible research on the environmental determinants of respiratory health.