NY R Conference

Thanks for attending the 10th anniversary of the New York R Conference!

Check out the Photo Gallery and Videos ! Stay tuned for 2025 conference details!

Agenda

Download the program PDF

Wednesday, May 15

08:00 AM - 09:00 AM

Registration & Breakfast
09:00 AM - 05:00 PM

Workshop: Machine Learning in R

Max Kuhn

Scientist @ Posit

More details

Join Max Kuhn on a tour through Machine Learning in R, with emphasis on using the software as opposed to general explanations of model building. This workshop is an abbreviated introduction to the tidymodels framework for modeling.

You'll learn about data preparation, model fitting, model assessment and predictions. The focus will be on data splitting and resampling, data pre-processing and feature engineering, model creation, evaluation, and tuning. This is not a deep learning course and will focus on tabular data.

Pre-requisites: some experience with modeling in R and the tidyverse (don't need to be experts); prior experience with lm is enough to get started and learn advanced modeling techniques. In case participants can’t install the packages on their machines, RStudio Server Pro instances will be available that are pre-loaded with the appropriate packages and GitHub repository.

(In-Person & Virtual Ticket Options Available)
09:00 AM - 05:00 PM

Workshop: Causal Inference in R

Malcolm Barrett & Lucy D'Agostino McGowan

More details
In this workshop, we’ll teach the essential elements of answering causal questions in R through causal diagrams, and causal modeling techniques such as propensity scores and inverse probability weighting.

In both data science and academic research, prediction modeling is often not enough; to answer many questions, we need to approach them causally. In this workshop, we’ll teach the essential elements of answering causal questions in R through causal diagrams, and causal modeling techniques such as propensity scores and inverse probability weighting. We’ll also show that by distinguishing predictive models from causal models, we can better take advantage of both tools. You’ll be able to use the tools you already know--the tidyverse, regression models, and more--to answer the questions that are important to your work.

This course is for you if you:
- Know how to fit a linear regression model in R
- Have a basic understanding of data manipulation and visualization using tidyverse tools
- Are interested in understanding the fundamentals behind how to move from estimating correlations to causal relationships
(In-Person & Virtual Ticket Options Available)
09:00 AM - 05:00 PM

Workshop: Exploratory Data Analysis with the Tidyverse

David Robinson

Director of Data Science @ Heap

More details

The tidyverse is a powerful collection of packages following a standard set of principles for usability. During this workshop David will demonstrate an exploratory data analysis in R using tidy tools. He will demonstrate the use of tools such as dplyr and ggplot2 for data transformation and visualization, as well as other packages from the tidyverse as they're needed. He'll narrate his thought process as attendees follow along and offer their own solutions.

The workshop expects some familiarity with dplyr and ggplot2—enough to work with data using functions like mutate, group_by, and summarize and to create graphs like scatterplots or bar plots in ggplot2. These concepts will be re-introduced to ensure a smooth workshop, but it isn't designed for brand new R programmers.

The workshop is designed to be interactive and participants are expected to type along on their own keyboards.

(In-Person & Virtual Ticket Options)

Workshop tickets sold separately

Thursday, May 16

08:00 AM - 08:50 AM

Registration & Breakfast
08:50 AM - 09:00 AM

Opening Remarks
09:00 AM - 09:20 AM

Not Your College Stats Course: Engaging Stakeholders Through Data Science

Megan Robertson

Senior Data Scientist @ Freelance

More details

When working in industry data scientists must collaborate with colleagues across many different positions. You need to understand stakeholder needs and communicate results with non-technical teams. While you don't need to share the mathematical details, explaining analyses builds a stronger relationship with stakeholders and helps them to understand the data science process. How do you determine the best way to deliver results? What are some techniques you can use to break down data science techniques and algorithms? This talk will review methods to effectively share data science analysis and why it is important to stay aligned with stakeholders.
09:25 AM - 09:45 AM

Building Data Tooling in Rust for Multimodal AI

Chang She

CEO & Cofounder @ LanceDB

More details

AI adoption is bringing a host of new challenges for data management and new workloads. This is especially true for multi-modal AI where data challenges extend far beyond just embeddings and require new tooling for working with images, audio, video, pdfs, and more. Traditional formats and tooling are optimized for purely tabular data and cannot be used effectively to manage unstructured data types. Instead, a new set of infrastructure and tooling are being built, in Rust. Rust makes high performance data manipulation code much safer, which means developers can move much quicker with more confidence. It's easy to bridge Rust into higher level languages like Python/R to be wrapped into APIs much more familiar to the data science / machine learning users. Finally, Rust offers powerful features for concurrency, which allows developers to parallelize data manipulation tasks much easier. In this talk we'll use Lance and LanceDB as a source of examples on building high performance data tools for AI in Rust. We'll show you how Rust is used to create blazing fast vector search with hardware acceleration, how Rust helps us create new data management tooling for unstructured data, and how these tools can be exposed in higher level languages like python and javascript.
09:50 AM - 10:10 AM

Open-Source Football: A Brief History of the NFL's Big Data Bowl Competition

Mike Band

Sr. Manager, Research & Analytics @ NFL Next Gen Stats

More details

For the past six years, the National Football League has hosted the annual Big Data Bowl, an open-source data competition. This event invites data scientists, analysts, and fans alike to develop innovative advanced metrics using Next Gen Stats player-tracking data. In my talk, I will explore the competition's history and highlight submissions that have led to the creation of several key NGS metrics. These metrics are not only featured in every live broadcast but are also utilized by all 32 teams. Don't miss the seventh annual Big Data Bowl coming Fall 2025.
10:10 AM - 10:40 AM

Break
10:40 AM - 11:00 AM

Reporting Survival Analysis Results with the gtsummary and ggsurvfit Packages

Emily Zabor

Associate Staff Biostatistician @ Cleveland Clinic, Department of Quantitative Health Sciences

More details

Survival analysis is an essential tool to handle censored time-dependent endpoints such as overall survival, which are common across a variety of biomedical and other applications. The survival package in R provides the most essential tools to conduct a survival analysis, including estimating survival probabilities, fitting Cox proportional hazards models, and plotting Kaplan-Meier curves. While the functions are powerful, user-friendly, and well documented, getting publication-ready tables and figures can still be a challenge. In this talk, I will review the basics of survival analysis, and will demonstrate how to take results from the console to the manuscript using the gtsummary and ggsurvfit packages.
11:05 AM - 11:25 AM

15 Years of Data Science in NYC

Jared P. Lander

Chief Data Scientist @ Lander Analytics

More details

Back when the meetup got started in 2009, data science wasn't even a thing yet, we called ourselves statisticians or analysts. Within a few short years Columbia had its first data science course, there were multiple data meetups, all with different names and an unofficial data mafia. Come take a look at the New York data community and how it evolved throughout the past 15 years.
11:30 AM - 11:50 AM

Smooths, Splines, and the Chamber of Secrets - Demystifying Female Reproductive Health

Ipek Ensari

Assistant Professor @ Windreich Department of Artificial Intelligence and Human Health, Hasso Plattner Institute for Digital Health, Icahn School of Medicine at Mount Sinai

More details

Chronic disorders affecting the female reproductive system often present diagnostic and treatment challenges due to their under-documentation within electronic health records and a lack of objective measures. Multimodal data from mobile health (mHealth) technologies can help close this gap by providing comprehensive patient profiles, insights into symptom patterns, and the interplay between symptomatic variance and personal factors. However, extracting meaningful insights from these noisy, high-dimensional data requires properly addressing their complex longitudinal patterns and irregular sampling. To address these challenges, this talk will investigate generalized additive models (GAMs) using example cases from pelvic pain disorders (PPDs) - a cluster of conditions with many unknowns. To this end, we will employ smoothing functions and mixture models to reveal underlying trends and relationships that may not be immediately apparent. We will use real-life prospective patient data and nonparametric methods that can be used when there is uncertainty in the shape and patterns of the data.
11:50 AM - 01:00 PM

Lunch
01:00 PM - 01:20 PM

Analyzing and Visualizing Event Sequence Data

Sean Taylor

Chief Scientist @ Motif Analytics

More details

Many business processes can be represented as event sequence data, especially from product instrumentation in web and mobile applications. However, low-level events are challenging to wrangle, model, and visualize. As a result, analysts typically aggregate data before visualization and estimation, discarding valuable information and introducing bias. In this talk I discuss how to work with event sequences directly, with a focus on exploratory analysis and hypothesis generation, and step through interactive visualizations that support these analysis goals.
01:25 PM - 02:05 PM

It’s About Time

Andrew Gelman

Professor @ Department of Statistics and Department of Political Science, Columbia University

More details

Statistical processes occur in time, but this is often not accounted for in the methods we use and the models we fit. Examples include imbalance in causal inference, generalization from A/B tests even when there is balance, sequential analysis, adjustment for pre-treatment measurements, poll aggregation, spatial and network models, chess ratings, sports analytics, and the replication crisis in science. The point of this talk is to motivate you to include time as a factor in your statistical analyses. This may change how you think about many applied problems!
02:05 PM - 02:35 PM

Break
02:35 PM - 02:55 PM

I Built a Robot to Write This Talk

Jon Harmon

Executive Director @ Data Science Learning Community

More details

Are large language models coming for your job? To examine both sides of that argument, I wrote {robodeck}, an R package that uses the OpenAI API to auto-generate a quarto slide deck from as little as a title. See how it helped, where it failed miserably, and how I coerced it to work at least most of the time.
03:00 PM - 03:20 PM

The Science of Product Development: Bringing Causal Inference to Conversion and Retention Metrics

David Robinson

Director of Data Science @ Contentsquare

More details

Modern websites track every pageview and click that their users perform, and have a strong interest in using that data to discover friction and smooth the journey. So then why are so many websites still so hard to use? I’ll make the case that the problem is largely a scientific one: even when we have the right data, we lack the conceptual and statistical tools to draw causal conclusions about user behavior. In this talk, I’ll lay out an early vision of what “product science” could be. I’ll introduce journeygrams, a method for quantifying and reasoning about sequential user behavior, and show how they can make product concepts like friction, backtracking, and retention more rigorous and actionable. I’ll include some examples of how typical product problems should be analyzed, and why our new approach is better suited to these problems than classical statistics and ML. These principles could help anyone looking to use data to improve their own products, and I hope will contribute to bringing the causal revolution to product development.
03:25 PM - 03:45 PM

RAGtime in the Big Apple: Chat with a Decade of NYR Talks

Alan Feder

Senior Principal Data Scientist @ Freelance

More details

As the adoption of Large Language Models (LLMs) like ChatGPT has increased over the past year, there's been a growing excitement about using these technologies to query existing documents and datasets. However, training your own LLM chatbot from scratch is impossible for everyone except the largest tech companies. Retrieval-Augmented Generation (RAG) is a versatile method for addressing these challenges. I will show how this works with a live demo exploring the past 10 years of NYR talks.
03:45 PM - 04:15 PM

Break
04:15 PM - 04:35 PM

Automating Tests for your RAG Chatbot or Other Generative Tool

Abigail Haddad

Lead Data Scientist @ Capital Technology Group

More details

Building a Retrieval Augmented Generation (RAG) chatbot that answers questions about a specific set of documents is straightforward. But how do you tell if it's working? Automated evaluation of generative tools for specific use cases is tricky, but it's also important if you want to easily compare performance using different underlying LLMs, system prompts, temperatures, or other parameters -- or just make sure you're not breaking something when you push your code. In this talk, I'll discuss why this kind of evaluation is challenging and review a few options for the kinds of assessments you can create, including using an LLM to evaluate your LLM-based tool. We'll then look at several ways to write automated LLM-led evaluations, including with a library that allows you to easily and with very little coding create complex grading rubrics for your tests.
04:40 PM - 05:00 PM

Kick or Receive? Determining Optimal NFL Playoff Overtime Strategy via Simulation

Walker Harrison

Analyst @ New York Yankees

More details

This year's Super Bowl was the first to feature an overtime period under the NFL's new playoff rules, which guarantee that each team will possess the ball in the added time. The San Francisco 49ers opted to have the first possession, subsequently lost, and were roundly criticized for not forcing their opponent to start with the ball. But did they actually make a poor strategic decision? To answer this question, we can simulate overtime periods by re-sampling historical plays under some added constraints.
05:00 PM - 05:10 PM

Closing Remarks
05:10 PM - 06:30 PM

Happy Hour

Friday, May 17

09:00 AM - 09:50 AM

Registration & Breakfast
09:50 AM - 10:00 AM

Opening Remarks
10:00 AM - 10:20 AM

R is for Retention: Using Regression Models to Increase Revenue in Sports

Kelsey McDonald

Ticketing Director @ Two Circles

More details

A conversation about how we’ve used R in the sports world to build logistic regression models that predict season ticket member retention, and multinomial regression models to identify upsell opportunities.
10:25 AM - 10:45 AM

Analyzing Consistency in LLM Outputs Leveraging Colourful Queries

Anna Kircher

Senior Data Scientist @ EY | AI & Data, FSO

More details

Approaching the GenAI’s black box with the power of colours. My illuminating journey through the mysteries of colour symbolism and its interpretations using ChatGPT and R. Through specific queries about metaphors, saying and meanings of colours, responses are generated. Analysis of said responses is conducted to further interpret and summarize colour perception and interpretation, shedding a little light on the intricate nuances of colour symbolism. By introducing randomness through temperature variation in underlying language models, the creative potential and consistency in responses is explored and shifts in the interpretation of colour symbolism uncovered.Overall leading to unraveling the intersection between language, colours, perception and AI.
10:45 AM - 11:15 AM

Break
11:15 AM - 11:35 AM

Strategic Football Operations: Department Philosophies and Integrating Statistical Applications

John Park

Director of Strategic Football Operations @ Dallas Cowboys

More details

In this talk, we’ll explore ideas we’ve leaned on to establish an identity for the SFO Department of the Dallas Cowboys, and we’ll unpack how we are integrated into the different elements of traditional football operations. We’ll cover topics such as where we choose to operate on the continuum between theoretical and applied research, the premium we place on manifesting a mindset of humility, clear communication, and collaboration, and the critical importance of trust. These wide-ranging topics are some of the ideas we’re pouring into the foundation of what we are continuing to build in Dallas.
11:40 AM - 12:20 PM

R in Production

Hadley Wickham

Chief Scientist @ Posit

More details

In this talk, we delve into the strategic deployment of R in production environments, guided by three core principles to elevate your work from individual exploration to scalable, collaborative data science. The essence of putting R into production lies not just in executing code but in crafting solutions that are robust, repeatable, and collaborative, guided by three key principles: * Not just once: Successful data science projects are not a one-off, but will be run repeatedly for months or years. I'll discuss some of the challenges for creating R scripts and applications that run repeatedly, handle new data seamlessly, and adapt to evolving analytical requirements without constant manual intervention. This principle ensures your analyses are enduring assets not throw away toys. * Not just my computer: the transition from development on your laptop (usually windows or mac) to a production environment (usually linux) introduces a number of challenges. Here, I'll discuss some strategies for making R code portable, how you can minimise pain when something inevitably goes wrong, and few unresolved auth challenges that we're currently working on. * Not just me: R is not just a tool for individual analysts but a platform for collaboration. I'll cover some of the best practices for writing readable, understandable code, and how you might go about sharing that code with your colleagues. This principle underscores the importance of building R projects that are accessible, editable, and usable by others, fostering a culture of collaboration and knowledge sharing. By adhering to these principles, we pave the way for R to be a powerful tool not just for individual analyses but as a cornerstone of enterprise-level data science solutions. Join me to explore how to harness the full potential of R in production, creating workflows that are robust, portable, and collaborative.
12:20 PM - 01:30 PM

Lunch
01:30 PM - 01:50 PM

The Future Roadmap for the Composable Data Stack

Wes McKinney

Principal Architect @ Posit

More details

In this talk, I plan to review the progress we have made in the last 10 years developing composable, interoperable open standards for the data processing stack, from such infrastructure projects as Parquet and Arrow to user-facing interface libraries like Ibis for Python and the tidyverse for R. In discussing the current landscape of projects, I will dig into the different areas where more innovation and growth is needed, and where we would ideally like to end up in the coming years.
01:55 PM - 02:15 PM

SHINYLIVE IS SO EASY

Max Kuhn

Scientist @ Posit

More details

shinylive is an extension to the Quarto open-source scientific and technical publishing system. It enables shiny applications to run locally, without a shiny server using WebAssembly. I’ll show examples and discuss the limitations of using shinylive.
02:20 PM - 02:40 PM

Data, AI, and Creativity

Hilary Mason

Co-Founder @ Hidden Door

More details

In this talk, we'll explore the lines between analytics, data science, machine learning, and AI, and what current developments open up in terms of creativity and impact.
02:40 PM - 03:10 PM

Break
03:10 PM - 04:10 PM

Join us for a captivating retrospective panel as we celebrate a decade of the New York R Conference, 15 years of the New York Open Statistical Programming Meetup and the vibrant journey of the Data Science community. Dive into the highlights, memories and collective achievements that have shaped our community's remarkable evolution. Don't miss this nostalgic journey reflecting on the past and embracing the exciting future of data science!

Retrospective Panel

More details

Hosted by Jon Krohn, this retrospective panel includes special guests Drew Conway, Emily Zabor, JD Long and Jared Lander.
04:10 PM - 04:20 PM

Closing Remarks

Workshops

Machine Learning in R

Hosted by Max Kuhn

Wednesday, May 15 | 9:00am - 5:00pm

More details

Join Max Kuhn on a tour through Machine Learning in R, with emphasis on using the software as opposed to general explanations of model building. This workshop is an abbreviated introduction to the tidymodels framework for modeling.

You'll learn about data preparation, model fitting, model assessment and predictions. The focus will be on data splitting and resampling, data pre-processing and feature engineering, model creation, evaluation, and tuning. This is not a deep learning course and will focus on tabular data.

Pre-requisites: some experience with modeling in R and the tidyverse (don't need to be experts); prior experience with lm is enough to get started and learn advanced modeling techniques. In case participants can’t install the packages on their machines, RStudio Server Pro instances will be available that are pre-loaded with the appropriate packages and GitHub repository.

(In-Person & Virtual Ticket Options Available)

Causal Inference in R

Hosted by Malcolm Barrett & Lucy D'Agostino McGowan

Wednesday, May 15 | 9:00am - 5:00pm

More details

In this workshop, we’ll teach the essential elements of answering causal questions in R through causal diagrams, and causal modeling techniques such as propensity scores and inverse probability weighting.

In both data science and academic research, prediction modeling is often not enough; to answer many questions, we need to approach them causally. In this workshop, we’ll teach the essential elements of answering causal questions in R through causal diagrams, and causal modeling techniques such as propensity scores and inverse probability weighting. We’ll also show that by distinguishing predictive models from causal models, we can better take advantage of both tools. You’ll be able to use the tools you already know--the tidyverse, regression models, and more--to answer the questions that are important to your work.

This course is for you if you:

Know how to fit a linear regression model in R
Have a basic understanding of data manipulation and visualization using tidyverse tools
Are interested in understanding the fundamentals behind how to move from estimating correlations to causal relationships

(In-Person & Virtual Ticket Options Available)

Exploratory Data Analysis with the Tidyverse

Hosted by David Robinson

Wednesday, May 15 | 9:00am - 5:00pm

More details

The tidyverse is a powerful collection of packages following a standard set of principles for usability. During this workshop David will demonstrate an exploratory data analysis in R using tidy tools. He will demonstrate the use of tools such as dplyr and ggplot2 for data transformation and visualization, as well as other packages from the tidyverse as they're needed. He'll narrate his thought process as attendees follow along and offer their own solutions.

The workshop expects some familiarity with dplyr and ggplot2—enough to work with data using functions like mutate, group_by, and summarize and to create graphs like scatterplots or bar plots in ggplot2. These concepts will be re-introduced to ensure a smooth workshop, but it isn't designed for brand new R programmers.

The workshop is designed to be interactive and participants are expected to type along on their own keyboards.

(In-Person & Virtual Ticket Options)

Speakers

Andrew Gelman

Professor

Department of Statistics and Department of Political Science, Columbia University

@StatModeling

Talk: It’s About Time

Hilary Mason

Co-Founder

Hidden Door

@hmason

Talk: Data, AI, and Creativity

Hadley Wickham

Chief Scientist

Posit

@hadleywickham

Talk: R in Production

Abigail Haddad

Lead Data Scientist

Capital Technology Group

@abbystat

Talk: Automating Tests for your RAG Chatbot or Other Generative Tool

Wes McKinney

Principal Architect

Posit

@wesmckinn

Talk: The Future Roadmap for the Composable Data Stack

Emily Zabor

Associate Staff Biostatistician

Cleveland Clinic, Department of Quantitative Health Sciences

@ClevelandClinic

Talk: Reporting Survival Analysis Results with the gtsummary and ggsurvfit Packages

Sean Taylor

Chief Scientist

Motif Analytics

@seanjtaylor

Talk: Analyzing and Visualizing Event Sequence Data

Ipek Ensari

Assistant Professor

Windreich Department of Artificial Intelligence and Human Health, Hasso Plattner Institute for Digital Health, Icahn School of Medicine at Mount Sinai

@datatransformr

Talk: Smooths, Splines, and the Chamber of Secrets - Demystifying Female Reproductive Health

Max Kuhn

Scientist

Posit

@topepos

Talk: SHINYLIVE IS SO EASY

Anna Kircher

Senior Data Scientist

EY | AI & Data, FSO

Talk: Analyzing Consistency in LLM Outputs Leveraging Colourful Queries

Mike Band

Sr. Manager, Research & Analytics

NFL Next Gen Stats

@MBandNFL

Talk: Open-Source Football: A Brief History of the NFL's Big Data Bowl Competition

Jared P. Lander

Chief Data Scientist

Lander Analytics

@jaredlander

Talk: 15 Years of Data Science in NYC

John Park

Director of Strategic Football Operations

Dallas Cowboys

@johnpark_52

Talk: Strategic Football Operations: Department Philosophies and Integrating Statistical Applications

Kelsey McDonald

Ticketing Director

Two Circles

@TwoCircles

Talk: R is for Retention: Using Regression Models to Increase Revenue in Sports

Walker Harrison

Analyst

New York Yankees

@WalkWearsCrocs

Talk: Kick or Receive? Determining Optimal NFL Playoff Overtime Strategy via Simulation

Megan Robertson

Senior Data Scientist

Freelance

@leggomymeggo4

Talk: Not Your College Stats Course: Engaging Stakeholders Through Data Science

David Robinson

Director of Data Science

Contentsquare

@drob

Talk: The Science of Product Development: Bringing Causal Inference to Conversion and Retention Metrics

Chang She

CEO & Cofounder

LanceDB

@changhiskhan

Talk: Building Data Tooling in Rust for Multimodal AI

Jon Harmon

Executive Director

Data Science Learning Community

@jonthegeek

Talk: I Built a Robot to Write This Talk

Alan Feder

Senior Principal Data Scientist

Freelance

@AlanFeder

Talk: RAGtime in the Big Apple: Chat with a Decade of NYR Talks

Retrospective Panel

Join us for a captivating retrospective panel as we celebrate a decade of the New York R Conference, 15 years of the New York Open Statistical Programming Meetup, and the vibrant journey of the Data Science community. Dive into the highlights, memories, and collective achievements that have shaped our community’s remarkable evolution. Don’t miss this nostalgic journey reflecting on the past and embracing the exciting future of data science!

Jon Krohn

Host

SuperDataScience Podcast

@JonKrohnLearns

Drew Conway

Head of Data Science, Private Investments

Two Sigma

@twosigma

Emily Zabor

Associate Staff Biostatistician

Cleveland Clinic, Department of Quantitative Health Sciences

@ClevelandClinic

James David (JD) Long

Head of Portfolio Technology Innovation

Renaissance Reinsurance

@Cmastication

Jared P. Lander

Chief Data Scientist

Lander Analytics

@jaredlander

Thanks for attending the 10th anniversary of the New York R Conference!

Check out the Photo Gallery and Videos ! Stay tuned for 2025 conference details!

Agenda

Wednesday, May 15

08:00 AM - 09:00 AM

09:00 AM - 05:00 PM

09:00 AM - 05:00 PM

09:00 AM - 05:00 PM

Thursday, May 16

08:00 AM - 08:50 AM

08:50 AM - 09:00 AM

09:00 AM - 09:20 AM

09:25 AM - 09:45 AM

09:50 AM - 10:10 AM

10:10 AM - 10:40 AM

10:40 AM - 11:00 AM

11:05 AM - 11:25 AM

11:30 AM - 11:50 AM

11:50 AM - 01:00 PM

01:00 PM - 01:20 PM

01:25 PM - 02:05 PM

02:05 PM - 02:35 PM

02:35 PM - 02:55 PM

03:00 PM - 03:20 PM

03:25 PM - 03:45 PM

03:45 PM - 04:15 PM

04:15 PM - 04:35 PM

04:40 PM - 05:00 PM

05:00 PM - 05:10 PM

05:10 PM - 06:30 PM

Friday, May 17

09:00 AM - 09:50 AM

09:50 AM - 10:00 AM

10:00 AM - 10:20 AM

10:25 AM - 10:45 AM

10:45 AM - 11:15 AM

11:15 AM - 11:35 AM

11:40 AM - 12:20 PM

12:20 PM - 01:30 PM

01:30 PM - 01:50 PM

01:55 PM - 02:15 PM

02:20 PM - 02:40 PM

02:40 PM - 03:10 PM

03:10 PM - 04:10 PM

04:10 PM - 04:20 PM

Workshops

Machine Learning in R

Hosted by Max Kuhn

Wednesday, May 15 | 9:00am - 5:00pm

Causal Inference in R

Hosted by Malcolm Barrett & Lucy D'Agostino McGowan

Wednesday, May 15 | 9:00am - 5:00pm

Exploratory Data Analysis with the Tidyverse

Hosted by David Robinson

Wednesday, May 15 | 9:00am - 5:00pm

Speakers

Andrew Gelman

Hilary Mason

Hadley Wickham

Abigail Haddad

Wes McKinney

Emily Zabor

Sean Taylor

Ipek Ensari

Max Kuhn

Anna Kircher

Mike Band

Jared P. Lander

John Park

Kelsey McDonald

Walker Harrison

Megan Robertson

David Robinson

Chang She

Jon Harmon

Alan Feder

Retrospective Panel

Jon Krohn

Drew Conway

Emily Zabor