CANSSI National Case Study Competition 2019 at Carleton University

CANSSI National Case Study Competition 2019 at Carleton University

Categories: General | Intended for ,

Wednesday, October 09, 2019

4:00 PM - 7:00 PM | Add to calendar

3422 Herzberg Laboratories

1125 Colonel By Dr, Ottawa, ON

Contact Information

Dave Campbell, 6135202600,


No registration required.



About this Event

Host Organization: School of Mathematics and Statistics /
More Information: Please click here for additional details.

The Carleton community and general public is invited to observe the Canadian Statistical Sciences Institute (CANSSI) National Case Study Competition at Carleton on Wednesday, Oct. 9.


The CANSSI NCSC is a project for students enrolled in undergraduate and graduate programs at Canadian Universities. Students will compete in a statistical prediction task. The data for this competition will be made available on September 3rd, and students will be able to submit their solutions online until October 3rd. Students may register for the CANSSI NCSC starting September 3rd. Registration for the regional competitions will remain open until September 29.

Carleton University, Concordia University, MacEwan University, Simon Fraser University and the University of New Brunswick will host competitions with cash prizes to judge the solutions of their participating students. Winners of the regional competitions will be invited to compete in a final national poster championship at Simon Fraser University in Burnaby, BC at the CANSSI Headquarters on November 2nd.

The Challenge:

This national case study competition is about predicting ferry delays in BC Ferry sailings around Vancouver harbours. The dataset consists of 61,880 sailings occurring between August 2016 and March 2018. The dataset is split into a training dataset including 80% of the sailings (49,504 sailings between August 2016 and November 2017) and a testing dataset including 20% of the sailings (12,376 sailings between November 2017 and March 2018). The task is to predict whether or not each sailing described in the testing dataset was delayed. A variety of covariates are provided for each sailing (date, time of departure, departure terminal, arrival terminal, the name of the vessel, and so on). These covariates are described more fully in the Data section below. In addition to these covariates, some weather data and traffic data is provided.

In the regional competitions and national poster championship, students will be judged based on the accuracy of their delay predictions (percent correct), and also a poster in which they discuss their methods and results and additional insight about the data provided by their analysis.