Massive Computational Experiments, Painlessly

Download as PDF

Course Description

Ambitious Data Science requires massive computational experimentation; the entry ticket for a solid PhD in some fields is now to conduct experiments involving 1 Million CPU hours. Recently several groups have created efficient computational environments that make it painless to run such massive experiments. This course reviews state-of-the-art practices for doing massive computational experiments on compute clusters in a painless and reproducible manner. Students will learn how to automate their computing experiments first of all using nuts-and-bolts tools such as Perl and Bash, and later using available comprehensive frameworks such as ClusterJob and CodaLab, which enables them to take on ambitious Data Science projects. The course also features few guest lectures by renowned scientists in the field of Data Science. Students should have a familiarity with computational experiments and be facile in some high-level computer language such as R, Matlab, or Python.

Grading Basis

RSN - Satisfactory/No Credit

Min

2

Max

2

Course Repeatable for Degree Credit?

No

Course Component

Research Seminar

Enrollment Optional?

No

Programs

STATS285 is a completion requirement for:
  • (from the following course set: )
  • (from the following course set: )
  • (from the following course set: )
  • (from the following course set: )
  • (from the following course set: )
  • (from the following course set: )