Mining Massive Data Sets
Download as PDF
Course Description
The availability of massive datasets is revolutionizing science and industry. This course discusses data mining and machine learning algorithms for analyzing very large amounts of data. Topics include: Big data systems (Hadoop, Spark); Link Analysis (PageRank, spam detection); Similarity search (locality-sensitive hashing, shingling, min-hashing); Stream data processing; Recommender Systems; Analysis of social-network graphs; Association rules; Dimensionality reduction (UV, SVD, and CUR decompositions); Algorithms for large-scale mining (clustering, nearest-neighbor search); Large-scale machine learning (decision tree ensembles); Multi-armed bandit; Computational advertising. Prerequisites: At least one of CS107 or CS145.
Grading Basis
ROP - Letter or Credit/No Credit
Min
3
Max
4
Course Repeatable for Degree Credit?
No
Course Component
Lecture
Enrollment Optional?
No
This course has been approved for the following WAYS
Formal Reasoning (FR)
Does this course satisfy the University Language Requirement?
No
Programs
CS246
is a
completion requirement
for:
- (from the following course set: )
- (from the following course set: )
- (from the following course set: )