Math 4803, Fall 2019
Instructor: Wenjing
Liao
Office:
Skiles 258
Email:
wliao60@gatech.edu
Lectures: M W
4:305:45pm, Clough 131
Office Hours:
Tuesday 14PM or by appointment
Course webpage: http://people.math.gatech.edu/~wliao60/19Fall_Math4803.html
General
course information

Prerequisites: Calculus I and
II: Math 1551 and Math 1552; and Linear algebra, such
as 1553 or 1554 or 1564
 Course
description
 Course goals:
 Introduce modern data science techniques and
the foundational mathematical concepts in linear
algebra, probability, and basic optimization related
with these techniques
 Teach how to use
software to perform learning tasks while adequately
addressing the practical challenges (e.g., modeling,
parameter tuning, computation and speed)
 Provide students
with valuable firsthand experience in handling real
and complex data
 Textbook and references:
 Review of linear algebra: http://cs229.stanford.edu/summer2019/cs229linalg.pdf
 Basic concepts in probability: http://cs229.stanford.edu/summer2019/cs229prob.pdf
 Statistical learning book: James, Gareth,
Daniela Witten, Trevor Hastie and Robert Tibshirani, An
introduction to statistical learning. Vol. 112, New York:
Springer, 2013: http://faculty.marshall.usc.edu/garethjames/ISL/
 Elements of statistical learning: https://web.stanford.edu/~hastie/ElemStatLearn/
 Von Luxburg, Ulrike, "A tutorial on
spectral clustering" Statistics and computing 17.4,
395416, 2007.
 Tentative topics (may not be all covered):
 Chapter 2 of the statistical learning book:
introduction to regression, classification and clustering.
 Chapter 3 of the statistical learning book: linear
regression.
 Chapter 4 of the statistical learning book:
classification, including logistic regression and linear
discriminate analysis.
 Chapter 5 of the statistical learning book:
resampling methods, especially cross validation.
 Chapter 7 of the statistical learning book:
moving beyond linearity, including polynomial regression,
step functions, regression splines and local regression.
 Chapter 10 of the statistical learning book:
unsupervised learning, including principal component
analysis and KMeans clustering.
 Spectral clustering
 Multidimensional scaling
 Grading:
 Grading will be based on exams, homework assignments
and a final project: Homework
40% Midterm 20% Final project 40%
 There will be 5 homework
assignments. The lowest HW score will be dropped.
 Midterm is on October 7.
 Final project: Students form a group of 3 or 4 to work
on a project with real data sets. Groups are expected to be
formed no later than Oct 23. The instructor will provide a
list of possible projects. Each group will pick a topic by
the end of October, and then write a project proposal. The
main work of the project is expected to be completed in
November. Each group will make a
poster and present it in the last lecture on Dec 2.
 The course grade will be determined by an absolute
scale with a slight modification using the normal
distribution curve if appropriate.
 General policies:
 Calculators are not allowed in exams.
 Attendance: Attendance at all lectures is required. If
you miss a lecture, it is your responsibility to catch up on
the topics that you missed.
 If a student is found responsible through the Office of
Student Integrity for academic dishonesty on a graded item
in this course, the student will receive a score of zero for
that assignment, and the final grade for the course will be
further reduced by one letter grade.
 In
this course, as in many math courses, working
in groups to study particular problems and
discussing theory is strongly
encouraged. Your ability to talk
mathematics is of particular importance to
your general understanding of mathematics.
You can discuss with other students about how
to approach the problem. However, you
must write up the solutions and the programs
to the homework problems individually and
separately.
 Programming: You can choose to use
Matlab, Python or R for programming.
Homework:
HW 1
due on Wednesday September 4, Sample Solution
1, Solution
2, Solution
3
HW 2
due on Wednesday September 18, Sample Solution
1, Solution
2
HW 3
Part I is due on Wednesday October 2 and Part II
is due on Friday October 11, Solution
for conceptual questions
HW 4
due on Monday November 25
Exams:
Midterm 1: covers materials from
Week 1 to Week 5
Course Progress:
 Week 1
 Week 2
 Week 3
 Week 4
 Week 5
 Week 6
 Week 7
 Week 8
 Week 9
 Week 10
 Week 11
 Principal Component Analysis (PCA)
 Week 12
 PCA from the textbook, slides
 Theory of PCA, notes
 Presentation of project proposal
 Week 13
Ethics:
The strength of
the university depends on academic and personal integrity.
In this course, everyone must be honest and truthful.
Violations include
cheating on exams, plagiarism, improper use of the
internet and electronic devices, unauthorized
collaboration, alteration of graded assignments,
forgery and falsification, lying, facilitating
academic dishonesty and unfair competition.
Ignorance of these rules is not an excuse.
Special
aid:
Students with disabilities or other special needs that
require classroom accommodation or other arrangements must let the
instructor know at the beginning of the semester.