Massive amounts of data are collected by many companies and organizations and the task of a data scientist is to extract actionable knowledge from the data – for scientific needs, to improve public health, to promote businesses, for social studies and for various other purposes. This course will focus on the practical aspects of the field and will attempt to provide a comprehensive set of tools for extracting knowledge from data.
Bloomberg Center Room 131
Meets: MW 4:45pm - 6:00pm
# | Date | Topic | Assignment |
---|---|---|---|
1 | Wed Jan 24, 2018 | Introduction to Data Science | - |
2 | Mon Jan 29, 2018 | Data Modeling: Supervised Learning Methods | Assign 0 Due |
3 | Wed Jan 31, 2018 | Data Modeling: Unsupervised Learning Methods | - |
4 | Mon Feb 5, 2018 | ETL, Feature Engineering, Bootstrapping, Sampling | Assign 1 Due |
5 | Wed Feb 7, 2018 | Case Study: Scaling Machine Learning in Ad Tech | - |
6 | Mon Feb 12, 2018 | Deep Learning | - |
7 | Wed Feb 14, 2018 | Deep Learning | Assign 2 Due |
8 | Mon Feb 19, 2018 | NO CLASS | - |
9 | Wed Feb 21, 2018 | NLP and Knowledge Bases | - |
10 | Mon Feb 26, 2018 | NLP and Knowledge Bases | Assign 3 Due |
11 | Wed Feb 28, 2018 | NLP and Knowledge Bases | - |
12 | Mon Mar 5, 2018 | Recommendation Systems pt 1 | - |
13 | Mon Mar 5, 2018 | Recommendation Systems pt 2 | - |
14 | Wed Mar 7, 2018 | Class cancelled due to weather | Assign 4 Due |
15 | Mon Mar 12, 2018 | Recommendation Systems pt 3 | - |
16 | Wed Mar 14, 2018 | Social Network Analysis | - |
17 | Mon Mar 19, 2018 | Social Network Analysis | Assign 5 Due |
18 | Wed Mar 21, 2018 | CLASS CANCELLED | - |
19 | Mon Mar 26, 2018 | Data Visualization | Project Part 0 Due |
20 | Wed Mar 28, 2018 | Computer Vision and Fashion Mining | - |
21 | Mon Apr 2, 2018 | NO CLASS | - |
22 | Wed Apr 4, 2018 | NO CLASS | - |
23 | Mon Apr 9, 2018 | Datalogue - Company Presentation | Assign 6 Due |
24 | Wed Apr 11, 2018 | Deeper Look at Bootstrap | - |
25 | Mon Apr 16, 2018 | Map Reduce and Streaming Calculations | Project Part 1 Due |
26 | Wed Apr 18, 2018 | Map Reduce and Streaming Calculations | - |
27 | Mon Apr 23, 2018 | Big Data Tools | - |
28 | Wed Apr 25, 2018 | Big Data Tools | Project Part 2 Due |
29 | Mon Apr 30, 2018 | Time Series and Practical Considerations | - |
30 | Wed May 2, 2018 | Privacy, Ethics of Data Science, Course Summary | - |
31 | Mon May 7, 2018 | Final Projects in Class | |
32 | Wed May 9, 2018 | Final Projects in Class | Final Project Due |
These descriptions only cover a brief non-exhaustive list of topics in the course.
Top Level Category | Grade Percentage |
---|---|
Assignments | 60% |
Project | 40% |
Assignments | Category Percentage |
---|---|
Assign 0 | 10% |
Assign 1 | 15% |
Assign 2 | 15% |
Assign 3 | 15% |
Assign 4 | 15% |
Assign 5 | 15% |
Assign 6 | 15% |
Project | Category Percentage |
---|---|
Part 0 | 10% |
Part 1 | 30% |
Part 2 | 30% |
Final Part | 30% |
Late homework. Each student will have 3 "slip" days per assignment with no penalty, then 20% will be deducted per day.
Dropped homework. There are 6 assignments and an assignment 0 to setup your programming environment. We will drop the lowest score among assignments 1-6.
Homework collaboration. You are encouraged (but not required) to work in groups of no more than 2 students on each assignment. Please indicate the name of your collaborator at the top of each assignment and cite any references you used (including articles, books, code, websites, and personal communications). If you’re not sure whether to cite a source, err on the side of caution and cite it. Each student should submit their own writeup. Remember not to plagiarize: you must write the solutions yourself.
Project collaboration. tba
Attendance. Some homework assignments may require information that was only shared in class (not in the online slides). In addition, there may be surprise quizzes to further encourage attendance.
Statement about students with disabilities Your access in this course is important. Please give me (Giri Iyengar) or one of the TAs your Student Disability Services (SDS) accommodation letter early in the semester so that we have adequate time to arrange your approved academic accommodations. If you need an immediate accommodation for equal access, please speak with me after class or send an email message to me and/or SDS at sds_cu@cornell.edu. If the need arises for additional accommodations during the semester, please contact SDS. You may also feel free to speak with Student Services at Cornell Tech who will connect you with the university SDS office.
Academic integrity. Each student in this course is expected to abide by the Cornell University Code of Academic Integrity. Any work submitted by a student in this course for academic credit will be the student's own work. You are encouraged to study together and to discuss information and concepts covered in lecture and the sections with other students. You can give "consulting" help to or receive "consulting" help from such students. However, this permissible cooperation should never involve one student having possession of a copy of all or part of work done by someone else, in the form of an e-mail, an e-mail attachment file, a diskette, or a hard copy. Should copying occur, both the student who copied work from another student and the student who gave material to be copied will both automatically receive a zero for the assignment. Penalty for violation of this Code can also be extended to include failure of the course and University disciplinary action.