BIA660 Web Analytics
Course Catalog Description
Students must have programming experience. It is also highly recommended for the students to have taken Multivariate Data Analytics (BIA 652), Data & Knowledge Management (MIS 630), and Knowledge Discovery in Databases (MIS 637).
In this course, students will learn through hands-on experience how to extract data from the web and analyze web-scale data using distributed computing. Students will learn different analysis methods that are widely used across the range of internet companies, from start-ups to online giants like Amazon or Google. At the end of the course, students will apply these methods to answer a real scientific question.
Additional learning objectives include the development of:
- Data collection and preprocessing skills: students will learn how to identify and profile candidate sources of valuable data, as well as how to automatically collect and manage the information they need for their analytics tasks.
- Diverse Analytic Skills: students will be exposed to a wide range of both quantitative and qualitative analytics techniques with applications across multiple business domains.
- Team Skills: the students will be organized in teams and collaborate on projects for the duration of the course. Each student will evaluate his/her teammates twice during the semester via a customized team survey tool. The tool provides a detailed analysis of a person’s contributions to the different stages of the team’s operation and will be used to promptly identify and address possible problems.
Homework: 50%, Mid-term Project: 20%, Final-term Project:20%, Class attendance and participation:10%.Midterm project:
Collect, clean and organize online data from one or more websites of your choice. The deliverable includes a cleaned dataset, scripts for data scraping and cleaning, exploratory data analysis, and a short proposal on what insights can be derived from the dataset and what methods can be used to obtain the potential insights.Final project:
Choose an important research question that emerges in the context of the dataset collected for the midterm project. Develop, apply and record an analytics methodology to address your question. The entire project work will be presented in the last class.Grading Scale:
|A||93 - 100|
|A-||90 - 92|
|B+||87 - 89|
|B||83 - 86|
|B-||80 - 82|
|C+||77 - 79|
|C||73 - 76|
|C-||70 - 72|
Recommended Readings: Links to free material will be provided in class.
|1||Introduction & Python basics|
|3||Data scraping from websites|
|4||Data scraping from websites|
|5||Data scraping from API|
|14||Final project presentations|