BIA660 Web Analytics



Course Catalog Description

Introduction

Prerequisites:

Students must have programming experience. It is also highly recommended for the students to have taken Multivariate Data Analytics (BIA 652), Data & Knowledge Management (MIS 630), and Knowledge Discovery in Databases (MIS 637).


Campus Fall Spring Summer
On Campus
Web Campus

Instructors

Professor Email Office
Rong Liu
rliu20@stevens.edu Babbio 619
Jingyi Sun
jsun54@stevens.edu
Theodoros Lappas
tlappas@stevens.edu
Jordan Suchow
jsuchow@stevens.edu Babbio 634

More Information

Course Outcomes

In this course, students will learn through hands-on experience how to extract data from the web and analyze web-scale data using distributed computing. Students will learn different analysis methods that are widely used across the range of internet companies, from start-ups to online giants like Amazon or Google. At the end of the course, students will apply these methods to answer a real scientific question.

Additional learning objectives include the development of:

  • Data collection and preprocessing skills: students will learn how to identify and profile candidate sources of valuable data, as well as how to automatically collect and manage the information they need for their analytics tasks.
  • Diverse Analytic Skills: students will be exposed to a wide range of both quantitative and qualitative analytics techniques with applications across multiple business domains.
  • Team Skills: the students will be organized in teams and collaborate on projects for the duration of the course. Each student will evaluate his/her teammates twice during the semester via a customized team survey tool. The tool provides a detailed analysis of a person’s contributions to the different stages of the team’s operation and will be used to promptly identify and address possible problems.

Grading

Grading Policies

Grading Percentages:

Homework: 50%, Mid-term Project: 20%, Final-term Project:20%, Class attendance and participation:10%.

Midterm project:

Collect, clean and organize online data from one or more websites of your choice. The deliverable includes a cleaned dataset, scripts for data scraping and cleaning, exploratory data analysis, and a short proposal on what insights can be derived from the dataset and what methods can be used to obtain the potential insights.

Final project:

Choose an important research question that emerges in the context of the dataset collected for the midterm project. Develop, apply and record an analytics methodology to address your question. The entire project work will be presented in the last class.

Grading Scale:
Grade Score
A 93 - 100
A- 90 - 92
B+ 87 - 89
B 83 - 86
B- 80 - 82
C+ 77 - 79
C 73 - 76
C- 70 - 72
F < 70

Lecture Outline

Recommended Readings: Links to free material will be provided in class.

Lecture Topic
1 Introduction & Python basics
2 Python basics
3 Data scraping from websites
4 Data scraping from websites
5 Data scraping from API
6 Text preprocessing
7 Text preprocessing
8 Classification
9 Classification
10 Clustering
11 Topic modeling
12 Sentiment
13 Word embeddings
14 Final project presentations