Big Data Technology 2
Big Data Technology 2
Technologies
2024.9
School of Computer Science &Technology,
Beijing Institute of Technology
教学团队
• 2020.2-中国大学MOOC(7943人选课)
• 学习强国(499595次点播)上线
• 北京市高校优质课程研究会“融优学堂”
• (21123人选课)
牵头建设了“大数据课程群”建设
联盟
4
Grading Structure
• MOOC/Online Courses: 10% (Discussion)
• Course Practice: 40% (Assignments: 20 points,
Case Study: 20 points)
• Final Exam: 50% (English version for
international students)
Coursework Requirements
• Find your own data, design your own scenario (dataset, storage and
processing, desensitization, mining, visualization).
• large models should be used.
• English classes and international students: The final project must be
presented with an English PPT, written in an English course design report,
and have a working software prototype.
• Chinese classes: The final project can be presented with an English PPT,
written in an English course design report, and have a working software
prototype.
• General elective classes and Yanhe Alliance classes: The final project can
be presented with an English PPT, written in an English course design
report, and have a working software prototype.
• Generative models like ChatGPT or OpenSource GPT should be used in
your project work.
• Experience code generation and sharing
• Copilot, Starcoder2, CodeGeeX, CodeFuse, etc.
Coursework and Submission
• English Classes and International Students:
Submit an English PPT, course design report in
English, and a working software prototype.
• Chinese Classes: Submit in English (PPT and
report) with a working software prototype.
• General and Yanhe Classes: Similar
requirements, but software is optional.
Coursework and Submission
• Proposal (Week 2), Midterm Checkpoint
(Week 5), Final Submission (Week 7 pre-
submission, Week 8 final review).
• Platform for submission:
• https://lexue.bit.edu.cn/course/view.php?id=1
5701
Course Registration
• Chinese Class, General Elective Class, Yanhe
Alliance (students from 9 schools)
https://www.icourse163.org/course/BIT-
1449814164
• English Class
https://www.livedu.com.cn/ispace4.0/moocxj
kc/toKcView.do?kcid=9AA774B48FA4E906E05
01B0ADF4F0B2A
• International Students
need not use the platform
Ten Questions and Ten Answers
• Each student is required to complete 'Ten
Questions and Ten Answers'.
Experiment Datasets
• Dataset 1: Alibaba Tianchi Competition
• https://tianchi.aliyun.com/competition/entran
ce/531830/information
• Dataset 2: Kaggle Hotel Booking Demand
https://www.kaggle.com/jessemostipak/hotel
-booking-demand?select=hotel_bookings.csv
• Dataset 3: Stanford SNAP Dataset
http://snap.stanford.edu/data/web-
Google.txt.gz
Group Work and Submission
• Teams: 1-2 people per group, with clear
responsibilities.
• Submit voice-recorded PPTs and project
demos via the online platform.
Submission Timelines
• Proposal: By Week 2
• Midterm Checkpoint: By Week 5
• Final Submission: Week 7 (pre-submission),
Week 8 (final review)
Proposal Requirements
• Cover Page: Project Title, Name (including group members)