0% found this document useful (0 votes)
108 views4 pages

Advanced Analytics - Course Outline

This document outlines an advanced analytics course on mining unstructured data from text, web, and images. The 20-session course will cover topics like text mining, web structure and content mining, sentiment analysis, and image classification. Students will learn techniques for analyzing large amounts of unstructured data and applying analytics tools to real-world problems and case studies in order to strategize based on mining results.

Uploaded by

Saksham Goyal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views4 pages

Advanced Analytics - Course Outline

This document outlines an advanced analytics course on mining unstructured data from text, web, and images. The 20-session course will cover topics like text mining, web structure and content mining, sentiment analysis, and image classification. Students will learn techniques for analyzing large amounts of unstructured data and applying analytics tools to real-world problems and case studies in order to strategize based on mining results.

Uploaded by

Saksham Goyal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Course: Advanced Analytics

Faculty: Prof. V. Nagadevara, [email protected]


Term: Term 6 (2015-17); Pre-requisites: Core courses.
No. of sessions: 20 sessions of 75 minutes each.
Textbook: Bing Liu, “Web Data Mining: Exploring Hyperlinks, Contents, and Usage
Data” 2nd ed. 2011 Edition, Springer

In addition, a number of journal articles will be supplied for additional reading.

Course Objective
This course is planned to provide students with:

 understanding of basic concepts and methods in text mining, such as document


representation, information extraction, text classification and clustering
 the ability to use benchmark corpora, commercial and open-source text analysis and
visualization tools to explore interesting patterns in textual data
 understanding of various techniques and algorithms (such as support vector machines, naïve
bayes) for advanced text mining, text classification and clustering, opinion mining, and their
applications in real-world problems;
 knowledge of various components of web mining such as web structure mining, web
content mining and web usage mining
 familiarity with the basic concepts involved in image mining including image processing and
classification

Given the large amounts of unstructured data flooding the Internet, mining high-quality information
from text and web becomes increasingly critical. The actionable knowledge extracted from text data
facilitates effective decision making in a broad spectrum of areas, including business intelligence,
information acquisition, social behaviour analysis and strategization. This course will cover important
topics in text mining, web mining and image mining leading to text and web analytics. Students will
also be exposed to use of software for text and web mining. The course places emphasis on the use
of techniques for different aspects of text and web mining and strategization based on the mining
results. In general, exposure to each topic will be supported by a real life case study. In order to
make the course more relevant, and practice oriented, participants will be using the very popular
analytics package, WEKA and applying the techniques on different corpora (text databases).

Each topic will include a real life application.

Course Outline

S No. Topic Reading material Application


Introduction to mining
1 Chapter 6
unstructured data
Natural language processing
2 and 3 and document Article on NLP, Chapter 6
representation
Classification Techniques for
4 Chapters 3.2 and 3.3 Automated patent classification
Textual Documents

Support Vector Machines for Sentiment analysis based on sports


5 Chapter 3.8
Classification forum

Naïve Bayes method for


6 Chapters 3.6 and 3.7 Extraction of Product Attributes
classification
Clustering techniques for Automatic labelling of hierarchical
7 Chapters 4.1 to 4.4
mining text data clusters
Documentation-to-Source-Code
8 Latent Semantic Analysis Chapter 6.7
Traceability

9 Latent Semantic Analysis

Sentiment analysis of Movie


10 Sentiment analysis Chapter 11.1
reviews

11 Use of WEKA for Text Mining

Text Summarization
12 Chapter 11.2 Source Code summarization
techniques
Text Summarization
13
Techniques (Contd.)
Analyzing e-Commerce website -
14 Introduction to web mining Hand-out
Flipcart
Effectiveness of web search
15 Web structure mining Chapter 8
engines
Case – Analyzing Customer
16 Web content mining Chapter 9 and 10 Reviews (audio)

Making automated
17 Web usage mining Chapter 12 recommendations and
personalization

18 Click stream analysis Chapter 12 Analysis of Matchmaking website

Image mining, Image


19 processing and Image Hand-out Image mining and classification
classification

20 Project Presentations

Some suggested readings


(Copies will be made available as a part of the reading material)

1. Farhad Soleimanian Gharehchopogh and Zeinab Abbasi Khalifehlou, Analysis and evaluation
of unstructured data: Text mining versus natural language processing; DOI:
10.1109/ICAICT.2011.6111017 · Source: IEEE Xplore
2. Andreas Hotho, A Brief Survey of Text Mining, available at http://www.kde.cs.uni-
kassel.de/hotho/pub/2005/hotho05TextMining.pdf

3. James Pustejovsky and Branimir Boguraev “Lexical knowledge representation and natural
language processing”, Artificial Intelligence 63 (1993) 193-223

4. C J Fall et. al. Automated Categorization in the International Patent Classification

5. Nan Li a, Desheng Dash Wu, Using text mining and sentiment analysis for online forums
hotspot detection and forecast

6. Rayid Ghani et. al. Text Mining for Product Attribute Extraction

7. Pucktada Treeratpituk and Jamie Callan, Automatically Labeling Hierarchical Clusters

8. Andrian Marcus, Jonathan I. Maletic, Recovering Documentation-to-Source-Code


Traceability Links using Latent Semantic Indexing

9. Sonia Haiduc et al On the Use of Automated Text Summarization Techniques for


Summarizing Source Code; 2010 IEEE DOI 10.1109/WCRE.2010.13

10. Rudy Prabowo and Mike Thelwall, Sentiment analysis: A combined approach, Journal of
Informetrics · March 2013

11. Alexander Pak and Patrick Paroubek, Twitter as a Corpus for Sentiment Analysis and Opinion
Mining, available at http://lexitron.nectec.or.th/public/LREC-010_Malta/pdf/385_Paper.pdf

12. Anurag Bejju, Sales Analysis of E-Commerce Websites using Data Mining Techniques,
International Journal of Computer Applications (0975 – 8887) Volume 133 – No.5, January
2016

13. Bernard J. Jansen, Paulo R. Molina, The effectiveness of Web search engines for retrieving
relevant ecommerce links, Information Processing and Management 42 (2006) 1075–109

14. Nadine Höchstötter and Dirk Lewandowski What users see – Structures in search engine
results pages, Information Sciences 179 (2009) 1796–1812

15. Khribi, M. K., Jemni, M., & Nasraoui, O. (2009). Automatic Recommendations for E-Learning
Personalization Based on Web Usage Mining Techniques and Information Retrieval.
Educational Technology & Society, 12 (4), 30–42.

16. Vishnuprasad Nagadevara, Jacob Paracka, Sudarshan K T, Mapping Visitors’ Behavior to


Business Goals through Click Stream Analysis,

17. Badrish Chandramouli, Jonathan Goldstein, Songyun Duan, Temporal Analytics on Big Data
for Web Advertising

18. Antonio Plaza et al, Recent advances in techniques for hyperspectral image processing,
Remote Sensing of Environment 113 (2009) S110–S122
Learning Outcomes
At the end of this course, a participant should be able to
 Mine and analyse the data from text data, web data and image data
 Identify the appropriate techniques for analysing the data drawn from text, web and image
sources
 Extract sentiment from social media and strategize based on sentiment analysis
 Use open source mining software package
 Strategize based on the analysis of unstructured data

Evaluation:

1. Term Paper (Individual) 30%


2. End term (Take home individual) 30%
3. Project (Group) 40%

You might also like