Data Mining
EG 3212 CT
|
Total: 7 hour /week
|
Year: III
|
Lecture: 3 hours/week
|
Semester: VI
|
Tutorial:
1 hours/week
|
|
Practical: 3
hours/week
|
Course Introduction
Data Mining studies
algorithms and computational paradigms that allow computers to find patterns
and regularities in databases, perform prediction and forecasting, and
generally improve their performance through interaction with data. The course
will cover all these issues and will illustrate the whole process by
examples.
Objectives
The general objectives of this course are
as follows:
• To introduce concept of data preprocessing and data mining
• To discuss multi-dimensional data representation and OLAP
operations
• To provide skill of illustrating clustering, classification,
and association rule mining algorithms
• To introduce advanced concept of data mining
Course Contents:
Unit
|
Topics
|
Contents
|
Hours
|
Methods/
Media
|
Marks
|
1
|
Introduction to Data Mining
|
1.1 Data Mining Concepts, KDD vs Data Mining, Data Mining System
Architecture
1.2 Data Mining Functionalities, Kinds of Data on which Data
Mining is Performed
1.3 Applications of Data
Mining,
|
(5Hrs)
|
|
|
2
|
Data Warehouse
and OLAP
|
2.1 Data Warehouse definition and Characteristics, DBMS vs Data
Warehouse, Multi-dimensional Data, Data
Cube, Cube
Materialization
2.2 Data Warehouse Schemas: Star, Snowflake and Fact
Constellation Schema
2.3 OLAP Operations: Roll-up, Drill, Down, Slice & Dice, and
Pivot
Operations
|
(6Hrs)
|
|
|
Unit
|
Topics
|
Contents
|
Hours
|
Methods/
Media
|
Marks
|
|
|
2.4 OLAP Servers:
ROLAP, MOLAP, HOLAP, Data
Warehouse
Architecture
|
|
|
|
3
|
Data Preprocessing
and DMQL
|
3.1 Data Pre-processing
Concepts
3.2 Data Cleaning, Data Integration, Data
Transformation,
Data
Reduction
3.3 Data Discretization
and
Concept
Hierarchy Generation
3.4 DMQL, Syntax of DMQL, Full Specification of DMQL
|
(6Hrs)
|
|
|
4
|
Clustering
|
4.1
Introduction to Clustering,
Distance Measures, Categories of Clustering algorithms
4.2 K-means, and K-medoid algorithms
4.3
Agglomerative Clustering,
Concept
of Divisive
Clustering
|
(6Hrs)
|
|
|
5
|
Classification and
Prediction
|
5.1 Concept of Classification
and
Clustering,
Evaluating
Classification
Algorithms
5.2 Bayesian Classification, Decision Tree Classification,
Concept of Entropy
5.3 Linear
Regression, Concept of Non-linear regression
|
(8Hrs)
|
|
|
6
|
Association Rule
Mining
|
6.1 Frequent Patterns, Association Rule, Concept of Support
and
Confidence
6.2 Apriori Property, Apriori algorithm,
Generating Association Rules
6.3 FP-growth algorithm,
FP-tree, Generating Association Rules
|
(8Hrs)
|
|
|
Unit
|
Topics
|
Contents
|
Hours
|
Methods/
Media
|
Marks
|
7
|
Advanced Data Mining
|
7.1 Information Retrieval, Measuring
Effectiveness of
Information
Retrieval
7.2 Concept of Time-Series Data and
Analysis, Image and Video Retrieval
7.3
Concept of Support Vector
Machine
and Deep Learning
|
(6Hrs)
|
|
|
8
|
Laboratory
Work
|
Perform the following:
1
Design data warehouse by
using SQL Server or Oracle
2
Implement OLAP operations
3
Implement clustering
algorithms K-means and Kmedoid by using Weka
4
Implement classification
algorithms Naïve-Bayes and decision trees by using Weka
5
Implement regression algorithms by using
Weka
6
Implement association
mining algorithms by using Weka
|
45hrs
|
|
|
Recommended Books
1. Jiawei Han,
MichelineKamber, Jian Pei; Data Mining:
Concepts and Techniques, Morgan Kaufman Publication, 3rd
Edition, 2011
References
2. Pang-Ning
Tan, Michael Steinbach, AnujKarpatne, Vipin Kumar, Introduction to Data Mining, Pearson Publication, First Edition,
2013
3. Charu C.
Agrawal, Data Mining: The Textbook,
Springer Nature Publication, First Edition,
2015
No comments:
Post a Comment