Thursday, March 10, 2016

CS514 - DATA WAREHOUSING AND DATA MINING April May 2014

Anna University Questions, CS514 DATA WAREHOUSING AND DATA MINING April May 2014, Computer Science and Engineering (CSE), Eighth Semester, Regulation 2004




Exam
B.E/B.Tech. (Full Time) DEGREE END SEMESTER EXAMINATIONS
Academic Year
April May 2014
Subject Code

CS514

Subject Name

Data Warehousing and Data Mining

Branch
Computer Science and Engineering
Semester
Eighth Semester
Regulation
2004

B.E / B.Tech. (Full Time) DEGREE END SEMESTER EXAMINATIONS, APRIL / MAY 2014
Computer Science and Engineering
Eighth Semester
CS514 DATA WAREHOUSING AND DATA MINING
(Regulation 2004)
Time : 3 Hours                      Answer A L L Questions                Max. Marks 100
PART-A (10 x 2 = 20 Marks)

1. What are the data mining task primitives?
2. What is metadata? Explain its types.
3. Distinguish between noise and outliers. Give suitable examples.
4. What are the advantages of dimensional modeling?
5. How do you perform discretizations of numeric attributes?
6. What is lazy learner? Give an example.
7. What is rule based classification? Give an example.
8. Difference between density and grid clustering.
9. What is precision and recall?
10. What are the applications of data mining?

Part-B (5* 16 = 80 Marks)

11. (i) What is data mining? Explain in detail with the KDD process. (10)
(ii) How data warehouses systems different from operational database systems and how are they similar? (6)

12 (a)
(i) What is data warehouse? Explain three-tier data warehouse architecture with a neat sketch. (8)
(ii) A data warehouse can be modeled by either a star schema or a snowflake schema. Briefly describe the similarities and the differences of the two models, and the analysis their advantages and disadvantages. Give which might be more empirically useful and state the reasons. (8)
(OR)
12 (b)
(i) How are concept hierarchies useful in OLAP? Explain OLAP operations in the multi-dimensional data model. (8)
(ii) Suppose that a data warehouse consists of the four dimensions data, spectator, location, and game, and the two measures count and charge, where charge is the fare that a spectator pays when watching a game on a given data. Spectator may be students, adults, or seniors, with each category having its own charge rate.
(1) Draw a star schema diagram for the data warehouse. (4)
(2) Starting with the base cuboid [data, spectator, location, game], what specific OLAP operations should one perform in order to list the total charge paid by student spectators GM_Place in 2000. (4)

13 (a)
(i) List and discuss the steps for integrating a data mining system with a data warehouse. (10)
(ii) A data set for analysis includes only one attribute X = {7, 12, 5, 18, 5, 9, 13, 12,19, 7,12,12,13, 3, 4, 5,13}                                    (6)
(1) What is the mean and median of the data set X?
(2) Find the standard deviation for X.
(3) Find outliers in the data set X. Discuss the results.
(OR)
13 (b)
(i) What is an association rule mining? Apply the FP-Growth algorithm to find the strong association rules for the transactional database given below. (12)
TID
ITEMS
01
A, B, C, D
02
A, C, D, F
03
C, D, E,G
04
A, D, F ,B
05
B, C,G
06
D, F,G
07
A, B, G
Analyze misleading associations for the rule set obtained in the above transactional database.
(ii) What are the essential difference between association rules and decision rules? (4)

14 (a)
(i) What are the prediction techniques supported by a data mining system. (4)
(ii) State the ID3 algorithm. Apply the same to construct a decision tree for the data given below. (12)

(OR)
14 (b) The data mining task is to cluster the following eight points into three clusters: A1 (2,10))A2(2,5),A3(8,4),B1 (5J8))B2(7I9))B3(6,4),C1 (1,2),C2(5,9). The distance function is Euclidean distance. Suppose initially we assign A1, B1 and C1 as the center of each cluster, respectively. Apply the K-means algorithm to show the three cluster centers after the first round execution and the final three clusters. (16)

15 (a)
(i) What is web mining? Differentiate between web content mining, web structure mining, and web usage mining. What is trend analysis? (12)
(ii) What are the differences between visual data mining and data visualization? (4)
(OR)
15 (b) What are the major challenges faced in bringing data mining related to market? Illustrate data mining issues that may have a strong impact on the market and society. Discuss how to approach such issue. (16)

************************







No comments:

Post a Comment

Featured Content

Multiple choice questions in Natural Language Processing Home

MCQ in Natural Language Processing, Quiz questions with answers in NLP, Top interview questions in NLP with answers Multiple Choice Que...

All time most popular contents

data recovery