What are the user interaction issues related to data mining methodology?

There are various user interaction issues related to data mining methodology which are as follows −

Mining different kinds of knowledge in databases − Different users can be interested in different kinds of knowledge. Thus, data mining must cover a broad spectrum of data analysis and knowledge discovery tasks, involving data characterization, discrimination, association, classification, clustering, trend and deviation analysis, and similarity analysis.

Interactive mining of knowledge at multiple levels of abstraction − Because it is complex to know exactly what can be found within a database, the data mining process must be interactive. Interactive mining enables users to target the search for patterns, supporting and refining data mining requests based on returned results. This will support the user to view information and discover patterns at several granularities and from multiple angles.

Incorporation of background knowledge − Domain knowledge associated with databases, including integrity constraints and deduction rules, can help target and speed up a data mining process, or judge the interestingness of identified patterns.

Data mining query languages and ad-hoc data mining − A high-level data mining query language required to be developed which can be integrated with a database or a data warehouse query language to enable users to define ad-hoc data mining tasks by supporting the specification of the relevant sets of data for analysis, the domain knowledge, the type of knowledge to be mined, and the conditions and interestingness constraints to be enforced on the identified patterns.

Presentation and visualization of data mining results − The discovered knowledge should be defined in high-level languages, visual definitions, or other expressive forms so that the knowledge can be simply learned and directly usable by humans.

Handling outlier or incomplete data − The data stored in a database can reflect outlier’s noise, exceptional cases, or incomplete data objects which can generate the accuracy of the identified patterns to be poor. Data cleaning methods and data analysis methods that can manage outliers are needed.

Pattern evaluation − A data mining system can uncover hundreds of patterns. Some of the patterns discovered can be uninteresting to the given user, defining common knowledge or lacking novelty. The use of the interestingness part to guide the discovery process and decrease the search space is another active area of research.

Parallel, distributed, and incremental updating algorithms − The large size of several databases, the broad distribution of data, and the computational complexity of some data mining methods are factors motivating the advancement of parallel and distributed data mining algorithms.