Inductive Logic Programming For
Inductive Logic Programming For
ABSTRACT
Advanced Monitoring Systems of the processes constitute a higher level to the systems of control and use specific techniques and methods. An important part of the task of supervision focuses on the detection and the diagnosis of various situations of faults which can affect the process. Methods of fault detection and diagnosis (FDD) are different from the type of knowledge about the process that they require. They can be classified as data-driven, analytical, or knowledgebased approach. A collaborative FDD approach that combines the strengths of various heterogeneous FDD methods is able to maximize diagnostic performance. The new generation of knowledge-based systems or decision support systems needs to tap into knowledge that is both very broad, but specific to a domain, combining learning, structured representations of domain knowledge such as ontologies and reasoning tools. In this paper, we present a decisionaid tool in case of malfunction of high power industrial steam boiler. For this purpose an ontology was developed and considered as a prior conceptual knowledge in Inductive Logic Programming (ILP) for inducing diagnosis rules. The next step of the process concerns the inclusion of rules acquired by induction in the knowledge base as well as their exploitation for reasoning.
KEYWORDS
Inductive Logic Programming (ILP); SHIQ+log; Hybrid Reasoning; Semantic Web Technologies; Control System; Knowledge Management.
1. INTRODUCTION
The increasing demand in quality, safety, availability and cost optimization of industrial processes requires the use of Advanced Supervision Systems. Traditionally, this aspect of the supervision was under the responsibility of the human operators, possibly assisted by a set of sensors and detectors. Nevertheless this apparatus was set up to control the process and not for the detection and the diagnosis of faults. Consequently, the development of new approaches is essential for a robust monitoring. In this context, many approaches are developed for fault detection and diagnosis (Cf. 2.3). They include data-driven, analytical, and knowledge based approaches [1]. Methods of faults detection and diagnosis mentioned above have their strengths and weaknesses. Thus the combination of complementary methods is an effective way to achieve high performance.
Dhinaharan Nagamalai et al. (Eds) : CSE, DBDM, CCNET, AIFL, SCOM, CICS, CSIP- 2014 pp. 229241, 2014. CS & IT-CSCP 2014 DOI : 10.5121/csit.2014.4420
230
Since supervision models are dependent on disparate information drawn from distributed sources, shared semantics based on a common ontology offers a way to develop these linkages. We address this critical need in this paper. Ontologies are a suitable formal representation able to convey this complex knowledge, but their use in learning algorithms is still a research issue. Inference rules may be crafted by the domain expert as part of the ontology design, or automatically learned by machine learning techniques. We focus on this latter case as a generic component to easily adapt them to new domains. However, as opposed to previous approaches, learning takes place in the ontology language to produce deductive diagnosis rules which is possible with inductive logic programming (ILP). We propose also a framework allowing the cohabitation of rules acquired by induction and the ontology as well as their exploitation for reasoning and researches. The use of large steam boilers is quite common in industry due to their advantageous features [2][3]. However, such facilities are subject to several operating failures that could expose the system structural integrity to serious hazard and huge economic and human life losses. Early detection of such faults under operation is of great importance: it helps in reducing possible damage to equipments and productivity loss caused by (otherwise) unscheduled boiler shut-down, and also ensures safety operation of the systems. The paper is organized as follows: initially, a discussion of current methods of fault diagnosis is presented. So, the proposed fault diagnosis system is developed. After discussing system structure, the main steps of the methodology designed are described in detail.
2. APPLICATION FIELD
In order to make natural gas practical and commercially viable to transport from one country to another, its volume has to be greatly reduced. To obtain maximum volume reduction, the gas has to be liquefied (condensed) by refrigeration to less than -161 C. This process also requires very strict safety measures and precautions during all liquefaction stages, due to the flammable nature of the gas involved. The LNG (Liquefied Natural Gas) plant (GL-1K complex) is located at 5km east side of Skikda, Algeria. It has an area of 92 hectars and has been in production since the early 1970s. Gas is sourced from the Hassi R'mel fields, which also supply Arzew plants. The plant, which is owned and operated by Sonatrach -owned oil and Gas Company , had grown to six trains by the 1990s with the last of these commissioned in 1981. An LNG train is a liquefied natural gas plant's liquefaction and purification facility. Each LNG plant consists of one or more trains to compress natural gas into liquefied natural gas. A typical train consists of a compression area, propane condenser area, methane, and ethane areas. All the trains received upgrades in the 1990s to bring them up to required specifications and the plant was capable of producing 7.68 million tons of LNG per year.
231
generator and the main superheated steam line. Figure 1 shows a schematic representation of the steam generator.
The integrated control and monitoring systems include various safety systems designed to prevent postulated damage during normal and abnormal transients. The top of the steam drum is equipped with three safety valves that control system pressure. Two other safety valves are installed on the main superheated steam line and there are four isolation valves and four flapper valves in different locations of the steam boiler facility.
232
Monitoring manufacturing plants is a complex task. We are interested, in one hand, in monitoring and decision-aiding in case of malfunction of the production facility. In the other hand, when a malfunction occurs on the plant, this leads to some higher level problems. In fact, in such a case, we do not have any formal model of the plant, of the process and of the malfunction that occurred. For such high-risk production, we are interested in the more efficient way to handle a malfunction at the moment is to stop the production, which entails a financial impact. Our system allows extracting knowledge about the production facility during malfunction situations. Later, this knowledge may be used during a decision-aid step.
233
processing, and scheduling. So, twenty three software toolboxes were developed during the project. MAGIC (Multi-Agent-Based Diagnostic Data Acquisition and Management in Complex Systems) [8] is developed by a joint venture of several European universities and companies. The MAGIC system consists of several model-based and cause-effect diagnostic agents and a process specification agent to specify the process to be monitored and diagnosed. Depending on the process specifications, the appropriate data and knowledge acquisition is performed by another agent. A diagnostic decision agent and a diagnostic support agent propose a final diagnostic decision, which is displayed with other information to an operator interface agent. The MAGIC system prototype is developed for the metal processing industry. However, knowledge of control systems mentioned above is not available in structured formats. For this reason, the new generation of decision support systems needs to tap into knowledge that is very broad combining learning, structured representations of domain knowledge such as ontologies and reasoning tools. It is in this context that joins our reflection.
234
DL-HCL representation for both hypotheses and background knowledge: Carin-ALN, resorts to AL-log, and builds upon SHIQ+log. 3.2.1. Learning in Carin-ALN The framework proposed in [13] focuses on discriminant induction and adopts the ILP setting of learning from interpretations. Hypotheses are represented as CARIN-ALN non-recursive rules with a Horn literal in the head that plays the role of target concept. The coverage relation of hypotheses against examples adapts the usual one in learning from interpretations to the case of hybrid CARIN-ALN BK. The generality relation between two hypotheses is defined as an extension of generalized subsumption. Procedures for testing both the coverage relation and the generality relation are based on the existential entailment algorithm of CARIN. 3.2.2. Learning in AL-log In [11], hypotheses are represented as constrained Datalog clauses that are linked, connected (or range-restricted), and compliant with the bias of Object Identity (OI). Therefore the literal in the head of hypotheses represents a concept to be either discriminated from others (discriminant induction) or characterized (characteristic induction). The generality relation for one such hypothesis language is an adaptation of generalized subsumption, named B-subsumption, to the AL-log KR framework. It gives raise to a quasi-order and can be checked with a decidable procedure based on constrained SLD-resolution. 3.2.3. Learning in SHIQ+log This ILP framework represents hypotheses as SHIQ+log rules restricted to positive Datalog [14] and organizes them according to a generality ordering inspired by generalized subsumption. The resulting hypothesis space can be searched by means of refinement operators either top-down or bottom-up. A decidable KR framework SHIQ+log is the most powerful among the ones currently available for the integration of DLs and HCLs.
235
Relational databases are valuable sources for ontology learning. In this paper, we describe an approach for steam boiler ontology construction using heterogeneous databases. Our objective is to build an ontological resource, in a most automated way. The main data and information constituting our system come from disparate databases for equipment characteristics. Methods and tools have been proposed to generate ontologies from such structured input. The mappings are the correspondences between each created ontology component (e.g., concept, property) and its original database schema concept (e.g., table, column). The implementation of the proposed solution is realised using Protg Plug-in DaTaMaster and it followed the steps below:
Data acquisition
Real time Integrated database database Background knowledge
Inference Monitor
Example database
System Interface
On-line
Off-line
The choice of the connection driver types: Open Data Base Connectivity (ODBC) or Java Databases Connectivity (JDBC) and the data source. The selection of a given table activates the visualization of its content, then the user have the choice of importing the table or not. The chosen data base tables are activated and visualised, each table is transferred into one class or sub-class depending on the users choice.
236
237
238
TBox T plus the set R of rules) plays the role of background knowledge and the extensional part (i.e., the ABox A plus the set F of facts) contributes to the definition of observations. Therefore ontologies may appear as input to the learning problem of interest. The observations are represented as a finite set of logical facts E. E could generally discomposed into the positive examples E+ and the negative ones E- . The background knowledge is supposed to be insufficient to explain the positive observations and the logical translation of this fact is: B E+ but there is no contradiction with the negative knowledge: BUE.So an ILP machinery with input E and B, will output a program H such that B U H E. So H constitutes a kind of explanation of our observations E+. The language L of hypotheses must allow for the generation of SHIQ+log rules. More precisely, we consider defined r1(Y1), . . . , rm(Ym), s1(Z1), . . . , sk(Zk). Where m 0, k 0, each clauses of the form: p(X) p(X), rj(Yj ), sl(Zl) is an atom, and the literal p(X) in the head represents the target concept.
Figure 4 reports the main procedure of an algorithm analogously to FOIL [15] for learning ontorelational rules. The outer loop learns new rules one at a time, removing the positive examples covered by the latest rule before attempting to learn the next rule. The inner loop searches a second hypothesis space, consisting of conjunctions of literals, to find a conjunction that will form the body of the new rule. Hset := Pos:= E+ while Pos do h := {p(X ) }; Negh:= Ewhile Negh do add a new literal L to specialize h end while Hset := Hset {h}; Posh:= {e Pos B e}; Pos:= Pos\Posh endwhile return Hset
We want to show, on a small example, the way ILP may be used to induce a model of plant malfunctions. We want to model these unusual situations. Suppose we have a SHIQ+log KB consisting of the following intensional knowledge K: Valve (C) Component (C) Pump (C) Component (C) Valves (v1) .. Valve (v2) Pump (P1) Parameter (pressure) We aim at inducing a definition for the predicate default/1 modeling the default of one component of the plant. In the entire system, the arity of this predicate may be greater than one depending on the number of components involved in the malfunction. If we run our system on the complete set of positive examples that describe the problem, the system induces the following definition for
239
the predicate default/1: Default (A) increase (A, pressure), after (A,P1), closed(A), which means that if the pressure increase in A and A is located after the pump P1 and A is closed then the component A causes a malfunction of the plant.
4.6. Monitor
The monitoring module is used to supervise production process. It monitors data streams obtained from the control system, e.g. temperature, pressure, and flow. If a situation is judged to be abnormal by the module, the data are automatically transferred to the inference machine to solve the problem. At the same time the data are stored in the integrated database.
5. CONCLUSION
Complex processes involve many process variables, and operators faced with the tasks of monitoring, control, and diagnosis of these processes. They often find it difficult to effectively monitor the process data, analyze current states, detect and diagnose process anomalies, or take appropriate actions to control the processes. To assist plant operators, decision support systems that incorporate artificial intelligence (AI) and non-AI technologies have been adopted for the tasks of monitoring, control, and diagnosis. In this paper, a real-time system is proposed for monitoring and diagnosing of chemical processes. The representation of knowledge base, inference machine and the relations among them are considered in this paper according to the characteristics of chemical processes.
240
This system helps the operators a lot to eliminate potential and disasters faults. The system also decreases the loss brought by unstable process situations and the loss if the time used for eliminating faults is too long. When new fault occurs, the stored data helps the domain expert to analyze the reason of the fault, and give earlier prediction of the trend. Our system design approach can be exploited to develop and rapidly prototype real time distributed multi-agent systems.
ACKNOWLEDGEMENTS
The authors are very grateful toward all the personnel of the natural gas liquefaction complex of Skikda for providing all information related to the steam boiler equipments enabling the present study to be achieved.
REFERENCES
[1] [2] [3] [4] [5] H.L Chiang, L.E. Russell and D.R. Braatz, (2001) Fault Detection and Diagnosis in Industrial Systems. Springer, London, Great Britain. Technical operations manual of the industrial steam boiler ABB ALSTOM (2000). D. R. Tucakovic, V. D. Stevanovic and T. Zivanovic, (2007) Thermal hydraulic analysis of a steam boiler with rifled evaporating tubes. Applied Thermal Engineering, 27, 509519. S. Yoon, F.J. MacGregor, (2004) Principle-component analysis of multiscale data for process monitoring and fault diagnosis. AIChE Journal 50 (11), 28912903 A.E. Garcia, M.P. Frank, (1996) On the relationship between observer and parameter identification based approaches to fault detection. In: Proceedings of the 13th IFAC World Congress, vol. N. Piscataway., New Jersey, pp. 2529. X. Luo, C. Zhang and R.N. Jennings, (2002) A hybrid model for sharing information between fuzzy, uncertain and default reasoning models in multi-agent systems. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 10 (4), 401450. S. Cauvin, (2004) CHEM-DSS : Advanced decision support system for chemical/petrochemical industry, Fifteenth International Workshop on Principles of Diagnosis (DX04), AAAI, Carcassonne, France B. Kppen-Seliger, T. Marcu, M. Capobianco, S. Gentil, M. Albert, and S. Latzel, (2003) MAGIC: An integrated approach for diagnostic data management and operator support, in Proceedings of the 5th IFAC Symposium Fault Detection, Supervision and Safety of Technical Processes SAFEPROCESS05, Washington D.C. S.Muggleton, (1999) Inductive Logic Programming: Issues, Results and the Challenge of Learning Language in Logic, Artificial Intelligence, 114(1-2), pp.283--296. T. Mitchell, (1982) Generalization as search. Artificial Intelligence 18, 203226. F. Lisi, (2008) Building Rules on Top of Ontologies for the Semantic Web with Inductive Logic Programming., Theory and Practice of Logic Programming 8(03), 271300. A. Borgida, (1996) On the relative expressiveness of description logics and predicate logics., Artificial Intelligence 82(12), 353367. C. Rouveirol, V. Ventos, (2000) Towards Learning in CARIN-ALN. In: Cussens, J., Frisch, A.M. (eds.) ILP 2000. LNCS (LNAI), vol. 1866, pp. 191208. Springer, Heidelberg. F. Lisi, F. Esposito,(2007) Building Rules on top of Ontologies? Inductive Logic Programming can help! SWAP 2007 J.R. Quinlan, (1990) Learning logical definitions from relations. Machine Learning, 5:239266. S. Bouarroudj, Z. Boufaida, (2010) A multi-reasoner system for semantic search of annotated images, EGC-M 2010, Algiers , Algria, pp. 118-129.
[6]
[7]
[8]
241
Authors
Samiya BOUARROUDJ is a doctoral student of Computer Science at University of Constantine 2, Algeria. Her current research activities are conducted at the LIRE Laboratory. Here research interests are knowledge representation and reasoning, semantic web technologies and advanced monitoring systems of the processes. Zizette BOUFAIDA is a Professor of Computer Science at University of Constantine 2, Algeria and the cohead of the SI&BC research group at the LIRE laboratory. Her research interests include knowledge representation and reasoning; formal knowledge representation for Semantic Web, ontology development,...