0% found this document useful (0 votes)
27 views21 pages

Knowledge Enginnering Record

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views21 pages

Knowledge Enginnering Record

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

lOM oAR c P S D | 4759 001 2

EX.NO: 1
PERFORM OPERATIONS WITH EVIDENCE BASED
DATE: LEARNING

AIM
To perform operation with evidence based model using Python.

ALGORITHM

Data Preparation:
• Import the necessary libraries, such as pandas for data handling and Scikit-Learn
for machine learning.
• Load the dataset from a CSV file into a Data Frame (data).
• Split the dataset into features (X) and the target variable (y).
• Further split the data into training and testing sets using the train_test_split
function from Scikit-Learn. Typically, you use about 80% of the data for training
and 20% for testing.
Model Selection:
• Choose a machine learning model suitable for the problem. In this case, a
Random Forest Classifier is selected. Random forests are an ensemble
learningmethod used for classification tasks.
Model Training:
• Train the selected model (Random Forest Classifier) using the training
data(X_train and y_train) by calling the fit method on the model instance.
Model Evaluation:
• Use the trained model to make predictions (y_pred) on the test data (X_test).
• Calculate the accuracy of the model's predictions by comparing them to the true
labels (y_test). The accuracy score is a common metric for classification tasks and
is calculated using the accuracy_score function from Scikit-Learn.
• Print the accuracy score to evaluate the model's performance.
Inference or Prediction:
• Load new data from a CSV file into a Data Frame (new_data) to make
predictionson unseen data.
• Use the trained model to predict the target variable for the new data and store the
predictions in the predictions variable.
• You can further process or analyze these predictions as needed for your
application.
lOM oAR c P S D | 4759 001 2

PROGRAM

import pandas as pd
from sklearn.model_selection import train_test_split
data = pd.read_csv("your_data.csv")
X = data.drop("target_column", axis=1)
y = data["target_column"]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
from sklearn.ensemble import RandomForestClassifier
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
from sklearn.metrics import accuracy_score
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Accuracy:", accuracy)
new_data = pd.read_csv("new_data.csv") # Load new evidence-based data
predictions = model.predict(new_data)
lOM oAR c P S D | 4759 001 2

DATA SET

OUTPUT

Accuracy: 0.85

RESULT

Thus the performance of operation with evidence based model had been
successfully implemented.
lOM oAR c P S D | 4759 001 2

EX.NO: 2

DATE: PERFORM EVIDENCE BASED ANALYSIS

AIM
To performance of evidence based Analysis using Python.

ALGORITHM

Data Collection:
• Load data from a CSV file (in this case, 'your_data.csv') into a Pandas DataFrame.
The data represents the information you want to analyze.
Data Preprocessing:
• This section is a placeholder for data cleaning, normalization, and transformation.
You would customize this part to suit your specific dataset and analysis needs.
Common preprocessing steps include handling missing values, encoding
categorical data, and scaling numerical features.
EDA (Exploratory Data Analysis):
• Visualize your data using a pairplot created with Seaborn. EDA is essential for
understanding the data's characteristics and relationships between variables.
Hypothesis Testing:
• Conduct a statistical test (t-test in this example) to assess whether there is a
significant difference between two groups (group1 and group2). The result of the
test includes the test statistic and p-value.
Machine Learning:
• Train a linear regression model to predict a target variable using features 'feature1'
and 'feature2'. Evaluate the model's performance on a test set by calculating the
mean squared error (MSE).
Statistical Analysis:
• This section is a placeholder for additional statistical analyses you may need based
on your research or analysis objectives. You should insert specific statistical tests
and analyses here.
Data Visualization:
• Create informative plots and charts to present the data and analysis results
visually. This section is a placeholder for adding the appropriate visualizations for
your analysis.
Reporting and Documentation:
• Document your analysis process and results. Effective documentation is crucial for
sharing your findings and insights with others.
lOM oAR c P S D | 4759 001 2

PROGRAM

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
np.random.seed(42)
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
plt.scatter(X, y, alpha=0.5)
plt.title('Generated Data for Linear Regression')
plt.xlabel('X')
plt.ylabel('y')
plt.show()
model = LinearRegression()
model.fit(X, y)
X_new = np.array([[0], [2]])
y_pred = model.predict(X_new)
plt.scatter(X, y, alpha=0.5)
plt.plot(X_new, y_pred, color='red', linewidth=2, label='Linear Regression')
plt.title('Linear Regression Analysis')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.show()
print(f'Intercept: {model.intercept_[0]}')
print(f'Coefficient: {model.coef_[0][0]}')
lOM oAR c P S D | 4759 001 2

OUTPUT

RESULT

Thus the performance of evidence based Analysis had been successfully implemented
lOM oAR c P S D | 4759 001 2

EX.NO: 3
PERFORM OPERATIONS ON PROBABILITY BASED REASONING
DATE:

AIM
To performance operation on probability based reasoning using Python.

ALGORITHM

Import Necessary Libraries


• Import the required Python libraries, such as numpy and scipy.stats.

Basic Probability Operations


• Define the parameters for a basic probability operation.
• Use the binom.pmf function from scipy.stats to calculate the probability.
• Display the result.

Normal Distribution
• Define the parameters for a normal distribution operation.
• Use the norm.cdf function from scipy.stats to calculate the cumulative
probability.
• Display the result.

Conditional Probability
• Define the probabilities and apply Bayes' theorem to calculate conditional
probability.
• Display the conditional probability result.

Random Sampling
• Define a population and sample size.
• Use the np.random.choice function from numpy to simulate random
sampling.
• Display the random sample.
lOM oAR c P S D | 4759 001 2

PROGRAM
import numpy as np
from scipy.stats import binom, norm
n=3
p = 0.5
k=2
probability = binom.pmf(k, n, p)
print(f"Probability of getting exactly {k} heads in {n} coin flips: {probability:.4f}")
z=1
cumulative_probability = norm.cdf(z)
print(f"Cumulative Probability (Z < {z}): {cumulative_probability:.4f}")
P_A = 0.4
P_B_given_A = 0.3
P_A_given_B = (P_B_given_A * P_A) / P_B_given_A
print(f"Conditional Probability P(A|B): {P_A_given_B:.4f}")
population = [1, 2, 3, 4, 5]
sample_size = 3
random_sample = np.random.choice(population, size=sample_size, replace=True)
print(f"Random Sample: {random_sample}")
lOM oAR c P S D | 4759 001 2

OUTPUT

Probability of getting exactly 2 heads in 3 coin flips: 0.3750

Cumulative Probability (Z < 1): 0.8413

Conditional Probability P(A|B): 0.5714

Random Sample: [3 5 2]

RESULT
Thus the performance of operation on probability based reasoning had
implemented successfully
lOM oAR c P S D | 4759 001 2

EX.NO: 4
PERFORM BELIEVABILITY BASED ANALYSIS
DATE:

AIM
To perform Believability Analysis using Python

ALGORITHM
1. Initialize the SentimentIntensityAnalyzer from the NLTK library to perform
sentiment analysis.
2. Define a function analyze_believability(text) to analyze the believability of a
given text based on sentiment analysis.
• Input: text (the text to be analyzed)
• Output: believability (a numerical score representing believability)
3. Perform sentiment analysis using the SentimentIntensityAnalyzer:
4. Define a function analyze_source_credibility(url) to analyze the credibility of a
given source URL.
• Input: url (the URL of the source to be analyzed)
• Output: source_credibility (a numerical score representing source )
5. Inside the analyze_source_credibility(url) function:
• Use the requests library to fetch the webpage content from the provided
URL.
• Parse the HTML content using BeautifulSoup to extract relevant
information, such as author, publication date, and source credibility
indicators. The extraction logic may vary depending on the webpage
structure.
6. In the main part of the code and source URL for analysis.
(name == " main ":), provide a sample text
7. Call analyze_believability(text) to calculate the believability score for the given
text.
8. Call analyze_source_credibility(source_url) to calculate the source credibility
score for the provided source URL.
9. If both believability and source credibility scores are successfully calculated,
compute the final believability score as the average of the two scores.
10. Print the final believability score.
lOM oAR c P S D | 4759 001 2

PROGRAM

import nltk

from nltk.sentiment.vader
import SentimentIntensityAnalyzer
import requests
from bs4 import BeautifulSoup
nltk.download('vader_lexicon')
sid = SentimentIntensityAnalyzer()
def analyze_believability(text):
sentiment_scores = sid.polarity_scores(text)
believability = 1.0 - sentiment_scores['neg']
return believability
def analyze_source_credibility(url):
try:
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
source_credibility = 0.7
return source_credibility
except Exception as e:
print(f"Error fetching or analyzing the source: {e}")
return None
if name == " main ":
text = "This is a sample text that you want to analyze for believability."
source_url = "https://www.example.com/sample-article"
believability_score = analyze_believability(text)
source_credibility_score = analyze_source_credibility(source_url)
if believability_score is not None and source_credibility_score is not None:
final_believability_score = (believability_score + source_credibility_score) / 2
print(f"Believability Score: {final_believability_score}")
else:
print("Unable to calculate believability due to errors.")
lOM oAR c P S D | 4759 001 2

OUTPUT

Believability Score: 0.77

RESULT
Thus the performance of Believability Analysis had been successfully
implemented.
lOM oAR c P S D | 4759 001 2

EX.NO: 5
IMPLEMENT RULE LEARNING AND REFINEMENT
DATE:

AIM

To Implement the Rule Learning and Refinement using Python.

ALGORITHM

1. Load the dataset:


• Load the dataset (e.g., Iris dataset) for the task. You can replace this dataset
with your own data.
2. Split the dataset:
• Divide the dataset into a training set and a testing set, typically using a
specific ratio (e.g., 80% for training and 20% for testing).
3. Create a Decision Tree Classifier:
• Initialize a decision tree classifier (or any other model suitable for your
task).
4. Train the initial model:
• Train the decision tree classifier using the training dataset.
5. Make predictions:
• Use the trained model to make predictions on the test dataset.
6. Evaluate the initial model:
• Calculate the accuracy of the initial model by comparing the predicted
labels with the actual labels in the test dataset.
7. Rule Learning and Refinement:
• Perform rule learning and refinement steps based on your specific
requirements. For example, you can apply techniques like pruning the
decision tree, feature selection, or hyperparameter tuning to improve the
model's performance.
8. Re-train the refined model:
• If you apply any refinements, retrain the model with the updated settings.
9. Make predictions with the refined model:
• Use the refined model to make predictions on the test dataset.
10. Evaluate the refined model:
Calculate the accuracy of the refined model by comparing the
predicted
lOM oAR c P S D | 4759 001 2

PROGRAM
From sklearn.datasets import load iris
from sklearn.tree import DecisionTreeClassifier

from sklearn.model_selection import

train_test_split from sklearn.metrics import

accuracy_score

data=load_iris()

X = data.data

y = data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,


random_state=42)

clf = DecisionTreeClassifier()

clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

initial_accuracy = accuracy_score(y_test, y_pred)

print(f"Initial Model Accuracy:

{initial_accuracy:.2f}") clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

refined_accuracy = accuracy_score(y_test, y_pred)

print(f"Refined Model Accuracy: {refined_accuracy:.2f}")


lOM oAR c P S D | 4759 001 2

OUTPUT

Initial Model Accuracy: 0.93

Refined Model Accuracy: 0.95

RESULT
Thus the Performance of Rule Learning and Refinement had been successfully
implemented.
lOM oAR c P S D | 4759 001 2

EX.NO: 6
PERFORM ANALYSIS BASED ON LEARNED PATTERNS
DATE:

AIM
To perform the analysis based on Learned Pattern using Python.

ALGORITHM

a. Data Collection:
i. Gather and collect the dataset relevant to your analysis. Ensure that the
data is clean, well-structured, and contains the necessary information
for pattern discovery.

b. Data Preprocessing:
i. Handle missing data by imputing or removing it as appropriate.
ii. Normalize or scale numerical features to ensure they are on a
similar scale. Encode categorical variables into numerical
values.

c. Data Split:
i. Split the dataset into a training set and a testing/validation set. The training
set is used to learn patterns, and the testing set is used to evaluate the model's
performance.

d. Pattern Learning:
i. Choose an appropriate machine learning algorithm, such as decision trees,
random forests, neural networks, or clustering algorithms, depending on
the type of analysis you want to perform.
ii. Train the selected model on the training data.

e. Model Evaluation:
i. Evaluate the model's performance on the testing/validation dataset.
Common evaluation metrics include accuracy, precision, recall, F1-score,
and ROC-AUC, depending on the nature of your analysis (classification,
regression, clustering, etc.).
f. Iterate:
i. If the initial analysis does not meet your objectives, consider iterating
through the process adjusting hyperparameters, trying different algorithms, or
collecting more data.

g. Deployment:
i. If the analysis meets your goals, deploy the model to make
predictions or informdecision-making.
lOM oAR c P S D | 4759 001 2

PROGRAM
Import numpy as np

from sklearn.linear_model import LinearRegression

import matplotlib.pyplot as plt

study_hours = np.array([2, 4, 6, 8, 10, 12]).reshape(-1, 1)

exam_scores = np.array([30, 40, 55, 60, 75, 85])

model = LinearRegression()

model.fit(study_hours, exam_scores)

predicted_scores = model.predict(study_hours)

plt.scatter(study_hours, exam_scores, label='Actual scores')

plt.plot(study_hours, predicted_scores, color='red', label='Predicted scores')

plt.xlabel('Study Hours')

plt.ylabel('Exam Scores')

plt.legend()

plt.show()
lOM oAR c P S D | 4759 001 2

OUTPUT

RESULT
Thus the Performance of analysis based on Learned Pattern had been successfully
implemented.
lOM oAR c P S D | 4759 001 2

EX.NO: 7

DATE: CONSTRUCTION OF ONTOLOGY FOR A GIVEN DOMAIN

AIM
To perform the analysis based on Learned Pattern using Python.

ALGORITHM

a. Import necessary Libraries:


Import the required libraries for working with RDF data. In this example,
we'll use RDFlib. RDF, RDFS.

b. Define Namespace:
Define namespaces for your ontology.

c. Define Classes:
Define classes in your ontology using RDF triples.

d. Define Properties:
Define properties (attributes or relations) for your classes, if necessary.

e. Define Individual:
Define individuals (instances) and specify their types by adding triples.

f. Specify Relationship:
Establish relationships between individuals and properties.

g. Serialize the Ontology:


Serialize the RDF graph to a file in the desired format (e.g., RDF/XML)

h. Extent and customize:


Continue to define more classes, individuals, properties, and relationships
lOM oAR c P S D | 4759 001 2

PROGRAM

from rdflib import Graph, Literal, URIRef


from rdflib.namespace import RDF, RDFS
g = Graph()
ns = {
"ex": URIRef("http://example.org/"),
"rdf": RDF,
"rdfs": RDFS
}
g.add((ns["ex:Pet"], RDF.type, RDFS.Class))
g.add((ns["ex:Animal"], RDF.type, RDFS.Class))
g.add((ns["ex:Dog"], RDF.type, ns["ex:Pet"]))
g.add((ns["ex:Cat"], RDF.type, ns["ex:Pet"]))
g.add((ns["ex:Fish"], RDF.type, ns["ex:Pet"]))
g.add((ns["ex:hasName"], RDF.type, RDF.Property))
g.add((ns["ex:Dog"], ns["ex:hasName"], Literal("Fido")))
g.add((ns["ex:Cat"], ns["ex:hasName"], Literal("Whiskers")))
g.add((ns["ex:Fish"], ns["ex:hasName"], Literal("Bubbles")))
with open("pets.owl", "wb") as f:
f.write(g.serialize(format="xml"))
lOM oAR c P S D | 4759 001 2

OUTPUT

RESULT

Thus the Construction of Ontology for a given domain had been successfully
implemented.

You might also like