0% found this document useful (0 votes)

132 views

WSMA-lab

Uploaded by

venkat varma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

132 views

WSMA-lab

Uploaded by

venkat varma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 21

LABORATORYMANUA

L
WEBANDSOCIALMEDIAANALYTICSLAB

For
B. Tech IV Year I Semester
(COMPUTERSCIENCEANDENGINEERIN
G)

(DATASCIENCE)

(R18Regulations)

DEPARTMENTOFCOMPUTERSCIENCEANDENG
INEERING
(DATASCIENCE)
Sreyas Institute of Engineering and
Technology
Prepared by B. Venkata Varma
An UGC Autonomous Institution

Prepared by B. Venkata Varma

PROGRAMOUTCOMES(POs)

1. Engineeringknowledge: Apply the knowledge of mathematics, science, engineering

fundamentals, and an engineering specialization to the solution of complex engineering
problems.

2. Problemanalysis:Identify, formulate, review research literature, and analyze complex

engineering problems reaching substantiated conclusions using first principles of mathematics,
natural sciences, and engineering sciences.

3. Design/developmentofsolutions:Design solutions for complex engineering problems and

design system components or processes that meet the specified needs with appropriate
consideration for the public health and safety, and the cultural, societal, and environmental
considerations.

4. Conductinvestigationsofcomplexproblems: Useresearch-basedknowledgeandresearch
methods including design of experiments, analysis and interpretation of data, and synthesis of
the information to provide valid conclusions.

5. Moderntoolusage:Create,select,andapplyappropriatetechniques,resources,andmodern
engineering and IT tools including prediction and modeling to complex engineering activities
with an understanding of the limitations.

6. Theengineerandsociety: Applyreasoninginformedbythecontextualknowledgetoassess
societal, health, safety, legal and cultural issues and the consequent responsibilities relevant to
the professional engineering practice.

7. Environmentandsustainability: Understand the impact of the professional engineering

solutions in societal and environmental contexts, and demonstrate the knowledge of, and need
for sustainable development.

8. Ethics:Apply ethical principles and commit to professional ethics and responsibilities and
norms of the engineering practice.

9. Individualandteamwork: Function effectively as an individual, and as a member or leader

in diverse teams, and in multidisciplinary settings.

10. Communication:Communicate effectively on complex engineering activities with the

engineering community and with society at large, such as, being able to comprehend and write
effective reports and design documentation, make effective presentations, and give and receive
clear instructions.

11. Projectmanagementandfinance: Demonstrate knowledge and understanding of the

engineering and management principles and apply these to one’s own work, as a member and
leader in a team, to manage projects and in multidisciplinary environments.
12. Life-longlearning: Recognize the need for,and have the preparation and ability to engage
in independent and life-long learning in the broadest context of technological change.

Prepared by B. Venkata Varma

COURSE STRUCTURE
(REGULATION: R18)
For The Fourth Year Under Graduate
Programme
Bachelor of
Technology(B.Tech)
With effect from the
AcademicYear2024-25

DEPARTMENTOFCOMPUTERSCIENCEANDENG
INEERING
(DATASCIENCE)

Prepared by B. Venkata Varma

R18B.Tech.CSE(DS)Syllabus

JNTUHYDERABAD

DATAMININGLAB

IV Year B.Tech. CSE(DS) I -Sem LTPC

0021
CourseObjectives:Exposuretovariouswebandsocialmediaanalytictechniques.
CourseOutcomes:
1. Knowledgeondecisionsupportsystems.
2. Applynaturallanguageprocessingconceptsontextanalytics.
3.Understand sentiment analysis.
4.Knowledgeonsearchengineoptimizationandwebanalytics.
List of Experiments
1. PreprocessingtextdocumentusingNLTKofPython
a.Stopword elimination
b.Stemming
c.Lemmatization
d.POStagging
e.Lexicalanalysis
2. Sentiment analysis on customer review on products
3. Web analytics
a. Webusagedata(web server log data, click stream analysis)
b.Hyperlink data
4. Search engine optimization-implement spamdexing
5. Use Google analytics tools to implement the following

Prepared by B. Venkata Varma

a. ConversionStatistics
b.Visitor Profiles
6. Use Google analytics tools to implement the Traffic Sources.
Resources:
1. Stanford core NLP package
2.GOOGLE.COM/ANALYTICS
TEXT BOOKS:
1. Ramesh Sharda, Dursun Delen, Efraim Turban, BUSINESS INTELLIGENCE
ANDANALYTICS:SYSTEMSFORDECISIONSUPPORT,PearsonEducation.
REFERENCEBOOKS:
1. RajivSabherwal,IrmaBecerra-Fernandez,”BusinessIntelligence–
Practice,Technologies andManagement”,JohnWiley2011.
2. LarissT.Moss,ShakuAtre,“BusinessIntelligenceRoadmap”,Addison-Wesley It
Service.
3. YuliVasiliev,“OracleBusinessIntelligence:TheCondensedGuidetoAnalysis and
Reporting”,SPD Shroff, 2012

Prepared by B. Venkata Varma

CO-POMAPPING

PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12

CO1 3 3 3 3 3 - 2 3 3 - - 3

CO2 3 3 3 2 2 2 - 3 3 - - 3

CO3 3 3 3 3 3 - - 3 3 - - 3

CO4 3 3 3 3 3 - - 3 3 - - 3

CO5 3 3 3 - 3 - - 3 3 - - 3

AVG 3 3 3 3 3 2 2 3 3 2 2 3

CO-PSOMAPPING:

PSO1 PSO2

CO1 - 2
CO2 - 1
CO3 - 1
CO4 - 2
CO5 - 1
AVG 0 2

Prepared by B. Venkata Varma

1. Preprocessing text document using NLTK of Python

a. Stop word elimination

Stop word elimination is a process used in natural language processing (NLP) to

remove common words, often called stop words, that are not essential to the
meaning of a sentence. These are typically high-frequency words like "and," "the,"
"is," and "of" that do not contribute much to understanding the content or topic of a
text.

Steps in Stop Word Elimination:

1. Tokenization: Split the text into individual words or tokens.

2. Stop Word List: Have a predefined list of stop words (e.g., provided by NLP
libraries or custom lists).

3. Filtering: Remove words from the text that are in the stop word list.

4. Reconstruction: Reassemble the text or tokens without the stop words.

Example:

 Original sentence: "The cat is on the mat."

 Stop words (from a predefined list): "the," "is," "on."

NLTK library maintains a list of around 179 stopwords (shown below) that can be
used to filtering stopwords from the text. You may also add or remove stopwords
from the default list.
import nltk
from nltk.corpus import stopwords
nltk.download('stopwords')
print(stopwords.words('english'))

Out:-

‘i’, ‘me’, ‘my’, ‘myself’, ‘we’, ‘our’, ‘ours’, ‘ourselves’, ‘you’, “you’re”, “you’ve”, “you’ll”,
“you’d”, ‘your’, ‘yours’, ‘yourself’, ‘yourselves’, ‘he’, ‘him’, ‘his’, ‘himself’, ‘she’, “she’s”,
‘her’, ‘hers’, ‘herself’, ‘it’, “it’s”, ‘its’, ‘itself’, ‘they’, ‘them’, ‘their’, ‘theirs’, ‘themselves’,
‘what’, ‘which’, ‘who’, ‘whom’, ‘this’, ‘that’, “that’ll”, ‘these’, ‘those’, ‘am’, ‘is’, ‘are’,
‘was’, ‘were’, ‘be’, ‘been’, ‘being’, ‘have’, ‘has’, ‘had’, ‘having’, ‘do’, ‘does’, ‘did’, ‘doing’,
‘a’, ‘an’, ‘the’, ‘and’, ‘but’, ‘if’, ‘or’, ‘because’, ‘as’, ‘until’, ‘while’, ‘of’, ‘at’, ‘by’, ‘for’,
‘with’, ‘about’, ‘against’, ‘between’, ‘into’, ‘through’, ‘during’, ‘before’, ‘after’, ‘above’,
Prepared by B. Venkata Varma
‘below’, ‘to’, ‘from’, ‘up’, ‘down’, ‘in’, ‘out’, ‘on’, ‘off’, ‘over’, ‘under’, ‘again’, ‘further’,
‘then’, ‘once’, ‘here’, ‘there’, ‘when’, ‘where’, ‘why’, ‘how’, ‘all’, ‘any’, ‘both’, ‘each’,
‘few’, ‘more’, ‘most’, ‘other’, ‘some’, ‘such’, ‘no’, ‘nor’, ‘not’, ‘only’, ‘own’, ‘same’, ‘so’,
‘than’, ‘too’, ‘very’, ‘s’, ‘t’, ‘can’, ‘will’, ‘just’, ‘don’, “don’t”, ‘should’, “should’ve”, ‘now’,
‘d’, ‘ll’, ‘m’, ‘o’, ‘re’, ‘ve’, ‘y’, ‘ain’, ‘aren’, “aren’t”, ‘couldn’, “couldn’t”, ‘didn’, “didn’t”,
‘doesn’, “doesn’t”, ‘hadn’, “hadn’t”, ‘hasn’, “hasn’t”, ‘haven’, “haven’t”, ‘isn’, “isn’t”,
‘ma’, ‘mightn’, “mightn’t”, ‘mustn’, “mustn’t”, ‘needn’, “needn’t”, ‘shan’, “shan’t”,
‘shouldn’, “shouldn’t”, ‘wasn’, “wasn’t”, ‘weren’, “weren’t”, ‘won’, “won’t”, ‘wouldn’,
“wouldn’t”

import nltk
nltk.download('stopwords')
def stopword_elimination(text):
stopwords = nltk.corpus.stopwords.words('english')
filtered_words = [word for word in text.split() if word.lower() not in stopwords]
return filtered_words
if __name__ == '__main__':
text = "This is a sample text with stopwords."
filtered_words = stopword_elimination(text)
print(filtered_words)

Output
['sample','text','with']

b)Stemming

Stemming also reduces the words to their root forms but unlike lemmatization, the
stem itself may not a valid word in the Language.
NLTK has many stemming functions with different algorithms, we will use
PorterStemmer over here.You will like to either perform stemming or lemmatization
and not both. We will however perform stemming on our data just to explain to you.
We have defined a custom function stemming() that returns the text by converting
the words to stem, we finally apply it to Twitter dataframe.

import nltk
from nltk.stem import PorterStemmer
def stemming(text):
stemmer = PorterStemmer()
stemmed_words = []

for word in text.split():

Prepared by B. Venkata Varma
stemmed_words.append(stemmer.stem(word))
return stemmed_words
if __name__ == '__main__':
text = "This is a sample text with stemming."
stemmed_words = stemming(text)
print(stemmed_words)

Output
['thi', 'is', 'a', 'sampl', 'text', 'with', 'stemming.']

In [1]:
from nltk.stem import PorterStemmer

def stemming(text):
porter = PorterStemmer()
result = []
for word in text:
result.append(porter.stem(word))
return result
# Test
text = ['Connects', 'Connecting', 'Connections', 'Connected', 'Connection', 'Connectings',
'Connect']
stemmed_words = stemming(text)
print(stemmed_words)
[Out]:

['connect', 'connect', 'connect', 'connect', 'connect', 'connect',

C) Lemmatization

Lemmatization is converting the word to its base form or lemma by removing

affixes from the inflected words. It helps to create better features for machine
learning and NLP models hence it is an important preprocessing step.
There are many Lematizers available in NLTK that employ different algorithms. In
our example, we have used WordNet Lemmatizer module of NLTK for
lemmatization.
We have created a custom function lemmatization() that first does POS tagging
and then lemmatizes the text. Finally, this function is applied to our Twitter
dataframe.
import nltk
Prepared by B. Venkata Varma
nltk.download('wordnet')
nltk.download('omw-1.4')
from nltk.stem import WordNetLemmatizer
def lemmatization(text):
lemmatizer = WordNetLemmatizer()
lemmatized_words = []
for word in text.split():
lemmatized_words.append(lemmatizer.lemmatize(word))
return lemmatized_words
if __name__ == '__main__':
text = "This is a sample text with lemmatization."
lemmatized_words = lemmatization(text)
print(lemmatized_words)

Output
['this','sample','text','lemmatization']

D) POStagging

If we scrape our data from a different website, removing HTML tags becomes
an essential step as part of our preprocessing.
We can use Python regular expression function to find all the unwanted tags. Here in
this example, we have defined a custom function remove_tag() which cleans the
HTML tags from the text by using regular expression. And finally, we apply this
function to our Twitter dataframe.
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
def pos_tagging(text):
tokens = nltk.word_tokenize(text)
tagged_tokens = nltk.pos_tag(tokens)
return tagged_tokens
if __name__ == '__main__':
text = "This is a sample text with lexical analysis."
tagged_tokens = pos_tagging (text)
print(tagged_tokens)
Output
[('This','DT'),('is','VBZ'),('a','DT'),('sample','NN'),('text','NN'),('with','IN'), ('POS',
'NN'), ('tagging', 'VBG')]

E) Lexical analysis
Prepared by B. Venkata Varma
Lexical analysis is the process of converting a sequence of characters in a source code file
into a sequence of tokens that can be more easily processed by a compiler or interpreter. It
is often the first phase of the compilation process and is followed by syntax analysis and
semantic analysis.

import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')
def lexical_analysis(text):
tokens = nltk.word_tokenize(text)
tagged_tokens = nltk.pos_tag(tokens)
return tagged_tokens
if __name__ == '__main__':
text = "This is a sample text with lexical analysis."
tagged_tokens = lexical_analysis(text)
print(tagged_tokens)

Output
[('This','DT'),('is','VBZ'),('a','DT'),('sample','NN'),('text','NN'),('with','IN'),
('lexical','JJ'),('analysis','NN')]

2. Sentiment analysis on customer review on products

import nltk
nltk.download('vader_lexicon')
from nltk.sentiment.vader import SentimentIntensityAnalyzer
def sentiment_analysis(text):
analyzer = SentimentIntensityAnalyzer()
sentiment = analyzer.polarity_scores(text)
return sentiment
if __name__ == '__main__':
text = "This is a sample text with positive sentiment."
sentiment = sentiment_analysis(text)
print(sentiment)

Output
{ 'neg': 0.0, 'neu': 0.625, 'pos': 0.375, 'compound': 0.5574}

3. Web analytics
Prepared by B. Venkata Varma
a. Web usage data(web server log data, click stream analysis)

import pandas as pd
def web_usage_analysis(log_file):
# Read web log data from CSV file
try:
log_data = pd.read_csv(log_file)
except Exception as e:
print(f"Error reading file: {e}")
return
# Check if necessary columns exist
required_columns = ['user_id', 'session_id', 'timestamp']
if not all(col in log_data.columns for col in required_columns):
print("Missing required columns in the log data.")
return
# Group by user and session to count requests per user
user_requests = log_data.groupby('user_id')['session_id'].count()
# Display the results
print("Web Requests per User:")
print(user_requests)
# Example usage with a log file
log_file = '/content/web_log.csv' # Example file path
web_usage_analysis(log_file)

Steps to Create a CSV File:

1. Using a Text Editor or Spreadsheet Application:
Option 1: Create Manually (Text Editor)
1. Open a text editor (e.g., Notepad, VSCode, Sublime Text).
2. Enter the following example data:
3.Save the file as web_log.csv (make sure to set the file type to All Files or the extension
to .csv).
user_id session_id timestamp
101 1 9/30/2023 10:15
102 2 9/30/2023 10:20
101 3 9/30/2023 11:00
103 4 9/30/2023 11:30
104 5 9/30/2023 12:15
105 6 9/30/2023 13:00
101 7 10/1/2023 9:00
102 8 10/1/2023 9:30
104 9 10/1/2023 10:00
105 10 10/1/2023 10:45

Output
Prepared by B. Venkata Varma
Web Requests per User:
user_id
101 3
102 2
103 1
104 2
105 2
a.Name: session_id, dtype: int64

b. Hyperlink
data
import requests
import bs4
def hyperlink_analysis(url):
# Send a request to the URL and parse the HTML
response = requests.get(url)
soup = bs4.BeautifulSoup(response.content, 'html.parser')
# Find all hyperlinks in the page
links = soup.find_all('a')
# Analyze the links
link_counts = {}
for link in links:
anchor_text = link.text
url = link.get('href', '') # Get href, handle if it doesn't exist
if url not in link_counts:
link_counts[url] = 0
link_counts[url] += 1

# Print the results

for url, count in link_counts.items():
print(f'{url}: {count}')

# Entry point of the script

if __name__ == '__main__':
url = 'https://www.google.com/'
hyperlink_analysis(url)

Output
pip install requests
pip install bs4
Prepared by B. Venkata Varma
python hyperlink_analysis.py
/search: 5
/maps: 1
/shopping: 1
/about: 1
https://policies.google.com/privacy: 1
/intl/en/policies/terms/: 1
4. Search engine optimization-implement spamdexing

import nltk
nltk.download('stopwords')
def spamdexing(text):
# Load English stopwords from NLTK
stopwords = nltk.corpus.stopwords.words('english')
# Define the keywords to be added
keywords = ['keyword1', 'keyword2', 'keyword3']
# Filter the text by removing stopwords
filtered_text = [word for word in text.split() if word.lower() not in stopwords]

# Append keywords to the filtered text (each keyword repeated 10 times)

for keyword in keywords:
filtered_text.append(keyword * 10)
return filtered_text
# Entry point of the script
if __name__ == '__main__':

text = "This is a sample text with stopwords."

filtered_text = spamdexing(text)
print(filtered_text)

Output

['This','is','a','sample','text','with','stopwords.','keyword1','keyword1','keyword1',
'keyword1','keyword1', 'keyword1', 'keyword1', 'keyword1', 'keyword2', 'keyword2',
'keyword2', 'keyword2','keyword2', 'keyword2', 'keyword2', 'keyword2', 'keyword3',
'keyword3', 'keyword3', 'keyword3','keyword3', 'keyword3', 'keyword3', 'keyword3']

Prepared by B. Venkata Varma

5. Use Google analytics tools to implement the following
a. Conversion Statistics
import requests

def get_conversion_data(conversion_id):
url = 'https://analytics.google.com/analytics/v3/data/ga'
params = {
'ids': f'ga:{conversion_id}',
'start-date': '2023-01-01',
'end-date': '2023-08-01',
'metrics': 'ga:conversions',
'dimensions': 'ga:date',
'samplingLevel': '1'
}
response = requests.get(url, params=params)
return response.json()

if __name__ == '__main__':
conversion_id = '1234567890'
conversion_data = get_conversion_data(conversion_id)
print(conversion_data)

Output
The output of the program will depend on the data in the data file. However, the
output might include the following information:
•The conversion rate
•The number of conversions
•The number of visitors.

b. Visitor Profiles

To create Visitor Profiles in Python, you need to analyze visitor data from sources like an
API, database, or CSV files. Visitor profiles typically include attributes such as
demographics, preferences, behavior patterns, and interaction history. Here's how you can
structure your approach:

Prepared by B. Venkata Varma

import pandas as pd
import matplotlib.pyplot as plt

# Sample visitor data

visitor_data = [
{"visitor_id": 1, "age": 25, "gender": "Female", "location": "New York", "visits": 10,
"purchases": 2},
{"visitor_id": 2, "age": 30, "gender": "Male", "location": "Los Angeles", "visits": 5,
"purchases": 1},
{"visitor_id": 3, "age": 22, "gender": "Female", "location": "Chicago", "visits": 15,
"purchases": 5},
]

# Convert to DataFrame
df = pd.DataFrame(visitor_data)

# Add derived metrics

df['conversion_rate'] = (df['purchases'] / df['visits']) * 100 # Conversion rate in %

# Display basic statistics

print("Summary Statistics:")
print(df.describe())

# Group by gender
gender_summary = df.groupby('gender').agg({'visits': 'mean', 'purchases': 'mean',
'conversion_rate': 'mean'})
print("\nGender-Based Summary:")
print(gender_summary)

# Plot profiles
plt.figure(figsize=(10, 6))
df.groupby('location')['visits'].sum().plot(kind='bar', color='skyblue')
plt.title('Visits by Location')
plt.xlabel('Location')
plt.ylabel('Total Visits')
plt.show()

Prepared by B. Venkata Varma

Output
Summary Statistics:
visitor_id age visits purchases conversion_rate
count 3.0 3.000000 3.000000 3.000000 3.000000
mean 2.0 29.000000 17.333333 2.666667 14.230504
std 1.0 6.557439 4.932883 2.081666 7.536384
min 1.0 22.000000 14.000000 1.000000 6.666667
25% 1.5 26.000000 14.500000 1.500000 10.476190
50% 2.0 30.000000 15.000000 2.000000 14.285714
75% 2.5 32.500000 19.000000 3.500000 18.012422
max 3.0 35.000000 23.000000 5.000000 21.739130
Gender-Based Summary:
visits purchases conversion_rate
gender
FeMale 15.0 1.0 6.666667
Female 23.0 5.0 21.739130
male 14.0 2.0 14.285714

Prepared by B. Venkata Varma

6. Use Google analytics tools to implement the Traffic Sources.
import requests
import json

def get_traffic_sources(profile_id, access_token):

url = 'https://analytics.googleapis.com/analytics/v3/data/ga'
params = {
'ids': f'ga:{profile_id}', # Use f-strings to interpolate the profile_id
'start-date': '2023-01-01',
'end-date': '2023-08-01',
'metrics': 'ga:sessions',
'dimensions': 'ga:source,ga:medium',
'samplingLevel': 'HIGHER_PRECISION'
}
headers = {
'Authorization': f'Bearer {access_token}', # Provide the access token for
authentication
'Accept': 'application/json'
}

response = requests.get(url, params=params, headers=headers)

if response.status_code == 200:
return response.json()
else:
print(f"Error: {response.status_code} - {response.text}")
return None

def save_data_to_json(data, filename='traffic_sources.json'):

# Write the response data to a JSON file
with open(filename, 'w') as json_file:
json.dump(data, json_file, indent=4)
print(f"Data saved to {filename}")

if __name__ == '__main__':
profile_id = '1234567890' # Replace with your actual profile ID
access_token = 'YOUR_ACCESS_TOKEN' # Replace with a valid access token

# Fetch the traffic sources

traffic_sources = get_traffic_sources(profile_id, access_token)

if traffic_sources:
save_data_to_json(traffic_sources)

Prepared by B. Venkata Varma

Output

Invalid Access Token:-

Error: 401 - {
"error": {
"code": 401,
"message": "Request is missing required authentication credential.",
"errors": [
{
"message": "Request is missing required authentication credential.",
"domain": "global",
"reason": "required"
}
]
}
}
Invalid Profile ID:-
Error: 400 - {
"error": {
"code": 400,
"message": "Invalid value 'ga:123456'. Values must match the pattern 'ga:[0-9]+'.",
"errors": [
{
"message": "Invalid value 'ga:123456'. Values must match the pattern 'ga:[0-9]+'.",
"domain": "global",
"reason": "invalid"
}
Prepared by B. Venkata Varma
]
}
}

Prepared by B. Venkata Varma

Surveying Complete PDF by Sandeep Jyani - SSC Je Gate Ies
75% (4)
Surveying Complete PDF by Sandeep Jyani - SSC Je Gate Ies
392 pages
Training - Power System Protection - AREVA
100% (54)
Training - Power System Protection - AREVA
461 pages
NLPsoundarya (1)
No ratings yet
NLPsoundarya (1)
24 pages
Cl-II Lab Manual Ir-Ui - Ux
No ratings yet
Cl-II Lab Manual Ir-Ui - Ux
61 pages
Wsma Final Manual
No ratings yet
Wsma Final Manual
58 pages
Nlp Manual Final
No ratings yet
Nlp Manual Final
28 pages
Web and Social Media Analytics Lab
No ratings yet
Web and Social Media Analytics Lab
34 pages
Lab - Manual - IR - BE AI&DS CL II
No ratings yet
Lab - Manual - IR - BE AI&DS CL II
38 pages
NLP___
No ratings yet
NLP___
28 pages
BCAI 551 Lab File_AI
No ratings yet
BCAI 551 Lab File_AI
7 pages
NLP LAB_MANUAL (1)
No ratings yet
NLP LAB_MANUAL (1)
33 pages
Lab2 IR
No ratings yet
Lab2 IR
16 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
38 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
88 pages
AI Lab CDP Format
No ratings yet
AI Lab CDP Format
9 pages
AIML_P4
No ratings yet
AIML_P4
12 pages
Python Programming Lab Manual
100% (1)
Python Programming Lab Manual
63 pages
(R18A0588) Python Programming Lab Manual
100% (1)
(R18A0588) Python Programming Lab Manual
63 pages
Ai & ML Week-11
No ratings yet
Ai & ML Week-11
32 pages
Aashi Goel AI Lab 3rd Sem
No ratings yet
Aashi Goel AI Lab 3rd Sem
36 pages
Ai & Ml Lab Manual-2023-2024
No ratings yet
Ai & Ml Lab Manual-2023-2024
67 pages
NLP Soc
No ratings yet
NLP Soc
15 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
54 pages
Final B. Tech, 1st Year AIML English CF 2024-2025, SEM 1 PDF (1)
No ratings yet
Final B. Tech, 1st Year AIML English CF 2024-2025, SEM 1 PDF (1)
165 pages
Python Programming (R18A0588) : Laboratory Manual
No ratings yet
Python Programming (R18A0588) : Laboratory Manual
61 pages
lab manual (1)
No ratings yet
lab manual (1)
57 pages
Natural Language Processing
No ratings yet
Natural Language Processing
25 pages
BTech. 4th Year - Computer Science and Engineering - Hindi - 2024-25 - v2
No ratings yet
BTech. 4th Year - Computer Science and Engineering - Hindi - 2024-25 - v2
20 pages
CSDM2-Text Preprocessing For NL Data - 011050
No ratings yet
CSDM2-Text Preprocessing For NL Data - 011050
6 pages
Tsa Lab Record - Cse
No ratings yet
Tsa Lab Record - Cse
61 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
29 pages
NLP Manual (1-12)
No ratings yet
NLP Manual (1-12)
55 pages
NLP Manual (1-12) 1
No ratings yet
NLP Manual (1-12) 1
56 pages
CCS369 - Text and Speech Analysis
No ratings yet
CCS369 - Text and Speech Analysis
31 pages
Lab Manual For: Advanced Python Programming Lab (CS311PC)
No ratings yet
Lab Manual For: Advanced Python Programming Lab (CS311PC)
27 pages
a7 dsbda sana
No ratings yet
a7 dsbda sana
15 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
94 pages
Im Tdadhad
No ratings yet
Im Tdadhad
2 pages
NLP Experiment 1
No ratings yet
NLP Experiment 1
13 pages
Ai & ML Lab Manual
No ratings yet
Ai & ML Lab Manual
41 pages
Ai Lab Manual
No ratings yet
Ai Lab Manual
25 pages
2024-25 AI & DS III Sem-A Sec IDS 8
No ratings yet
2024-25 AI & DS III Sem-A Sec IDS 8
4 pages
CP-CSE3188-Natural Language Processing (1)
No ratings yet
CP-CSE3188-Natural Language Processing (1)
14 pages
Big Data Analytics Lab Manual
No ratings yet
Big Data Analytics Lab Manual
90 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
15 pages
SYNERGY INSTITUTE OF ENGINEERING AND TECHNOLOGY-1
No ratings yet
SYNERGY INSTITUTE OF ENGINEERING AND TECHNOLOGY-1
62 pages
Bda Lab Manual (R20a0592)
No ratings yet
Bda Lab Manual (R20a0592)
89 pages
Lab3 IR BIM
No ratings yet
Lab3 IR BIM
14 pages
p4
No ratings yet
p4
10 pages
NLP_course-EDC-1-29
No ratings yet
NLP_course-EDC-1-29
29 pages
Ge3171 – Problem Solving and Python Programming Laboratory Manual (2)
No ratings yet
Ge3171 – Problem Solving and Python Programming Laboratory Manual (2)
47 pages
ML Lab
No ratings yet
ML Lab
13 pages
Ai Lab Manual
No ratings yet
Ai Lab Manual
38 pages
22CS601 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING - Lab Manual
No ratings yet
22CS601 ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING - Lab Manual
109 pages
Batch 2
No ratings yet
Batch 2
13 pages
dvp
No ratings yet
dvp
51 pages
NLP Manual
No ratings yet
NLP Manual
15 pages
Assignment 2 IR
No ratings yet
Assignment 2 IR
6 pages
Cho
No ratings yet
Cho
13 pages
Python NLP
No ratings yet
Python NLP
6 pages
Real-Time Critical Systems
From Everand
Real-Time Critical Systems
Jordan Lee Mauro-Buhagiar
3/5 (1)
LabVIEW – More LCOD
From Everand
LabVIEW – More LCOD
Rob Maskell
No ratings yet
Transformer Selection Guide
No ratings yet
Transformer Selection Guide
1 page
Additive Manufacturing-Achal Dubey
No ratings yet
Additive Manufacturing-Achal Dubey
25 pages
094 - JIETDAT-BTU-I B.Tech. (I Sem.) (Improvement Examinations Schedule) (Online Examination), July-2022
No ratings yet
094 - JIETDAT-BTU-I B.Tech. (I Sem.) (Improvement Examinations Schedule) (Online Examination), July-2022
2 pages
Jassim Salam CV
No ratings yet
Jassim Salam CV
2 pages
Power Transmission
No ratings yet
Power Transmission
17 pages
Unique & Latest Civil Engineering Project, Seminar, Thesis & Presentation Topics - CivilDigital PDF
100% (1)
Unique & Latest Civil Engineering Project, Seminar, Thesis & Presentation Topics - CivilDigital PDF
6 pages
Syllabus CSE
No ratings yet
Syllabus CSE
137 pages
Ee210 2025 Course Outline
No ratings yet
Ee210 2025 Course Outline
4 pages
2013 ESCAPE 1.6L-2 Diagrama de Cableado
100% (1)
2013 ESCAPE 1.6L-2 Diagrama de Cableado
3 pages
Upfc PHD Thesis
100% (3)
Upfc PHD Thesis
7 pages
Binoy O Mathew
No ratings yet
Binoy O Mathew
5 pages
Tablas 286 - 2 PDF
No ratings yet
Tablas 286 - 2 PDF
48 pages
Hydrologic Modeling System Hec-Hms: Applications Guide
No ratings yet
Hydrologic Modeling System Hec-Hms: Applications Guide
116 pages
Universities of Florida Graduate Salaries
No ratings yet
Universities of Florida Graduate Salaries
1 page
300l Course Registration
No ratings yet
300l Course Registration
2 pages
PressureVessels Figrs
No ratings yet
PressureVessels Figrs
3 pages
Commissioning Notes
100% (1)
Commissioning Notes
17 pages
Considerations For The Use of Precast Concrete Inlets & Manholes
No ratings yet
Considerations For The Use of Precast Concrete Inlets & Manholes
2 pages
Data
100% (1)
Data
184 pages
Computer Aided Analysis and Structural Optimization of Transmission Line Tower
No ratings yet
Computer Aided Analysis and Structural Optimization of Transmission Line Tower
7 pages
ASME 16.10
No ratings yet
ASME 16.10
52 pages
Office Awards Af415d763baab751158daad22c702924
No ratings yet
Office Awards Af415d763baab751158daad22c702924
10 pages
Jafar Estimate
No ratings yet
Jafar Estimate
8 pages
DS en 1997-1 DK Na 2013 e PDF
No ratings yet
DS en 1997-1 DK Na 2013 e PDF
33 pages
Multec Electronic Fuel Injection
No ratings yet
Multec Electronic Fuel Injection
8 pages
Equipment Critical Analysis - The Need For An Effective Maintenance Program
No ratings yet
Equipment Critical Analysis - The Need For An Effective Maintenance Program
9 pages
Sri Krishna College of Technology: Cut Off Marks of The Year 2020 - 2021 B.E. / B.Tech. Government Quota
No ratings yet
Sri Krishna College of Technology: Cut Off Marks of The Year 2020 - 2021 B.E. / B.Tech. Government Quota
1 page
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
7 pages

WSMA-lab

Uploaded by

WSMA-lab

Uploaded by

LABORATORYMANUA

Prepared by B. Venkata Varma

1. Engineeringknowledge: Apply the knowledge of mathematics, science, engineering

2. Problemanalysis:Identify, formulate, review research literature, and analyze complex

3. Design/developmentofsolutions:Design solutions for complex engineering problems and

7. Environmentandsustainability: Understand the impact of the professional engineering

9. Individualandteamwork: Function effectively as an individual, and as a member or leader

10. Communication:Communicate effectively on complex engineering activities with the

11. Projectmanagementandfinance: Demonstrate knowledge and understanding of the

Prepared by B. Venkata Varma

Prepared by B. Venkata Varma

IV Year B.Tech. CSE(DS) I -Sem LTPC

Prepared by B. Venkata Varma

Prepared by B. Venkata Varma

Prepared by B. Venkata Varma

a. Stop word elimination

Stop word elimination is a process used in natural language processing (NLP) to

Steps in Stop Word Elimination:

1. Tokenization: Split the text into individual words or tokens.

4. Reconstruction: Reassemble the text or tokens without the stop words.

 Original sentence: "The cat is on the mat."

 Stop words (from a predefined list): "the," "is," "on."

for word in text.split():

['connect', 'connect', 'connect', 'connect', 'connect', 'connect',

Lemmatization is converting the word to its base form or lemma by removing

2. Sentiment analysis on customer review on products

Steps to Create a CSV File:

# Print the results

# Entry point of the script

# Append keywords to the filtered text (each keyword repeated 10 times)

text = "This is a sample text with stopwords."

Prepared by B. Venkata Varma

Prepared by B. Venkata Varma

# Sample visitor data

# Add derived metrics

# Display basic statistics

Prepared by B. Venkata Varma

Prepared by B. Venkata Varma

def get_traffic_sources(profile_id, access_token):

response = requests.get(url, params=params, headers=headers)

def save_data_to_json(data, filename='traffic_sources.json'):

# Fetch the traffic sources

Prepared by B. Venkata Varma

Invalid Access Token:-

Prepared by B. Venkata Varma

You might also like