python que
python que
14. List the different types of file modes available in Python. Provide examples of
each.
15. What are the key advantages of using Python for data analytics over other
programming languages like R?
16. What is the significance of feature engineering in text mining? What are some
common techniques used for feature extraction?
17. What is a word cloud in text analysis? How can it be used to understand textual
data?
18. How can Python be used for database handling? Explain with an example using
pandas.
19. What are the challenges of handling high-dimensional text data?
20. What is the role of Named Entity Recognition (NER) in text processing? Provide
an example.
21. What are word embeddings in NLP? How do they improve text classification
models?
22. How do machine learning models handle imbalanced text classification datasets?
23. What is topic modeling in text mining? Briefly explain the concept with an
example.
Section 3: Programming-Based Questions (8 Marks Each)
24. Write a Python script to read a text file, count the number of words, and display
the frequency of each word.
25. Write a Python program to read a CSV file using pandas and perform basic
operations such as handling missing values and filtering specific columns.
26. Develop a Python program to tokenize a given text and remove stopwords using
the nltk library.
27. Implement a simple Naïve Bayes classifier for text classification using sklearn.
Test it on a given dataset.
28. Write a Python function to extract numerical values (such as prices, revenue,
etc.) from an unstructured text document using regular expressions.
29. Create a bar chart using Matplotlib to visualize the word frequency in a given
text file. Provide the Python code and interpret the results.
30. Demonstrate how to use the read_excel() function in Python to load data from
an Excel file. Explain how to handle missing values.
31. Write a Python script to classify emails as spam or not spam using text mining
techniques.
32. Demonstrate how to use SQL queries in Python using pandas. Write a script to
load an SQL database table into a pandas DataFrame and filter the data.
33. Write a Python program to perform Named Entity Recognition (NER) using the
spacy library. Explain its use cases.
34. Write a Python function to implement K-Means clustering on a text dataset.
Explain how clustering helps in text categorization.
35. Build a simple sentiment analysis model using Python. Use a dataset of movie
reviews and apply logistic regression for classification.
36. Write a Python script to compute TF-IDF values for a given text dataset using
sklearn. Explain how TF-IDF improves text classification.
1. Explain the key differences between text mining and data mining. How does each
handle structured and unstructured data?
2. What are some preprocessing techniques used in text mining? Discuss with
examples.
3. Discuss the advantages and disadvantages of using machine learning for text
classification. Provide examples.
4. Explain the concept of text transformation and feature engineering in NLP. How
do they improve model performance?
5. What is topic modeling? Compare Latent Dirichlet Allocation (LDA) with Non-
Negative Matrix Factorization (NMF).
6. Describe the role of deep learning in text mining. How do models like recurrent
neural networks (RNNs) and transformers help in NLP?
7. What are word embeddings, and how do they improve NLP models? Compare
Word2Vec and GloVe.
8. Explain the challenges of working with high-dimensional text data. How can
dimensionality reduction techniques help?
New Short Answer Questions (4 Marks Each)
16. Write a Python script to preprocess a text dataset. Perform tokenization, stopword
removal, and stemming.
17. Write a Python function to generate n-grams from a given text. Explain how n-
grams improve text representation.
18. Develop a Python program to scrape text data from a website using
BeautifulSoup. Provide an example.
19. Write a Python script to build a logistic regression model for text classification.
Train it on a dataset and evaluate its accuracy.
20. Implement a Python program to detect and extract named entities using spaCy.
Explain its real-world applications.
21. Create a Python script to visualize word frequencies using a word cloud. Explain
the significance of this visualization.
22. Write a Python script to compare the performance of BoW and TF-IDF on a
given dataset. Explain the differences.
23. Develop a Python script to apply Latent Dirichlet Allocation (LDA) for topic
modeling. Interpret the output.
24. Write a Python program to implement a simple chatbot using NLP techniques.
Explain its components.
25. Implement a deep learning model (LSTM) for text classification using
TensorFlow/Keras. Train it on a dataset and evaluate the results.