Machine Learning (ML) is a computer science field focused on enabling computers to learn from experience. Key issues in ML include inadequate training data, poor data quality, overfitting and underfitting, and the need for skilled resources, which can lead to inaccurate predictions and model performance challenges. Additionally, complexities in the ML process and data bias can hinder effective implementation and results.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
3 views15 pages
ML Module 1
Machine Learning (ML) is a computer science field focused on enabling computers to learn from experience. Key issues in ML include inadequate training data, poor data quality, overfitting and underfitting, and the need for skilled resources, which can lead to inaccurate predictions and model performance challenges. Additionally, complexities in the ML process and data bias can hinder effective implementation and results.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15
ML Module 1: Introduction to ML ( 15-20 M )
Q.1 Define Machine Learning. Explain issues in ML?
Definition: It is an area of computer science which involves teaching computers to do things naturally by learning through experience. i.e. Learning that happens through experience. Issues in ML 1. Inadequate Training Data: - Lack of Quality and Quantity: Insufficient and poor-quality data can exhaust machine learning algorithms. - Noisy Data: Leads to inaccurate predictions and classification errors. - Incorrect Data: Causes faulty programming and affects model accuracy. - Complex Generalization: Difficulties in generalizing output data result in poor future actions. 2. Poor Quality of Data: - Noisy, Incomplete, Inaccurate, and Unclean Data: Leads to less accurate classification and low-quality results. 3. Non-representative Training Data: - Unrepresentative Samples: Results in less accurate predictions and biased models. - Generalization Issues: Inadequate training data can lead to sampling noise and inaccurate predictions. 4. Overfitting and Underfitting: - Overfitting: Captures noise and inaccuracies, reducing model performance. Methods to reduce: Increase training data, reduce model complexity, regularization, early stopping, reduce noise and attributes, constrain the model. - Underfitting: Model too simple, leading to inaccurate predictions. Methods to reduce: Increase model complexity, remove noise, train on better features, reduce constraints, increase epochs. 5. Monitoring and Maintenance: - Regular Updates Needed: Essential to ensure generalized and accurate output data. 6. Getting Bad Recommendations: - Data Drift: Model provides outdated recommendations due to changes in user behavior or data interpretation. - Solution: Regularly update and monitor data. 7. Lack of Skilled Resources: - Shortage of Expertise: Need for professionals with deep knowledge in mathematics, science, and technology. 8. Customer Segmentation: - Identifying User Behavior: Necessary to recognize customer behavior for relevant recommendations based on past experiences. 9. Process Complexity of Machine Learning: - Complex and Tedious Process: Involves data analysis, bias removal, training, and complex calculations, leading to higher error probabilities. 10. Data Bias: - Biased Datasets: Leads to inaccurate results and analytical errors. - Solution: Identify and reduce biases through diverse data sources, bias testing, regular analysis, and multi-pass annotation. 11. Lack of Explainability: - Opaque Outputs: Reduces credibility as outputs are not easily comprehensible. 12. Slow Implementations and Results: - Time-Consuming: Excessive data and slow programming lead to delayed results, requiring continuous maintenance and monitoring. 13. Irrelevant Features: - Garbage In, Garbage Out: Irrelevant input data results in poor model outcomes. - Solution: Use relevant features for a good training dataset.