Open navigation menu
Close suggestions
Search
Search
en
Change Language
Upload
Sign in
Sign in
Download free for days
0 ratings
0% found this document useful (0 votes)
17 views
5 pages
Chapter 6 - Data Science and K Nearest Neighbour Model (PART B)
Data scientist and nearest neighb
Uploaded by
uug449162
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Download
Save
Save Chapter 6_Data science and k nearest neighbour mod... For Later
Share
0%
0% found this document useful, undefined
0%
, undefined
Print
Embed
Report
0 ratings
0% found this document useful (0 votes)
17 views
5 pages
Chapter 6 - Data Science and K Nearest Neighbour Model (PART B)
Data scientist and nearest neighb
Uploaded by
uug449162
AI-enhanced title
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here
.
Available Formats
Download as PDF or read online on Scribd
Carousel Previous
Carousel Next
Download
Save
Save Chapter 6_Data science and k nearest neighbour mod... For Later
Share
0%
0% found this document useful, undefined
0%
, undefined
Print
Embed
Report
Download
Save Chapter 6_Data science and k nearest neighbour mod... For Later
You are on page 1
/ 5
Search
Fullscreen
© CHAPTER 6 GET SET GO (Page No. ~ 239) 1. Surveys 2. Interviews 3, Document and Records 4, Polls 5. Questionnaire CHECKBOT (Page No. ~ 243) DATA SCIENCES AND K-NEAREST NEIGHBOUR MODEL Regression Classification 1. Inregression, the algorithm generates a mapping function from the given data, represented by the solid line, In classification, the algorithm can determine which set a given data point belongs to by utilising a classification function represented by the dotted line. 2. The dots shown in the graph are the data values and the solid line here represents the mapping done for them. The model classifies datasets according to the rules given to it. Usually, the dataset used for classification are labelled and the data then gets sorted according to their labelling 3. With the help of this mapping function, we can predict future data. To apply the regression modelling technique, we need continuous data. CHECKBOT (Page No. - 244) To be done by the students. CHECKBOT (Page No. ~ 247) The classification works on the discrete dataset. The sources from which relevant data can be collected are as follows: Sensors Surveys Interviews Observations Open-sourced Government Portals Reliable Websites (Kagale) Noaa eno CHECKBOT (Page No. ~ 252) World Organisations’ open-sourced statistical websites K-nearest neighbours (KNN) algorithm is used to solve both classification and regression type problems. 60 # The algorithm is simple and easy to implement. The supervised learning algorithm is trained with data and the ‘Scanned with CamScannee_— corresponding label. After training, the algorithm can label the data, which is not labelled yet. The classification type problems are the problems in which the data can be classified into two or more categories. KNN algorithm res that similar things exist nearby. ass eXERCISEBOT Aad 2a 3c Ac 5.b ea 7d Bc 9a . 1.2008 2. data science application 3. patterns 4, data scientist s.median 6. Keras 7. Scatter plot 8. nearby GF 27 3.7 4.F 5.F p, 1. Asupervised learning algorithm analyses the training data and produces an inferred function, which can be used for mapping new examples. This is the simplest and easily implemented algorithm. A fully ‘rained algorithm will be able to observe a new, never seen before examples and predict a good label for them. It demands more and more examples until it can accurately perform the task. 2. Mode: This function returns the most common value in a set of data. >>eimport statistics as st >>snums=[1, 2,3, 5,7, 9, 7, 2,7, 6] >>>print(st.mode(nums)) The output will be 7. 3. Aplotis an effective way to display data in pictorial form. it makes easier to draw comparison and analyse the growth, relationship and trends among the values in a table. Different types of plots used in Python are as follows: Line plot Bar plot a. b. c. Histogram plot d. Scatter plot ©. Box and Whisker plot stdev( ) returns the standard deviation of the sample. This is equal to the square root of the sample variance. >>>import statistics as st >eenums=[1, 2, 3, 5, 7,9, 7, 2, 7, 6] >>>print(st.stdev(nums)) The output will be 2.7264140062238043. {nearest neighbours (KNN) algorithm is used to solve both classification and regression type problems. The algorithm is simple and easy to implement. The supervised learning algorithm is trained with data and the corresponding label. After training, the algorithm can label the data, which is not labelled yet. The classification type problems are the problems in which the data can be classified into two or more categories. KNN algorithm assumes that similar things exist nearby. Statistical learning is a framework for Machine Learning from the field of statistical and functional analysis. It deals with the problem by predictive-based functions on data, Learning can be of the following types. 6 ae ‘Scanned with CamScanner62 up ‘a. Supervised Learning: A supervised learning algorithm analyses the training data and produces an inferred function, which can be used for mapping new examples. This is the simplest and easily implemented algorithm. A fully trained algorithm will be able to observe a new, never seen before ‘examples and predict a good label for them. It demands more and more examples until it can accurately perform the task. b. Unsupervised Learning: Unsupervised learning isa type of Machine Learning that looks for previously undetected patterns in a data set with no pre-existing labels and minimum human supervision. It can learn to group, cluster and organise data in such a way that human can make sense of newly organised data. It is an intelligent algorithm that can take terabytes of unlabelled datz and make sense of it. ¢. Reinforcement Learning: When some sort of signal is provided to the algorithm that associates good behaviour with a positive signal and bad behaviour with a negative one, the algorithm can be reinforced to prefer good behaviour over bad behaviour. Over some time, the algorithm learns and makes fewer mistakes. It is very much influenced by neuroscience and psychology. 2. Companies require a data scientist to make a data-driven decision. The model can be used to give a better customer experience. Some of the applications are as follows: (any five) a. For Better Marketing: Companies use data and feedback for marketing but they directly don't know what the customers will say about the product. The data scientists collect that data and after analysis, can suggest a better marketing strategy. Marketing is an important step in business. The data scientist can also tell which advertisement is having an impact and which is not that saves money and efforts. By studying the customer feedback, the companies can create the best advertisement. b. For Customer Acquisition: The data scientists can analyse the feedback and other data. They can tell us the needs of the customer. The company can use this information to tailor the product. The product tailored by a company based on this information can be the best product that suits the customer's requirement. It also helps to find potential customers. The data scientist can help in recognising the potential customer and their needs. ¢. For Innovation and Manufacturing: The customer feedback can be used to innovate and manufacture the product. The data scientist helps to innovate the product. It can be used to craft a new product or make changes to the regular product. The information given by the data scientist can lead you in the right direction in decision-making and product innovation. d. For Banking: A data scientist can give information about frauds. Data science deals in the areas of customer service, forecasting, understanding consumer sentiments, customer profiling and targets marketing by analysing customer feedback and queries that are studied by data scientists. Banks use data science to approve the loan too. e. In E-commerce: Data analysis can be used to find out the potential customers, to recommend a product by analysing the reviews and feedback. But for that purpose, a skilled data scientist is required. The total sales of e-commerce depend on analysis. f. InHealthcare: Medical images and reports can be analysed and compared with another patient showing the same symptoms. The analysis of reports and case studies of various patients can be used to suggest the drugs used in treatment. Even in Covid-19 pandemic, many people used the virtual assistant for their treatments. g._In Transportation: The self-driving cars can be improved with the analysis of data collected and related accidents. The driving experience can be improved by calculating the traffic on the road. Google maps use that data analysis to tell us the estimated time. ‘Scanned with CamScannerh, In Finance: Customers can be segmented based on purchasing and saving habits so that data can be used to suggest loans and investment scheme. The data analysis can be used to predict the market so the purchasing and selling of shares can be done. The risk can be easily analysed in an investment and investments can be done considering these risks. i, In Education: Educational institutions use various techniques to analyse and evaluate data. This data helps them to understand student requirements, course content demand, teaching methodologies, etc. Data science also reduces chances of evaluator's bias. Data science makes it possible for institutions to devise innovative curriculums. The performance of students is measured by teachers and data science helps in measuring the performance of teachers. |. The steps involve in Al project cycle are as follows: a, Acquire data that will become the base of the project as it will help understand the parameters related to problem scoping. b. Data acquisition is made by collecting data from various reliable and authentic sources. Since the data collected would be in large quantities, it will be very important to visualise different types of representations such as graphs, databases, flowcharts, maps, etc. This makes it easier to interpret the patterns, which acquired data follows. c. After exploring the patterns, it is easy to decide, which model would be built to achieve the goal. For this, online research can be done and various models that give a suitable output, can be selected. d. Test the selected models and figure out which is the most efficient one. e. The most efficient model is now the base of your Al project. Develop an algorithm around it. f. Once the modelling is complete, test your model on some newly obtained data. The results will help in evaluating the model and improving it. g. Finally, after evaluation, the project cycle is now complete and the Al project is ready. There exist various sources of data from where you can collect required data and the data collection process can be categorised in two ways: Offline and Online. Offline Data sources are as follows: a. Sensors b. Surveys c. Interviews d. Observations Online Data sources are as follows: a. Open-sourced Government Portals b. Reliable Websites (Kaggle) ©. World Organisations’ open-sourced statistical websites The following points should be kept in mind, while accessing data from any of the data sources: + Data that is available for public usage only should be taken up. + Personal datasets should only be used with the consent of the owner. + One should never breach someone's privacy to coliect data. + Data should only be taken from reliable sources as the data collected from random sources can be wrong or unusable } ‘Scanned with CamScannerReliable data sources ensure the authenticity of dat: Al model. a, which helps in the proper training of the Classification Regression In classification, the algorithm can determine which set a given data point belongs to by utilising a classification function represented by the dotted line 2. The model classifies datasets tules given to it. as according to the 3. The classification works on the discrete dataset, In regression, the ala mapping funetion from t represented by the solid line he given data, The dots shown in the graph are the data values and the solid line here represents the mapping done for them. With the help of this mapping function, we can predict future data. To apply the regression ‘modelling technique, we need continuous data. ‘Scanned with CamScanner rithm generatesa
You might also like
Unit-1 IDS
PDF
No ratings yet
Unit-1 IDS
26 pages
DS QB Unit 1
PDF
No ratings yet
DS QB Unit 1
45 pages
Applied Data Analysis
PDF
No ratings yet
Applied Data Analysis
128 pages
Data Science Unit 1
PDF
No ratings yet
Data Science Unit 1
30 pages
2 Marks With Answers
PDF
No ratings yet
2 Marks With Answers
39 pages
Chapter 1
PDF
No ratings yet
Chapter 1
85 pages
Ads TopperSh
PDF
No ratings yet
Ads TopperSh
50 pages
IDS Unit 1
PDF
No ratings yet
IDS Unit 1
67 pages
Week 12 Intro To DS and ML
PDF
No ratings yet
Week 12 Intro To DS and ML
67 pages
Fd45092a Ccad 459e Bc18 B01536fd6bac Untitled
PDF
No ratings yet
Fd45092a Ccad 459e Bc18 B01536fd6bac Untitled
53 pages
Kadir
PDF
No ratings yet
Kadir
84 pages
DS PPT 1
PDF
No ratings yet
DS PPT 1
30 pages
Chapter 1
PDF
No ratings yet
Chapter 1
62 pages
Data Science S3mca
PDF
No ratings yet
Data Science S3mca
55 pages
Kadir
PDF
No ratings yet
Kadir
80 pages
02 Introduction - Fall 23-24
PDF
No ratings yet
02 Introduction - Fall 23-24
29 pages
Chapter 1
PDF
No ratings yet
Chapter 1
62 pages
FDS Notes
PDF
No ratings yet
FDS Notes
5 pages
Unit I
PDF
No ratings yet
Unit I
52 pages
Internship Report: T.J.Instituteoftechnology
PDF
No ratings yet
Internship Report: T.J.Instituteoftechnology
29 pages
Introduction Am
PDF
No ratings yet
Introduction Am
74 pages
M1.1 DS
PDF
No ratings yet
M1.1 DS
57 pages
B Ei
PDF
No ratings yet
B Ei
44 pages
Screenshot 2025-04-23 at 8.26.12 AM
PDF
No ratings yet
Screenshot 2025-04-23 at 8.26.12 AM
14 pages
Data-Science - Introduction
PDF
No ratings yet
Data-Science - Introduction
35 pages
Unit 1 Data Science Notes
PDF
No ratings yet
Unit 1 Data Science Notes
33 pages
Data and Analysis
PDF
No ratings yet
Data and Analysis
13 pages
Datascience
PDF
No ratings yet
Datascience
12 pages
TE Sem1 UNIT 1 (Data Science and Visualization) HONOURS - TE (SEM V)
PDF
No ratings yet
TE Sem1 UNIT 1 (Data Science and Visualization) HONOURS - TE (SEM V)
28 pages
Data Sciene - Unit 5 Material
PDF
No ratings yet
Data Sciene - Unit 5 Material
15 pages
2 - Business Problems and Data Science Solutions
PDF
No ratings yet
2 - Business Problems and Data Science Solutions
26 pages
Project Report
PDF
No ratings yet
Project Report
29 pages
Lecture 1 - Introduction To Data Science
PDF
No ratings yet
Lecture 1 - Introduction To Data Science
14 pages
File
PDF
No ratings yet
File
27 pages
Ds U1 chp1
PDF
No ratings yet
Ds U1 chp1
13 pages
EDS Unit 1?
PDF
No ratings yet
EDS Unit 1?
15 pages
DSA Unit1
PDF
No ratings yet
DSA Unit1
37 pages
Introduction To Data Science
PDF
No ratings yet
Introduction To Data Science
8 pages
WEEK 4-5-Exploring Data Science Methods, Models, and Application
PDF
No ratings yet
WEEK 4-5-Exploring Data Science Methods, Models, and Application
18 pages
Data Science Ppt1 Update
PDF
No ratings yet
Data Science Ppt1 Update
67 pages
Fundamentals of Data Science
PDF
No ratings yet
Fundamentals of Data Science
54 pages
Data Science
PDF
No ratings yet
Data Science
10 pages
FDS Unit 1 QB
PDF
No ratings yet
FDS Unit 1 QB
7 pages
Data Science
PDF
No ratings yet
Data Science
10 pages
PDF Data Science
PDF
No ratings yet
PDF Data Science
7 pages
Unit 4
PDF
No ratings yet
Unit 4
6 pages
Digital Data Part 2
PDF
No ratings yet
Digital Data Part 2
6 pages
UNIT IV Data Science
PDF
No ratings yet
UNIT IV Data Science
7 pages
Data Science
PDF
No ratings yet
Data Science
5 pages
TTDS Lectures
PDF
No ratings yet
TTDS Lectures
13 pages
The Field of Data Science
PDF
No ratings yet
The Field of Data Science
4 pages
IAT 2 Part A - DS
PDF
No ratings yet
IAT 2 Part A - DS
5 pages
Data Science
PDF
No ratings yet
Data Science
11 pages
Summary Business Analytics
PDF
No ratings yet
Summary Business Analytics
24 pages
Data Science and Analytics Reviewer
PDF
No ratings yet
Data Science and Analytics Reviewer
5 pages
Ai Answers
PDF
No ratings yet
Ai Answers
3 pages
Impact of Data Science Across Industries
PDF
No ratings yet
Impact of Data Science Across Industries
3 pages
Adobe Scan 09 Sept 2024
PDF
No ratings yet
Adobe Scan 09 Sept 2024
4 pages
Ab Assignment 3
PDF
No ratings yet
Ab Assignment 3
7 pages