Internship Report PDF 2022
Internship Report PDF 2022
CHAPTER 1
ORGANIZATION INTRODUCTION
1.1 ABOUT THE ORGANIZATION
The purpose of the business is the development of various types of testing
equipment’s including IC Testers, Component Testers, GSM module, GPS module and
Microcontroller trainer kits based on 8051/8052. Microcontroller based educational
systems; ARM-7/8051/MSP430 based development kits. We also undertake development
of Custom Designed/Application Specific equipment.
HISTORY
Since 2013, we are contributing our work successfully on design and development
of various types of hardware circuit boards, electronics products, industrial and academic
projects. With strong industrial background and experience we offer student training,
corporate training and placement. aMSa is a dedicated team consisting of graduates in
Bachelor of Engineering from and are working as technical specialists in embedded
companies in Bangalore and Hubli. With this industrial expertise we would like to assist
the students to get awareness on Industry demand and work during their training in our
academy.
COMPANY STRATERGY
AMSa Embedded Solution will be promoted in various local media. Clients have
already begun promotion by word of mouth. Response has been favorable. Formal
advertising is planned to begin three months before the center is scheduled to open.
Mobile No 0836-2270318
Email ID [email protected]
CHAPTER 2
ABOUT THE DEPARTMENT
2.1 COMPANY SERVICES
• Electronic Product Design
• Embedded systems design using 8, 16, 32-bit microcontroller.
• Embedded Software & Firmware development
• Re-engineering services to correct flaws or to optimize existing designs for lowering costs.
• Electronics Consulting services
2.2 TRAINING
Engineering students will from now on have to undergo a mandatory period of
internship to gain practical experience, a decision prompted by industry complaints about
the poor employability of BE/BTech graduates. Students studying in college does not have
the opportunity to learn the practical skills required for the skill development and
employment process so it is better to join internship in required domain in the companies
which has got the potential to shape the student’s future with hands on experience on the
technologies including hardware and software development.
Service Offerings:
1. Application Development
2. Web-based Applications
3. Windows based Applications
4. IOT based Application
AES has R&D team size of 4 members, two are working as Hardware Design
Engineer and other two are working as Firmware Design Engineer, whose
responsibility will be as follows-
CHAPTER 3
TASK PERFORMED
3.1 Machine Learning
Machine learning is a growing technology which enables computers to learn
automatically from past data. Machine learning uses various algorithms for building
mathematical models and making predictions using historical data or information.
Currently, it is being used for various tasks such as image recognition, speech recognition,
email filtering, Facebook auto-tagging, recommender system, and many more.
What is Machine Learning In the real world, we are surrounded by humans who
can learn everything from their experiences with their learning capability, and we have
computers or machines which work on our instructions. But can a machine also learn from
experiences or past data like a human does? So here comes the role of Machine Learning.
Machine Learning is said as a subset of artificial intelligence that is mainly concerned
with the development of algorithms which allow a computer to learn from the data and
past experiences on their own. The term machine learning was first introduced by Arthur
Samuel in 1959. We can define it in a summarized way as: A machine has the ability to
learn if it can improve its performance by gaining more data.
How does Machine Learning work A Machine Learning system learns from
historical data, builds the prediction models, and whenever it receives new data, predicts
the output for it. The accuracy of predicted output depends upon the amount of data, as the
huge amount of data helps to build a better model which predicts the output more
accurately. Suppose we have a complex problem, where we need to perform some
predictions, so instead of writing a code for it, we just need to feed the data to generic
algorithms, and with the help of these algorithms, machine builds the logic as per the data
and predict the output. Machine learning has changed our way of thinking about the
problem. The below block diagram explains the working of Machine Learning algorithm:
➢ Unsupervised Learning
Unsupervised learning models are utilized for three main tasks clustering,
association, and dimensionality reduction.
➢ Reinforcement learning
Advantages
• Linear regression is an extremely simple method. It is very easy and intuitive to use
and understand.
• A person with only the knowledge of high school mathematics can understand and use
it. In addition, it works in most of the cases. Even when it doesn’t fit the data exactly,
we can use it to find the nature of the relationship between the two variables.
Disadvantages
• By its definition, linear regression only models’ relationships between dependent and
independent variables that are linear. It assumes there is a straight-line relationship
between them which is incorrect sometimes.
• Linear regression is very sensitive to the anomalies in the data (or outliers).
This is where logistic regression comes into play. In logistic regression, you get
a probability score that reflects the probability of the occurrence of the event. An event
in this case is each row of the training dataset. It could be something like classifying if
a given email is spam, or mass of cell is malignant or a user will buy a product and so
on.
Advantages
➢ It doesn’t require high computational power.
➢ Is easily interpretable.
➢ Is used widely by the data analyst and data scientists.
➢ Is very easy to implement.
➢ It doesn’t require scaling of features.
Disadvantages
➢ While working with Logistic regression you are not able to handle a large number
of categorical features/variables.
➢ It is vulnerable to overfitting.
➢ regression will not perform well with independent(X) variables that are not correlated to
the target(Y) variable.
3.3.3 KNN
K nearest neighbors or KNN Algorithm is a simple algorithm which uses the entire
dataset in its training phase. Whenever a prediction is required for an unseen data
instance, it searches through the entire training dataset for k-most similar instances and
the data with the most similar instance is finally returned as the prediction. KNN is often
used in search applications where you are looking for similar items, like find items similar
to this one.
Advantages
• The algorithm is simple and easy to implement.
• There’s no need to build a model, tune several parameters, or make additional
assumptions.
• The algorithm is versatile. It can be used for classification, regression, and search.
• The training phase of K-nearest neighbor classification is much faster compared to other
classification algorithms.
Disadvantages
▪ The algorithm gets significantly slower as the number of examples and/or
predictors/independent variables increase.
▪ The testing phase of K-nearest neighbor classification is slower and costlier in terms
of time and memory. It requires large memory for storing the entire training dataset
for prediction.
▪ KNN also not suitable for large dimensional data.
3.3.4 SVM
“Support Vector Machine” (SVM) is a supervised machine learning algorithm that
can be used for both classification and regression challenges. However, it is mostly used
in classification problems. In the SVM algorithm, we plot each data item as a point in n-
dimensional space (where n is a number of features you have) with the value of each
feature being the value of a particular coordinate.
• SVM is a Supervised Learning algorithm that uses labelled input data set to predict
the output of the data points.
• It is one of the simplest Machine learning algorithms and it can be easily
implemented for a varied set of problems.
• SVM can be used for solving both classification and regression problems.
Advantages
• SVM works relatively well when there is a clear margin of separation between classes.
• SVM is more effective in high dimensional spaces.
• SVM is relatively memory efficient.
• SVM is effective in cases where the number of dimensions is greater than number of
samples.
Disadvantages
• SVM algorithm is not suitable for large data sets.
• SVM does not perform very well when the data set has more noise i.e. target classes are
overlapping.
• In cases where the number of features for each data point exceeds the number of training
data samples, the SVM will underperform.
Decision Tree is a supervised learning technique that can be used for both
classification and Regression problems, but mostly it is preferred for solving
Classification problems. It is a tree-structured classifier, where internal nodes represent
the features of a dataset, branches represent the decision rules and each leaf node
represents the outcome. It is a graphical representation for getting all the possible
solutions to a problem/decision on based on given conditions. In a Decision tree, there
are two nodes, which are the Decision Node and Leaf Node. Decision nodes are used to
make any decision and have multiple branches, whereas Leaf nodes are the output of those
decisions and do not contain any further branches.
• Decision Trees usually mimic human thinking ability while making a decision, so it
is easy to understand.
• The logic behind the decision tree can be easily understood because it shows a tree-like
structure.
• It is very easy to understand and implement.
3.3.5.2 Working
In a decision tree, for predicting the class of the given dataset, the algorithm starts
from the root node of the tree. This algorithm compares the values of root attribute with
the record (real dataset) attribute and, based on the comparison, follows the branch and
jumps to the next node. For the next node, the algorithm again compares the attribute
value with the other sub- nodes and move further. It continues the process until it reaches
the leaf node of the tree.
The complete process can be better understood using the below algorithm:
Step-1: Begin the tree with the root node, says S, which contains the complete dataset.
Step-2: Find the best attribute in the dataset using Attribute Selection Measure (ASM).
Step-3: Divide the S into subsets that contains possible values for the best attributes.
Step-4: Generate the decision tree node, which contains the best attribute.
Step-5: Recursively make new decision trees using the subsets of the dataset created in
step3.
Continue this process until a stage is reached where you cannot further classify the
nodes and called the final node as a leaf node.
Advantages
• It is simple to understand as it follows the same process which a human follow while
making any decision in real-life.
• It can be very useful for solving decision-related problems.
• It helps to think about all the possible outcomes for a problem.
• There is less requirement of data cleaning compared to other algorithms.
Disadvantages
CHAPTER 4
REFLECTION
4.1 PROJECT: TITANIC SURVIVAL PREDICTION
➢ Data collection:
The very first step is data collection process which can be obtained from many
sources like company side, Kaggle repository surveys 3rd party API etc. and import that
data set in form of comma separated file and import the required modules.
➢ Algorithm used
The algorithm I used here is Logistic Regression which is generally used in binary
classification problem statements. It is used in problems where the data is linearly
separable and our algorithm designs a line which best linearly separates the two classes.
It basically predicts the probability whether that particular event will happen or not.
We then create the classifier and the fit the training data into it and then use the
predict function for the output. Since the model is now ready it is time to evaluate the
performance of our classifier or model.
➢ Problem statement:
Here we can choose any of the models to predict survival of test sample. Since
we have evaluated all models by using confusion matrix, we will predict by using model
which has highest accuracy. We performed prediction on dataset by using logistic
regression, SVC and Random Forest. Additionally, the final prediction is made through
best voting outcome of algorithms using ensemble technique. As it is very much clear
from above table, and we predicted that Random Forest model has the highest accuracy
decisions.
SOFTWARE REQUIRMENTS
• Operating System: windows 8/10
• Programming Language: Python
• Framework: Anaconda
• IDE: Jupyter Notebook
• ML Libraries: NumPy, Pandas, Matplotlib.
4.4 SNAPSHOT
RESULT
CONCLUTION
Data cleaning is the first step while performing data analysis. Exploratory data
analytics helps one to understand the dataset and the dependency among the attributes.
EDA is used to figure out the relationship between the features of the dataset. This is done
by using various graphical techniques. The one used above is gg plot and histograms. By
applying EDA some conclusions are drawn and facts are found. There is high influence of
age on survival. We can see from table-2 that as age increases survival decreases