0% found this document useful (0 votes)
25 views18 pages

CCD Chapter 6 Notes

Uploaded by

Ishwari khebade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views18 pages

CCD Chapter 6 Notes

Uploaded by

Ishwari khebade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

UNIT 6 : managed machine learning systems

• introduction of various ML systems available in market

various Machine Learning (ML) systems available in the market, classified into
different categories based on their applications and capabilities:

1) cloud based ML systems

− Amazon sagemaker
− google Auto ML
− azure ML studio

2) open source ML libraries and framework

− TensorFlow
− PyTorch
− Scikit-learn
− keras

1) TensorFlow

− TensorFlow which is developed by google is one of the most widely used


open source ML framework for building and deploying machine learning
and deep learning models.
− TensorFlow is designed to work with large data sets made up of many
indivisual attributes
− any data we want to process with TensorFlow is stored in
multidimensional array these multidimensional arrays are known as
tensors.
− it Primarily uses Python but also supports C++, JavaScript and Swift.

advantages :

− Supports a wide range of tasks, including deep learning, reinforcement


learning, and statistical modeling.
− Allows for distributed training across multiple CPUs, GPUs, and TPUs
(Tensor Processing Units).
− IT supports multiple laguages like python , C++ , JavaScript
− Works well for both small experiments and large-scale production
systems.
− provides Extensive documentation, tutorials, and a large active
community.
disadvantges :

− Can be complex for beginners due to its extensive functionalities.


− Requires significant computational power for training large models.
− Debugging TensorFlow models can be challenging

Applications :

− Image classification, object detection, and segmentation.


− Natural Language Processing (NLP): Text generation, sentiment analysis,
and translation.
− Analyzing trends and making predictions

2) PyTorch

− PyTorch is an open-source machine learning framework developed by


Facebook
− It is widely used in academic research and industrial applications for its
flexibility, dynamic computation graph, and simplicity.
− PyTorch is based on python programming language and Torch library.
Torch is an open source library used for creating deep neural networks.

Advantages :

− supports Growing community with a wide range of tutorials and pre-


trained models available.
− Easily switches between CPU and GPU for faster computations.
− Allows changes to the model during runtime, making it easier to debug
− it is python friendly and good for python developers

Disadvantages :

− Compared to TensorFlow, it requires additional tools for deploying


models
− its ecosystem is not as extensive as TensorFlow's.
− Fewer built-in utilities for automating workflows compared to other
libraries.

Applications:

− Used for image classification, object detection, and segmentation tasks.


− Enables tasks like text classification, sentiment analysis, and machine
translation.
− Used for building GANs (Generative Adversarial Networks) and other
generative systems.
− Supports advanced architectures like CNNs, RNNs, Transformers, and
more.

3) Scikit-learn

− Scikit-learn is a popular Python library for machine learning, built on top of


NumPy, SciPy, and matplotlib.
− It provides tools for traditional machine learning tasks like classification,
regression, clustering, and preprocessing
− Its simplicity and versatility make it a better choice for both beginners and
seasoned data scientists to build and implement machine learning models.
− Includes a wide range of algorithms for supervised and unsupervised
learning. Examples: Decision Trees, SVM, K-Means
− provides Tools for scaling, normalization, and encoding categorical
variables.
− Easily works with other Python libraries like Pandas and matplotlib for data
handling and visualization.

Advantages

− it is user-friendly and easy for beginners to use


− it Covers a wide range of traditional machine learning tasks, including
regression, classification, clustering, and more.
− provides extensive documentation and community support for
troubleshooting and learning.
− Optimized for small to medium-sized datasets
− It is free and does not require license

Disadvantages

− Does not support deep learning or neural networks


− Not designed for distributed computing or handling massive datasets
− Relies on external libraries like matplotlib or seaborn for data visualization.
− Not suitable for applications that require real-time or dynamic updates.

application :

− Spam detection, disease diagnosis, Predicting house prices, stock values, or


sales trends.

4) keras
− Keras is a high level deep learning API developed by Google for
implementing neural networks.
− Keras is a high-level, user-friendly deep learning library written in Python.
− It is designed to enable fast experimentation with deep learning models and
is built on top of backend engines like TensorFlow, Theano
− Provides a simple, intuitive interface for building and training neural
networks.
− It Includes tools for preprocessing, visualization, and training neural
networks.

Advantages

− Easy to Use: Simplifies the process of creating deep learning models with
minimal code.
− Extensive Documentation and Community: Offers detailed documentation,
tutorials, and a large community for support.
− Can switch between TensorFlow, Theano, or CNTK without changing the
code.
− Built-in Pretrained Models: Saves time with ready-to-use models for transfer
learning.
− Allows quick experimentation, making it ideal for research and iterative
development.

Disadvantages

− Limited Customization: it is difficult to customize models at a low level.


− Performance Overhead: Slightly slower compared to directly using backend
libraries like TensorFlow.
− Dependency on Backend: Requires a backend engine like TensorFlow for
computation.
− It is not suitable for highly complex or research focused models

Applications

− Image Processing, Natural Language Processing (NLP), Speech


Recognition, Generative Models , Transfer Learning

• benefits of using managed ml platforms


Managed ML platforms provide a comprehensive, cloud-based environment for
building, training, deploying, and managing machine learning models. They are
designed to simplify the ML lifecycle

1) Easy to use : Managed ML platforms make it easy to get started with machine
learning.There is need to worry about setting up servers or complex systems.
They provide simple interfaces, like drag-and-drop features, to create models
without deep technical knowledge.
2) Time saving : They automate things like data cleaning and model tuning, so
you can focus on solving the problem at hand. You can quickly build and train
models without wasting time on setup.
3) scalability: These platforms can automatically adjust based on your needs. If
you're working with a small project, they use fewer resources, saving costs.
For large projects or big data, they automatically scale up to handle the extra
work. Dynamically allocate resources based on the complexity of the task.
4) cost effective : Users only have to pay for the resources they use. There's no
need to buy expensive hardware or worry about maintenance. The platform
optimizes resources to avoid waste, saving you money.
5) integrated tools : The platform includes tools for everything: data handling,
model training, and deployment. You don't have to look for separate services
for each step of the ML process
6) pre-Built models: Many platforms offer ready-made models for common
tasks. For example, you can use models that already understand images, text,
or speech. This saves you time by not having to train a model from scratch.
7) Security: Managed ML platforms keep your data safe. They encrypt your data
to protect it from unauthorized access. These platforms also follow security
standards to comply with legal requirements.
8) Easy Collaboration: If you're working in a team, these platforms make it easy
to share work and ideas. You can share models, data, and code with others on
the same platform. It helps everyone stay on the same page and makes
teamwork easier. Multiple users can access and modify resources in real-time.
9) Monitoring and maintenance: Once the model is deployed these platforms
monitor it to make sure it’s working well. They automatically track
performance, and you get alerts if something goes wrong. The platform helps
you keep your model updated when new data comes in.
10) cloud integration : Since everything is in the cloud, there's no need for heavy
local infrastructure. You can access the platform and work from anywhere,
without needing powerful local machines. It also integrates with cloud storage
and computing services, so you don't have to worry about running out of
resources
• jupyter notebook – introduction , workflow

− jupyter notebbok is an open- source web application that allows users to


create and share documents containing live code, equations, visualizations,
narrative text.
− It is widely used in data science machine learning and academic research for
its interactive and user-friendly interface
− The name "Jupyter" comes from the combination of Julia, Python, and R
programming languages
− Although primarily used with Python, Jupyter supports various languages
such as R, Julia, and others
− Jupyter integrates well with plotting libraries like Matplotlib, Seaborn, and
Plotly, allowing you to visualize data and results within the notebook itself.
− we can combine code, output, and explanatory text in one place, making
Jupyter a powerful tool for data exploration and reporting.

Components of jupyter notebook :

− cells: The notebook consists of cells, where each cell can contain code or
text. There are two main types of cells:
Code Cells: Contain code that can be executed within the
notebook.
Markdown Cells: Contain formatted text, equations, and other
narrative components.
− kernel: The kernel is responsible for executing code. When you open a
notebook, Jupyter launches a kernel associated with the language you
choose (Python is the default). The kernel maintains the state of variables
and objects during execution.
− Notebook Interface: The Jupyter interface allows you to add, delete, or
modify cells. It also lets you run code, view results, and export the notebook
to other formats (like HTML or PDF).

advantages

1. Interactive and Real-Time Execution:


You can write and run code in small chunks (cells), and immediately see the
output below the code. This makes it easy to test and experiment.
2. Great for Data Visualization:
Jupyter works well with libraries like Matplotlib, Seaborn, and Plotly,
allowing you to create and display charts and graphs directly within the
notebook.
3. Supports Multiple Languages:
Although Python is the most common, Jupyter supports many programming
languages like R, Julia, and Scala through different kernels, making it
versatile.
4. Easy to Share and Collaborate:
You can save your work as .ipynb files, share them with others, and even
upload them to platforms like GitHub. This allows easy collaboration.
5. Documentation and Code in One Place:
You can write explanations, notes, and even include LaTeX formulas
alongside the code in markdown cells. This makes it easy to document your
work.
6. Reproducibility:
Since Jupyter notebooks store both the code and its output, others can easily
reproduce your work by running the same code on their systems.
7. Great for Learning:
It's perfect for learning programming, as you can run code interactively, see
results, and make changes quickly without needing a full-fledged
development environment.

disadvantages :

− Not Ideal for Large-Scale Projects:


Jupyter is best suited for small scripts or data analysis. For larger projects,
it can become hard to maintain, and managing complex code across many
cells can get messy.
− Limited Code Navigation:
Navigating through large codebases in Jupyter can be difficult. It’s not as
organized as traditional code editors, and you don’t have the same features
for code refactoring.
− Lack of Version Control:
While you can save notebooks, Jupyter doesn't have built-in version
control like Git. Managing different versions of a notebook can be
challenging without extra tools.
− Performance Issues for Heavy Computations:
Jupyter is not optimized for large computations or running memory-
intensive tasks. The browser-based environment can slow down when
handling heavy workloads.
− Security Concerns:
Jupyter notebooks can be a security risk if they are not used properly,
especially when running untrusted code or sharing notebooks that contain
sensitive data.
− Not Ideal for Deployment:
Jupyter notebooks are great for development and analysis, but they are not
intended for production-level deployment. You'll need to convert them into
scripts or other formats for production use.

workflow of jupyter notebook :

1. Setting Up the Environment


Install Jupyter Notebook:
Install Jupyter using pip install notebook or conda install notebook if you
are using Anaconda.
Start Jupyter Notebook:
Run jupyter notebook in the command line or terminal. This will open a
web interface where you can create and manage notebooks.
Choose a Kernel:
Choose the programming language kernel (usually Python) that you want
to work with.

2. Create a New Notebook


From the Jupyter interface, click on the “New” button and select the
desired language (e.g., Python 3).
This will open a new notebook with an empty code cell.

3. Writing and Running Code


Write Code:
Type Python code (or another language depending on your kernel) into the
code cells.
Run Code:
Press Shift + Enter or click the “Run” button to execute the code in the
current cell. Jupyter will run the code and show the output below the cell.
Check Output:
The output can be anything from simple print statements to complex
visualizations, depending on what the code does.

4. Writing Documentation
You can add Markdown cells to document your process, explain your
code, and write mathematical equations in LaTeX format.
Markdown cells allow you to structure your notebook into sections, add
bullet points, hyperlinks, and even images to improve readability.

5. Adding Visualizations
You can integrate libraries like Matplotlib, Seaborn, or Plotly to generate
visualizations within the notebook.
Visualizations are automatically rendered and displayed directly beneath
the code that generates them.

6. Saving and Exporting


Save the Notebook:
Regularly save your notebook using Ctrl + S or the save button. Jupyter
notebooks are saved with the .ipynb file extension.
Export the Notebook:
You can export your notebook to formats like HTML, PDF, or slides for
sharing and presentation. Go to File > Download As and select your
preferred format.

7. Sharing the Notebook


Once you're done, you can share the .ipynb file with others, allowing them
to run the notebook on their systems, or you can upload it to platforms like
GitHub or Google Colab for others to access.
• azure ML studio
− Azure Machine Learning Studio (Azure ML Studio) is a cloud-based
integrated development environment (IDE) offered by Microsoft, designed
to build, train, and deploy machine learning models.
− It is part of Azure AI and provides tools to help data scientists and
developers to work on machine learning projects easily and efficiently.
− Azure ML Studio offers a user-friendly interface that enables you to
experiment with data, train models, and deploy machine learning solutions
without needing to handle complex infrastructure. The platform is scalable
and integrates seamlessly with other Microsoft services.

Key Features of Azure Machine Learning Studio


1. Drag-and-Drop Interface:
o Azure ML Studio provides a visual interface where you can drag
and drop modules to create machine learning models.
o It simplifies the process of building models without needing to
write a lot of code, making it easy for beginners to get started with
machine learning.
2. Pre-built Modules:
o The platform offers a wide range of pre-built modules for tasks
such as data preprocessing, model training, evaluation, and
deployment.
o These modules cover a variety of machine learning algorithms like
decision trees, regression models, clustering, and neural networks.
3. End-to-End Machine Learning Workflow:
o You can handle the entire machine learning lifecycle within Azure
ML Studio, from data ingestion to data preparation, training,
evaluation, and deployment.
o This allows for seamless workflows, reducing the need for different
tools or platforms during the development process.
4. Integration with Azure Services:
o Azure ML Studio integrates well with other Azure services, such as
Azure Data Lake, Azure Blob Storage, and Azure SQL Database,
for easy data access and storage.
o You can also deploy models to Azure Kubernetes Service (AKS) or
Azure App Services, making it simple to host and scale machine
learning applications.
5. Automated Machine Learning (AutoML):
o Azure ML Studio offers AutoML, which automates the process of
selecting the best model and hyperparameters for your data.
o AutoML is useful for users who might not have deep expertise in
machine learning, as it simplifies the process of finding the best
model.
6. Custom Code Support:
o While Azure ML Studio has a drag-and-drop interface, it also
allows you to write custom code using Python or R for more
advanced machine learning tasks.
o This flexibility ensures that experienced data scientists can still use
their preferred libraries and frameworks.
7. Collaboration:
o Azure ML Studio allows multiple users to collaborate on the same
project. You can share datasets, models, and results easily with your
team.
o This is particularly helpful for team-based projects, where different
team members can work on various stages of the machine learning
pipeline.
8. Model Deployment and Monitoring:
o Once a model is trained and evaluated, it can be deployed to
Azure's cloud infrastructure for real-time predictions.
o Azure ML Studio also provides tools for monitoring deployed
models, tracking their performance, and retraining them when
necessary.
9. Experimentation and Tracking:
o Azure ML Studio helps track experiments, enabling you to compare
different models, configurations, and training parameters.
o You can log and view metrics, making it easier to find the best-
performing model.

Workflow in Azure Machine Learning Studio

The typical workflow in Azure Machine Learning Studio involves the


following steps:
1. Data Preparation:
o Start by importing data from various sources such as local files,
Azure Blob Storage, or databases.
o Use data cleaning modules to handle missing values, data
transformation, and normalization.
2. Exploratory Data Analysis (EDA):
o Use statistical and visualization tools to understand the data,
check for trends, and find correlations.
3. Model Building:
o Drag-and-drop machine learning modules onto the canvas to
construct your model. Choose the right algorithm (e.g., decision
trees, logistic regression, neural networks) based on your problem.
o Alternatively, use AutoML for automatic model selection and
hyperparameter tuning.
4. Model Training:
o Train the model using the prepared dataset. Azure ML Studio
provides various options to control the training process, such as
specifying the training algorithm and adjusting parameters.
5. Model Evaluation:
o Evaluate the model’s performance using metrics like accuracy,
precision, recall, F1-score, and ROC curves.
o Compare different models to select the one that performs best.
6. Model Deployment:
o Once you have a trained and evaluated model, deploy it using
Azure’s cloud infrastructure for real-time prediction services.
o You can deploy models to Azure Kubernetes Service (AKS) or
Azure Container Instances for scaling.
7. Monitoring and Maintenance:
o Monitor the deployed model’s performance to ensure it is
delivering accurate predictions.

Advantages:

1. User-Friendly:
The drag-and-drop interface simplifies machine learning for non-
experts and speeds up the workflow for data scientists.
2. Scalable and Flexible:
Azure ML Studio can handle large datasets and scale according to the
needs of the project, making it suitable for both small experiments and
large enterprise applications.
3. Integration with Azure Ecosystem:
Azure ML Studio works seamlessly with other Azure services like
storage, databases, and compute resources, enhancing its capabilities
and ease of use.
4. Collaborative:
The platform supports collaboration, allowing teams to work together
on projects and share insights efficiently.
5. Automated ML:
AutoML makes it easy for users to build models without needing deep
expertise, allowing for faster experimentation.
6. Secure:
With Azure’s robust security features, you can ensure that your data
and models are protected and comply with industry regulations.

Disadvantages of Azure Machine Learning Studio

1. Learning Curve:
While the platform is user-friendly, new users might still face a
learning curve, especially those unfamiliar with machine learning
concepts or the Azure ecosystem.
2. Cost:
Although Azure offers flexible pricing, the costs can add up when
dealing with large datasets or deploying models at scale. Budgeting
for usage is necessary.
3. Limited Customization in Some Areas:
While the drag-and-drop interface is helpful for beginners, it can be
limiting for advanced users who want complete control over the
model-building process.
4. Internet Dependency:
As a cloud-based service, Azure ML Studio requires an internet
connection, which might be a limitation in environments with
unreliable internet access.
5. Resource Intensive:
Running large-scale models or processing large datasets can require
significant cloud resources, which may lead to high operational costs.

• google AutoML Computer Vision


− Google AutoML Vision is a tool from Google that helps you create machine
learning models to work with images, like recognizing objects or
classifying images, without needing deep knowledge of machine learning.
Key Features:
1. Image Classification:
You can train a model to sort images into categories (e.g., identifying
different types of animals or products).
2. Object Detection:
The tool can locate and identify multiple objects within an image, like
detecting cars or people in a photo.
3. Transfer Learning:
AutoML Vision uses existing, pre-trained models and adjusts them to
your specific data, which saves you time and resources.
4. Easy-to-Use Interface:
It’s simple to use, even for people with no machine learning background.
Just upload images, and the system takes care of the rest.
5. Fast Model Training:
AutoML Vision can train models quickly because it uses powerful
Google Cloud computing.
6. Performance Evaluation:
After training, it tells you how well your model is doing, with clear
feedback on accuracy and other performance measures.
7. Automatic Tuning:
The system automatically adjusts settings to improve the model’s
performance, making it more efficient.
8. Cloud-Based:
All the processing happens in the cloud, so you don’t need to worry
about setting up servers or hardware.
How it Works:
1. Prepare Your Data:
First, you need to upload your images and label them (like "cat," "dog,"
etc.).
2. Train the Model:
You upload your labeled images to Google Cloud, and AutoML Vision
trains a model to recognize patterns in those images.
3. Evaluate the Model:
After training, AutoML Vision checks how well the model is
performing, and you can improve it if needed.
4. Deploy the Model:
Once the model is trained, you can use it in apps or websites to
automatically classify or detect objects in new images.
Advantages:
1. Easy for Beginners:
You don’t need to be an expert in machine learning to use it.
2. Saves Time:
AutoML Vision speeds up the process of training and deploying
models.
3. Good Accuracy:
It uses powerful pre-trained models, so even with small datasets, it can
still give good results.
4. Scalable:
It can handle large datasets and perform heavy computations in the
cloud.
5. Customizable:
You can create models that are specific to your needs, like identifying
your own products or objects.
Disadvantages:
1. Cost:
It can become expensive if you have a lot of data or need to train
many models.
2. Limited Flexibility:
If you need very specific customizations, you might find it limiting
compared to building your own model from scratch.
3. Data Privacy:
Since it uses Google Cloud, sensitive data may need to be handled
carefully to ensure privacy.
4. Cloud Dependency:
It requires you to use Google Cloud, which might not be ideal if
you're using another cloud service.

Applications :

Retail, Healthcare, Security, Agriculture, Self-Driving Cars

• AWS Sagemaker
− Amazon SageMaker is a fully managed machine learning (ML) service
offered by Amazon Web Services (AWS) that makes it easier to build, train,
and deploy machine learning models.
− It helps developers and data scientists to focus on creating models without
having to manage the underlying infrastructure.

Key Features of AWS SageMaker

1. Build Models:
SageMaker provides tools to help you prepare your data, select the right
algorithms, and create a model from scratch or using pre-built templates.
You don’t need to worry about setting up servers or managing resources.
2. Train Models:
Once you’ve prepared your data and built your model, SageMaker lets
you train it using AWS’s powerful cloud computing resources. It helps
speed up the training process by handling the infrastructure and scaling
automatically.
3. Hyperparameter Tuning:
SageMaker can automatically adjust the settings (hyperparameters) of
your model to improve its performance. This is done using techniques like
Bayesian optimization, which helps find the best parameters for your
model.
4. Deploy Models:
After training, SageMaker makes it easy to deploy your model into
production. It automatically sets up endpoints, allowing you to get
predictions from your model quickly and at scale.
5. Model Monitoring:
SageMaker provides tools to monitor the performance of your model in
real-time. You can track its accuracy, ensure it continues to work well
over time, and retrain it if needed.
6. Pre-built Algorithms:
If you don’t want to build a model from scratch, SageMaker offers several
pre-built machine learning algorithms that you can use for different tasks,
such as image classification, text analysis, and recommendation systems.
7. Notebook Instances:
SageMaker provides Jupyter Notebooks as part of the service, where you
can write and run your code interactively. It’s a convenient way to explore
data, build models, and visualize results.
8. Integration with Other AWS Services:
SageMaker integrates with other AWS services, such as AWS S3 for
storage and AWS Lambda for serverless functions, making it easier to
build end-to-end machine learning workflows.
9. Security and Compliance:
AWS SageMaker offers built-in security features like encryption, and it
complies with various standards, making it suitable for industries that
require high levels of data security and privacy.

How AWS SageMaker Works

1. Prepare Data:
First, you prepare and clean your data. SageMaker integrates with AWS
S3 for storing data, so you can easily access and process it.
2. Build and Train Models:
You select or build a machine learning algorithm and train the model on
your data. SageMaker provides several options, such as using built-in
algorithms or bringing your custom algorithms.
3. Optimize the Model:
SageMaker uses hyperparameter optimization to improve your model by
finding the best settings that make it more accurate.
4. Deploy the Model:
Once trained, the model is deployed to an endpoint, where it can start
making predictions on new data.
5. Monitor the Model:
After deployment, you can track the performance of the model, ensure
it’s working well, and retrain it if necessary.

Advantages of AWS SageMaker

1. Fully Managed:
SageMaker takes care of the infrastructure, so you don’t need to
manage servers or worry about scaling. This saves time and resources.
2. Flexible and Scalable:
It allows you to use a wide range of machine learning frameworks (like
TensorFlow, PyTorch) and scale up or down based on your needs. It
also integrates with other AWS services to create complete ML
pipelines.
3. Easy to Use:
SageMaker simplifies many machine learning tasks like data
preparation, training, and deployment, making it easier even for
beginners or those with less ML expertise.
4. Cost-Effective:
You only pay for the resources you use. SageMaker offers flexible
pricing, so you can scale as needed without upfront costs.
5. Speed:
SageMaker provides fast processing and model training, using AWS’s
powerful computing infrastructure. This makes it suitable for projects
with large datasets or complex models.
6. Security:
SageMaker provides features like encryption for data and complies
with industry standards, making it a secure option for businesses
dealing with sensitive data.
Disadvantages of AWS SageMaker

1. Complexity for Beginners:


While SageMaker simplifies many tasks, it can still be complex for
people who are brand new to machine learning or cloud services.
Understanding how to properly use AWS services alongside
SageMaker can take time.
2. Costs Can Add Up:
Depending on the resources and models used, costs can increase,
especially if you are working with large datasets or require a lot of
training time.
3. Learning Curve:
For those not familiar with AWS, there is a learning curve to
understanding how SageMaker fits into the larger AWS ecosystem.
4. Limited Free Tier:
While SageMaker has a free tier, it’s limited in terms of resources.
Larger or more complex projects may require a paid plan.

applications :

Predictive Analytics, Image Recognition, Natural Language Processing


(NLP), Fraud Detection, Recommendation Systems, Autonomous Systems

You might also like