RPA-Unit-3 (Notes-Short-&-Typed-AKM)
RPA-Unit-3 (Notes-Short-&-Typed-AKM)
RPA UNIT-3
Web scraping and screen scraping are the main techniques used in data extraction.
They differ in terms of the type of data collected and the practices used to collect this
public data.
However, web scraping and screen scraping terms frequently overlap. You will
require a different data collection tool depending on the data type. Understanding the
distinctions between these terms is critical for selecting the right data scraping
tool for your business’s needs and use case.
This article explains the differences between web scraping and screen scraping in
terms of use cases and methods used to extract data.
The extracted visual data is displayed or used in another application or system for
various business purposes. There are three screen scraping techniques to collect
screen display data from applications, documents, and information systems.
2. Native: Unlike FullText and OCR, with native screen scraping techniques,
users can only extract data from apps. The native screen scraping method
allows users to extract the screen coordinates of each word on a screen. It
can not retrieve hidden text.
Screen scraping is used for collecting customers’ financial data and financial data
transactions between banks and third-party providers (TPPs). Screen scraping
enables TTPs to access customers’ financial data, such as transaction history and
app login credentials, with their permission.
2. Marketing
Screen scraping allows businesses to monitor product reviews and prices and check
whether the ad is displayed on the desired platforms.
3. Quality Assurance
Companies use screen scraping technology to perform user experience (UX) and
user interface (UI) QA on their websites. Screen scraping allows developers and test
automation engineers to ensure that the looks and functions of websites perform
properly.
Screen scraping tools extract graphical user interface (GUI) elements from a
website, such as menus, buttons, icons, etc., to determine whether the website’s
functionalities and design structure or application work as required.
The other differences between screen scraping and data scraping are-
-AKm
While traditional web scraping libraries like Beautiful Soup and Requests are
excellent for parsing static HTML, some websites rely heavily on JavaScript to
load dynamic content. In such cases, using headless browsers like Selenium or
Puppeteer can be incredibly useful. These tools automate browsers, allowing you
-AKm
to interact with web pages just like a human user. You can trigger JavaScript
actions, fill out forms, and scrape dynamically generated content.
For example, you can scrape data from websites that use infinite scrolling,
interactive maps, or require user logins for access.
2. API Scraping
API scraping is particularly valuable when dealing with websites like social
media platforms, e-commerce sites, or data providers that offer API access.
Proxy rotation is essential for large-scale scraping projects or when dealing with
websites that have strict anti-scraping measures in place.
5. Scraping as a Service
For businesses and organizations with frequent scraping needs, there are
scraping-as-a-service providers like ScrapingHub that offer managed web
scraping solutions. These platforms provide scalable infrastructure, scheduling,
and monitoring, allowing you to focus on the data rather than the technicalities
of scraping.
Integrating machine learning and natural language processing (NLP) into your
scraping pipeline can help you extract valuable insights from unstructured text
data. You can use libraries like spaCy, NLTK, or Transformers to perform tasks
like sentiment analysis, entity recognition, and topic modeling on scraped text.
UiPath is one of the most popular RPA tools used for Windows desktop automation.
It is used to automate repetitive tasks without human intervention, the tool offers drag
and drop functionality of activities that you must have learned in the previous
blogs. In this blog on Error Handling in UiPath, I will cover all the basics of how you
can handle errors in projects.
Error Handling in UiPath mainly consists of two topics that you need to understand:
• Debugging
• Exception Handling
Once you go through the above two topics, we will discuss few tips & tricks which will
make you aware of some common errors, and how to avoid them.
-AKm
Debugging
Debugging in simple terms is the process of identifying and removing errors from the
project. Now, to debug errors, you need to go to the Execute tab. The Execute tab has
3 sections, Launch section, Debug section & the Logs section, as you can see in the
below image:
Launch Section:
As you can see in the above image, the Run option is used when you simply want to
execute your project. So, with this option, you would not see the step by step
execution, but would directly see the output, if it successfully executes.
The Stop button is used to use stop the execution of your project in the middle
and Debug is used to Debug the errors step by step.
Debug Section:
• Steps are used to execute your project step by step. So, when you click on Step Into,
it executes the next step and then it waits.
• Validate button is used to validate your project and check if you have any errors or
not. So, when you choose this option, UiPath will check if your automation has any
errors and if it has any errors it will return you the error.
• Breakpoints are the points at which you want to stop the execution and start
debugging step by step. The breakpoints button offers two options:
▪ Toggle Breakpoints
▪ Remove All Breakpoints
• Slow Step slows down your execution so that, you get a track of what is happening.
• Options provide various highlighting options to highlight the activities. So, you can
use this when you want to highlight any activity while you are debugging your project.
Logs Section:
The log section has only one option, which is Open Logs.
The Open Logs button lets you debug the program with the help of the logs. You can
check where your values went wrong from the logs.
Explore Curriculum
-AKm
So, that was about Debugging folks. Let us move to our next topic which is Exception
Handling.
Exception Handling
Exception Handling mainly deals with handling errors with respect to various activities
in UiPath. The Error Handling activity offers four options: Rethrow, Terminate
Workflow, Throw, Try Catch.
• Rethrow is used when you want activities to occur before the exception is thrown.
• Terminate workflow is used to terminate the workflow the moment the task
encounters an error.
• Throw activity is used when you want to throw error before the execution of the step.
• Try Catch activity is used when you want to test something and handle the exception
accordingly. So, whatever you want to test you can put it under the try section, and
then if any error occurs, then it can be handled using the catch section, based on your
input to the catch section. Apart from the try-catch, we also have a Finally section
which is used to mention those activities which have to be performed after the try and
catch blocks are executed.
Now, that you folks know the various options that UiPath offers for handling errors. It
is a good time that you know the common mistakes that people do and learn how to
resolve them.
Debugging techniques
-AKm
There are various techniques provided by UiPath Studio for debugging in order to check
whether the workflow is running successfully or to find out errors in order to rectify them. At
the top of the UiPath window, we can see various available methods of debugging inside
the EXECUTE block, as shown in the following screenshot:
• Setting breakpoints
• Slow step
• Highlighting
• Break
1. Exception Handling:
• Try-Catch Blocks: Implement try-catch blocks around sections of code
where errors are likely to occur. The try block contains the code that may
raise an exception, while the catch block handles the exception if it
occurs.
• Specific Exception Handling: Catch specific types of exceptions rather
than generic ones. This allows for more granular error handling and
enables different actions to be taken depending on the type of error
encountered.
• Finally Block: Use a finally block to execute cleanup code that should
run regardless of whether an exception occurs. This is useful for releasing
resources or performing other necessary tasks.
2. Logging:
• Comprehensive Logging: Implement logging throughout your
automation scripts to record important information about the execution
process, including steps completed, errors encountered, and variable
values.
• Error Logging: Ensure that errors and exceptions are logged with
sufficient detail to aid in troubleshooting and diagnosis. Include
timestamps, error messages, stack traces, and any relevant contextual
information.
• Log Levels: Utilize different log levels (e.g., DEBUG, INFO, WARN,
ERROR) to categorize log messages based on their importance and
severity. This allows for better control over the amount of detail logged
and helps in filtering and prioritizing log messages.
3. Notification Mechanisms:
-AKm
• Alerts and Notifications: Implement mechanisms to alert stakeholders
when errors occur during automation execution. This could involve
sending emails, SMS messages, or notifications to a monitoring
dashboard.
• Thresholds and Escalations: Define thresholds for acceptable error rates
or response times, and configure escalation procedures to notify
appropriate personnel if these thresholds are exceeded. This ensures that
errors are promptly addressed and escalated if necessary.
Security in RPA:
5. Audit Trails and Logging: Maintain detailed audit trails and logs of RPA
activities to track who accessed the system, what actions were performed,
and when they occurred. This information is critical for detecting and
investigating security incidents.
Compliance in RPA
-AKm
Compliance in Robotic Process Automation (RPA) refers to ensuring that RPA
implementations adhere to relevant laws, regulations, and industry standards.
Compliance is essential for mitigating legal risks, protecting sensitive data, and
maintaining trust with customers and stakeholders.
2. Data Privacy: Ensure that RPA processes comply with data privacy laws and
regulations, such as the General Data Protection Regulation (GDPR) in the
European Union or the California Consumer Privacy Act (CCPA) in the United
States. This includes obtaining consent for data processing, implementing data
minimization practices, and ensuring the security of personal information.
3. Sarbanes-Oxley Act (SOX): SOX is a U.S. federal law that mandates strict
governance and financial disclosure requirements for publicly traded
companies. RPA implementations in finance and accounting must comply with
SOX regulations to ensure accurate financial reporting and prevent fraud.
4. Payment Card Industry Data Security Standard (PCI DSS): PCI DSS is a
set of security standards designed to ensure that all companies that accept,
process, store, or transmit credit card information maintain a secure
environment. RPA solutions involved in payment processing must comply with
PCI DSS to protect cardholder data.
5. Process Discovery and Mining: Process discovery and mining tools analyze
digital footprints to identify inefficiencies and opportunities for automation
within business processes. Future RPA solutions will incorporate advanced
process discovery capabilities to automatically identify, prioritize, and optimize
processes for automation.
6. Robotic Process Intelligence (RPI): RPI combines RPA with analytics and
intelligence capabilities to provide insights into process performance,
compliance, and optimization opportunities. Future RPA platforms will
incorporate RPI features to enable continuous improvement and monitoring of
automated processes.