0% found this document useful (0 votes)
25 views23 pages

Capstone - Project - 3 Report

Uploaded by

Gladwin Tirkey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views23 pages

Capstone - Project - 3 Report

Uploaded by

Gladwin Tirkey
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

AI for Future

Workforce
Gladwin Tirkey
News Summary in Shorts

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
1. Introduction

The Following Application which i have worked on takes a url from the user and lets the
user select a word count, after which the application scrapes the information from the given
website url then , it takes that information and breaks it down into tokens, then it removes
stop words and punctuations. Then finds the frequency of every words and then divide every
words frequency with the maximum words count. This divided scores is stored in the form of a
dictionary for every word. Through the help of this dictionary every sentence is extracted from
the data and the words are then matched with the words from dictionary and for every sentence
an accumulated score is then given. This is used to rank the sentence and create a summary for
the news

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
1.1 Problem Statement

The application will be provided a URL of a news website and with the
help of the URL , the application will extract the news and give a summary of
the news with the word count provided by the user .

2. Methodology

2.1 Systems Thinking

A system is made up of a group of interconnected components. System


thinking is a holistic way of thinking and demonstrates how different pieces
of a jigsaw puzzle come together to form a beautiful picture. Changing one
part of a system may affect other parts or the whole system. It may be
possible to predict these changes in patterns of behavior. Now, we shall look
at the whole system of the problem.

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Traditional thinking vs systems thinking

We can apply the system thinking process to our obesity problem, in fact. By
breaking the system into components and devising relations between them,
we can come up with a systems map. The components of our system are as
follows:
 BeautifulSoup (Web Scraping)
 Spacy (Providing libraries to tokenize words)
 Streamlit (For Constructing quick frontend)
 FastApi (For Constructing Backend)
 Reactjs (For Constructing Frontend)
 Docker (For Constructing images that can be ran on Virtual
Enviorment)

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
All these components are interconnected to each other and affect each other
vastly. Household waste and industries are major players in the waste
generation scenario. By simple activities like reducing, reusing, recycling, or
the 3Rs we can check pollution and waste generation.
Here we present a systems diagram of the problem in hand:

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
2.2 Approach

First and Foremost, the most important part of the project is the data that
need to be collected from the news website. In order to collect that data we
have to learn how to use beautiful soup , and also the inner workings of html
pages. We have to learn how the paragraphs have been organized in a news
website and then use our knowledge of beautiful soups code to extract that
information.
Then once that data has been extracted we have to use that data and
remove all the useless stop word’s and punctuation’s that provide no context
to the crux of the news which then will be displayed on our frontend
application.

2.3 Design Thinking

Design Thinking is an iterative process in which we seek to understand the


user, challenge assumptions, and redefine problems in an attempt to identify
alternative strategies and solutions that might not be instantly apparent with
our initial level of understanding. What is special about Design Thinking is
that designers’ work processes can help us systematically extract, teach,
learn and apply these human-centered techniques to solve problems in a
creative and innovative way – in our designs, in our businesses, in our
countries, in our lives. Here we will see how we can design a system that
helps to Summarize news according to word count:
Empathy: Summarizing news in shorts give one the opputinity to save time
and gather information from news
Define: In order to efficiently summarize news, the sentences have to bed
properly ranked.
Ideate: How the problem can be solved with or without AI
Prototype: Choose feasible ideas from the pool of ideas and select the ones
that can be tested in the real world and see how the classification really is.
Test: Get some quick comments about problems identified in the prototype
phase and come up with a full-fledged solution.

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
3. Implementation

The first step is importing the libraries. Here we use BeautifulSoup, requests,
pandas library for Scraping , for sending HTTP requests and pandas for data
manipulation.

Finding all the Paragraphs:

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Removing Tags from paragraph:

Importing Spacy for removing stop words and also Tokenizing


words:
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Finding maximum frequency of words:

Using this max_freq to normalize words , then using that normalized words to
score sentences which have those words.

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Creating a Dataframe for those Sentences:

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Using the dataFrame to construct the dictionary and then sorting the
sentence based on score and then selecting length of the summary.

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Building Frontend Streamlit:

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Building FastApi:

Cross-Origin Resource Sharing (CORS) is an HTTP-header based mechanism that allows a


server to indicate any origins (domain, scheme, or port) other than its own from which a browser
should permit loading resources.

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Routes: This code defines a RESTful API for managing news data, including retrieval, creation,
update, and deletion. It leverages FastAPI’s routing capabilities to organize endpoints efficiently.

 get_news(): Retrieves a list of news items.


 get_By_Id(id: str): Retrieves news by its unique ID.
 get_By_Title(title: Optional[str] = None): Retrieves news by
title.
 get_By_Url(url: str): Retrieves news data from a given URL.
 get_Summary(url: str, summary_len: int): Generates a
summary for news data.
 post_news(news: News): Adds a new news item.

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
 put_news(id: str, news: News): Updates an existing news
item.
 delete_news(id: str): Deletes a news item by ID

Frontend (React js):

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
1 . Functional Component:
 The main component is defined as a functional component
named App.
It uses React hooks (useState) to manage state.State
Variables:

 isCopied: A boolean state variable to track whether the


summary has been copied to the clipboard.

 status: A boolean state variable to track whether the form


has been submitted.
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
 summaryData: An array state variable to store summary
data fetched from the API.

 newsQuery: An object state variable with properties url and


wordcount.

2 .Functions:
 fetchSummaryData: An asynchronous function that
fetches summary data from the API based on the
newsQuery.url and newsQuery.wordcount.

 handleInputChange: Updates the newsQuery state when


input fields change.

 handleSubmit: Handles form submission, calls


fetchSummaryData, and resets the input fields.

 handleCopyToClipboard: Copies the summary text to the


clipboard.

4. Final Results:

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Docker Files:

A Dockerfile is a text document that contains instructions for building a Docker image. It serves
as blueprint for constructing containerized applications.

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Frontend:

Backend:

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Compose.yaml:

Conclusion:
Through this project i learned how to divide a a problem in different
components i.e backend and frontend and build through each component
one step at a time. One more important module i learned was how to use
docker and upload your project on there. One more thing thats need to be
mentioned that i also learned how much a prominent role github play’s in
developing project as it can be used to update your project and can be linked
for deploying your websites

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
MISC:
GITHUB LINK:https://github.com/CrimeMaster/Intel-Final-Project-News-
Summary-App.git
News Summary Render Deployed Linked: https://news-summary-
r4ne.onrender.com

The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n

You might also like