Capstone - Project - 3 Report
Capstone - Project - 3 Report
Workforce
Gladwin Tirkey
News Summary in Shorts
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
1. Introduction
The Following Application which i have worked on takes a url from the user and lets the
user select a word count, after which the application scrapes the information from the given
website url then , it takes that information and breaks it down into tokens, then it removes
stop words and punctuations. Then finds the frequency of every words and then divide every
words frequency with the maximum words count. This divided scores is stored in the form of a
dictionary for every word. Through the help of this dictionary every sentence is extracted from
the data and the words are then matched with the words from dictionary and for every sentence
an accumulated score is then given. This is used to rank the sentence and create a summary for
the news
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
1.1 Problem Statement
The application will be provided a URL of a news website and with the
help of the URL , the application will extract the news and give a summary of
the news with the word count provided by the user .
2. Methodology
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Traditional thinking vs systems thinking
We can apply the system thinking process to our obesity problem, in fact. By
breaking the system into components and devising relations between them,
we can come up with a systems map. The components of our system are as
follows:
BeautifulSoup (Web Scraping)
Spacy (Providing libraries to tokenize words)
Streamlit (For Constructing quick frontend)
FastApi (For Constructing Backend)
Reactjs (For Constructing Frontend)
Docker (For Constructing images that can be ran on Virtual
Enviorment)
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
All these components are interconnected to each other and affect each other
vastly. Household waste and industries are major players in the waste
generation scenario. By simple activities like reducing, reusing, recycling, or
the 3Rs we can check pollution and waste generation.
Here we present a systems diagram of the problem in hand:
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
2.2 Approach
First and Foremost, the most important part of the project is the data that
need to be collected from the news website. In order to collect that data we
have to learn how to use beautiful soup , and also the inner workings of html
pages. We have to learn how the paragraphs have been organized in a news
website and then use our knowledge of beautiful soups code to extract that
information.
Then once that data has been extracted we have to use that data and
remove all the useless stop word’s and punctuation’s that provide no context
to the crux of the news which then will be displayed on our frontend
application.
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
3. Implementation
The first step is importing the libraries. Here we use BeautifulSoup, requests,
pandas library for Scraping , for sending HTTP requests and pandas for data
manipulation.
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Removing Tags from paragraph:
Using this max_freq to normalize words , then using that normalized words to
score sentences which have those words.
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Creating a Dataframe for those Sentences:
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Using the dataFrame to construct the dictionary and then sorting the
sentence based on score and then selecting length of the summary.
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Building Frontend Streamlit:
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Building FastApi:
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Routes: This code defines a RESTful API for managing news data, including retrieval, creation,
update, and deletion. It leverages FastAPI’s routing capabilities to organize endpoints efficiently.
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
put_news(id: str, news: News): Updates an existing news
item.
delete_news(id: str): Deletes a news item by ID
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
1 . Functional Component:
The main component is defined as a functional component
named App.
It uses React hooks (useState) to manage state.State
Variables:
2 .Functions:
fetchSummaryData: An asynchronous function that
fetches summary data from the API based on the
newsQuery.url and newsQuery.wordcount.
4. Final Results:
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Docker Files:
A Dockerfile is a text document that contains instructions for building a Docker image. It serves
as blueprint for constructing containerized applications.
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Frontend:
Backend:
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
Compose.yaml:
Conclusion:
Through this project i learned how to divide a a problem in different
components i.e backend and frontend and build through each component
one step at a time. One more important module i learned was how to use
docker and upload your project on there. One more thing thats need to be
mentioned that i also learned how much a prominent role github play’s in
developing project as it can be used to update your project and can be linked
for deploying your websites
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n
MISC:
GITHUB LINK:https://github.com/CrimeMaster/Intel-Final-Project-News-
Summary-App.git
News Summary Render Deployed Linked: https://news-summary-
r4ne.onrender.com
The Intel® Digital Readiness Programs and Intel® AI for Future Workforce program are developed by
Intel Corporation. ©Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of
Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of
others. All rights reserved. Program dates and lesson plans are subject to change. Intel technologies
may require enabled hardware, software, or service activation. No product or component can be
absolutely secure. Results have been estimated or simulated. Intel does not control or audit third-party
data. You should consult other sources to evaluate accuracy. Your costs and results may vary.n