Pankaj Python
Pankaj Python
In
I hereby certify that the gone under training in intershala for Python by
“PANKAJ KUMAR”, in partial fulfillment for the award of degree of
B.Tech.(ELECTRICAL AND ELECTRONICS ENGG.) submitted in the
Department of (ELECTRICAL AND ELECTRONICS ENGG.) at SRM
UNIVERSITY, SONEPAT, HARYANA is an authentic record of my own
work carried out during a period from JULY 2019 to NOVEMBER 2019
under the supervision of Mr Mukesh singh . The mater presented in this
project has not been submitted by me in any other University / Institute for
the Award of B.Tech. Degree.
Pankaj Kumar
This is to certify that the above statement made by the student is correct to the best
of our knowledge and belief.
Signature of Supervisors
CERTIFICATE
This is to certify that this report is based on PYTHON language by Pankaj Kumar
(10516210004) , is submitted of partial fulfillment of the requirement for the
degree of Bachelor of Technology in electrical and Electronics Engineering of the
SRM University Delhi-NCR, Sonepat during the academic year 2019-20, is a
bonafide record carried out under our guidance and supervision.
The result embodied in this report have not been submitted to any other University
or Institution for the award of any degree.
1. INTRODUCTION
2. HISTORY
3. Web Scrapping
4. Uses of Web Scrapping
5. Benefits of Scrapping
6. Using Requests
7. Python Code
8. Beautiful Soup
9. Conclusion and Beautiful Soup
10.Conclusion and Future Scope
11.Reference
PYTHON
The language's core philosophy is summarized in the document The Zen of Python
(PEP 20), which includes aphorisms such as:
Beautiful is better than ugly
Explicit is better than implicit
Simple is better than complex
Complex is better than complicated
Readability counts
Users and admirers of Python, especially those considered knowledgeable or
experienced, are often referred to as Pythonists, Pythonistas, and Pythoneers.
Python's large standard library, commonly cited as one of its greatest strengths,
provides tools suited to many tasks. For Internet-facing applications, many
standard formats and protocols such as MIME and HTTP are supported. It includes
modules for creating graphical user interfaces, connecting to relational databases,
generating pseudorandom numbers, arithmetic with arbitrary precision decimals,
manipulating regular expressions, and unit testing.
Some parts of the standard library are covered by specifications (for example, the
Web Server Gateway Interface (WSGI) implementation wsgiref follows PEP 333),
but most modules are not. They are specified by their code, internal
documentation, and test suites (if supplied). However, because most of the
standard library is cross-platform Python code, only a few modules need .
As of March 2018, the Python Package Index (PyPI), the official repository for
third-party Python software, contains over 130,000 packages with a wide range of
functionality, including:
Graphical user interfaces
Web frameworks
Multimedia
Databases
Networking
Test frameworks
Automation
Web scraping
Documentation
System administration
Scientific computing
Text processing
Image processing
Python's name is derived from the British comedy group Monty Python, whom
Python creator Guido van Rossum enjoyed while developing the language.
Since 2003, Python has consistently ranked in the top ten most popular
programming languages in the TIOBE Programming Community Index where, as
of January 2018, it is the fourth most popular language (behind Java, C, and C++).
It was selected Programming Language of the Year in 2007 and 2010.
History of Python
Guido Van Rossum published the first version of Python code (version 0.9.0) at
alt.sources in February 1991. This release included already exception handling,
functions, and the core data types of list, dict, str and others. It was also object
oriented and had a module system. Python version 1.0 was released in January
1994. The major new features included in this release were the functional
programming tools lambda, map, filter and reduce, which Guido Van Rossum
never liked. Six and a half years later in October 2000, Python 2.0 was introduced.
This release included list comprehensions, a full garbage collector and it was
supporting unicode. Python flourished for another 8 years in the versions 2.x
before the next major release as Python 3.0 (also known as "Python 3000" and
"Py3K") was released. Python 3 is not backwards compatible with Python 2.x. The
emphasis in Python 3 had been on the removal of duplicate programming
constructs and modules, thus fulfilling or coming close to fulfilling the 13th law of
the Zen of Python: "There should be one -- and preferably only one – obvious way
to do it.
WEB SCRAPING
Web scraping, web harvesting, or web data extraction is data scraping used for
extracting data from websites. Web scraping software may access the World Wide
Web directly using the Hypertext Transfer Protocol, or through a web browser.
While web scraping can be done manually by a software user, the term typically
refers to automated processes implemented using a bot or web crawler. It is a form
of copying, in which specific data is gathered and copied from the web, typically
into a central local database or spreadsheet, for later retrieval or analysis.
Web scraping a web page involves fetching it .