0% found this document useful (0 votes)
14 views3 pages

Web Scrapping

Uploaded by

Srushti patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
14 views3 pages

Web Scrapping

Uploaded by

Srushti patil
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 3
5728723, 1049 PM \webscrapping.ipynb- Colaboratory lp install app_store_serape Looking in indexes: Stine: aynL.ore/siaple, ttes://us-nytnan.nke.dow/colab-wbee]s/nu)ic/sinnlel Downloading a9p_store_scraper-0.3.5-py3-none-any. wl (8.3 kb) ‘Downloading requests-2.23.8-py2.py3-none-any uh (58 KB) eV—_—eeee ern 93, 4/38.4 Kb 2.4 Mo/s es 2:00:00 Downloading chardet-3,0.4-py2.pySenont-any.ahl (233 K8) ss afi. 5.3 H8/s eta 0:00:00 ogutenent already satisfied: sana, 2.5 sn fusr/locel/1ibjpythond.0/dist-packages (Fron requests Downloading url2sb3-1.25/13-5y2_py3-noneany hl (127 ka) —_—“nrrerovosc er n1280/128.0 KO 12.0 MB/s ea 0:00:00 Requirenent already satisfied: certifin-2017.4,17 Sn /usr/locel/Aib/aythona.9/éist-packages (From neque Uninstalting urils03-1.26.15: Successfully uninstailee sctonpeing uninstall” requests ‘Successfully uninstallee requests-2.2 cnnon: pip's dependency resolver does not currently take Anto account all the packages that are installed. pandas-protiling 3.2.0 requires requests>-2.24,@, bet you have requests 2.23.0 which 1s incompatible Google-colab 110.0 requires requestsy=2.25-1, but you have reauests 2.23.0 which is incompatible Successfully installed app_store.serager-0.3.5 chandet. 3.8.6 requests-?.9.0 urllSb3-1. 25.11 1493-2.28. fron app_store_scraper laport tppStore slack = Appstore(country="us', app_nane="tinder*, app_id = “s477ez081") stack, nevten(how_nany=202) slackef = pa.oataFrane(ng.array(slack, reviews) columse[ ‘review ]) slackat? = slackotsoin(od.ontarrane(slake/pop( review) -t0l43¢0)) Slacket2.head€) + ESR teeters tein TG ee 1 MzgT20 Wanadaeapp‘arewle 4 ay adh Teeachisomt 2 Mpegs 1 ase teasouy TYG eo on0640 butt realy slacks#2.to_cev('Slack-agp-reviews.cov") Wie Serapper Using Beautiful Soup from Python ingore requests hitpsscolab research. google comidrivateyv¥_FRLYscJSIOTAR_mL CmiLsukS7optpriniMede=tue o-sapp_stare_seraper) (2-18) re seraper) 1 New New 19 5728723, 1049 PM \webscrapping.ipynb- Colaboratory noort pandas as pa from urllib.parse deport urlencode footing a List of URL's that will be scraped. List of_uns = ['heeps://a 23200. 3n/go/custoner-reves /RDXESSSVIGIYPL/refeen cr reps //aw. aaron n/a custoner=revews/RL@2K)3KT1TA17/refven erro te]? Se-0TFBRASTNCBODYSAPICA etpe://ham.aeaz0n3n/ a/cortoner ree R2O28JBRADURH?/ref-enacr_ap_so rw, ¢2150-UTFBAASINCBOOYSADTCR™, "netpe:/ nwa, aaron in/gp/eustoner-revtews/RPUECATEQ)3RD/refea_or 9p dF, F1?40-UTERRASINBOSYSHP TCA" | 1 $31 2S6-UTFBRASIN-BOOYSHP7CE" f Retrieve each of the urls HIML dats and convert the data Ente a Beautiful soup object. anes = (1 reviews = () fata. string = °* for url tn MAst_of-uris params = ("apiueey' “pyageen4n6sc7‘ea7oxpsrreeaetzad, “url”: vel) Fesponse = requests. get(Mttp://apS.seraperaps.con/", paranscurlencode(parane)) soup = Beautifulsoup(response.text, ‘ntel-parser") for ten in sou2.Find_al1("span", elass_<*a-profie-nane") “data string = data_ string = iten.get_text() anes. append(data_String) Gata_sering = for ten in soup-find.sl1("span", (datachook" "review-bosy")) eata.string = data_string * iten-got_sext) reviews_append(aata string) fata string = ° reviews dict = ("Reviewer Howe's panes, “Reviews”: revieds} f Print the Lengths of each List printtlea(nanes), lengrevieus)) coeace new cstafrane. GF = paspatatrane. fron dict(reviews.oiet, orfente"sndex") efsneaa) e a 2 a ‘ 5 Reviewer ‘chins sho sho Nay ipa ‘view enin i hie sonanok pat a , revew-igot WT Pnge sends - Revs TRebuce neDSSe n= OSEORATTNN Ee te greaandeosomene., —(N® None f oelete all the colums that have missing values, ef.dropna(axtsrd, Snplacertrve) ersnesat) Reviewer ‘ahi enint i ahs Name HONEST REVIEW :igolihese WwThehingeis Kinds ose ‘est thing ital bass Sound quality ta much great Reviews oe n 858 ‘her han sap Istar te ‘nd 29 mich ft Teanspose the eatatrane prod_reviews = of.7 Print (prod. reviews. neae(4)) Reviewer Nase Reviews 8 ACRIRE \PHENEST REVIEW :-T got these buds in 695/- 31. hitpsscolab research. google comidrivateyv¥_FRLYscJSIOTAR_mL CmiLsukS7oppintMede=tue 29 5728723, 1049 PM \webscrapping.ipynb- Colaboratory 1 Achint \nthe hinge is Kinda Loose other than its a pr 2 dahisnek mM \nest thing 1s that , {e's bass {5 etter tha 3 Aahianok M \nsound quaiity fo nach great and ro 39 much # fF Remove special characters fron review text prod_reviewst Revieus"] = prod_revieus["Revtews'|.str.replace("\9","") prod-reviews.nese(4) Reviewer Nene viens ° Aebiot HONEST REVIEW st got hose bude in £685! and 1 Achint The ing it kin loose obser tan is pe 2 Abhishek Best ing Isat, ts bass isbeter tan 3. Abhishek M_— Sound qui to much geal and gogo much sat f convert datafrane to CS file, prod_reviews.to_c5v("reviews.c5v", indexeFalse, heasereTrve) oF ope. read_csu(‘reviens.csv") printtet ‘chine NOMEST REVIOM :-1 got these buds sn 1395/- nde Behint The hinge ss Kinda loose other than Jts a pret ‘ohisnak M Gest thing is that , It's bass 15 better than + Dohisnek M Sound quality to mudh great and so so much sat hitpsscolab research. google comidrivateyv¥_FRLYscJSIOTAR_mL CmiLsukS7oppintMede=tue

You might also like