We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 3
5728723, 1049 PM \webscrapping.ipynb- Colaboratory
lp install app_store_serape
Looking in indexes: Stine: aynL.ore/siaple, ttes://us-nytnan.nke.dow/colab-wbee]s/nu)ic/sinnlel
Downloading a9p_store_scraper-0.3.5-py3-none-any. wl (8.3 kb)
‘Downloading requests-2.23.8-py2.py3-none-any uh (58 KB)
eV—_—eeee ern 93, 4/38.4 Kb 2.4 Mo/s es 2:00:00
Downloading chardet-3,0.4-py2.pySenont-any.ahl (233 K8)
ss afi. 5.3 H8/s eta 0:00:00
ogutenent already satisfied: sana, 2.5 sn fusr/locel/1ibjpythond.0/dist-packages (Fron requests
Downloading url2sb3-1.25/13-5y2_py3-noneany hl (127 ka)
—_—“nrrerovosc er n1280/128.0 KO 12.0 MB/s ea 0:00:00
Requirenent already satisfied: certifin-2017.4,17 Sn /usr/locel/Aib/aythona.9/éist-packages (From neque
Uninstalting urils03-1.26.15:
Successfully uninstailee
sctonpeing uninstall” requests
‘Successfully uninstallee requests-2.2
cnnon: pip's dependency resolver does not currently take Anto account all the packages that are installed.
pandas-protiling 3.2.0 requires requests>-2.24,@, bet you have requests 2.23.0 which 1s incompatible
Google-colab 110.0 requires requestsy=2.25-1, but you have reauests 2.23.0 which is incompatible
Successfully installed app_store.serager-0.3.5 chandet. 3.8.6 requests-?.9.0 urllSb3-1. 25.11
1493-2.28.
fron app_store_scraper laport tppStore
slack = Appstore(country="us', app_nane="tinder*, app_id = “s477ez081")
stack, nevten(how_nany=202)
slackef = pa.oataFrane(ng.array(slack, reviews) columse[ ‘review ])
slackat? = slackotsoin(od.ontarrane(slake/pop( review) -t0l43¢0))
Slacket2.head€)
+ ESR teeters tein TG ee
1 MzgT20 Wanadaeapp‘arewle 4 ay adh Teeachisomt
2 Mpegs 1 ase teasouy TYG eo
on0640 butt realy
slacks#2.to_cev('Slack-agp-reviews.cov")
Wie Serapper Using Beautiful Soup from Python
ingore requests
hitpsscolab research. google comidrivateyv¥_FRLYscJSIOTAR_mL CmiLsukS7optpriniMede=tue
o-sapp_stare_seraper) (2-18)
re seraper) 1
New
New
195728723, 1049 PM \webscrapping.ipynb- Colaboratory
noort pandas as pa
from urllib.parse deport urlencode
footing a List of URL's that will be scraped.
List of_uns = ['heeps://a 23200. 3n/go/custoner-reves /RDXESSSVIGIYPL/refeen cr
reps //aw. aaron n/a custoner=revews/RL@2K)3KT1TA17/refven erro te]? Se-0TFBRASTNCBODYSAPICA
etpe://ham.aeaz0n3n/ a/cortoner ree R2O28JBRADURH?/ref-enacr_ap_so rw, ¢2150-UTFBAASINCBOOYSADTCR™,
"netpe:/ nwa, aaron in/gp/eustoner-revtews/RPUECATEQ)3RD/refea_or 9p dF, F1?40-UTERRASINBOSYSHP TCA" |
1 $31 2S6-UTFBRASIN-BOOYSHP7CE"
f Retrieve each of the urls HIML dats and convert the data Ente a Beautiful soup object.
anes = (1
reviews = ()
fata. string = °*
for url tn MAst_of-uris
params = ("apiueey' “pyageen4n6sc7‘ea7oxpsrreeaetzad, “url”: vel)
Fesponse = requests. get(Mttp://apS.seraperaps.con/", paranscurlencode(parane))
soup = Beautifulsoup(response.text, ‘ntel-parser")
for ten in sou2.Find_al1("span", elass_<*a-profie-nane")
“data string = data_ string = iten.get_text()
anes. append(data_String)
Gata_sering =
for ten in soup-find.sl1("span", (datachook" "review-bosy"))
eata.string = data_string * iten-got_sext)
reviews_append(aata string)
fata string = °
reviews dict = ("Reviewer Howe's panes, “Reviews”: revieds}
f Print the Lengths of each List
printtlea(nanes), lengrevieus))
coeace new cstafrane.
GF = paspatatrane. fron dict(reviews.oiet, orfente"sndex")
efsneaa)
e a 2 a ‘ 5
Reviewer ‘chins sho sho Nay ipa
‘view enin i hie sonanok pat a
, revew-igot WT Pnge sends -
Revs TRebuce neDSSe n= OSEORATTNN Ee te greaandeosomene., —(N® None
f oelete all the colums that have missing values,
ef.dropna(axtsrd, Snplacertrve)
ersnesat)
Reviewer ‘ahi enint i ahs
Name
HONEST REVIEW :igolihese WwThehingeis Kinds ose ‘est thing ital bass Sound quality ta much great
Reviews oe n 858 ‘her han sap Istar te ‘nd 29 mich
ft Teanspose the eatatrane
prod_reviews = of.7
Print (prod. reviews. neae(4))
Reviewer Nase Reviews
8 ACRIRE \PHENEST REVIEW :-T got these buds in 695/- 31.
hitpsscolab research. google comidrivateyv¥_FRLYscJSIOTAR_mL CmiLsukS7oppintMede=tue 295728723, 1049 PM \webscrapping.ipynb- Colaboratory
1 Achint \nthe hinge is Kinda Loose other than its a pr
2 dahisnek mM \nest thing 1s that , {e's bass {5 etter tha
3 Aahianok M \nsound quaiity fo nach great and ro 39 much #
fF Remove special characters fron review text
prod_reviewst Revieus"] = prod_revieus["Revtews'|.str.replace("\9","")
prod-reviews.nese(4)
Reviewer Nene viens
° Aebiot HONEST REVIEW st got hose bude in £685! and
1 Achint The ing it kin loose obser tan is pe
2 Abhishek Best ing Isat, ts bass isbeter tan
3. Abhishek M_— Sound qui to much geal and gogo much sat
f convert datafrane to CS file,
prod_reviews.to_c5v("reviews.c5v", indexeFalse, heasereTrve)
oF ope. read_csu(‘reviens.csv")
printtet
‘chine NOMEST REVIOM :-1 got these buds sn 1395/- nde
Behint The hinge ss Kinda loose other than Jts a pret
‘ohisnak M Gest thing is that , It's bass 15 better than +
Dohisnek M Sound quality to mudh great and so so much sat
hitpsscolab research. google comidrivateyv¥_FRLYscJSIOTAR_mL CmiLsukS7oppintMede=tue