0€
07
08 chi>FiRHRBIe/mI>
09 2018 ASR AAR ERIEDI
declarationN )
10
1 pa RNAUI
2 a id-Wdetail® href-"/detail/info.htnl">
a
16
17 FRAME :2019.1.3
18 —#Fiic/p>
18
20
24.1 TREE
RIL CSS ARIE MRE TERURIERS. ULLAL, SOMES TORRE AEASTOIRAERS, Gt py
bl. div. a, 192% 2.9 Fras.
R29 COST RI
ae OL 8 |
eee » Eee
element iy ARO2m seskamOIR | 21
24.2 Sete
SRAM AYER TITER SORE TER. PIER OLR, LATS
UBT E EMI, 8 2-10 Fira.
R210 CORREA
ae FOL ie
hss info EPP FEcsal O7O
eee dlass vito FEAT elsif dv TR
2.4.3 ID TER
1D sv GAR AFBI, AIAN AL ID RAFAEL “#” FFAG, JRLLEMP A HTML CHS ID
CEM, PRUE class A HERE, (LIER ARI EAT ID. RI HIN Ae 2-11 BR
ROA CSS ORR
a Fo 18
id ees INA dae OE
element ‘etl BH deal eR
244 Bie
RA ONATAT AULT, ET DLAEA NEM, ANIL T class A id UNE, te
22 im.
R212 CSSA
ad FR ie
[ort] Tet) EA A ret UR
(atrtane-value) PF tnet SF sma EH
[outta] [angst "estan TE pet AE estan IER,
[arte ale [urge ie] AF ape HED RIC
eemenfarnibt aati RA AT ter ORE a
245 BRR
JRA (Descendant Selector) UWE, TULLE TERE RINTR, fa
223 Bak.22_|_ Scrapy paxanesscst EES
R213. CSSHERIER
ae 1 1
clement element vp 569 dy RFD pK
lemeniclss element | divinio ima EH class BF info 8 dv TER TOUTE ma OR
246 FIRE
WERE BANE, LEBEL, FUR TERINT IER, BTELEERL
FFHAHE (Child Selector) » ML 2-14 HAR
R214 COSTED
ae mot a
clement > elemeut doy, BE div ERG P FAR
247 AAU RRR S
UR ESSERE SA TRIM TER, ML AALS, TNE A
#2% (Adjacent Sibling Selector) 122-18 Hai.
R215 CSSHEER INES
Ed Oi ‘i
cement element pea RERREp ARSENATE AR
MEE AEH IRAN RMIT, ERAT, FEAL. AE
216 Bia.
wRo18 CSSRPRIRA IEA
Sic, THRESIBESREIEEG _- SENSSAEESnTEEISEESIEETSSIESESTSEEESERIESESIEETEETT|
Tuml> ody avinfo +p | IFIRIRIE chs 257 info dv cE RROBEAERLAG pea, WK div eA
fr body a, body hl AR ETE
25 MRA 4. EM RAST
AVF SARI MAREE NY, RATER NPath, CSS EAT RRIIMRAE, SERB TEIN ARIES MEL
SIERRA Le HANNE Ae REL — ie MIRE. CN ME
HOARE, HE LOIN RC.25.1 iAURE ST
2m sekemOIR | 23
ROCHE Ce. AHURA SC AD, BRAT MT Le FI TOI, ANE 2-17 Bm
e217 FRO
ie ax ast | ERATR
aera [ome abe abe
TIMER T RAE “a” OME ae abe, ate. ae, WE
o Tone SHOR allbsie | ale. abe. ae
fal oo" HEIN, RRM, SOUR T HABA [aa] | abe. ate
v BREE, GET PRRMRMRE, aU" HAG [aldol | ade, abe. ae, we
RIOR, REED EAT
25.2 MELPHR
ee
SM ESTROUS AT ARAN. STA
SHOE EN, MRCS Aw, SURO i LM, ALON EAT MA 218
Wir.
218 EK
ie ax at ERE
7 RTT abe aad
3 TRS abs 123abe
w eR TARE Babe ae
wt TERR 0-9 aide we
D TRICE, ATTA) De eae
% RRSP, EAT (eH weNg [ae ae
‘s CRERERAES, BOF) ase ade, #6
w TAAL AZav0-9 Jy EA, Ser. ee | awe aby ale abe, we
we AEA, SF) awe Wie. Ae 0
253 RERE
ge EARATVURENY Ulm AE 11 Pe, TORE ods PRES 11 Ud ATE
ASEMESUME ERE, tm 2-19 Fim.24 | Scrapy paxanesscat
e219 ERE
ee BX nt Rae
. TRH P9H ORE ae? aby abece
+ EWA eR ae abe, aboce
2 Ton — PF Oe ate? a, abe
= TRH PA a Pers ae
im) EH PA m EE a jhe abe, sabe
(aut Tomei — FFF Om aah es aes abe
ma} Teme Pm aK atlaybe abe, abe, abe
254 XA
Sm RE ARVC MEANS, HE AMSA ROU, AAAI RAPE.
HATE ifs, 2018-10-10, 2018/10/10 HR 2018.10.10 ABIRIEAE, Hetits, DORAN
Wey, an 2.20 Fa.
R220 SAK
me [ax east RFT
1 EARP RERBA— PHT, KEK | als}od(2}saQanayaynacayrayy | 2182-12
46, Sere, se Astarunt | agayaCay da)
255 S48
22-20 EI AeA LALA) 2018-12-12 AH FM, LAURE 2018.99.99 ik AS
5 te HDD LP ie A OLL-SYO-Z]. I
Of1-S]{12][0-9]sf01], eR REAM 4 Match 1, AREER,
KIA None. Match GAME Ai. LACIE romp, GME.
(55681 2-21 rematch rete Fe
01 import xe
02
03 § ARIE MMACASRAE aR Pattern MM
04 pattern ~ re.compile(r*(\d14})~(\aC8})")
os + AHOLREE A
06 string ~ "0755~44445555 is our new office phone nunber*
07 string? = "the old number 0755-11112222 is no longer ued”
08 Perm Match MR
09 match ~ rematch (pattern, stringl)
40 match2 ~ rematch (pattern, string?)
u
a2 4 maton:
13 re.match Sa
14 print (type(matent))
15+ groupe SIGFFATURR
16 print atch. groups ())
17 & group (0) WASPS RTL
18 print (natch1. group (0))
19 group() WALT, DUES2m seskamOR | 27
20° print (match1. group (1))
21 print (natchl group (2))
22 else:
23 print (‘maven AUURESIRR")
2
25 Sf match2:
26 print (match2)
27 groups SSaRATUREOR
28 print (ratch2.groups ())
29° group (0) WASTHA RSRUEREN
30 print (match2.group (0))
21 group() MAL, Data
32, print (patch2.group (1))
33. print (natch2. group (2))
36 else:
35 print (‘match AUREL")
iets Be
rematch MEMORIAM:
aroups HAG: ("0755", "44445555")
group (0) AWN: 0755-44445555
group (LAH: 0755,
group (2)IAEA: 49445555
match? RIE
(3) se seareh(pattem, sting, flags-0)
search 773855 match JIA, APUAVAL, match DAES LAAT IPSULA search
SWZ AeHRC string UEATUUI. FEAR UCREA, MEA Mateh x1. ar
URBI2.37 se seach) PHRF A
01 import re
02
03 + ARIE MMACASRMEfeaR Pattern MM
0¢ pattern = re.compile(r" (\di4})~(\a{8})")
os + UUM
06 string = "0795-44445959 is our new office phone number"
07 string? = "the old number 0755-11112222 1s no longer used”
08 tis search UR
09 search ~ re.search (pattern, stringl)
40 search? ~ re.search (pattern, string?)
u
12 ££ earch
13 re.search Sem
14 print “search! BEI RAA:
15 groupe SIGRFATURR
type (search1))28 | Scrapy paxanesscat EES
16 print (‘search # groups i8%%:", search groups)
17 & group(0) Avesta RORTUREN
10 print "search! # group (0) HAS: ", search] -group(0)}
1s grow) WEAR, DUR
20 print (*search! # group (1) YAM: ", search] .gzoup(1)}
21 print ("search # group (2) HUA: ", search! .gzoup (2)
22 else:
23. print "search! IRA")
™
25 Af search2:
26 print (‘search? MEIERAA:*, type (search]))
27 groupe SIGRFATURRR
28 print (*re. search HEIARRAVs:', search?.groups())
29 group(o) WASH RAGTURN
30 print (‘search? # groups INAH: ", search?.group (0))
31 group() AMAL, LULA HE
32 print (*search2 # group (0) HAs: ", search? .aroup(0)}
33° print (‘search? # group (1) UA: , search? group (1))
34 print ("search2 # group (2) HUA: ", coarch? group (2)
35 else:
36 print (*search2 IRA R")
serra Fe
search! IBM/EIMN:
searchl # groups MAH: (*0755", *44445555")
o75s-a4445555,
075s
searchl # group (2) MBN: 44445555
search? SSMAVRIBV:
re.searchBMAVRIUMN: (*0755*, *11112222")
search? t# groups AWN: 0755-11112292
search2 t# group (0) MM: 075S-11112222
search2 # group (1) AEN: 0755
search? # group (2) ®W#: 11112222
(A) refindall(pattem, string, flags=0)
TURRET, COURIER ATURE, ROUT R
(5682-47 re findall HL AAITERD @
01 import xe
02
03. # AEURSREE (LAR Pattern He
0¢ pattern = re.compile(*\d+")
os + AHOURESE ASH
06 strings = "Your activation code 1s 73629~72993-00983-84721"2R sek | 29
07 result = re.findall (pattern, strings)
08
09 print (result)
jet Fe
[73629", "72993", 00903", *64721"
(5) se pixpater, sting, maxspit-0,Mags-0)
TIOGA 8 ting. BLAU, ARB
(612-51 respi srIME AE Fe
01 import
02
03 # RIEWAERHE ft Pottcrn MR
06 pattern ~ se.compile("\W")
05 + ARLMREAR
06 ateinga ~ 'Thiadiscthetlargestsbell’
7
08 result ~ re.aplit (pattern, strings)
8
10. print (esule)
iti Be
(emis, ‘1s', ‘the’, ‘largest", ‘bali")
(6) re-sub(paitem, repl, string, count, flags-O)
‘eH rept BHIRUAFICO PF FFE PA OI A neph BELLAE—h a—AT.
SUR ATTN, HCP Mate SBOE SEAT IH, HEIL EATER, coun Ht
seve, BRU OME AEEE, AHR
C761 2.6] reso RINE
01 import ze
02
03 ¢ RIEU RASREH(LAR Pattern HR
04 pattern ~ re.compile(z'(\d{4)-\d(2}-\412})")
05 strings = "Today 1s 2018-12-12, the date of the meeting 1s set at 2018-09-10,
lease confirm 06 ‘whether to participate before 2018-12-25"
07
00 FLAMERS. HERI PCE
09 der totype natn)
10 return match. group (0) -replace (r"
n
12 new strings ~ re.sub(pattern, totype, strings)
a
14 print (new strings)30_|_Scrapy paxgnesscat
jet Fe
‘Today 1s 2018.12.12, the date of themeeting 1s set at 2015.03.10, please conrirm
whether to participate before 2018.12.25
2.6 [esv3 28 1. Python Ay HTTP BAK urllib
ARATE TMA, MA PR ACARI AR FRI EARL, 1 ZAR
SPIRAL, (8 Python HiaEME ROE = FMEA SRULH AE. Ae RRA ea a
Jae. wuld 58 Python AN ELAAHE N.S TH.
2.6.1 RRR
EA HOPUIDAR ATEN unlopen, Ti:
urllib request urlopan(url, datactione, [timeout, |*, cafilecone,
capath-one, cadefault-Falee, context-None)
sim.
© ls SFA, BATH Request,
© data; BANA data A, MLA POST st (tA: data Aha RA RAE bytes, TALL
lit parse nlencode 454428).
timeout; SALMA ALOE A,
cafe: (£750) HTTPS MLAbnt, STARR A BIN PN CA GEH.
copath: A637 HTTPS mabe}, TRAIL AEE CA ELAR,
cedefanlt: SLAM E ALF.
‘contest: RAR ARIE SSL APR ssl SSLContext 44).
rlopen( i bt oF Na EHF.
read): Heri.
etal: eds mak,
geteode): 81% 5,
info): BIRGAEL, retin ey,
ADF ARBRE, ARUN GET HR:
>>> import urllib.request
bo> url ~ ‘http://bing.com
>> reurlLib. request .urlopen (url)
UR ALLOY GET i ROT SAL, BEA BIH url parse 408 urlencode HEARS MCE AT URL A
1h, HEE a ERAT RL2m sekemOR | 31
>> import urllib-request
>> import urllib-parse
>o> url = ‘hetp://bing-com/search"
>> data=("a": "python" }
>>> req data ~ urllib-parse.urlencode (data)
>>> zeqdata
‘a-python’
>>> requrl = url +12" + req cata
>>> requel
“netp: //eing.com/search?q-python"
>>> reurllib. request .urlopen (url)
de> agetcode ()
200
>>> regeturl 0
‘netp: //en. bing. com/?acope-webssetmkt=zh-Chesetmkt=zh-CNesstnkt=2h-cN"
>>> .info()
“http.-cllent.ATTMessage object at 0x00000188853E6E10>
MURSERGH POST if. BABIN: data Se
data = bytes (urllib.parse-urlencode (name: 11}), encoding= "utts")
= urllib. request .urlopen (*http://exanple.con’, datacdata)
2.6.2 {#FA Cookie
Cookie BMINAERAN, PLETE NIGEL FE, BAER T AMD, RS. RUNGE. aM
SAMA. A AURA RIAU, REE BEIL Cookie, ARAMA MONAA Se
MORIP2,. HEB, REPRISES. Cookie IGARMEEHAR ATH), SRLLEL A) Heim a A REIK
1S ORK OH MCT Ae. (I uly HR fF RISK Cookie HE MH
lib. requestHITTPCookieProcessor(cooke). EF Cookie EOI ‘> opener. 2 Python 1 hip
‘41642 cookiejar BL, /H-THRUEA! Cookie EHF. itp cookiejr DS, RAAT AEE
kif) Cookietar 3881 @AANSE Cookie, JHE R SHIEH RI HARARE NATL SCHUM Sab
fe. EAB IIO
[G1 2-71 wid $82 Cookie
01 + BMCookie
02 import http.cookiejar, urllib.request
03 cookie = http. cookiejar.CookieJar ()
04 handler ~ urilib. request .MPCookieProcessor (cookie)
05 opener ~ urllib.request.build opener (handler)
06 response ~ opener.cpen(*http://wa.baidu com’)
07 for item in cookie:
03 print (item.nane+*-
“sien. value)
‘nest MP 2.12 Ba.32_|_ Scrapy RaxeNBsa8®
212 GB Cookie
U1 2-81 valid (R1F Cookie
01 # RFF Cooxte
02 filename - ‘saved cookies.txt"
03 f FileCookiedar, MozillaCockieJar, INPCookioJar HHT Cookio (5B, SEAR
FASTA, eae
04 cookie = http.cockiejar.MozillaCoskieJar (filename)
05 handler ~ urllib-requost .TTPCooki sProceseor (cookie)
06 opener ~ urllib-request.-build opener (handler)
07 response ~ opener.cpen(*http://wws.baidu.com")
08 cookie.save(ignore discard=nrue, ignore expires
ams 2.13 Ba.
L218 (fF Cookie
[7812-91 wild (8H Cookie
01 AHF Cookie
02 import http.cookiejar, urllib.request
03 cookie = http.cockiejar.Mozillacoskievar()
14 cookie.load(*saved_cookies.tit', ignore discard-true,
ignore_expires-True)
05 handler ~ urllib.request .nTTPCookLeProcessor (cookie)
06 opener ~ urllib-request.build opener (handler)
07 response = opener.cpen(*http://awa.baidu.con')
08 print (response. read () .decode "ut f-8"))2e sek | 33
STAIN 2.14 FR.
R214 WAI Cookie
27 RRBARRE 2: BAUMB=IAE requests
ALC weld, B= requests BLING HEHE, ANCA Teh aH, MECC.
2 requests AI HHI:
© 42 pip install requests HARA
© FA Githab $25 (hitps//sithab comlroquestrequests), i (Fsetuppy BATRA,
int requests 8 EAR EH ISH
C1210 requests (EAE
01 import request
02
03 r = requests.get (url-"http://sww.bing.com*)
04 print (r.content)
BULA, requests #26 HTTP WAR
HITTP if Rat.
= requests post (*nttp: //nttpbin.org/delete*)
= requesta.put ("http://httpbin.org/put', data = (*key':'value"))
requests.delete (*http: //httpbin.org/delete")
= requests.head(*nttp: //nttpbin.ora/aet")
= requasts.options (*http://httpbin.org/get")
FIRE GET 45 POST it RASA 79IALIET requests EASIER
DA, AERA URL 26. RRR. SEAL34 | Scrapy paxanesscst
271 RRR
1. GET HR
EMESIS NA J MT IEIS GET PR. GET Ak —AMHAEAI, TEN RLALA AT AR
HTT, LoLOOBP Bing 19% requests, RACTATELA SUMMA HM URL REEVE:
Innps:/em bing comy/seareh?q-requestsqs-n8 form=QBLHAsp~14pq-requestsse-8-BRSk-Rey
id-72590B4841941B79094E826A164CC50
CHEAT TA, MAvor ARAM HCH ML. MM AS BA
Bi2us ETRE H
“a” MNEIO RATIO RAR. LOA “requests” . JHB ALE DIARIO, (CESAR
C61 2-111 requests (032 URL PR HE
01 import requests
02
03 payload = (
0g q's trequests',
05 ‘gst: tas",
06 tpg’: “requests,
°7
08
09
10 tepts tb
11 response ~ requests.got (url~'http://snne.bing.com', parans-payload)
ATEN IGA URL:
lap bing comteach?g-requestdeqy-HSApq-requesidac-#-8Avid-3FFOLSBOSDSAPORDG
FOCOIOSEAGBD7Céfom-QBLH.Asp=1
ARAM params SEM A]. HERO, MRS CH A HI None Mya, AZ i NB) URL
riteioeenh te.2e sekamMR | 35
2. POST if
POST if RAR AEAT RHR HOOK, HEMI EEIAIE . PISCEEYA. requests (lh POST 1B.
i, SFR data DAT
C2121 requests (hid
01 import requests
02
03 payload ~ (
04 — *keyit*:*valuel",
05 *key2*:*value2"
o
07 response - requests.post (url-'http://www.bing.con' ,data-payload)
FUT LO ERLTER IY ISON Hea, SURI PRE json HCH of
response ~ requests.post (url-‘http: //mm-bing.com', json-payload)
2.7.2 iARK
SARADET TEASE PL Er MEAT RI HL
AG AIT FSR (IP SE OA OG ARERR AOTEAROA
>> import requests
>>> user_agent~"mozilla/5.0 (Windows wT 10.0; winé4; x64) Applemebit/537.36
(KHTML, Like Gecko) Chrome/71.0.3578.98 safari /537.36"
>>> headers ~ {*User-agent agent)
bo> response = requests.qet (http: //www.baldu.com’ ,headers=headers)
273 WAS
PATRAS, SACLE MMII NR, TURP AMIRI SL: RURAL IES
AOL AIEBUES. XC AAM, requests 22 HUH IM LTH RY FATA
>>> Amport requests
>o> r= requests get (‘http://blog. Jobbole.com/al-posts/")
>>> Evencading
rute-o"
>>> z.text
".. \e\a\e\r\n\eInstagram