e$SrSSKrSSKrSSKrSSKrS/r\R"SS5r"SS5r "SS5r "S S 5r g) a robotparser.py Copyright (C) 2000 Bastian Kleineidam You can choose between two licenses when using this package: 1) GNU GPLv2 2) PSF license for Python 2.2 The robots.txt Exclusion Protocol is implemented as specified in http://www.robotstxt.org/norobots-rfc.txt NRobotFileParser RequestRatezrequests secondscd\rSrSrSrSSjrSrSrSrSr Sr S r S r S r S rS rSrSrg)rzjThis class provides a set of methods to read, parse and answer questions about a single robots.txt file. cz/Ul/UlSUlSUlSUlUR U5 SUlg)NFr)entriessitemaps default_entry disallow_all allow_allset_url last_checkedselfurls 8/opt/imh/python3.13/lib/python3.13/urllib/robotparser.py__init__RobotFileParser.__init__s;  !! ScUR$)zReturns the time the robots.txt file was last fetched. This is useful for long-running web spiders that need to check for new robots.txt files periodically. )rrs rmtimeRobotFileParser.mtime&s   rc6SSKnUR5Ulg)zISets the time the robots.txt file was last fetched to the current time. rN)timer)rrs rmodifiedRobotFileParser.modified/s  IIKrcnXl[RRU5SSuUlUlg)z,Sets the URL referring to a robots.txt file.N)rurllibparseurlparsehostpathrs rr RobotFileParser.set_url7s+%||44S9!A> 49rc[RRUR5nUR 5nUR UR S5R55 g![RRaYnURS;aSUl O'URS:aURS:aSUl UR5 SnAgSnAff=f)z4Reads the robots.txt URL and feeds it to the parser.zutf-8)iiTiiN)r!requesturlopenrreadr"decode splitlineserror HTTPErrorcoder r close)rfrawerrs rr*RobotFileParser.read<s 9&&txx0A&&(C JJszz'*557 8||%% xx:%$(!SSXX^!% IIKK  s)A**C!ACC!cSUR;aURcXlggURRU5 gN*) useragentsr rappend)rentrys r _add_entryRobotFileParser._add_entryJs; %"" "!!)%*"* LL   &rcBSn[5nUR5 UGHnU(d6US:Xa [5nSnO#US:XaURU5 [5nSnURS5nUS:aUSUnUR 5nU(dMvUR SS5n[ U5S:XdMUSR 5R5US'[RRUSR 55US'USS:XaDUS:XaURU5 [5nURRUS5 SnGM=USS:Xa6US:wa-URR[USS 55 SnGMyGM|USS :Xa6US:wa-URR[USS 55 SnGMGMUSS :XaGUS:wa>USR 5R5(a[!US5UlSnGMGM USS :XaUS:waUSR S5n[ U5S:XauUSR 5R5(aOUSR 5R5(a)[%[!US5[!US55UlSnGMGMUSS:XdGMUR(RUS5 GM US:XaURU5 gg)z|Parse the input lines from a robots.txt file. We allow that a user-agent: line is not preceded by one or more blank lines. rr#N:z user-agentdisallowFallowTz crawl-delayz request-rate/sitemap)Entryrr;findstripsplitlenlowerr!r"unquoter8r9 rulelinesRuleLineisdigitintdelayrreq_rater )rlinesstater:lineinumberss rr"RobotFileParser.parseSs DA:!GEEaZOOE*!GEE #AAvBQx::>   \\**6<<+?+?+DE ll%%r"Z__   j.. 0C0C'EFll  %C\\E **s++"   %%//4 4rcUR5(dgURH'nURU5(dMURs $ UR(aURR$gN)rrr_rPr rrar:s r crawl_delayRobotFileParser.crawl_delaysYzz||\\E **{{""   %%++ +rcUR5(dgURH'nURU5(dMURs $ UR(aURR$grf)rrr_rQr rgs r request_rateRobotFileParser.request_ratesYzz||\\E **~~%"   %%.. .rc>UR(dgUR$rf)r rs r site_mapsRobotFileParser.site_mapss}}}}rcURnURbXR/-nSR[[U55$)Nz )rr joinmapstr)rrs r__str__RobotFileParser.__str__s>,,    )!3!3 44G{{3sG,--r) r r r rr$rr%r rN)rY)__name__ __module__ __qualname____firstlineno____doc__rrrr r*r;r"rcrhrkrnrt__static_attributes__rrrrsE !(? 9'G#R: .rc*\rSrSrSrSrSrSrSrg)rMzhA rule line is a single "Allow:" (allowance==True) or "Disallow:" (allowance==False) followed by a path.cUS:Xa U(dSn[RR[RRU55n[RR U5UlX lg)NrYT)r!r"rZr#r^r%r`)rr%r`s rrRuleLine.__init__sN 2:iI||&&v||'<'>zTADIIMMrrN) rvrwrxryrzrr_rtr{r|rrrMrMs1#BNrrMc0\rSrSrSrSrSrSrSrSr g) rEz?An entry has one or more user-agents and zero or more rulelinesc</Ul/UlSUlSUlgrf)r8rLrPrQrs rrEntry.__init__s  rc/nURHnURSU35 M URbURSUR35 URb7URnURSURSUR 35 UR [[UR55 SRU5$)Nz User-agent: z Crawl-delay: zRequest-rate: rC ) r8r9rPrQrequestssecondsextendrrrsrLrq)rretagentrates rrt Entry.__str__s__E JJeW- .% :: ! JJtzzl3 4 == $==D JJ a ~F G 3sDNN+,yy~rcURS5SR5nURH"nUS:Xa gUR5nX!;dM" g g)z2check if this entry applies to the specified agentrCrr7TF)rHrJr8)rrars rr_Entry.applies_tosQOOC(+113 __E|KKME! %rcrURH'nURU5(dMURs $ g)zJPreconditions: - our agent applies to this entry - filename is URL decodedT)rLr_r`)rrrTs rr`Entry.allowance s0NNDx((~~%#r)rPrQrLr8N) rvrwrxryrzrrtr_r`r{r|rrrErEsI  rrE) rz collections urllib.errorr! urllib.parseurllib.request__all__ namedtuplerrrMrEr|rrrsV   $$]4FG ..DNN$((r