as.list() method for robots.txtget_robotstxt() that defaults to “UTF-8” which does the
content function anyways - but now it will not complain about itpaths_allowed and robotstxt.future::future_lapply() to
future.apply::future_lapply() to make package compatible
with versions of future after 1.8.1get_robotstxts() function wich is a ‘vectorized’ version of
get_robotstxt()paths_allowed() now allows
checking via either robotstxt parsed robots.txt files or via
functionality provided by the spiderbar package (the latter should be
faster by approximatly factor 10)sessionInfo()$R.version$version.stringget_robotstxt() tests for HTTP errors and handles them, warnings might be suppressed while un-plausible HTTP status codes will lead to stoping the function https://github.com/ropenscilabs/robotstxt#5
dropping R6 dependency and use list implementation instead https://github.com/ropenscilabs/robotstxt#6
use caching for get_robotstxt() https://github.com/ropenscilabs/robotstxt#7 / https://github.com/ropenscilabs/robotstxt/commit/90ad735b8c2663367db6a9d5dedbad8df2bc0d23
make explicit, less error prone usage of httr::content(rtxt) https://github.com/ropenscilabs/robotstxt#
replace usage of missing for parameter check with explicit NULL as default value for parameter https://github.com/ropenscilabs/robotstxt#9
partial match useragent / useragents https://github.com/ropenscilabs/robotstxt#10
explicit declaration encoding: encoding=“UTF-8” in httr::content() https://github.com/ropenscilabs/robotstxt#11