资 源 简 介
Mandja
Usage
```
usage: mandja-crawl [-h] [-a HTTPAUTH_CREDS] [-s SESSION_FILE] [-g GO_REGEX]
[-u VISIT_REGEX][-e {a-href,link-href,script-src,form-action,sick-match}]
[-o {url,status,from_main_domain}] [-f OUTPUT_FILTER]
url
A little recursive crawler. It is able to list all URLs it has visited. You
reuse the connection via Keep-Connection when crawling on the set domain and
use different connections for the head ping of URLs on different domains.
positional arguments:
url URL to crawl
optional arguments:
-h, --help show this help message and exit
-a HTTPAUTHCREDS, --httpauth-creds HTTPAUTHCREDS
HTTP authentication credentials passed as a json
string: -a "{"": {"username": "",
"password": ""}}"
-s SESSIONFILE, --session-file SESSIONFILE
If a session file is used - all the data from URL
processing is saved including