Crawpy

Yet another content discovery tool written in python.

What makes this tool different than others:

Sends asynchronous HTTP requests, fast
Calibration mode, applies filters on its own
Recursive scan mode with additional flags
Report generation, you can later go and check your results
Multiple url scan

An example run

An example run with auto calibration and recursive mode enabled

Example reports

Example reports can be found here

https://morph3.blog/crawpy/example.html
https://morph3.blog/crawpy/example.txt

Installation

git clone https://github.com/morph3/crawpy
pip3 install -r requirements.txt 
or
python3 -m pip install -r requirements.txt

Usage

morph3 ➜ crawpy/ [main✗] λ python3 crawpy.py --help
usage: crawpy.py [-h] [-u URL] [-w WORDLIST] [-t THREADS] [-rc RECURSIVE_CODES] [-rp RECURSIVE_PATHS] [-rd RECURSIVE_DEPTH] [-e EXTENSIONS] [-to TIMEOUT] [-follow] [-ac] [-fc FILTER_CODE] [-fs FILTER_SIZE] [-fw FILTER_WORD] [-fl FILTER_LINE] [-k] [-m MAX_RETRY]
                 [-H HEADERS] [-o OUTPUT_FILE] [-gr] [-l URL_LIST] [-lt LIST_THREADS] [-s] [-X HTTP_METHOD] [-p PROXY_SERVER]

optional arguments:
  -h, --help            show this help message and exit
  -u URL, --url URL     URL
  -w WORDLIST, --wordlist WORDLIST
                        Wordlist
  -t THREADS, --threads THREADS
                        Size of the semaphore pool
  -rc RECURSIVE_CODES, --recursive-codes RECURSIVE_CODES
                        Recursive codes to scan recursively Example: 301,302,307
  -rp RECURSIVE_PATHS, --recursive-paths RECURSIVE_PATHS
                        Recursive paths to scan recursively, please note that only given recursive paths will be scanned initially Example: admin,support,js,backup
  -rd RECURSIVE_DEPTH, --recursive-depth RECURSIVE_DEPTH
                        Recursive scan depth Example: 2
  -e EXTENSIONS, --extension EXTENSIONS
                        Add extensions at the end. Seperate them with comas Example: -x .php,.html,.txt
  -to TIMEOUT, --timeout TIMEOUT
                        Timeouts, I suggest you to not use this option because it is procudes lots of erros now which I was not able to solve why
  -follow, --follow-redirects
                        Follow redirects
  -ac, --auto-calibrate
                        Automatically calibre filter stuff
  -fc FILTER_CODE, --filter-code FILTER_CODE
                        Filter status code
  -fs FILTER_SIZE, --filter-size FILTER_SIZE
                        Filter size
  -fw FILTER_WORD, --filter-word FILTER_WORD
                        Filter words
  -fl FILTER_LINE, --filter-line FILTER_LINE
                        Filter line
  -k, --ignore-ssl      Ignore untrusted SSL certificate
  -m MAX_RETRY, --max-retry MAX_RETRY
                        Max retry
  -H HEADERS, --headers HEADERS
                        Headers, you can set the flag multiple times.For example: -H "X-Forwarded-For: 127.0.0.1", -H "Host: foobar"
  -o OUTPUT_FILE, --output OUTPUT_FILE
                        Output folder
  -gr, --generate-report
                        If you want crawpy to generate a report, default path is crawpy/reports/<url>.txt
  -l URL_LIST, --list URL_LIST
                        Takes a list of urls as input and runs crawpy on via multiprocessing -l ./urls.txt
  -lt LIST_THREADS, --list-threads LIST_THREADS
                        Number of threads for running crawpy parallely when running with list of urls
  -s, --silent          Make crawpy not produce output
  -X HTTP_METHOD, --http-method HTTP_METHOD
                        HTTP request method
  -p PROXY_SERVER, --proxy PROXY_SERVER
                        Proxy server, ex: 'http://127.0.0.1:8080'

Examples

python3 crawpy.py -u https://facebook.com/FUZZ -w ./common.txt  -k -ac  -e .php,.html
python3 crawpy.py -u https://google.com/FUZZ -w ./common.txt -k -fw 9,83 -rc 301,302 -rd 2 -ac
python3 crawpy.py -u https://morph3sec.com/FUZZ -w ./common.txt -e .php,.html -t 20 -ac -k
python3 crawpy.py -u https://google.com/FUZZ -w ./common.txt  -ac -gr
python3 crawpy.py -u https://google.com/FUZZ -w ./common.txt  -ac -gr -o /tmp/test.txt
sudo python3 crawpy.py -l urls.txt -lt 20 -gr -w ./common.txt -t 20 -o custom_reports -k -ac -s
python3 crawpy.py -u https://google.com/FUZZ -w ./common.txt -ac -gr -rd 1 -rc 302,301 -rp admin,backup,support -k

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
reports		reports
src		src
.gitignore		.gitignore
README.md		README.md
common.txt		common.txt
crawpy.py		crawpy.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crawpy

An example run

An example run with auto calibration and recursive mode enabled

Example reports

Installation

Usage

Examples

About

Releases

Packages

Languages

morph3/crawpy

Folders and files

Latest commit

History

Repository files navigation

Crawpy

An example run

An example run with auto calibration and recursive mode enabled

Example reports

Installation

Usage

Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages