691 stable releases
new 2.13.3 | Nov 8, 2024 |
---|---|
2.11.0 | Oct 31, 2024 |
1.99.37 | Aug 14, 2024 |
1.99.13 | Jul 31, 2024 |
1.26.7 | Mar 22, 2023 |
#520 in Web programming
12,643 downloads per month
710KB
14K
SLoC
Spider Worker
A spider worker to decentralize the crawl lifting.
Dependencies
This project depends on the spider crate.
Usage
The worker starts on port 3030 and the scraper for html gathering on 3031 by default.
SPIDER_WORKER_PORT=3030 SPIDER_WORKER_SCRAPER_PORT=3031 cargo run
Feature Flags
scrape
- When the html is needed run the instance with the flag. Requires spider feature flag matching on the client to start. This also starts the instance on port 3031 instead.full_resources
- Start the basic worker to gather links and scraper together.tls
- Enable tls support use the env variablesSPIDER_WORKER_CERT_PATH
for the.pem
file andSPIDER_WORKER_KEY_PATH
with your.rsa
file. Defaults to/cert.pem
and/key.rsa
.
Ports
By default the instance runs on port 3030
use SPIDER_WORKER_PORT
to adjust the port.
The scraper runs on port 3031
when enabled use SPIDER_WORKER_SCRAPER_PORT
to adjust the port.
Dependencies
~25–39MB
~697K SLoC