In some cases, default settings in Netpeak Spider are not suitable for crawling your website. Usually, it happens during the crawling of:
- Slow websites – with low query processing speed.
- Websites with an additional protection from crawling – with a restriction on the number of simultaneous requests to the server.
To reduce the load on a website, you can use the following settings:
- On the ‘General’ tab:
- Reduce the number of crawling threads to minimum (for instance, to one or two threads). By default, Netpeak Spider uses 10 threads and provides quite fast crawling for most websites. If the current method doesn’t help, you can try to set one thread with 1500-3000 ms delay between requests to minimize the crawling speed;
- Turn off crawling of images, PDF, CSS and other MIME-types of files and disable Javascript rendering. It will decrease the number of requests during the crawling. Note, that the main SEO-parameters will not be checked for these files.
- On the ‘Restrictions’ tab you can set:
- Max number of crawled URLs;
- Max depth of crawling (the distance from the initial URL to a target URL, measured by clicks);
- Max URL depth (the number of segments in a URL of an analyzed page).
- On the ‘Rules‘ tab:
- Restrict the crawling area by excluding exact folders or entire directories.
Was this article helpful?
That’s Great!
Thank you for your feedback
Sorry! We couldn't be helpful
Thank you for your feedback
Feedback sent
We appreciate your effort and will try to fix the article