CYTAT(Dorota Matysiak @ Dec 11 2006, 06:50 PM)

Mam podobny problem, jak Marcin, tyle że dotyczy on robota wyszukiwarki Yahoo. Czy polecenie crawl-delay jest brane pod uwagę także przez Yahoo! Slurpa?
Yahoo! Slurp także obsługuje instrukcję Crawl-delay. Więcej informacji znajduje się w dziale
pomocy wyszukiwarki:
CYTAT
Since we crawl billions of pages from the entire Web, we use a large number of systems for web crawling. Therefore your web server may log requests from a number of different Yahoo! crawler client IP addresses. The different crawler systems are coordinated to limit the activity on any single web server. We determine a single "web server" by IP address, so if your host is serving multiple IPs it may see higher levels of activity.
If there are directories on your web server which you do not want represented in web search results, use robot exclusion rules as described in "How do I prevent certain subdirectories from being crawled". An exclusion rule can reduce the number of pages Slurp will read from your server.
There is a Yahoo! Slurp-specific extension to robots.txt which allows you to set a lower limit on our crawler request rate.
You can add a "Crawl-delay: xx" instruction, where "xx" is a delay value between successive crawler accesses. If the crawler rate is a problem for your server, you can set the delay up to 5 or 10 or a comfortable value for your server.
Setting a crawl-delay of 10 for Yahoo! Slurp would look something like:
User-agent: Slurp
Crawl-delay: 10
In general you should restrict total crawler activity to your server by disallowing unimportant content with a robots.txt rule. Setting a crawl-delay may limit the coverage and freshness of your content representation in Yahoo! search results. If you do feel that a crawl-delay is necessary, use small values to avoid blocking Slurp discovery and refresh of your key content.
Pozdrawiam,
A. Pawlus