<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
	<channel>
		<title><![CDATA[Latest posts for the topic "Blocking Bad Bots"]]></title>
		<link>https://proxy2.de/forum/posts/list/3.php</link>
		<description><![CDATA[Latest messages posted in the topic "Blocking Bad Bots"]]></description>
		<generator>JForum - http://www.jforum.net</generator>
			<item>
				<title>Blocking Bad Bots</title>
				<description><![CDATA[ Some info I found on the Internet to help stop bad bots from indexing the Guestbook, as well as harvesting emails and using up bandwidth.<br /> <br /> On this forum, someone has posted their code for a "php spider trap". lt Uses robots.txt, .htaccess and getout.php:<br /> <a class="snap_shots" href="http://www.webmasterworld.com/forum88/3104.htm" target="_blank" rel="nofollow">http://www.webmasterworld.com/forum88/3104.htm</a><br /> <br /> The Perl version of the above script, named "trap.cgi":<br /> <a class="snap_shots" href="http://www.webmasterworld.com/forum13/1823.htm" target="_blank" rel="nofollow">http://www.webmasterworld.com/forum13/1823.htm</a><br /> <br /> This is a slick article with instructions for using mod_rewrite.  I don't understand the concept or know if it's possible under most hosts, but maybe some of the experts here can translate it for us:<br /> <a class="snap_shots" href="http://diveintomark.org/archives/2003/02/26/how_to_block_spambots_ban_spybots_and_tell_" target="_blank" rel="nofollow">http://diveintomark.org/archives/2003/02/26/how_to_block_spambots_ban_spybots_and_tell_</a><br /> unwanted_robots_to_go_to_hell<br /> (split this address to allow line ending.  After clicking on the link, you must paste the last part in the address window at the end)<br /> <br /> Sample .htaccess spider-blocking script (using mod_rewrite) has a long list of bots added:<br /> <a class="snap_shots" href="http://techpatterns.com/downloads/scripts/sample_wbmw.txt" target="_blank" rel="nofollow">http://techpatterns.com/downloads/scripts/sample_wbmw.txt</a><br /> <br /> A robots.txt Tutorial with lists of spambots, harvesters and bots searching for plagerism:<br /> <a class="snap_shots" href="http://www.clockwatchers.com/robots_list.html" target="_blank" rel="nofollow">http://www.clockwatchers.com/robots_list.html</a><br /> <br /> A ready-made robots.txt file to be downloaded from phpbbhacks.com.  Use for any site, not just phpbb:<br /> <a class="snap_shots" href="http://www.phpbbhacks.com/download/3182" target="_blank" rel="nofollow">http://www.phpbbhacks.com/download/3182</a>]]></description>
				<guid isPermaLink="true">https://proxy2.de/forum/posts/preList/3643/10569.php</guid>
				<link>https://proxy2.de/forum/posts/preList/3643/10569.php</link>
				<pubDate><![CDATA[Fri, 24 Sep 2004 11:40:13]]> GMT</pubDate>
				<author><![CDATA[ amber222]]></author>
			</item>
	</channel>
</rss>