We have to scan sites that we don't know the structure of quite often. When we're reviewing the crawl it seems quite often like there are loops, or maybe references to resource directories that get called a lot with includes. The crawl path might look something like this.
site root/calendar/maincalendar/includes/calendarl/maincalendar/includes.. and so on it will stop several itterations deep.
Is there a good way to prevent these loops using something like redundant links or blacklists safely while not know the exact use for these files or directories?