--45Z9DzgjV8m4Oswq
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Hi
I'm having an odd problem with swish-e 2.4.2. I have an index generated using
spider.pl. Contrary to my expectations it appears to be indexing the href content
of html anchors. I've attached the index configuration file to this message. The only
odd thing I can think of about this particular website is that the URLs don't have
file extensions (see http://pmr.corbas.co.uk/dynamic/). However, the content type
is definitely correct.
I'd be really grateful if someone more experienced in the ways of SWISH-E could
tell me that I've done something particularly stupid (failing that - clues as to
where to look next would be great).
cheers
nic gibson
--
love is the shit that makes life bloom
--45Z9DzgjV8m4Oswq
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="site.conf"
IndexDir /usr/local/src/pmr/bin/spider.pl
SwishProgParameters /usr/local/src/pmr/search/spider.conf
IndexFile /usr/local/src/pmr/search/index.site
IndexName 'PMR Site'
IndexDescription "Index of the PMR Site Content"
MetaNames title
PropertyNames title category
ExtractPath category regex !/dynamic/([^/]+)/!$1!
ExtractPathDefault category general
DefaultContents HTML2
--45Z9DzgjV8m4Oswq--