: I have a project Search engine. My part is :
: 1. Creat dataBase for Search engine
: 2. indexing and searching
: My friend told me need indexing all webpages then save all files were
: indexed in DataBase.
: then I don't know which is right.
databases can build internal "indexes" on tables to make certain
queries faster ... so if you have a database of webpages you can build an
index on something like a "size" field to make searching for pages by size
faster.
some databases have a feature called a "fulltext" index that can be built
on text colums to make searching for words faster them doing simple "LIKE"
queries. This can work in some use cases, but these database "fulltext"
indexes tend to be very limiting and not easy to customize.
based on what you've described, a couple of Lucene subrpojects might be
useful to you...
http://lucene.apache.org/nutch/
Nutch is specificly designed to crawl and index webpages.
http://lucene.apache.org/solr/
Solr is a search "application" that let's you index/query content using
any language over HTTP. It comes with a DataImportHandler plugin that
lets you automaticly index databases using configuration to describe how
to fetch the logical contents of each "document"
http://lucene.apache.org/java/
Lucene-Java is the underlying search library used in both Nutch and Solr,
if you want to custom build search based logic you can use this library.
As you mentioned, there is also a Hibernate project for integrating with
Lucene.
if you have followup questions about any of those 3 subprojects, please
consult the specific user mailing list for the project that you are
interested in.
-Hoss