| |
Search
Search technology is a huge subject, encompassing:
- networking (spidering the web),
- string and markup-language manipulation (parsing HTML)
- language and text-parsing (finding words & sentences in documents, stemming and other
linguistic analysis),
- algorithms (finding matches, AND/OR queries, combining multiple word results),
and
- performance (both increasing spidering speed, and making large catalogs fast to search).
In addition to the articles and code below, these search-related links
might be interesting or useful.
Searcharoo.NET - Version 1
|
|
How to build a simple, extensible search engine using ASP.NET that
can crawl files and create a searchable catalog by processing the
text from HTML source.
|
|
Searcharoo.NET - Version 2
NEW !
|
|
Extend Searcharoo to populate its search
catalog by Spidering HTML pages - follow links and imagemaps
to process both static and dynamicly generated pages!
You can also search for multiple words.
|
|
|
Useful links
On Search,
the Series
Lucene.net [Open Source]
Nata1 [Open Source]
SiteSearchEngine [article]
What is Stemming?
Robots.txt
more links »
|