About Nettalk

Editor

Overview

Useful Resources

Contact Us

1

Current Issue (Issue 120)

Thought of the Week:- "To be successful you must accept all challenges that come your way. You can't just accept the ones you like."
Xapian:- http://xapian.org/ Open Source (1.83 MB).

Xapian is an Open Source Probabilistic Information Retrieval library, released under the GPL. Simply we can call it as a open source search engine. Xapian is designed to be a highly adaptable toolkit to allow developers to easily add advanced indexing and search facilities to their own applications. Xapian is supplying Omega, which is a packaged Search Engine and can be used to search your own website. The features of the software are:

Free Software/Open Source - licensed under the GPL.

Portable to most Unix platforms (known to work on Linux, FreeBSD, OpenBSD, Solaris, and MacOS X). It also works on Microsoft Windows.

Ranked probabilistic search - important words get more weight than unimportant words, so the most relevant documents are more likely to come near the top of the results list.

Relevance feedback - given one or more documents, Xapian can suggest the most relevant index terms to expand a query, suggest related documents, categorise documents, etc.

Phrase and proximity searching - users can search for words occuring in an exact phrase or within a specified number of words, either in a specified order, or in any order.

Full range of structured Boolean search operators. The results of the Boolean search are ranked by the probabilistic weights. Boolean filters can also be applied to restrict a probabilistic search.

Supports stemming of search terms. This helps to find relevant documents which might otherwise be missed. Stemmers are currently included for Danish, Dutch, English, Finnish, French, German, Italian, Norwegian, Portuguese, Russian, Spanish, and Swedish.

Supports bigger than 2 GB database files - essential for scaling to large document collections.

Platform independent data formats - you can build a database on one machine and search it on another.

Allows simultaneous update and searching. New documents become searchable right away.

Indexer supplied can index HTML, PHP, PDF, PostScript, and plain text. Adding support for indexing other formats is easy where conversion filters are available (e.g. Microsoft Word).

One can also index data from any SQL or other RDBMS supported by the Perl DBI module. That includes MySQL, PostgreSQL, SQLite, Sybase, MS SQL, LDAP, and ODBC.

CGI search front-end supplied with highly customisable appearance. This can also be customised to output results in XML or CSV, which is useful for dynamically generating pages (e.g. with PHP or mod_perl).

***********************************************************************

Site of the Week:- http://www.livingwithoutmicrosoft.org/

This site (Living without Microsoft- LWM) is for anyone who wishes to explore realistic alternatives to Microsoft software. The aim is to provide accurate information about, and analyses of, non-Microsoft software and to discuss the benefits and problems one can encounter by adopting it instead of a Microsoft solution. It also provides news on industry and legal developments which may be relevant to anyone making decisions about deploying non-Microsoft software. The aim is to build a resource which will be informative, impartial and accessible to non-techies.


That's all for this week. See you next week.

Madhuresh Singhal