This is admittedly similar to (but not a duplicate of) Comparison of full text search engine - Lucene, Sphinx, Postgresql, MySQL?, however what I am looking for are specific, supported, recommendations from the benefit of experience with more than one of the available systems (there seems to be a lot of: "I've used lucene, but not sphinx", and vice a versa).
The setup: Standard LAMP (Mysql 5.0, PHP 5).
MySQL: tables are using the InnoDB engine for foreign key constraints
We are looking at indexing data, not pages. data to be indexed may be in multiple languages (utf-8 charset)
A number of the comparisons I've come across (like http://blog.evanweaver.com/articles/2008/03/17/rails-search-benchmarks/) are either not entirely applicable (ferret is a lucene port but not the same as Zend_Search_Lucene) or they are pushing their own systems/implementations (not exactly unbiased).
Some others I've come across (such as http://whatstheplot.com/blog/tag/lucene/ and http://pagetracer.com/2008/02/15/sphinx-and-lucene-search-engines-first-impressions/) provide very different results for performance of the two systems.
Also, all but ignored in much of what I've read is Xapian. Might this be worth consideration as well?
So... I'm hoping that some of you here on SO have some experience with this question and could help with some recommendations or point me in the right direction.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…