Solr Search Application

Last modified by Ecaterina Moraru (Valica) on 2017/09/04 14:44

Solr Search Application

Description

The main objective of this application is to exploit the Apache SOLR search engine as indexing and search engine for XWiki. XWiki is using Lucene as a core component for Wiki Search. Lucene is little hard to configure and doesn't support features like facet search, hit highlighting, customizing search relevancy using boost index out of the box. Solr stands out in its minimal configuration to implement the search engine.Few libraries and a couple of XML configuration files are sufficient to implement a well to do engine. Configuring multiple languages is easy in SOLR compared to Lucene. Using SOLR, one can customize the indexing process by using required analyzers with selected tokenizers and set of filters on the dataset to generate highly customizable relevancy index. Through the front end, the user can select or configure the fields to be searched for and their weight which contributes to the document score.

Features

 The solr search module uses embedded solr server for indexing, analysis and retrieval of data. The features of the search module are as follows:

  1. Simple text search
  2. Advanced search using document fields like title, name, comments, space, language etc..
  3. Objects and Properties
  4. Search attachments and text inside the readable attachments.
  5. Multilingual Support.
  6. Debug Mode
  7. Search Filters based on space, type, language and boost values
  8. Sorting based on Relevancy, Date, Author
  9. Quick search
  10. Admin module to index wiki as a whole or a selected space.
  11. Indexing status details.

Set Up

  1. Get the latest XWiki 4.2-SNAPSHOT +.

  2. XWiki 4.2-SNAPSHOT has lucene 3.5 libraries.Delete lucene-analyzer, lucene-core, lucene-queryparser of version 3.5.0 , xwiki-platform-search-lucene-4.2-milestone-2.jar from xwiki-enterprise-jetty-hsqldb-4.2-SNAPSHOT/webapps/xwiki/WEB-INF/lib.

  3. Download Solr from http://www.apache.org/dyn/closer.cgi/lucene/solr/4.0.0-ALPHA . Add the below jar files to XE/webapps/xwiki/WEB-INF/lib.
     Untill the Solr search is added to xwiki-platform, for convienience the files are hosted in dropbox site here. Solr Lib
    1. apache-solr-core-4.0.0- ALPHA.jar
    2. apache-solr-solrj-4.0.0-ALPHA.jar
    3. apache-solr-velocity-4.0.0-ALPHA.jar
    4. apache-solr-langid-4.0.0- ALPHA.jar
    5. apache-solr-analysis-extras-4.0.0- ALPHA.jar
    6. apache-solr-dataimporthandler-4.0.0-ALPHA.jar
    7. apache-solr-cell-4.0.0-ALPHA.jar             
    8. apache-solr-dataimporthandler-extras-4.0.0-ALPHA.jar 
    9. apache-solr-uima-4.0.0-ALPHA.jar
    10. apache-solr-clustering-4.0.0-ALPHA.jar

  4. Download Lucene from  .Add the below jar files to XE/webapps/xwiki/WEB-INF/lib.
    Untill the Solr search is added to xwiki-platform, for convienience the files are hosted in dropbox site here. Lucene Lib
    1. lucene-analyzers-4.0.0-ALPHA.jar
    2. lucene-icu-4.0.0-ALPHA.jar
    3. lucene-phonetic-4.0.0-ALPHA.jar
    4. lucene-spellchecker-4.0.0-ALPHA.jar
    5. lucene-core-4.0.0-ALPHA.jar
    6. lucene-join-4.0.0-ALPHA.jar
    7. lucene-queries-4.0.0-ALPHA.jar
    8. lucene-stempel-4.0.0-ALPHA.jar
    9. lucene-facet-4.0.0-ALPHA.jar
    10. lucene-kuromoji-4.0.0-ALPHA.jar
    11. lucene-queryparser-4.0.0-ALPHA.jar
    12. lucene-grouping-4.0.0-ALPHA.jar
    13. lucene-memory-4.0.0-ALPHA.jar
    14. lucene-remote-4.0.0-ALPHA.jar
    15. lucene-highlighter-4.0.0-ALPHA.jar
    16. lucene-misc-4.0.0-ALPHA.jar
    17. lucene-spatial-4.0.0-ALPHA.jar

  5. Add  google gson library to the XE/webapps/xwiki/WEB-INF/lib. Indexing uses json communication to update the progress bar. Download from http://code.google.com/p/google-gson/downloads/list

  6. Add spatial4j library file to XE/webapps/xwiki/WEB-INF/lib. Download from http://repo1.maven.org/maven2/com/spatial4j/spatial4j/0.2/spatial4j-0.2.jar
     
  7. Add configuration to xwiki.properties.
    Search backend to be used by wiki search for indexing and search query retrieval.
       search.backend=solrj
       search.solr.home='/path/to/solr/home/'


  8. Build the code and copy xwiki-platform-search-api-4.2-SNAPSHOT.jar and xwiki-platform-search-solrj-4.2-SNAPSHOT.jar to the XWiki Eneterprise lib directory.
    InformationThe code is not stable, there are few checkstyle errors. build with -Dxwiki.checkstyle.skip=true

  9. Start the Server using start_xwiki.sh or start_xwiki.bat
  10. Save and View the SearchPage  at http://localhost:8080/xwiki/bin/view/Main/AdvancedSearch, Start searching.                      

Search Actions

Simple Search

Search for simple text : water 

This returns the result with text highlighting. Facet search is enabled based on Space, lang, object, date, creation date, author, creator .

Advanced Search

  1. Search for text in space health
  2. Search for text in comments comment:ipsum
  3. FilteredSearch

The filtered search allows users to search based on Spaces, language, Type, File Type, Query Boost

Filter Search

Multilingual Search

The Multi Lingual search supports 5 languages currently. English, French, Spanish, Cesky, Deutsch(en,fr,es,cs,de).

  1. Search for a french word Atmosph*
  2. Search for a english word when french is active. Water lang:en
  3. Search for a Spanish word Agua lang:es

Debug Search Queries

Advanced search displays the debug information of search queries and the results. It give an insight on query parser, physical query and fieldboosts used for the search. For each result information regarding the score calculation is displayed.

  1. Debug Query
    debug_query.png
  2. Debug Search result.


    debug_search_result.png

 

Administration

 The following links to the Admin module is present in wiki administration. 

Solr Search  : It contains information about  Solr, Lucene, Solr server core container, configuration files, Update Handlers, Query Handlers, Cache and Highlighting.

Indexing       : It allows the admin to Index based on the available spaces and wikis. It gives information about the indexing status.

Schema.xml : It gives the view of Schema.xml file

SolrConfig.xml : It gives the view of SolrConfig.xml

The credentials to log into the instance running is given below.

http://savitha.hoplahup.net/xwiki/bin/admin/XWiki/XWikiPreferences

Login ID: Admin

Password:admin

Get Connected