Author: Swatantra Kumar
Swatantra is an engineering leader with a successful record in building, nurturing, managing, and leading a multi-disciplinary, diverse, and distributed team of engineers and managers developing and delivering solutions. Professionally, he oversees solution design-development-delivery, cloud transition, IT strategies, technical and organizational leadership, TOM, IT governance, digital transformation, Innovation, stakeholder management, management consulting, and technology vision & strategy. When he's not working, he enjoys reading about and working with new technologies, and trying to get his friends to make the move to new web trends. He has written, co-written, and published many articles in international journals, on various domains/topics including Open Source, Networks, Low-Code, Mobile Technologies, and Business Intelligence. He made a proposal for an information management system at the University level during his graduation days.
Solr
1. Web service: Solr places Lucene over HTTP, allowing programs written in any language to invoke Lucene
2. XML-based schema for managing indexed fields and their characteristics
3. System administration tools for configuration, data loading, index replication, statistics, logging and cache management
4. Large scale distributed search
5. Fixed/paid result list placement
6. Faceting — the dynamic clustering of items or search results into categories that lets users drill into search results (or even skip searching entirely) by any value in any field, as seen on popular ecommerce sites such as Amazon
Feature List of Solr
1) Faceted search
2) Full-text search
3) Hit highlighting
4) Dynamic clustering
5) Sorting
6) Filtering
7) Spell checking
8) Elevation
9) Boosting at index and query time
10) “Did you mean” spell checking
11) Finding Documents that are “More like this”
12) Overriding search results based on editorial input (also known as paid placement)
13) Term
14) Term Frequency
15) Position (based on analysis)
16) Offset (character based)
17) IDF – Inverse Document Frequency
18) CopyField functionality allows indexing a single field multiple ways, or combining multiple fields into a single searchable field
Query
1 HTTP interface with configurable response formats (XML/XSLT, JSON, Python, Ruby)
2 Sort by any number of fields
3 Advanced DisMax query parser for high relevancy results from user-entered queries
4 Highlighted context snippets
5 Faceted Searching based on unique field values and explicit queries
6 Spelling suggestions for user queries
7 More Like This suggestions for given document
8 Constant scoring range and prefix queries – no idf, coord, or lengthNorm factors, and no restriction on the number of terms the query matches.
9 Function Query – influence the score by a function of a field’s numeric value or ordinal
10 Date Math – specify dates relative to “NOW” in queries and updates
11 Performance Optimizations
Cache in Solr
Admin Interface
1 Comprehensive statistics on cache utilization, updates, and queries
2 Interactive schema browser that includes index statistics
3 Replication monitoring
4 Full logging control
5 Text analysis debugger, showing result of every stage in an analyzer
6 Web Query Interface w/ debugging output
o parsed query output
o Lucene explain() document score detailing
o explain score for documents outside of the requested range to debug why a given document wasn’t ranked higher.