Planning for Search Project

Current Search Projects

  • Code to remove two-stage search should be available for 3.0. End results of this project:
    • More complete set of search results, particularly for large consortia.
    • Improved performance for search, particularly for narrow searches. Broad searches (1,000,000 hits or above) will not see much improvement, but also won't be slower.
  • Code to change the search and display infrastructure should be available in 3.1. End results of this project:
    • Removal of the keyword blob, which should improve field-by-field weighting of search results in relevance ranking for general keyword searches.
    • Highlighted search terms

Overriding Question

Do we need to revisit the question of what is used for our search infrastructure before we proceed with specific projects? Or can we proceed with spec development for one or two major quality-of-life improvements before evaluating the effect of the above projects.

  • In 2015 - 2016, as a community, we discussed search improvements we would like to see in Evergreen.
  • Part of the discussion raised the question of whether Evergreen should move to a Solr-based search, which generally are faster searches. Making such a move would be a long-term project that requires that we build a new search infrastructure from the ground up.
  • One reason for proceeding with the current projects was to do some major infrastructure work with our current PostgreSQL search to determine if we could get the type of performance gains that bring us close to a Solr-based search.
  • The ability to improve relevance-ranking in the current PostgreSQL search was also part of the question of moving to Solr. 
  • We will need to see the above code working in production for some time before we can determine if the improvements bring us to a level where we are confident that PostgreSQL can give us the search we need, even if it may take a little more tweaking over the next few Postgres releases before it's really there.
  • If we wait until we see that code working in production before proceeding with other search projects, it will significantly lengthen the time before our libraries can see something like Did You Mean? Functionality.

Specific Search Projects to Explore

During the May 2016 Equinox meeting where the current search projects were planned for, we laid out a search roadmap the consisted of the following next steps:

  • Incorporating PostgreSQL's string proximity into search to improve relevance ranking and to lay the groundwork for the next project.
  • Did You Mean? functionality
  • There was also talk of a possible project to incorporate record quality into the relevance-ranking algorithm.

See also the search ideas that came out of the community focus groups held in 2016.


Syndicate content

Creative Commons license icon
This work is licensed under a Attribution Share Alike Creative Commons license