In-doc search UI
Giving readers the ability to easily search the information
that they are looking for is important for us.
We have already upgraded to the latest version of Elasticsearch and
we plan to implement search as you type
feature for all the documentations hosted by us.
It will be designed to provide instant results as soon as the user starts
typing in the search bar with a clean and minimal frontend.
This design document aims to provides the details of it.
This is a GSoC’19 project.
Warning
This design document details future features that are not yet implemented. To discuss this document, please get in touch in the issue tracker.
The final result may look something like this:
Goals And non-Goals
Project goals
Support a search-as-you-type/autocomplete interface.
Support across all (or virtually all) Sphinx themes.
Support for the JavaScript user experience down to IE11 or graceful degradation where we can’t support it.
Project maintainers should have a way to opt-in/opt-out of this feature.
(Optional) Project maintainers should have the flexibility to change some of the styles using custom CSS and JS files.
Non-goals
For the initial release, we are targeting only Sphinx documentations as we don’t index MkDocs documentations to our Elasticsearch index.
Existing search implementation
We have a detailed documentation explaining the underlying architecture of our search backend and how we index documents to our Elasticsearch index. You can read about it here.
Proposed architecture for in-doc search UI
Frontend
Technologies
Frontend is to designed in a theme agnostics way. For that, we explored various libraries which may be of use but none of them fits our needs. So, we might be using vanilla JavaScript for this purpose. This will provide us some advantages over using any third party library:
Better control over the DOM.
Performance benefits.
Proposed architecture
We plan to select the search bar, which is present in every theme,
and use the querySelector() method of JavaScript.
Then add an event listener to it to listen for the changes and
fire a search query to our backend as soon as there is any change.
Our backend will then return the suggestions,
which will be shown to the user in a clean and minimal UI.
We will be using document.createElement() and node.removeChild() method
provided by JavaScript as we don’t want empty <div>
hanging out in the DOM.
We have a few ways to include the required JavaScript and CSS files in all the projects:
Add CSS into
readthedocs-doc-embed.css
and JS intoreadthedocs-doc-embed.js
and it will get included.Package the in-doc search into it’s own self-contained CSS and JS files and include them in a similar manner to
readthedocs-doc-embed.*
.It might be possible to package up the in-doc CSS/JS as a sphinx extension. This might be nice because then it’s easy to enable it on a per-project basis. When we are ready to roll it out to a wider audience, we can make a decision to just turn it on for everybody (put it in here) or we could enable it as an opt-in feature like the 404 extension.
UI/UX
We have two ways which can be used to show suggestions to the user.
Show suggestions below the search bar.
Open a full page search interface when the user click on search field.
Backend
We have a few options to support search as you type
feature,
but we need to decide that which option would be best for our use-case.
Edge NGram Tokenizer
Pros
More effective than Completion Suggester when it comes to autocompleting words that can appear in any order.
It is considerable fast because most of the work is being done at index time, hence the time taken for autocompletion is reduced.
Supports highlighting of the matching terms.
Cons
Requires greater disk space.
Completion suggester
Pros
Really fast as it is optimized for speed.
Does not require large disk space.
Cons
Matching always starts at the beginning of the text. So, for example, “Hel” will match “Hello, World” but not “World Hello”.
Highlighting of the matching words is not supported.
According to the official docs for Completion Suggester, fast lookups are costly to build and are stored in-memory.
Milestones
Milestone |
Due Date |
---|---|
A local implementation of the project. |
12th June, 2019 |
In-doc search on a test project hosted on Read the Docs using the RTD Search API. |
20th June, 2019 |
In-doc search on docs.readthedocs.io. |
20th June, 2019 |
Friendly user trial where users can add this on their own docs. |
5th July, 2019 |
Additional UX testing on the top-10 Sphinx themes. |
15th July, 2019 |
Finalize the UI. |
25th July, 2019 |
Improve the search backend for efficient and fast search results. |
10th August, 2019 |
Open questions
Should we rely on jQuery, any third party library or pure vanilla JavaScript?
Are the subprojects to be searched?
Is our existing Search API is sufficient?
Should we go for edge ngrams or completion suggester?