Skip to main content

Search algorithm

How Search Works in Elium.

Updated over 4 weeks ago

🔍 How Search Works in Elium

Elium uses Elasticsearch, a powerful search engine that ranks results based on how relevant they are to your query. This page explains how the search scoring works, what influences result order, and how to get the most accurate results.

⚙️ How Is a Result Ranked?

When you run a search, each content item (article, comment, attachment, etc.) gets a relevance score. This score determines its position in the results list.

The final score of a document is the sum of scores from each matching field.

Each field score is calculated based on:

Field Score = Boost × IDF × TF

Here’s what each part means:

  • Boost: A manual importance factor applied to each field (e.g., title^2.0 makes title matches more important).

  • IDF (Inverse Document Frequency): Rare words across the entire knowledge base are considered more meaningful.

  • TF (Term Frequency): The more often a word appears in the field of the document, the more relevant it is.

Once each field is scored, the overall document score is the sum of all field scores. This favors documents that match across multiple fields, not just one.

🧪 Example

Let’s say the user searches for "refund card".

Assume a document contains:

  • “refund” once in the title (boost: 1.6)

  • “refund” and “card” in a comment (boost: 0.8)

The engine calculates:

  • title_score = 1.6 × IDF(refund) × TF(refund in title)

  • comment_score = 0.8 × IDF × TF(for both refund and card in comments)

  • total_score = title_score + comment_score

💡 So a document that includes the keywords in multiple places may outrank another that only contains them in one place, even if that one place is highly boosted.

🧩 Which Fields Are Searched?

When searching, Elasticsearch looks across multiple fields of the content and applies different boost levels to each. For example:

Field

Boost

title

1.6

title.unstemmed

2.0

tags.name

1.4

tags.name.unstemmed

1.7

comments

0.8

attachments

1.0

Unstemmed fields correspond to exact matches without word variations (e.g., "refund" will not match "refunding").

Each matching field contributes to the total score of the document.

⏳ Does Content Freshness Matter?

Yes. Elium also applies time-based scoring to prioritize more recently updated content, without fully ignoring older resources.

This is done through a decay function, which slightly reduces the relevance score of documents based on their last update date:

  • Content updated within the last 30 days keeps its full score.

  • After that, the score gradually decays, and drops to 33% after 1 year.

  • Older documents can still appear if they are very relevant, but newer content is generally favored.

This ensures users see up-to-date answers first, while preserving the value of long-lasting knowledge.

🌐 What About Multiple Languages?

If your platform is multilingual, the search engine automatically searches translated fields based on your interface language.

For example, if you’re browsing in French, it will include fields like title_trans.fr and tags.translations.fr.

These translated fields also have their own boosts to ensure they're treated fairly in scoring.

🔄 How Are Keywords Interpreted?

Elium uses a query string parser, which allows for advanced matching and search flexibility:

  • It searches across multiple fields (with weights)

  • Uses AND by default: all words in the query must be present

⚠️ Stop Words: Common Words May Be Ignored

Some very common words (like "the", "in", "on", "and", etc.) are automatically removed from queries. These are called stop words, and are designed to reduce noise and improve relevance.

However, this can be problematic when a user is intentionally searching for a word that is both:

  • common in the language, and

  • meaningful in context (e.g., “On” as a system name or acronym)

Note: There is no way to force the inclusion of a stop word, even when placing it in quotes.

🤔 Why Doesn't My Document Appear?

Here are some common reasons a document might not show up in search results:

  1. The keywords don’t appear in any of the indexed fields

  2. The words used are too common (low IDF)

  3. The matches are in low-boosted fields

  4. The word is ignored as a stop word

  5. The document is poorly structured (e.g., generic title)

✅ Best Practices to Improve Searchability

To ensure your content appears when it should:

  1. Use clear, descriptive titles

  2. Add meaningful tags

  3. Put key information in the main content, not just in comments

  4. Translate titles and tags when needed

  5. Avoid overly generic or common-language content titles

  6. Use quotes when searching for exact words

Did this answer your question?