Getting random documents from ElasticSearch in a native way without code
I am using elastic search (ES) as my backend no-sql DB.
One of the tasks I had today is to fetch random amount of documents from specific index/type,
The initial thought was to fetch the number of documents matching my criteria, and than fetch random page within in - but this would require 2 round trips (although quick and under 50 ms) and some code
It felt weird, I wanted a way that will use the native capabilities of ES, After 15 min of search and reading I found a way to do it!
The process is based on the fact that in the query syntax I can define to ES on how to calculate the score per each document (this is done using function_score)
The nice thing is that by default, the query is sorted by the score desc and when asking for X amount of documents, I get actually X amount of random documents :)
Sample query is below
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"query": { | |
"function_score": { | |
"query": { | |
"match_all": {} | |
}, | |
"functions": [ | |
{ | |
"random_score": {} | |
} | |
] | |
} | |
} | |
} |
The "random_score" function actually compute a random number between 0-1 --> the score will be 0-1
Be aware that using this feature will load field data for _uid, which can be a memory intensive operation since the values are unique.
- In my scenario i can predict the size of max _uid and can hold this.
Thanks wikimedia for the image
Getting random documents from ElasticSearch in a native way without code
Reviewed by Ran Davidovitz
on
9:59 AM
Rating:

No comments:
Post a Comment