I am confused with the near-real-time search ability of Solr and Elasticsearch. The near-real-time search is argued to be one of the advantages Elasticsearch has over Solr. However, I have read some documents of Solr saying that near-real-time search also can be done on Solr by using soft commit with the cost of open new searcher. By doing so, the new document is visible in 1 second. In Elasticsearch, the refresh can also make a new document available to search within one second. Did I miss or misunderstand anything? Which one does better on real time search? Any answer would be appreciated. Thank you.
Solr vs Elasticsearch on near real time search
3.1k views Asked by Kuang Lu At
1
There are 1 answers
Related Questions in SOLR
- Upgrading to Solr 9 failes due to NoSuchFileException
- regex to produce duplicate string with modification
- Apache atlas UI not showing up
- SAP Commerce Cloud multisite SOLR configuration
- Solr 9 punctuation issue
- Accessing solr web interface behind reverse proxy returns "Content Encoding Error"
- Getting NPE in apache SOLR 8.11.2 while doing atomic update using add-distinct from my java based appication
- how to specify the maximum number of clusters for the STC algorithm in Solr admin console?
- SOLR compatibility of the KNN query parser with function queries
- How to use Solr as retriever in RAG
- Multiple replacement / substitute NGgram string SOLR 8.6
- Solr updates are taking too long. The update requests are stalling
- solrCloud(9.5) integrates springboots, and adds user authentication, and there is no problem with queries, but the new one keeps reporting errors
- Why does Spring Data for Apache Solr run a count query before running the actual query?
- SOLR 'facet.prefix' is not working as expected
Related Questions in ELASTICSEARCH
- How does Elasticsearch do attribute filtering during knn (vector-based) retrieval?
- Elastic python to extract last 1hr tracing
- Elastic search not giving result when Hyphen is used in search text
- FluentD / Fluent-Bit: Concatenate multiple lines of log files and generate one JSON record for all key-value from each line
- Elasticsearch functional_score with parameter of type string array as input not working
- Elasticsearch - cascading http inputs from Airflow API
- AWS Opensearch - Restore snapshot - Failed to parse object: unknown field [uuid] found
- cluster block exception for system index of kibana
- What settings are best for elasticsearch query to find full word and half word
- OpenSearch - Bulk inserting Million rows from Pandas dataframe
- unable access to kibana
- PySpark elastic load fail with error SparkContext is stopping with exitCode 0
- How to use query combined to KNN with ElasticSearch?
- Facing logstash compatibility issues
- If the same document is ingested at two different times, how to have the same id in Elasticsearch
Related Questions in NEAR-REAL-TIME
- Load data from a near real time Bigquery table to another near real time table in GCP
- Real Time Cluster Log Delivery in a Databricks Cluster
- Near real time streaming data from 100s customer to Google Pub/Sub to GCS
- Using Snowpipe - What's the best practice for loading small files. eg. Thousands of 4K files per day?
- Reduce TTFB on PHP empty file
- why is spark streaming called near real time?
- Lucene near real time search
- How do I invoke executable for Azure real time analytics?
- Real Time Notification for Angular 2 web app with asp.net web API back end
- online recommendation using mahout
- Solr suggester not available in near real time
- Xmpp Vs Websocket
- Solr vs Elasticsearch on near real time search
- how can i use spring framework with lucene
- Lucene Near Realtime Search
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Popular Tags
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
At the end of the day, they both use lucene under the hood. Near real-time search in lucene means reopening the index reader, called refresh in elasticsearch and exposed through the refresh api.
On the other hand you need to commit the lucene index too to have durability, which is expensive and cannot be done every second, and that is why elasticsearch has a transaction log and what makes elasticsearch "kill -9 safe", and allows also for real-time get.
But the best part to me is that in elasticsearch the user doesn't have to worry about refreshes and commits too much, as everything happens automatically under the hood, by default. At the same time, there are apis (refresh and flush) as well as settings that allow to change the default behaviour for power users.