May 01 2015

Why should I NOT use ElasticSearch as my primary datastore?

Answer by Mani Gandham:

As is the case with all database deployments, it really depends on your specific application.

ElasticSearch is a great open source search engine built on top of Apache Lucene. Its features and upgrades allow it to basically function just like a schema-less JSON datastore that can be accessed using both search-specific methods and regular database CRUD-like commands.

Here are the main "disadvantages" I see:

  • Security – ElasticSearch does not provide any authentication or access control functionality.
  • Transactions – There is no support for transactions or processing on data manipulation.
  • Durability – ES is distributed and fairly stable but backups and durability are not as high priority as in other data stores.
  • Maturity of tools – ES is still relatively new and has not had time to develop mature client libraries and 3rd party tools which can make development much harder.
  • Large Computations – Commands for searching data are not suited to "large" scans of data and advanced computation on the db side.
  • Data Availability – ES makes data available in "near real-time" which may require additional considerations in your application (ie: comments page where a user adds new comment, refreshing the page might not actually show the new post because the index is still updating).

If you can deal with these issues then there's certainly no reason why you can't use ElasticSearch as your primary data store. It can actually lower complexity and improve performance by not having to duplicate your data but again this depends on your specific use case.

As always, weigh the benefits, do some experimentation and see what works best for you.

UPDATE FOR 2015: ElasticSearch has come a long way in the past few years since this original answer and now has much more capable durability options, backup methods and even realtime indexing. Please review the official site for more information.

