A Review on Enterprise Data Lake Solutions

Authors

  • Aakash Aundhkar

DOI:

https://doi.org/10.46243/jst.2021.v6.i04.pp11-14

Keywords:

Data Analysis, Hadoop, Data lake

Abstract

Data Lake is a highly flexible storage solution that can store both structured and unstructured data and operates on the schema-on-read approach. It acts as a potential alternative to the current Big Data storage issue. However, it does have certain flaws, such as inadequate authentication and access control. This paper examines a few of the current business Data Lake strategies. Apache Hadoop is generally regarded as the data lake industry standard. Its parallel processing systems ensure high-speed processing of massive volumes of data. Many businesses have attempted to build Hadoop wrappers in order to resolve questions about its raw state and lack of data protection. Platforms including Amazon Web Services (AWS) Data Lake and Azure Data Lake fall under this category. AWS Data Lake provides a more straightforward approach with failsafes to avoid data failure, while Azure Data Lake offers much greater scalability and enterprise-level reliability. Data Lake systems are becoming increasingly common in a variety of sectors, including finance, business intelligence, engineering, and healthcare.1

Downloads

Published

2021-08-16

How to Cite

Aundhkar, A. (2021). A Review on Enterprise Data Lake Solutions. Journal of Science & Technology (JST), 6(Special Issue 1), 11–14. https://doi.org/10.46243/jst.2021.v6.i04.pp11-14