Understand the fundamentals of Delta Lake Concept
You might be hearing a lot about Delta Lake nowadays. Yes, it is because of it’s introduction of new features which was not there in Apache Spark earlier. Why is Delta Lake?If you check here, you can understand that Spark writes are not atomic and data consistency is not guaranteed. When metadata itself becomes big data it is difficult to manage data. If you have worked on lambda architecture you would understand how painful it is to have same aggregations to both hot layer as well as cold layer differently. Sometimes you make unwanted writes to a table or a location and it overwrites existing data then you wish to go back to your previous state of data which was tedious task. What is Delta Lake?Delta Lake is a project that was developed by Databricks and now open sourced with the Linux Foundation project. Delta Lake is an open source storage layer that sits on top of the Apache Spark services which brings reliability to data lakes. Delta Lake provides new features which includes ACID ...