File organization for Data Lake

Introduction Data Lake is becoming a natural way to architect large datasets and processing around it. As we saw in our earlier blog, Data Lake hosts enormous amount of data. Hence organizing [...]


Data Replication and Change Data Capture in AWS Data Lake

This blog is part of the technical blog series initiated by my colleagues at Persistent. Here are the quick links of previous blogs Data Lake in AWS Cloud, Data Lake Architecture in AWS Cloud, [...]


Data Lake Architecture in AWS Cloud

In my last blog, I talked about why cloud is the natural choice for implementing new age data lakes. In this blog, I will try to double click on ‘how’ part of it. At Persistent, we have been [...]


To Data Lake or Not To Data Lake

Very often, we get requests from our customers for workshops on Data Lake. We discuss whether to create a data lake, are there adequate use cases, the technology stack decisions for building the [...]


Data Lake in AWS Cloud

It has been three years since we (the architect community at Persistent), published a series of blogs detailing our views on Data Lakes – these blogs covered the why, the what and the how of the [...]


AWS re:Invent 2018 – My Key Takeaways

My reflections post the AWS re:Invent 2018 conference in Vegas had me convinced of the need to share a few key takeaways. This blog covers the relevant insights gathered from the conference and [...]


Tags – First (read: FAST) step to Discover, Explore and Enrich your Data Lake!

Have you faced these or similar situations? You decided to build a distributed data warehouse containing terabytes (TBs) of data. Data consumers are now asking for different insights and the [...]