blog

Data Mesh, the new data paradigm set to rise.

| By Doichin Yordanov, CTO Cloud Software, HeleCloud |

The world of data has not seen a major shift since the time of No/SQL and data lakes, some 15 years ago now. These came with the adoption of Cloud computing in the early 2000s, to extend the 40 years old relational data warehouses and bring capabilities for easily exploring vast amounts of various structured, semi-structured and unstructured data. A lot of tooling and architectural patterns evolved around data lakes since then: lake-houses (un/semi-structured data lake + structured warehouse) or lake-house-marts (un/semi-structured data lake + structured warehouse + pre-calculated BI data). The classical data-warehouses and data-mart sub-systems still played an important role in these architectures, but it is the data-lake front-end after-all that enabled rapid exploration, manually or through AI, of vast volumes of assorted data.

In the following years as computing moved to the Cloud, so did the data. It was natural evolution that Cloud vendors started competing with traditional data-base and data-analytic hegemons. AWS began to tackle players like Oracle or SAS for example. Cloud vendors had an important advantage though: cheap infrastructure, secure, robust and elastic computing/storage. New distributed computing paradigms, such as Serverless, contributed tremendously to the advantage of Cloud vendors. An AWS-native Cloud stack for example, comprised of S3, Athena + EMR + Redshift + Quicksight, allows you to fulfill complex data analytics patterns through secure, robust and auto-scalable architectures, just like using the Lego building blocks, and far easier than traditional DB vendors allow. It is the Cloud now where computing is and where data intensive analytics happen.

The state-of-the-art Cloud data tooling led to an explosion in the number of Cloud “data lakes”. Still, all analytics were concentrated per client/case basis, e.g., you build one lake for a Medi-Care facility, then build another one for a chain of pharmacies. 

Data Mesh is about federating information dispersed in multiple company-specific data lakes throughout an entire or even few different industries for far more powerful insights and smarter business decisions, through meta-data standardisation, information discovery, efficient and secure data exchange. It’s like leveraging what the Cloud data architectures built on a bigger global cross-company scale. Martin Fowler, as always, authors perceptive articles on this mega trend and Cloud vendors of course are all-in.

AWS, with its principal architect Richard Nicholson provides an insightful glimpse of the AWS way for Data Mesh architectures. The AWS Data Exchange service along with the multiple AWS data processing and storage services are excellently positioned to become a powerful Data Mesh orchestration tool, solving many of the riddles of distributed data-analytics systems such as the cross-domain cross-company data-affinity, sharing, and caching. 

We at HeleCloud, an AWS Premium Consulting Partner offering sound insight into AWS workings and deep experience in data analytics, believe that combining the AWS Instrumentarium with industry-specific efforts for meta-data standardisation, such as the Electronic Health Record standard (EHR) is a low hanging fruit for leveraging the power of the Data Mesh paradigm. 

Data Mesh is not just some abstract long-term paradigm, it is an imminent digitalisation shift that will transform and differentiate businesses into smarter, faster decision makers.

About the author: Doichin brings 22 years of IT experience in data-intensive application development. He technically led some game-changing R&D projects such as the core VMware vSphere data layer and SAP’s Environment Health and Safety CRM modules.