KLake
Virtual Data Center Platform

KLake Virtual Data Center Platform

At present, in the medical, university, government, financial and other industries, there are hundreds of databases, which makes it difficult to form a unified access view. ETL and other work are needed to centrally store them before subsequent analysis and data governance can be carried out. However, in the face of such a huge amount of data assets, the following unavoidable pain points will exist through the ETL method of data movement and centralization:

-> High cost

It takes a huge amount of time, manpower and financial costs to centralize data scattered across hundreds of database systems.

-> Not real-time

After the data is moved to the centralized central database through ETL, on the one hand, its data lags behind the production database, and on the other hand, the data moved is not at the same time point, which also affects subsequent analysis and use;

->Performance pressure

Usually, each business library needs to undertake tasks such as data reporting and integration platform synchronization. At this time, adding tasks such as ETL extraction and synchronization will further increase the burden on the business library and affect the performance experience of business operations.

Data Asset Management Pain Points

KLake is designed based on the Data Fabric architecture. Through real-time links, it unifies all databases within an organization to form a logically centralized and unified virtual database system, thereby achieving centralized, secure operation and sharing of global data assets:

-> Unified access portal

Centralize the management of all database systems within the organization to form a logically centralized single data source, and provide unified data sharing and access to the outside world;

->Cross-database analysis

SQL in KLake can implement SQL execution across multiple business libraries and heterogeneous databases, avoiding the drawbacks of inconsistent time of various data and the need to centralize data in advance;

->Reduce the burden on production libraries

The SQL executed in KLake is first parsed into several small sub-SQLs so that data can be easily obtained from each source library, and then SQL association, aggregation and other calculations are performed in the KLake cluster nodes. Therefore, data analysis through the KLake platform can be directly from the data source to the report display or result input, avoiding complex intermediate processes. In addition, it also greatly relieves the performance pressure of the production library;

KLake Use Cases: Healthcare Industry Example

Technical Architecture

Daily data reporting, various data synchronization and other scenarios cause the core production database to be overwhelmed. KLake uses its own computing cluster to offload SQL computing from the business database to the KLake platform, while improving the execution efficiency of SQL;

BI Reports

Reduce Operational Database Load

The evaluation of electronic medical records, interconnection, and hospital grades involves complex and numerous indicator queries and cross-database data comparisons. Through the KLake platform, cross-database indicator comparisons of multiple data sources can be achieved through a single SQL statement, avoiding the complex process of step-by-step database division under the traditional method.

Comparison of Indicators in Various Ratings

In the traditional model, BI reports need to be aggregated layer by layer through ETL, data warehouse, data mart, etc., before they can finally generate BI reports and be presented.

In KLake mode, users can use KLake SQL's cross-database query in BI reports to directly summarize and analyze the source data, and use BI reporting tools to directly display the results, thus skipping all intermediate links.

During the construction of smart hospitals, various smart applications require real-time and comprehensive data supply, while traditional data middle platforms and integration platforms are difficult to meet their application needs in terms of real-time and data comprehensiveness. KLake unifies all database resources in the hospital into a virtual database, thus perfectly solving the problems of data real-time and data comprehensiveness, and becoming the core data infrastructure foundation of the hospital.