The text has been included in my GitHub Warehouse , welcome Star：https://github.com/bin392328206/six-finger
The best time to plant a tree is ten years ago , The second is now
I haven't written for a long time , These ten days , Xiaoliuliu is not busy , It is learning from Alibaba cloud , ha-ha . Because the company's warehouse technology stack is dataworks and maxcomputer, So I'm learning recently , There's just one need that we bi Data should be taken from hbase writes maxcomputer To do ETL And then they're going to write Hologres in , Then our business department goes to Hologres In order to support the report business , In fact, the scene is also a general scene , I just want to write something and record what I've done recently , Otherwise, I feel relaxed . ha-ha , Alibaba cloud didn't give me advertising expenses , Pure sharing , Record .
What is? Hologres
Hologres It is an interactive analysis product independently developed by Alibaba , compatible PostgreSQL 11 agreement , Seamless connection with big data Ecology , Support high concurrency and low latency analysis processing PB Level data .
As the ways of collecting data are enriched , The degree of enterprise informatization is higher and higher , The amount of data mastered by enterprises is TB、PB or EB Level growth . meanwhile , The rapid development of Data Center , Data application is mainly data support 、 User portrait 、 Core business services such as real-time circle people and accurate advertising delivery . High reliability and low latency data services become the key to enterprise digital transformation .
Hologres Committed to low-cost and high-performance large-scale computational storage and powerful query capabilities , To provide you with massive data real-time data warehouse solutions and real-time interactive query services .
- Speed up queries MaxCompute data
Hologres And MaxCompute Seamless connection at the bottom , You don't have to move data , You can use standard PostgreSQL Statement query analysis MaxCompute Huge amount of data in , Get query results quickly .
- Build real-time data warehouse quickly
Hologres Deep integration of real-time computing Flink edition , Support high concurrent real-time write and real-time query data , Help you quickly build enterprise real-time data warehouse .
- Data warehouse service
- Hologres The servitization of data warehouse is as follows ：
- Supports two storage modes: row storage and column storage .
- Optimized high concurrency and complex query scenarios , by PB Second level data query provides second level query response .
- Support interactive and service-oriented query scenarios .
- Seamless connection with the mainstream BI Tools
Hologres compatible PostgreSQL agreement , Provide JDBC or ODBC Interface . You can easily connect the massive data that you find with various kinds of BI Tools , Realize exploratory multidimensional analysis , Support richer application scenarios without migrating data .
What the hell is it , In the past, we used to make reports every morning to count the day before And then write about our warehouse of ADS layer （ Because the polymerization of this layer is higher , There's not much data ）, Then take it. ADS Layer of data synchronization to our mysql, And then the people in our business department go to the library , Provide api Interface . But this Hologres He means that he can query in real time , There's no need to say that it's cleaned , And then you can get the data quickly , This brings us to a new concept HSAP
HSAP： analysis 、 Service integration
HSAP Refer to Hybridservingandanalyticalprocessing, Our idea is to be able to support this very high QPS Of Serverless Query writing of scenarios , And the complex analysis scenarios are completed in a set of systems . So what is the core of it ？ First , Have a very powerful set of storage , It can store real-time data and offline data , Realize the universal storage of data , At the same time, there must be an efficient query service , Be able to support high QPS Query for , Support complex analysis as well as federated query and Analysis , In this way, offline data and real-time data can be imported into the system , And then apply the front-end data , such as BI Reports and some online services , Docking into the system . such , The architectural complexity mentioned above , In fact, it can be solved easily . We call this design concept HSAP Design concept .
With the above HSAP Design concept of , We have to make corresponding products to realize this idea , Hence the Hologres.Hologres The word is holographic and Postgre The combination of ,Postgre Compatible with PostgreSQL ecology ,holographic It means holography , That's all the information , We hope to pass Hologres Holographic analysis of the data , And compatible PostgreSQL ecology . Sum up in one sentence ,Hologres Is based on HSAP idea , compatible PostgreSQL ecology 、 Support MaxCompute Data direct query , Support real-time write real-time query , Real time offline Federation analysis , Low cost 、 High aging 、 Quickly build enterprise real-time data warehouse
- Unified storage
Hologres It can meet the data storage mode in a variety of scenarios , Support PointQuery（Hbase scene ）、Ad-hocQuery（Durid scene ） and OLAPQuery（Impala scene ） etc.
- Design with real-time analysis as the center
Hologres The design idea is to be fast , Second level data support real-time analysis , Extremely fast query response , It also supports real-time writing 、 Bulk data import , With high import performance .
- Storage computing separation
Hologres Using storage computing separation architecture , Users can flexibly expand and shrink the capacity according to their needs , If you use more storage, you can buy more storage ,CPU Use more, you can buy more CPU. in addition ,Hologres Support heterogeneous data source interaction analysis and federated query of offline data and real-time data .Hologres Has been and MaxCompute Seamless through , Can be directly in Hologres Chinese vs MaxCompute Table to query .
- PG ecology
Hologres compatible PostgreSQL ecology , Can and PG development tool 、BI Tool docking , At the same time, alicloud provides a set of native development platform , In the WebIDE In the middle of SQL Development of , And support task scheduling .
You can find the official website by yourself demo Just play , Little 66 this way Tested his pg ecology , Just use jdbc Finished a simple crud, As for acceleration maxcomputer Data development for , Xiaoliuliu itself is engaged in business development , So I haven't tried it yet ？ In fact, for the official website said how boastful , Xiao Liuliu doesn't know how much cattle hide she is , Because I haven't tried , But xiaoliuliu's current company has used clickhourse We will olap Can be compared with the scene of , Actually how to say ？ He and clickhourse It's a competitive product , After all, they are all real-time interaction analysis （ Here's a simple explanation of what real-time interaction analysis is , Suppose my database has 40 100 million data , I'm going to be here 40 To do analysis in 100 million data , If it is mysql What will be done , As we all know , It's hard , But for this clickhourse He can write directly sql check , It's really cowhide ）
Open source system comparison
- Storage computing angle
Because xiaoliuliu company used clickhourse Let's talk about one of his shortcomings , It's that he has one shortcoming , That is, we can't write data frequently , We are currently writing in batches at regular intervals , There is also a delay in writing , It's not millisecond , It's seconds .
In terms of storage ,Hologres Supports millisecond level write visibility ,Druid and ClickHouse It supports seconds . meanwhile Hologres Support write update . In terms of calculation ,Hologres Same as Druid and ClickHouse They all use vectorization /SIMD actuator , It is worth mentioning that , because Hologres The underlying technical principles can support federated queries and high QPS Check it out , These are all Druid and ClickHouse unattainable .
This point is really what makes me feel arrogant , Write can also respond in milliseconds , It also has the query ability of millisecond level for massive data , I don't know why the bottom can be so fast , It broke my mind .
- SQL、 Advanced features and Ecology
For our curd Come on , If a product is not compatible sql ecology , Basically, the cost of learning is very high , But most of the time olap The tools are all supported , Including offline silos . It's just that the grammar is slightly different .
stay SQL aspect ,Hologres Is compatible Postgres grammar , Out of the box , The cost of learning is low , and Druid and ClickHouse The grammar is more complicated , It's hard to get started . And for the update、delete、join etc. ,Druid and ClickHouse The support of the government is limited . In terms of advanced functions ,Hologres Support Vector Retrieval 、 Spatial data, etc , Support richer business scenarios , And for security control ,Hologres It also has very strict authority management , for example RAM、IP White list, etc . Ecologically Hologres Support will be richer ,Hologres Provide JDBC Interface , And can pass Flink perhaps DataX、StreamX write in , Easy import of multiple heterogeneous data sources Hologres.
In fact, there is nothing , It's all from the official website , I copy Now , But this database is really bragging , Look at the introduction , But everything has 2 Facet , There must be something bad about them , For example, their primary keys don't seem to be self increasing . And other things , Because I didn't go into , I don't know much about , You can understand , I have this thing under science popularization , ha-ha .
Ask for praise everyday
All right, everyone , The above is the whole content of this article , You can see the people here , All are Real powder .
It's not easy to create , Your support and recognition , It's the biggest driving force of my creation , See you in the next article
six miles holy sword | writing 【 original 】 If there are any mistakes in this blog , Please criticize , Thank you for ！