当前位置：网站首页>Service architecture and transformation optimization process of e-commerce trading platform in mogujie (including ppt)
Service architecture and transformation optimization process of e-commerce trading platform in mogujie (including ppt)
2020-11-08 10:35:48 【osc_x8s7voop】
Service architecture and transformation optimization process of e-commerce trading platform in mogujie ( contain PPT)
Reading guide ： High availability Architecture 7 month 30 It was held in Shanghai on May th 『 The cornerstone of the Internet Architecture 』 Special Salon , He gave lectures on four topics of closed door private Council discussion and opening to the outside world , It is expected to promote the construction and development of Internet infrastructure in the industry , This paper is about Pan Fujiang sharing the e-commerce trading system architecture of mushroom street .
Pan Fujiang , Senior R & D Engineer of mushroom Street ,2014 Years ago in Ali , I have done the construction of e-commerce vertical business platform , Also engaged in middleware related research and development work ,2015 Joined mushroom street in （ Now beautiful United Group ）, Responsible for mushroom Street trading funds , Service oriented construction of e-commerce infrastructure platform such as shopping cart .
I'm from mushroom Street , Mushroom street is an e-commerce platform mainly for female users , Men may use less . But there are a lot of model girls in mushroom street , And the appearance is high , I suggest that you can come down and use , When I'm tired of writing code , You can secretly open mushroom street to see my sister , I think it's still very good .
Today, my topic is the service architecture of mushroom Street trading platform , And in the process of service construction , We share some of the transformation process .
Mushroom Street shopping guide period Business structure
Mushroom Street started as a shopping guide , At that time, all business was based on users and content . At that time, the front desk business was mainly social shopping guide , Background business is mainly to do content management . In a word, it is a small and beautiful state , The business is not very complicated .
At that time, the technology architecture was a typical entrepreneurial company . The whole website uses PHP Built , The system is simply layered , The infrastructure is based on ready-made open source products .2013 Mushroom street was transformed in , The main reason is that a large number of shopping guide websites were blocked during that period , So we transformed into a social e-commerce platform .
The social e-commerce platform is divided into two parts , Part of socialization , We have accumulated some experience in shopping guide before . E-commerce is something we haven't touched before , This is basically built from scratch . To be an e-commerce platform , The first step is to build a trading platform . At first it was simpler , We rewrote a system , There is no essential change in the structure of the system , All business is written in a huge project , We interact with our infrastructure through a set of proxy layers .
The problems faced by e-commerce transformation
- Business is developing at a high speed , Keep... Every year 3 More than double the growth （2015 More than 100 million users per year ,PV exceed 10 Billion ）
- The peak value of user purchase link is 100 times that of daily （2015 At the beginning of the year, only 400 single / second ）
- The business is extremely complex , Rapid expansion of business form
- The burden of history is heavy , System coupling is very strict
After the transformation of mushroom street to e-commerce platform , Business is basically growing more than three times a year , This is when the problem begins to emerge . E-commerce platform in the development process, especially in the middle of the development of some problems encountered , It's not just mushroom street , Other platforms may also encounter . For example, the system code is bloated 、 Module coupling is high , Rely on Complexity , Poor business expansion ability, etc .
At that time, mushroom Street faced several problems ：
One is that our business is growing at a high speed , The system capacity can't keep up with , At that time, the trading system could only support 400 orders per second ( The flow rate is more than 100 times that of the normal time when it is greatly promoted ).
The other is that the e-commerce business has changed rapidly , Business support is not flexible enough , Not fast enough .
There is also the burden of history , System coupling is very serious .
The key to solve this series of problems is just one word ：“ Demolition ”.
The process of system splitting
- DB Split Vertically
- Vertical splitting of business system （ The shopping cart , Place an order , Money …）
- data & Business model unification , The logic of service interface design is clear , Proper granularity
- Basic business logic sinks into service ,Web Layers focus on representing logic and choreography
- Service governance
System split —— Trading shopping carts, for example
Let's take the example of trading shopping cart to illustrate our transformation process , We used to have a project , All the code is written here , Different terminals or services have a different module code in maintenance . Access to data is also more casual , Each maintains a set of data access code . So there are two very headache problems ：
On the one hand, because trading is just a pool , It's all in it , So these random things SQL May be cold to give you a slow inquiry , The instability of other business code can affect each other , It's hard to locate this “ wild SQL” Where did you find out , That led to our DB It's very unstable , It's very bad for the subsequent transformation .
The other is business support , A product needs to come , It has to be implemented on all kinds of terminals , Reusability is very poor , Business support is very inflexible , The system has no scalability , It's hard to develop students , I often work overtime to do , And a lot of bug.
So we went to dismantle the system , How to dismantle ？ In fact, there are some fastidious things .
Priority of system splitting
If you put DB Like a barrel , It's like pouring water into all kinds of business , There may not be much water poured into the barrel at first , There's no problem with the barrel . But as the business grows , The barrel will not be able to hold it one day .
First of all, the barrel needs to be big enough , And it's easy to expand , In this way, there will be no worries . Business volume is sometimes not easy to predict , It's not sure when it will be measured , If you don't make the bottom barrel strong enough , And give priority to the division and optimization of business , As soon as the quantity rises, the whole system will stop .
therefore DB It's the basis of system splitting , It needs to be split first .
DB Take it out and pay attention to stability , As mentioned earlier , at that time SQL It's a bit messy , Extremely easy to cause DB unstable , So data access / Model unification is also key , We built a unified data access layer . With this layer , Back to DB The transformation and expansion can be controlled effectively .
All the basic things have been built , Then we can solve the problem of business support difficulties . Business models need to be unified and abstract , Ability to support custom extensions . At the same time, the process of process transformation also hatched SPI Business framework 、 Process engine 、 These basic business frameworks such as rule engines . It is flexible and scalable in business support . The system has also made a reasonable stratification , Each layer only needs to pay attention to its own ability .
The result of system split
After the overall split of the trading system is completed , company SOA The rudiment of transformation has basically been formed , Including the basic service-oriented framework 、 Message middleware 、 Data middleware 、 The configuration center has also been implemented , In addition, a series of infrastructure tools have been incubated , Including the monitoring system , Scheduling system , Log collection , Link tracking system, etc .
There is also a background splitting process , The overall strategy of the company is to Java Language transfer , This is a comprehensive consideration of the company ,Java There are a lot of talents, especially in Hangzhou , The technology system is also relatively mature , There is a big cow who can hold Housing problems , At that time, it was PHP Less resources .
After the system was split and transformed , Next, more attention will be paid to the capacity of the application itself 、 Things about performance and stability . We have also made some improvements and attempts in these aspects .
- According to the business DB Split vertically
- Read / write separation , Ensure that the read can be extended at will
- Sub database and sub table , Increase the central service write capacity
When the system is split , We have put DB It's split vertically , also DB I also made a separation between reading and writing （ be based on MySQL）.
The following focuses on the transformation of sub database and sub table , At that time, the main purpose was to improve the write capacity of the central service , Because at that time DB Separation of reading and writing is single Master structure , There will be a write bottleneck .
Take transaction creation as an example to illustrate the process of sub database and sub table , Transaction creation should be one of the most complex business scenarios in a transaction . When you create an order , Write a lot of other data at the same time , At that time, the system capacity was about 1000 units per second ,DB There is a write bottleneck in a single point , And write too much will cause serious delay between master and slave . in addition DB Disk space has also broken through 80%, It's very unstable , It could collapse at any time .
So we decided to split it up , The background was that middleware had not yet been established , There are no sub database and sub table related components , So I decided to start inside first .
At that time, we compared some popular solutions in the industry , Like Ali's TDDL,Cobar, Google's Vitess etc. , By comparison, these components are heavy , Access and use costs are relatively high . Our principle is in line with our business scenario , Choose a component with relatively simple access and usage costs . So we took the last approach , adopt MyBatis Plugin The way to realize the function table of byte division , The component is now open source ：https://github.com/baihui212/tsharding
The industry scheme comparison of sub database and sub table is as follows ：
Self developed sub database sub table component TSharding, Complete the sub warehouse and sub table
- Simple enough , Invest less resources
- Support sub database and sub table
- Support data source routing
- Support transactions
- Support result set merging
- Support for read/write separation
This component is called TSharding, It's characterized by being simple enough , It's in line with our expectations , Support sub database and sub table , Support data source routing , Support transactions , Support result set merging , Support for read/write separation , Meet all our requirements .
We have also made some attempts at performance optimization , Here are three scenarios ：
- Distributed transactions
- Single machine asynchronous parallel
- Preprocessing & cache
Distributed transactions ---- Transaction creation, for example
Optimization idea ： Asynchronous message decoupling
- In the transaction creation process , Order 、 The status of the voucher and inventory must be consistent
- Marketing coupon services and inventory center inventory services , It is deployed separately from the order service
- Call the coupon / Inventory service timeout / Failure , Send a message asynchronously to notify rollback ; Complexity is controllable
- MQ Failed to send and try again + Consumption acceptance ACK The mechanism ensures consistency
- Eliminates the intrusive impact of distributed transaction frameworks such as two-phase commit
Let's start with distributed transaction processing , Here's another example of transaction creation , The transaction creation process interacts with multiple services , And some services are strongly dependent , For example, deduct inventory , Locking service , Consistency has to be maintained . Two stages / The multi-stage protocol is very heavy , It was not adopted at that time .
We thought of a way to do it , The asynchronous message decoupling is used to solve the problem , Specific process ：
Don't rush to expose the order when placing an order , Let's create an invisible order first （ Or you can think of it as creating an order in advance ）, Then we can reduce the inventory , Lock ticket operation , When these operations are abnormal or fail , The order system will send out a scrap message , Its downstream system （ Such as promotion , inventory system ） After receiving the news of the scrap Bill , Will help us do the rollback operation , To solve our distributed transaction problem in this way .
Distributed transactions —— Payment callback, for example
- In the payment callback process , After the fund system calls back the transaction, the order status will be updated , Reduce inventory , And so on
- Funds as sponsor guarantee to try again , Message reachable , Trading and downstream good, etc
- The failed business enters the task retry table , Do asynchronous compensation and try again
- Eliminates the intrusive impact of distributed transaction frameworks such as two-phase commit
Another scenario is payment callback , The payment system will inform the trading system after the order has been paid , The trading system will perform a series of operations, such as order status update , Reduce inventory , Issuing coupons, etc , It's also a distributed transaction problem .
Our strategy is , When the business fails, the request will enter into one of our failure compensation tables , Try again by constantly doing asynchronous compensation ( Stepped ), Ensure ultimate consistency .
Single machine asynchronous parallel —— Shopping cart, for example
- Shopping carts are typical IO Intensive application
- The code is executed serially , Synchronization wait time is long
- CPU Low utilization
Look at the , The shopping cart itself is a typical IO Intensive application , There are many similar applications like this , There will be a lot of networks IO request . Another point is that we are used to writing code serially , So there's a lot of synchronization waiting time .
Since every shopping cart query will pass through so many nodes , If there is no dependency between two nodes , Can I do it in parallel , In fact, every query corresponds to a query dependency tree , There is no dependency between nodes in the same layer , In this layer, we can actually do parallel operations , So based on this idea, we optimized , And the effect is pretty good .
The specific optimization is to add this concept , When we go to check, we'll wait for other queries , Let's go and find out , Finally make a summary , And the results of the query come out , It's such a process . And then the effect is good , Whole RT Basically, it can be reduced to more than half .
Preprocessing & cache —— Marketing pricing services, for example
- Use cache to reduce DB Reading pressure
- Cache as much data as you need
- DB Data changes, active invalidation cache （ asynchronous , Low latency ）, Reduce inconsistencies
- Turn on the local cache before the rush hour ; Warm up the cache , Help to improve cache hit rate
- The preprocessing achieves partial coupling of pricing interface ： Sign up to promote the discount of products synchronized to the product list
Preprocessing and caching , In fact, it is also a common optimization method . We adopt a multi-level cache strategy , Local cache + Distributed cache . Read local cache first , If there is no local cache, go to the distributed cache . The distributed cache can't be retrieved DB Take inside . When the data changes, we will have a system to refresh the cache asynchronously to update the data in the cache in time .
service SLA guarantee
SLA: Service Level Agreement, It's a requirement for service providers .SLA Embodied in the container （QPS）、 performance （RT）、 Degree of （ Distribution situation ; Usability ; Error rate ） Constraints . Improve SLA Some of the tools are as follows
- Basic monitoring goes first , Monitor key indicators
- Dependence on Governance 、 Logic optimization ： Reduce unnecessary dependencies
- Load balancing ; Service group ; Current limiting
- Downgrade plan 、 disaster 、 Pressure measurement 、 Practice online
This is our internal monitoring system , We will monitor some key metrics for each application , Look at the whole link .
Summarize and plan for the next step
- Service architecture is not static , It evolves as the business evolves
- There is no best plan , Use the right solution in the right scenario
At present, we are doing
- Service governance 、SLA Support systematization
- Same city / Live in different places
put questions to ： If the consumer receives a message , At this point, I told the system that the message could be deleted , Then my backend is in the process of executing this message , For example, I do some warehousing operations or other operations , But the service died , So, have you ever had a situation like this , How do you deal with it ？
Pan Fujiang ： Not yet , Because it's actually a process that requires cooperation , We need downstream systems to cooperate to ensure , Guarantee business OK Then I went to ACK news , It's mainly about the cooperation between several systems .
put questions to ： Most e-commerce distributed transaction solutions use message queuing mechanism , Is there a more general solution ？ We develop a set of distributed components , For example, a two-stage agreement , Solve this kind of distributed transaction more efficiently .
Pan Fujiang ： The problem of distributed transaction depends on the scenario , Alipay （ Used to be in Alipay ） There is a similar framework , But it's heavier , It also requires a certain access cost , It needs some cooperation .Case by Case It would be better to analyze the problem , For example, in some scenarios, the consistency requirements are not so high , There's no need for a two-phase protocol to deal with , Mainly depends on the business scenario .
put questions to ： When the database is migrated , Can you do smooth migration ？ Because I saw that the middleware you used before was switched twice , At this time, there must be some database online smooth migration , How did you do it ？
Pan Fujiang ： We will have a set of data synchronization tools internally , In addition, there is a set of switch system to complete the gray switching , You can push some values dynamically , Into your app , You can change this value dynamically . In addition, our data synchronization tool supports backtracking , You can switch back quickly in case of emergency , And trace the data back .
put questions to ： After you make a sub library, you need to keep the old library in the new one , Because after you go up, you have to release it step by step , But your old library is still running , Someone is using your old library at this time , But because you're publishing again , Half of the traffic has been switched to the new library , What if someone is modifying the data at this time ？
Pan Fujiang ： As mentioned above , There is a channel between the old library and the new one （ Data synchronization channel ）, And the data synchronization tool works all the time . The data from the old database will be synchronized to the new database in real time , And our gray level is dynamically pushed through the switch system , In real time .
put questions to ： After the split of database and table , If there is a table associated query, how do you handle it ？
Pan Fujiang ： We don't seem to have associated queries , Also do not recommend , Association query is very disadvantageous to subsequent horizontal split , It's not good for DB An extension of . You can look it up separately , Then in the application layer, do the related things .
put questions to ： If we go separately , Will the performance be affected ？
Pan Fujiang ： Performance may be affected to some extent , Will visit the database several times more , however DB Extended performance is greatly improved , The Internet is playing with big data , Compared to this ,DB Scalability is more important , There are many ways to optimize the application layer ( Such as caching ), If you use JOIN It's very difficult to do horizontal split again .
put questions to ： Before you dismantle the library , Have you considered any good plan to back off ？
Pan Fujiang ： Our data synchronization tool supports backtracking , If something goes wrong, the switch system can be switched back immediately , And it can go back to the data , Reduce the influence surface to a controllable range .
（ Click on the title to read ）
- Mercury: Vipshop full link application monitoring system solution details
- Design of the same journey travel cache system : How to build Redis The perfect system of the times
- To ensure the data consistency of the distributed system 6 Kind of plan （ Including the mushroom street plan ）
- dialogue ： An engineer in mushroom street 4 The structure of the year
- 10 Internet teams deal with high voltage capacity assessment and high availability system : The private Council 1 period
This article is related to this salon PPT Links are as follows , You can also click to read the original text and download it directly
Want to know more about high availability architecture Salon , Please pay attention to 「ArchNotes」 The official account of WeChat reads the following articles . Pay attention to the official account and reply City circle We can learn more about the follow-up activities in time . Please indicate from the highly available architecture and include the following QR code .
High availability Architecture
Changing the way the Internet is built
Long press QR code Focus on 「 High availability Architecture 」 official account
- C++ 数字、string和char*的转换
- Won the CKA + CKS certificate with the highest gold content in kubernetes in 31 days!
- C + + number, string and char * conversion
- C + + Learning -- capacity() and resize() in C + +
- C + + Learning -- about code performance optimization
C + + programming experience (6): using C + + style type conversion
Latest party and government work report ppt - Park ppt
Online ID number extraction birthday tool
Field pointer? Dangling pointer? This article will help you understand!
GVRP of hcna Routing & Switching
- LeetCode 91. 解码方法
- Seq2seq implements chat robot
- [chat robot] principle of seq2seq model
- Leetcode 91. Decoding method
- HCNA Routing＆Switching之GVRP
- GVRP of hcna Routing & Switching
- HDU7016 Random Walk 2
- [Code+＃1]Yazid 的新生舞会
- CF1548C The Three Little Pigs
- HDU7033 Typing Contest
- HDU7016 Random Walk 2
- [code + 1] Yazid's freshman ball
- CF1548C The Three Little Pigs
- HDU7033 Typing Contest
- Qt Creator 自动补齐变慢的解决
- HALCON 20.11：如何处理标定助手品质问题
- HALCON 20.11：标定助手使用注意事项
- Solution of QT creator's automatic replenishment slowing down
- Halcon 20.11: how to deal with the quality problem of calibration assistant
- Halcon 20.11: precautions for use of calibration assistant
- "Top ten scientific and technological issues" announced| Young scientists 50 ² forum
- Reverse linked list
- JS data type
- Remember the bug encountered in reading and writing a file
- Singleton mode
- 在这个 N 多编程语言争霸的世界，C++ 究竟还有没有未来？
- In this world of N programming languages, is there a future for C + +?
- js Promise
- js 数组方法 回顾
- ES6 template characters
- js Promise
- JS array method review
- 【Golang】️走进 Go 语言️ 第一课 Hello World
- [golang] go into go language lesson 1 Hello World