As a web engineer, I focus most on performance and architecture. Fortunately, I participated in the sd2.0 conference this time and was able to communicate extensively with my peers. In these two aspects, I don’t dare to keep some of my architectural design experiences to myself. Shared by friends, this article is the experience of participating in this conference and communicating with others.
Some thoughts on architectural design:
1. Never over design: never over design
This is a topic that is often mentioned, but as long as you think about how many functions in your architecture are not used at all, or are finally abandoned, you can understand its importance. When you first get involved in architecture design, you often tend to design large-scale designs. As for Huayi's architecture, we hope to design an incremental architecture that is extremely scalable and can adapt to all needs. The field of web development is a very dynamic process. It is difficult for us to predict the changes next week, and we need to respond to changes as quickly as possible. The most effective response.
eBay engineers have said that their architectural design has never been able to meet the growth of the system, so their system is always being overturned and redone. Please note that there is no problem with the ability of eBay architects. The architecture they design always builds on the bottlenecks of the old version, hoping that the new architecture will bring breakthroughs. However, the breakthroughs brought by the new architecture will always be achieved in a short period of time. Overwhelmed by new demands, they had to use a new architecture
Web development is a very agile process. Changes occur at any time. User needs are ever-changing. In many aspects, contingency is very high. Compared with software development, it is unrealistic to hope to use one architecture to plan all future designs.
2. web architecture life cycle: web architecture's life cycle
Since we need to eliminate over-design and ensure a certain degree of foresight, how can we find the balance? I hope the following web architecture life cycle can help you.
The designed architecture needs to be able to handle the 1-10 times growth by simply increasing the hardware capacity. During the 5-10 times growth period, please start designing the next version of the architecture so that it can withstand the next 10 times. double growth
The reason why Google can dominate is not entirely due to how advanced the search technology and sorting technology are. In fact, including Baidu and Yahoo, the technologies used are now similar. However, Google can achieve this by adding tens of thousands of servers within a month. The capability of sufficient system capacity is indeed difficult to replicate
3. Cache: Cache
Space is exchanged for time. Cache is always the top priority in computer design. From CPU to IO, cache can be seen everywhere. Web architecture design is important, and cache design is essential. Regarding how to design a reasonable cache, the founder of jbosscache The founder of Taobao said this: In fact, the design of web cache and enterprise-level cache are very different. Enterprise-level cache focuses on logic, while web cache is simple and fast. .
What is the problem caused by caching? It is the increase in program complexity. Because data is spread across multiple processes, synchronization is a troublesome problem. With the addition of clusters, the complexity will further increase. In actual applications, what kind of synchronization is used? Strategies often need to be bound to business
Laoqian designed a linked list cache for posts designed by Sohu, which can not only meet the needs of flexible insertion, but also enable fast reading. Some other large communities often use similar structures to optimize post lists. Memcache is also a frequently used one. tool
Link: Qian Hongwu’s video on architecture design http://211.100.26.82/CSDN_Live/140/qhw.flv
The common strategy of Cache is to keep data in memory instead of on the more time-consuming disk. From this perspective, the heap engine (storage method) provided by MySQL is also a method worth thinking about. This storage method can store data in memory and retain the powerful query capabilities of SQL. Does it kill two birds with one stone?
We only talked about read caching here. In fact, there is also a write cache, which is rarely used in content-oriented communities, because the main problem that such communities need to solve is the read problem, but when the processing capacity is lower than the request capacity When, or when a single request is cached to form a block and then processed in batches, write caching appears. We can easily find such a cache in a highly interactive community design.
Fourth, the core module must be developed by yourself: DIY your core module
We are deeply aware of this. Qian Hongwu and Yunfeng also mentioned that we often tend to use some open source modules. If the core modules are not involved, it is indeed possible. If they are, then we must be careful, because when When the number of visits reaches a certain level, these modules often have problems of one kind or another. Of course, we can attribute the problem to unfamiliarity with open source modules, but no matter what, when there is a problem with the core, it is very scary not to fully grasp its code.
5. Reasonable data storage: reasonable data storage
Do we have to use a database? Not necessarily. Lei Ming tells us that searching does not necessarily require a database. Yunfeng tells us that games do not necessarily require a database. So when do we need a database? Why not simply use files to replace it?
First of all, we need to admit that the database also operates on files. We need a database, mainly to use the following functions, one is data storage, and the other is data retrieval. In relational databases, we actually care very much about the complex search capabilities of the database. Just look at a tsql for statistics ( No need to read carefully, just glance at it)
select c.Class_name,d.Class_name_2,a.Creativity_Title,b.User_name,(select count(Id) from review where Reviewid=a.Id) as countNum from Creativity as a,User_info as b,class as c,class2 as d where a.user_id=b.id and a.Creativity_Class=c.Id and a.Creativity_Class_2=d.Id
select a.Id,max(c.Class_name),(max(d.Class_name_2),max(a.Creativity_Title),max(b.User_name),count(e.Id) as countNum from Creativity as a,User_info as b ,class as c,class2 as d,review as e where a.user_id=b.id and a.Creativity_Class=c.Id and a.Creativity_Class_2=d.Id and a.Id=e.Reviewid group by a.Id … ……………………………….