When you start a blog with your own domain name, your first post must be a bit more important to be worthy of the $4 domain name. As a technical practitioner for ten years, after ten years of wandering around, I found that there is some knowledge that I have to read all over the world to sort out a clue. Let me systematically explain it step by step from the beginning. How can a small website with thousands of visitors, or a small website with one or two million visitors a day, pass through this stage smoothly without any inherent shortcomings in technology? This article is written for some technical personnel, but also for entrepreneurs who do not understand technology. who.
Everyone who knows the Internet has their own ideas, and some people put their ideas into practice, build a website and start operating it. In fact, from a pure website technology perspective, due to the development of the open source model, it is now very easy and cheap to build a small website. When the number of visits reaches a certain level, the cost begins to soar, and problems begin to appear. The increase in costs caused by the increase in bandwidth, hardware expansion, and personnel expansion is obvious, and a considerable part of the cost is caused by code reconstruction, architecture reconstruction, and even the replacement of the underlying development language. The worst thing is The biggest problem is data loss. After working hard for several years, you can return to before you started your business overnight.
Reducing costs means increasing profits. We can avoid many things at the beginning. By laying a good foundation first, we can save a lot of energy and worry in the future.
Suppose you are a technical person involved in starting a business, and you are currently broke. You have to do everything yourself and pay for it yourself. You need hundreds of thousands of initial funds to build a website with a not particularly complicated application. Then you should pay attention to the following points:
1. Development language
Generally speaking, when technicians (programmers) start a business, they choose the language they are most familiar with based on their technical background. However, considering that you cannot always write programs alone, you have to think about this carefully. No matter what language is used, the final code quality depends on management, so we should be practical from a pure language level. The popular java, php, .net, python, and ruby all have their own advantages and disadvantages. For python and ruby, it is still relatively difficult to recruit personnel, and performance optimization will take some effort. The .net platform cannot afford a windows server. Java and PHP are still used the most. For websites whose applications are almost always supported by the front end in the early stage, PHP has slightly greater advantages, such as easy entry, simple design patterns, fast writing, and sufficient performance. However, not paying attention to design patterns is also its disadvantage, and it is easy to become It is loose, has a lot of hidden bugs, and is difficult to maintain. The advantage of Java is that the entire management process is assisted by many mature tools, and strong typing can also avoid some mentally handicapped bugs. Most JAVA programmers pay more attention to design patterns. Regardless of whether it is practical or not, the code format still looks good. This is also a disadvantage. Beginners may focus too much on patterns and find it difficult to solve actual needs.
The front end is not just HTML and CSS. The entire part responsible for interacting with the user is the front end, including the handler. It is still recommended to use PHP for this type of program. The main reason is that it is developed quickly and has a wide range of practitioners. As for the backend, such as behavioral analysis, bank interfaces, asynchronous message processing, etc., no matter what program you use, you can only choose different languages according to different business needs.
2. Code version management
If the network speed between developers is similar, use SVN; if it is more dispersed, such as across countries, use hg. Most people still use svn.
Assuming you choose svn, there are several considerations. One is what tree structure to use. In the early stage, there may be only one trunk, but later you will need to establish branches, such as a development branch and an online branch. Later, there may be one branch for each team. It is recommended to choose two branches at the beginning when there are few people, development and online. After local testing of each function is correct, submit it to the development branch. Finally, the unified testing can be merged into the online branch when going online. If you like to use svn as a mobile hard disk, it doesn't matter if you write a little bit and commit it once, but it will be a little bigger when merging. These people can create a branch or even a local code warehouse, submit it to their own branch at will, and then test it again. Commit to the development branch.
Deployment can be done manually or automatically. Manual deployment is relatively simple, usually svn update directly on the server, or find a new directory svn checkout, and then pass the web root to ln -s. The more complex the application, the more complicated the deployment. There is no unified standard. Just don’t use ftp upload. First, the error rate of inconsistent file references increases when uploading. Second, it is easy for the developer’s version to be inconsistent with the online version. , resulting in a typo that I originally wanted to correct but ended up being rolled back. If there are multiple servers, it is still recommended to deploy automatically. The machine whose code is changed is temporarily removed from the current service pool, and then rejoined after the update is completed.
No matter how small the project is, develop a good habit of using version management. At least it can be used as your backup. Although my http://zhiyi.us is just a WordPress, it is still svn. If I only change one or two css, that’s fine. The fruits of labor.
3. Server hardware
Don't envy the big customers and rich people. Take a look at the retail area of the computer room. One server alone supports countless websites. If you have sufficient funds, it is recommended to have at least three standard configurations for web processing, database, and backup. The web server requires at least 8G of memory and dual sata raid1. If the economy is a little loose, or there are many static files or pictures, then 15k sas raid1+0. The database must have at least 16G memory and 15k sas raid 1+0. It is best to configure the backup server with the same configuration as the database server. For hardware, you can buy your own brand's chassis, that is, the chassis is equipped with a motherboard and a hard disk box, and you can match the CPU, memory, and hard disk by yourself. You can also buy a complete set of brands, or it can be compatible with the machine. With three machines, the market price is RMB 60,000 to RMB 70,000.
The web server can run programs and serve as a memory cache, while the database server only runs the main database (if it is MySQL). The backup server does relatively more work. The web configuration, cache configuration, and database configuration must be consistent with the previous two. In this way, if there is a problem with either the WEB or the database, change the IP address of the backup server and switch to it. The backup strategy can be drbd, rsync, or many other open source backup solutions to choose from. rsync is the simplest, just put it in cron and run it yourself. For backup and switching, it is recommended to do more tests, choose the safest and most suitable for the business, and back up in off-site locations as much as possible.
4. Computer room
Try not to choose three types of computer rooms: China Unicom computer rooms with extremely slow access to China Unicom, China Unicom computer rooms with extremely slow China Telecom access, and China Mobile or China Railcom computer rooms with extremely slow China Unicom access. What about the Netcom computer room? Dear, China Netcom and China Unicom merged a long time ago and were renamed China Unicom. Search a lot, visit on-site, test a lot, and inquire in many ways. There are still many high-quality computer rooms in major node cities such as Beijing, Shanghai, and Guangzhou. Find a computer room with good network quality and strict management. In particular, the management must be strict. Don’t The website cannot be accessed. After a phone call, you find out that someone else knocked off your network cable during maintenance. This is more of a headache than DOS. If you pull a few optical fibers by yourself, it is called the computer room. It depends on your risk tolerance and psychological quality. The computer room can be said to be very important. It is directly related to the website access speed. The website access speed is directly related to the user experience. I can climb over the wall to see the scenery, but it is difficult to buy an online game VPN to open your not-so-well-known website. Maybe your website's Ajax is excellent, but the document is never ready, and some codes are always insulated from users.
5. Structure
The initial architecture is generally relatively simple, consisting of web load balancing + database master-slave + cache + distributed storage + queue. In the general direction, there are indeed just these few things. In terms of details, countless articles have repeated them. According to the future, there will be N more WEBs, N more master-slave relationships, N more caches, and N more xxx designs. The basic solutions are all ready-made. Yes, but what makes you better than others is that your design takes into account the avalanche effect when the cache fails, the data consistency and time difference of master-slave synchronization, the stability of the queue and the retry strategy after failure, the efficiency of file storage and Backup methods and other unexpected situations. The cache will one day fail, the database replication will one day be interrupted, the queue will one day be unable to write, and the power supply will one day burn out. According to Murphy's Law, if you don't take these into consideration, the website will become a coffee table sooner or later.
6. Server software
Linux, nginx, php, and mysql are almost standard. In addition to looking at the name, we also have to choose the version. There are many Linux distributions. As long as there are no special requirements, choose the one with the most users, the most active community, the most convenient configuration, and the most complete and latest software packages, such as Debian and Ubuntu. As for RHEL and the like, do you use software that can only run on RHEL? For the remaining nginx, php, mysql, activemq, others, etc., unless you have changed these software or your program is really not compatible with the new version, try to have the newer version, the better. Newer version means more new features and bugs Reduction, performance increase. There are always some people who tell you through hearsay that the old version is stable. The so-called stability refers to special businesses. As for a website written in PHP, most people have not changed any server software source code. In most cases, it can be smoothly upgraded to a new version. Similar to jdk5 to jdk6, python2 to python3, upgrades with relatively large changes are still relatively rare. Take a look at the ChangeLog, look at the upgrade instructions, and evaluate it based on your own situation. The sooner you upgrade, the better. Others are using php6 to write programs and here we are still using php4. It is still very responsible to upgrade excellent open source programs. Pay attention to the documentation and don't be afraid.
The above six points have been prepared. Now that we have the operating environment, the basic architecture skeleton, and the backup and switching plan, we should start to design and develop things. There are countless things in development, and the next article will talk about some key points first.
Original address: http://zhiyi.us/internet/thinking-twice-before-building-your-site-one.html
Please indicate the source of reprinting is zhiyi.us.