No matter what content management system or Web application framework you use to develop your Web site, there are some basic elements that should be covered. It's great to provide a polished user interface and rich content, but until then, the first priority should be to provide basic documentation that users can find and clearly express the purpose of the site. introduction There are several standard files that are required by every Web site, but many times they are ignored by the site. Most of these documents have to do with convention rather than technical requirements, but failure to provide them can lead to site creation going astray. In addition to the URL that can be obtained through guessing, it is usually difficult for users to find other things they want through guessing. This article will briefly describe each of these standard documents. Exactly how a given resource is provided depends on which Web server layer and Web application layer are used. In a "traditional", nearly static server such as Apache, these resources are likely to be text files on the server. But in different configurations, they may also be certain entries in the database, certain lines in the configuration file, certain classes in the server process, etc. This article focuses on what users end up seeing, rather than how to make it happen. 404.html When users use your Web site, they will inevitably look for resources that don't exist. These searches are more likely due to URL misspellings than any other reason, but factors such as outdated links, misconfiguration of the backend, and broken URLs at different points should not be underestimated. When a resource is unavailable, it is a good practice to provide some kind of fallback page to assist the user in navigating to other useful pages. A plain "not found" will let the user know that the resource is unavailable, but it won't help them solve the "what to do next" problem. Warning: Too many Web sites are incorrectly configured to send "soft 404" messages when creating custom 404.html (or any other mechanism used by Web servers to publish custom "not found" messages). In other words, they send a page with a regular "200 OK" header, which simply states that somewhere in the text is "unavailable" and maybe (but not often) mentions that there is a "404 Error" here. This should be avoided. Instead, save users (and their Web browsers and other tools) the trouble and use the exact status title. about.html So, why create a Web site? Yes, you need a homepage to answer this question. But it's more likely that the home page doesn't provide this kind of information, but simply allows users to log in, highlights the site's "selling points," displays some bells and whistles, and so on. You may also want to enable users to navigate to the About page from the home page, and if so, be sure to make that information available at http://mysite.example.com/about.html. Some people are used to looking for this type of information from this page. A good about.html page should provide an overview of the site's functionality, the purpose of creating the site, and why users should care about the site, and may also have several links to help users navigate back to the site's core functionality. This page doesn't need to be, and usually shouldn't be, flashy. Just keep it pragmatic and accurate so users can take advantage of all the site has to offer. contact.html So, how do you contact you? With about.html, users can get this information with multiple clicks on their existing home page. copyright.html Who owns the copyright to the website? It's possible that the content belongs to you, but who are you? An individual? A company? A partner? A government agency? If the content is in the public domain or covered by a free content license, you may need to inform users of this. a little. Nowadays, almost everything has its own copyright: if your content adheres to different principles, let users know. But there are not enough websites that bother to provide this kind of information, but why not add it to your own website? Because there will always be some users who will pay attention to this information. Obviously, different pages or resources may have different copyright information. Please use this page to provide users with information on how to determine those individual differences. If you have any questions about trademarks, please provide them as well. index.html (and index.htm) Not every Web server actually uses an index.html file to describe its home page. Depending on the settings, there may be methods such as URL rewriting and dynamic generation based on path names. But users don't care about these details! Just let http://www.aaa.com/index.html point to the home page, even if you have to use a simple HTML redirect to do that. By the way, if that's the case, then just let the old .htm extension take effect. If you still feel it's not enough, do the same with index.cgi. index.rss Much Web content is available via RSS. Although this approach does not apply to all Web sites, it is effective for most sites. It makes perfect sense to make RSS content independent of user-specific configuration options, logins, or payments for specific information. Because RSS cannot cover everything. That said, if something can be made available as RSS, then please go ahead and do it. Perhaps, what is presented in index.rss is nothing more than "advertisement" content, sometimes accompanied by platitudes on how to take advantage of the various advantages of RSS feeds. Or maybe it's an explanation of why RSS isn't relevant to your Web site. privacy.html Whenever you want to collect user information (even just usernames or traffic logs), inform users what you plan to do with that information. The legal issues surrounding the rights and responsibilities of Web site creators and/or users are complex. However, if the user's personal privacy can be taken into consideration, the user will still feel it. And maybe you should talk to a lawyer at this point about what to do with your user data. robots.txt If you do not want all resources on your Web site to be indexed by automated tools, indicate this in the robots.txt file. But if you do want your content to be indexed, please state that. The Robots Exclusion Standard directive does not force users: if you really don't want something to be visible, don't put it on your site, or make sure you have adequate permission protection behind it. However, all major legitimate web crawling engines will comply with the requirements in robots.txt. So try to be as clear as possible about your intentions. security.html The use of security.html is not mandatory. But if the site has security issues (for example, any sensitive information is collected from users), it's a good idea to document (at least give a rough outline of) the security process. Please provide contact information on this page in case users have any questions or would like to give suggestions on how to improve. Finding this information should follow the overall organization of the site's navigation options. In this case, you might as well put the resource at this URL. sitemap How to display a map of an entire Web site is not yet fully standardized. It's always useful to have something provided for making a sitemap, but how detailed it is depends on how dynamic your site is (or how it's dynamic). Furthermore, the content you want to display for users also depends on the intent of the site. For example, if the user does not have permission to use resource X, then letting the user know that resource X exists may not be appropriate at all. Please try to provide something based on your own judgment and the circumstances. For many sites, providing a sitemap is nothing more than support and friendliness to automated mechanisms such as search engines. Google has released a new convention based on the robots.txt convention. In summary, you can create an XML file that presents all the resources provided by the site. This is somewhat like an "include list" that acts as a complement to the "exclude list" of robots.txt. email address Just thinking about what's on the Web isn't enough. Sometimes a Web site's navigation tools aren't quite what they want (or some users may not understand your elegant design), so it's a good idea to make it possible for users to contact you via email as well. Be sure to prominently post your contact information in contact.html or elsewhere on your Web site. But also make sure that messages sent to generic email addresses get to the right person. This includes at least [email protected], [email protected], and [email protected]. For those "older folks" out there, you may want to have emails sent to [email protected] be routed to the appropriate destination as well (but probably not to "root" for security reasons). Please include a little text describing the email forwarding that clearly communicates the purpose of the site. Email addresses are as readily available as symbolic links in a Web server directory.