outline:
Introduction
1. Terms related to XML documents
2. Terms related to DTD
Introduction
The most troublesome thing for beginners to learn XML is that there are a lot of new terminology concepts to understand. Since XML itself is also a brand-new technology, it is constantly developing and changing. Organizations and major network companies (Microsoft, IBM, SUN, etc.) are constantly introducing their own insights and standards, so it is not surprising that new concepts are flying everywhere. . There is no authoritative institution or organization in China to officially name these terms. Most of the Chinese textbooks you see about XML are translated based on the author's own understanding. Some are correct and some are wrong, which further hinders the development of XML. Our understanding and learning of these concepts.
The explanations of XML terms you will see below are also the author's own understanding and translation. Ajie is based on the XML1.0 standard specification released by the W3C organization and related official documentation. It can be ensured that these understandings are basically correct, at least not wrong. If you want to read and understand further, I have listed the sources and links to relevant resources at the end of this article, which you can access directly. Okay, let’s get down to business:
1. Terms related to XML documents
What is an XML document? You know the HTML source code file? An XML document is an XML source code file written with XML tags. XML documents are also ASCII plain text files that you can create and modify using Notepad. The suffix name of XML documents is .XML, for example, myfile.xml. You can also directly open the .xml file using IE5.0 or above browsers, but what you see is the "XML original code" and the page content will not be displayed. You can try saving the following code as myfile.xml:
<?xml version="1.0" encoding="GB2312"?>
<myfile>
<title>XML Easy Learning Manual</title>
<author>ajie</author>
<email>[email protected]</email>
<date>20010115</date>
</myfile>
XML documents contain three parts:
1. An XML document declaration;
2. A definition of document type;
3. Mark the created content with XML.
Example:
<?xml version="1.0"?>
<!DOCTYPE filelist SYSTEM "filelist.dtd">
<filelist>
<myfile>
<title>QUICK START OF XML</title>
<author>ajie</author>
</myfile>
...
</filelist>
The first line <?xml version="1.0"?> is the declaration of an XML document. The second line indicates that this document uses filelist.dtd to define the document type. The third line below is the main part of the content.
Let’s understand the relevant terms in XML documents:
1.Element:
We already know the element in HTML. It is the smallest unit that makes up an HTML document, and it is the same in XML. An element is defined by an identifier, including a start and end identifier and its content, like this: <author>ajie</author>
The only difference is: in HTML, the tag is fixed, but in XML, the tag needs to be created by you.
2.Tag(logo)
Identifiers are used to define elements. In XML, tags must appear in pairs, surrounding the data. The name of the identifier is the same as the name of the element. For example, an element like this:
<author>ajie</author>
Where <author> is the identifier.
3.Attribute:
What are properties? Look at this HTML code:<font color="red">word</font>. Among them, color is one of the attributes of font.
Attributes are further descriptions and explanations of the logo. A logo can have multiple attributes, such as the font attribute and size. Attributes in XML are the same as attributes in HTML. Each attribute has its own name and value. The attribute is part of the identifier. Example:
<author sex="female">ajie</author>
Attributes in XML are also defined by themselves. We recommend that you try not to use attributes and change attributes into sub-elements. For example, the above code can be changed to this:
<author>ajie
<sex>female</sex>
</author>
The reason is that attributes cannot be easily expanded and manipulated by programs.
4.Declaration
In the first line of all XML documents there is an XML declaration. This declaration indicates that this document is an XML document and which XML version specification it follows. An XML declaration statement looks like this:
<?xml version="1.0"?>
5.DTD (file type definition)
DTD is used to define elements, attributes and relationships between elements in XML documents.
The DTD file can be used to detect whether the structure of the XML document is correct. But creating an XML document does not necessarily require a DTD file. Detailed descriptions of DTD files will be listed separately below.
6.Well-formed XML (well-formed XML)
A document that abides by XML syntax rules and adheres to XML specifications is called "well-formed". If all your markup strictly adheres to the XML specification, then your XML document does not necessarily need a DTD file to define it.
A well-formed document must start with an XML declaration, such as:
<?xml version="1.0" standalone="yes" encoding="UTF-8"?>
Among them, you must indicate the XML version that the document complies with, which is currently 1.0; secondly, explain that the document is "independent", and it does not require a DTD file to verify whether the identification in it is valid; thirdly, you must indicate the language encoding used in the document. The default is UTF-8. If you use Chinese, you need to set it to GB2312.
A well-formed XML document must have a root element, which is the first element created immediately after the declaration. All other elements are child elements of this root element and belong to a group of root elements.
The content of a well-formed XML document must be written in compliance with XML syntax. (We will explain XML syntax in detail in the next chapter)
7.Valid XML (valid XML)
An XML document that complies with XML syntax rules and complies with the corresponding DTD file specifications is called a valid XML document. Note that we compare "Well-formed XML" and "Valid
XML", the biggest difference between them is that one fully complies with the XML specification, while the other has its own "Document Type Definition (DTD)".
The process of comparing an XML document with its DTD file to see if it complies with DTD rules is called validation. This process is usually handled by a software called parser.
A valid XML document must also start with an XML declaration, for example:
<?xml version="1.0" standalone="no" encode="UTF-8"?>
Different from the above example, in the standalone (independent) attribute, "no" is set here because it must be used with the corresponding DTD. The DTD file is defined as follows:
<!DOCTYPE type-of-doc SYSTEM/PUBLIC "dtd-name">
in:
"!DOCTYPE" means you want to define a DOCTYPE;
"type-of-doc" is the name of the document type, defined by you, usually the same as the DTD file name;
Only use one of the two parameters "SYSTEM/PUBLIC". SYSTEM refers to the URL of the private DTD file used by the document, while PUBLIC refers to the URL of the public DTD file used by the document.
"dtd-name" is the URL and name of the DTD file. All DTD files have the suffix ".dtd".
We still use the above example, it should be written like this:
<?xml version="1.0" standalone="no" encode="UTF-8"?>
<!DOCTYPE filelist SYSTEM "filelist.dtd">
2. DTD related terms
What is DTD, we have briefly mentioned above. DTD is an effective method to ensure that the XML document format is correct. You can compare the XML document and the DTD file to see whether the document conforms to the specification and whether the elements and tags are used correctly. A DTD document contains: the definition rules of elements, the definition rules of relationships between elements, the attributes that can be used by elements, and the rules of entities or symbols that can be used.
The DTD file is also an ASCII text file with the suffix .dtd. For example: myfile.dtd.
Why use DTD files? My understanding is that it meets network sharing and data interaction. The biggest benefit of using DTD is the sharing of DTD files. (This is the PUBLIC attribute in the DTD description statement above). For example, if two people in the same industry and different regions use the same DTD file as a document creation specification, their data can be easily exchanged and shared. If other people on the Internet want to add data, they only need to create a document according to the public DTD specification, and they can join immediately.
Currently, there are already a large number of written DTD files available. Targeting different industries and applications, these DTD files have established common element and label rules. You don't need to recreate them yourself, just add the new logos you need based on them.
Of course, if you like, you can create your own DTD, which may match your document more perfectly. Creating your own DTD is also very simple. Generally, you only need to define 4-5 elements.
There are two ways to call a DTD file:
1. DTD contained directly within the XML document
All you need to do is insert some special instructions into the DOCTYPE declaration, like this:
We have an XML document:
<?xml version="1.0" encoding="GB2312"?>
<myfile>
<title>XML Easy Learning Manual</title>
<author>ajie</author>
</myfile>
We just insert the following code after the first line:
<!DOCTYPE myfile [
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
<!ENTITY copyright "Copyright 2001, Ajie.">
]>
2. Call an independent DTD file
Save the DTD document as a .dtd file, and then call it in the DOCTYPE declaration line. For example, save the following code as myfile.dtd
<!ELEMENT myfile (title, author)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT author (#PCDATA)>
Then call it in the XML document, inserting after the first line:
<!DOCTYPE myfile SYSTEM "myfile.dtd">
We can see that the calls to js in DTD documents and HTML are similar. Regarding how to write DTD documents, we will introduce them together with the syntax of XML documents in the next chapter.
Let’s learn about the terminology related to DTD:
1.Schema(Planning)
Schema is a description of data rules. Schema does two things:
a. It defines the element data type and the relationship between elements;
b. It defines the content type that the element can contain.
DTD is a schema for XML documents.
2.Document Tree
We have already mentioned the "document tree" in Chapter 2. It is an image representation of the hierarchical structure of document elements. A document structure tree contains the root element, which is the top-level element (that is, the first element immediately following the XML declaration statement). Look at the example:
<?xml version="1.0"?>
<filelist>
<myfile>
<title>...</title>
<author>...</author>
</myfile>
</filelist>
The above example is arranged in a three-level structure into a "tree" shape, in which <filelist> is the root element. In XML and DTD files, the first element defined is the root element.
3.Parent Element/Child Element
A parent element is an element that contains other elements, and the contained element is called its child element. Look at the "structure tree" above, where <myfile> is the parent element, <title>, <author> are its child elements, and <myfile> is the child element of <filelist>. The last-level elements that do not contain any child elements like <title> are also called "page elements".
4.Parser (parsing software)
Parser is a tool software that checks whether XML documents comply with DTD specifications.
XML parsers have been developed into two categories: one is the "non-confirmation parser", which only detects whether the document complies with XML syntax rules and whether the document tree is established with element identifiers. The other is the "confirmation class paeser", which not only detects the document syntax and structure tree, but also compares and analyzes whether the element identifiers you use comply with the specifications of the corresponding DTD file.
Parser can be used independently or as part of an editing software or browser. In the following list of related resources, I have listed some of the currently popular parsers.
Okay, through the study of Chapter 3, we have learned some basic terms of XML and DTD, but we still don’t know how to write these files and what kind of syntax needs to be followed. In the next chapter, we will focus on writing XML and syntax of DTD documents. Please continue browsing, thank you!