The Future of XML Now you know XML. It's true that the structure is a bit complex, and the DTD has various options for defining what the document can contain. But that's not all.
Consider an industry for which data exchange is important, such as banking. Banks use ownership systems to track transactions internally, but if they use a common XML format on the Web, then they must describe the transaction information to another institution or application (such as Quicken or MS Money). Of course, they can also represent data on Web pages. FYI: This tag does not exist. It's called OFEX, Open Financial Exchange.
Under certain circumstances, if IE 4 on a PC encounters a <SOFTPKG> tag, a function will be initiated to give the user the opportunity to update installed software. If you are using Windows 98, you may have seen this situation, but did not know that it is an XML application.
Here we have three XML applications that look different from the adding machines, typewriters and pencils Andy Grove saw in the 1970s. But similar to the applications that eventually appeared on PCs, the benefits of XML can be described generally as: "When you use human- and machine-readable tags to describe your data, good things happen."
Those good things
happen.What is it? I have no idea. But I also don't know what the next generation of programs on my PC will look like. As long as the data is tagged in this way, different applications can be generated.
Are you starting to think about how far it might expand?
We have a lot of practical applications of XML to talk about, and I'll be covering them in the near future. Since we are all Internet users, the future will be XSL (Extensible Style Language-
eXtensible Style Language).
By the way, this recipe is indeed my mom's and it's outstanding. If you are using that, add another half cup of grated coconut.
I’m writing this because I genuinely care about what you think of me. My concern is this: if you read my introduction to XML and are ready to start writing your own XML documents. So you start looking for an already established DTD to represent your information. You find one, as shown below:
<!ATTLIST fn
%attr.lang;
value CDATA #FIXED "TEXT">
<!ENTITY % attr.img "
img.type CDATA #REQUIRED
img.data ENTITY #REQUIRED">
Right off the bat you think Jay must be an idiot. He didn't say anything about ATTLIST and ENTITY - whatever they were.
So let’s talk about this, first with a little patience.
The lines above may not look good, but they're actually nothing. They are used in DTDs to define attributes and entities in XML documents. Anyone who knows HTML will know this very well. Attributes are entries with HTML tags that describe the tags more accurately. In the frequently appearing <img src="my.gif" height="20" width="20">, there are two attributes: height and width. As you'll see later, using attributes in XML documents is very similar.
There's nothing new about entities either. If you've used &, you already know the basics. A string surrounded by & and semicolons represents another character or set of characters. (A complete list of ISO entities is available here.)
Of course, attributes and entities in XML have other functions. This inevitably introduces syntax, although not too much. Once you know this, working with XML documents will be effortless.
Simplified Recipes
If you read my introduction to XML, you'll remember that the ingredients in a recipe are represented by simple tags, such as <item>2 cups flour</item>. After writing that article, I was roaming around the web and found another XML document about recipes. The recipe elements are as follows:
<ingredient quantity="2" units="cups">flour</ingredient>
This approach has a practical benefit: it makes it easier to control the data. With the first approach, the <item> tag is used to hold a bunch of different information. If I wanted to extract a list of ingredients without the amounts of each ingredient, I wouldn't do that.
I can achieve similar functionality using the following structure:
<item>flour
<quantity>2</quantity>
<units>cups</units>
This can be handled, but there are two problems: First, the item element contains mixed content: text and other markup. I quickly discovered that this structure should be avoided whenever possible. The second is that markers have almost no independent meaning. It's hard to imagine a situation where there are only units but no actual components. These items can be described simply, I prefer to think of them as properties.
The first thing to note is that the attribute names, quantities and units are only meaningful when processed by an application that can translate them.
The DTD should be told to allow it before being included in a valid document. For the ingredient element above, we only included the following code in the DTD:
<!ELEMENT ingredient #PCDATA>
<!ATTLIST ingredient quantity CDATA #REQUIRED>
<!ATTLIST ingredient units CDATA #REQUIRED>
The first line looks familiar - standard element definitions you'll see in any DTD. Each ATTLIST line contains the following information in turn:
<!ATTLIST ingredient quantity CDATA #REQUIRED>
This is the element to which the attribute is attached.
<!ATTLIST ingredient quantity CDATA #REQUIRED>
The attribute name is defined here.
<!ATTLIST ingredient quantity CDATA #REQUIRED>
Set the attribute type here. CDATA stands for character data. Meaning the processor can get the text within the attribute.
<!ATTLIST ingredient quantity CDATA #REQUIRED>
The last part defines the default value of the attribute. You can use an actual numerical value, such as 3. This way, the attribute value for whitespace length in XML will be 3. The value entered will override the default value.
In the example above I did not set a specific quantity, but used the XML keyword #REQUIRED. It tells the processor that the secondary attribute must contain a value. If blank, the document will not be processed.
The default value has two additional keywords. The first is #FIXED - if the attribute value remains the same value throughout the document. Suppose I define an image tag attribute, and all images are of the same size, such as 100*50 pixels. I can define the attribute like this in the DTD:
<!ATTLIST picture length CDATA #FIXED "100 px">
<!ATTLIST picture width CDATA #FIXED "50 px">
Another keyword is #IMPLIED, indicating that the property can contain a value or be empty.
Let's look at attribute types.
If you decide to write your own DTD, you may want a book that explains the XML of all combinations in an ATTLIST statement. But if you borrow DTD, you may only know CDATA and three other attributes.
The first one is ID. It requires that the attribute's value is not repeated in the document. Anyone who has used a database knows the need for unique identifiers. The DTD ATTLIST statement looks like this:
<!ATTLIST element_name attribute_name ID #REQUIRED>
It is difficult to imagine the ID attribute type without the default value of #REQUIRED. In that case, any duplicate or empty IDs will force the processor to return an error. The ID must start with a letter or underscore and cannot contain any spaces.
The NMTOKEN type also uses the above naming rules. But duplication is allowed. It is used as a guarantee for passing data to the application. Most programming languages, including Java and JavaScript, cannot have spaces in module names. In most cases, it's best to ensure that properties comply with their rules.
Finally, there are enumeration types, which do not require specific keywords. Instead, use the "|" symbol to enclose the value in parentheses, for example:
<!ATTLIST sibling (brother | sister) #REQUIRED>
This approach can be used if there are a limited number of possible attribute values.
You don’t think today’s course is boring, so keep reading!