DELPHI's Atomic World (2)

Author：Eve Cole Update Time：2025-02-07 04:36:01

Section 2 TClass Atom

In the System.pas unit, TClass is defined like this:

TClass = class of TObject;

It means that TClass is the class of TObject. Because TObject itself is a class, TClass is the so-called class of classes.

Conceptually, TClass is a type of class, that is, a class. However, we know that a class of DELPHI represents a piece of VMT data. Therefore, the class can be considered as the type defined for the VMT data item. In fact, it is a pointer type pointing to the VMT data!

In the previous traditional C++ language, the type of a class could not be defined. Once the object is compiled, it is fixed, the structural information of the class has been converted into absolute machine code, and the complete class information will not exist in the memory. Some higher-level object-oriented languages can support dynamic access and invocation of class information, but they often require a complex internal interpretation mechanism and more system resources. DELPHI's Object Pascal language absorbs some of the excellent features of high-level object-oriented languages, while retaining the traditional advantage of directly compiling programs into machine code, which perfectly solves the problems of advanced functions and program efficiency.

It is precisely because DELPHI retains complete class information in the application that it can provide advanced object-oriented functions such as as and is to convert and identify classes at runtime, in which the VMT data of the class plays a key core role. Interested friends can read the two assembly processes of AsClass and IsClass in the System unit. They are the implementation codes of the as and is operators to deepen their understanding of classes and VMT data.

With the type of class, you can use the class as a variable. A class variable can be understood as a special object, and you can access the methods of a class variable just like an object. For example: Let's take a look at the following program fragment:

type

TSampleClass = class of TSampleObject;

TSampleObject = class( TObject )

public

constructor Create;

destructor Destroy; override;

class function GetSampleObjectCount:Integer;

PRocedure GetObjectIndex:Integer;

end;

var

aSampleClass : TSampleClass;

aClass : TClass;

In this code, we define a class TSampleObject and its related class type TSampleClass, as well as two class variables aSampleClass and aClass. In addition, we also defined a constructor, destructor, a class method GetSampleObjectCount and an object method GetObjectIndex for the TSampleObject class.

First, let's understand the meaning of class variables aSampleClass and aClass.

Obviously, you can treat TSampleObject and TObject as constant values and assign them to aClass variables, just like assigning 123 constant values to the integer variable i. Therefore, the relationship between class types, classes and class variables is the relationship between types, constants and variables, but at the level of the class rather than the object level. Of course, it is not legal to directly assign TObject to aSampleClass, because aSampleClass is a class variable of TObject-derived class TSampleObject, and TObject does not contain all definitions compatible with the TSampleClass type. On the contrary, it is legal to assign TSampleObject to aClass variable, because TSampleObject is a derived class of TObject and is compatible with the TClass type. This is exactly similar to the assignment and type matching relationship of object variables.

Then, let's take a look at what class methods are.

The so-called class method refers to the method called at the class level, such as the GetSampleObjectCount method defined above, which is a method declared with the reserved word class. Class methods are different from object methods called at the object level. Object methods are already familiar to us, and class methods are always used at the level of accessing and controlling the common characteristics of all class objects and centrally managing objects. In the definition of TObject, we can find a large number of class methods, such as ClassName, ClassInfo, NewInstance, etc. Among them, NewInstance is also defined as virtual, that is, a virtual class method. This means that you can rewrite the implementation method of NewInstance in a derived subclass to construct object instances of that class in a special way.

You can also use the identifier self in class methods, but its meaning is different from self in object methods. The self in the class method represents its own class, that is, the pointer to the VMT, while the self in the object method represents the object itself, that is, the pointer to the object data space. Although class methods can only be used at the class level, you can still call class methods through an object. For example, the class method ClassName of the object TObject can be called through the statement aObject.ClassName, because the first 4 bytes in the object data space pointed by the object pointer are pointers to the class VMT. On the contrary, you cannot call object methods at the class level, and statements like TObject.Free must be illegal.

It is worth noting that the constructor is a class method, and the destructor is an object method!

What? Constructors are class methods and destructors are object methods! Was there any mistake?

You see, when you create an object, you clearly use a statement similar to the following:

aObject := TObject.Create;

It is clearly calling the Create method of class TObject. When deleting an object, use the following statement:

aObject.Destroy;

Even if you use the Free method to release the object, the Destroy method of the object is indirectly called.

The reason is very simple. Before the object is constructed, the object does not exist yet, only the class exists. You can only use class methods to create objects. On the contrary, deleting an object must delete the existing object. The object is released, not the class.

Finally, let’s discuss the issue of fictitious constructors.

In the traditional C++ language, virtual destructors can be implemented, but implementing virtual constructors is a difficult problem. Because, in the traditional C++ language, there are no class types. Instances of global objects exist in the global data space at compile time, and local objects of functions are also instances mapped in the stack space at compile time. Even dynamically created objects are placed in the fixed class structure using the new operator. Instances allocated in the heap space, and the constructor is just an object method that initializes the generated object instance. There are no real class methods in the traditional C++ language. Even if so-called static class-based methods can be defined, they are ultimately implemented as a special global function, not to mention virtual class methods. Virtual methods can only target specific object instances. efficient. Therefore, the traditional C++ language believes that before a specific object instance is generated, it is impossible to construct the object itself based on the object to be generated. It is indeed impossible, because this would create a self-contradictory paradox in logic!

However, it is precisely because of the key concepts of dynamic class type information, truly virtual class methods, and constructors implemented based on classes in DELPHI that virtual constructors can be implemented. Objects are produced by classes. The object is like a growing baby, and the class is its mother. The baby himself does not know what kind of person he will become in the future, but the mothers use their own education methods to cultivate different children. People, the principles are the same.

It is in the definition of the TComponent class that the constructor Create is defined as virtual so that different types of controls can implement their own construction methods. This is the greatness of concepts like classes created by TClass, and also the greatness of DELPHI.

....................................................

Chapter 3 The View of Time and Space in WIN32

My old father looked at his little grandson playing with toys on the ground, and then said to me: "This child is just like you when you were young. He likes to take things apart and only stops after seeing them to the end." Thinking back to when I was a child, I often dismantled toy cars, small alarm clocks, music boxes, etc., and was often scolded by my mother.

The first time I understood the basic principles of computers had to do with a music box I took apart. It was in a comic book when I was in high school. An old man with a white beard was explaining the theory of smart machines, and an uncle with a mustache was talking about computers and music boxes. They said that the central processing unit of a computer is the row of music reeds used for pronunciation in the music box, and the computer program is the densely packed bumps on the small cylinder in the music box. The rotation of the small cylinder is equivalent to the rotation of the central processing unit. The natural movement of the instruction pointer, while the bumps representing music on the small cylinder control the vibration of the music reed to produce instructions equivalent to the execution of the program by the central processor. The music box emits a beautiful melody, which is played according to the music score that has been engraved on the small cylinder by the craftsman. The computer completes complex processing based on the program pre-programmed by the programmer. After I went to college, I learned that the old man with the white beard was the scientific giant Turing. His theory of finite automata promoted the development of the entire information revolution, and the uncle with the mustache was the father of computers, von Neumann. Computer architecture is still the main architectural structure of computers. The music box was not dismantled in vain, mother can rest assured.

Only with a simple and profound understanding can we create profound and concise creations.

In this chapter we will discuss the basic concepts related to our programming in the Windows 32-bit operating system and establish the correct view of time and space in WIN32. I hope that after reading this chapter, we can have a deeper understanding of programs, processes and threads, understand the principles of executable files, dynamic link libraries and runtime packages, and see clearly the truth about global data, local data and parameters in memory.

Section 1 Understanding the Process

Due to historical reasons, Windows originated from DOS. In the DOS era, we always only had the concept of program, but not the concept of process. At that time, only regular operating systems, such as UNIX and VMS, had the concept of processes, and multi-processes meant minicomputers, terminals, and multiple users, which also meant money. Most of the time, I could only use relatively cheap microcomputers and DOS systems. I only started to come into contact with processes and minicomputers when I was studying operating systems.

It was only after Windows 3. In the past, under DOS, only one program could be executed at the same time, but under Windows, multiple programs could be executed at the same time. This is multitasking. While running a program under DOS, the same program cannot be executed at the same time, but under Windows, more than two copies of the same program can be running at the same time, and each running copy of the program is a process. To be more precise, every run of any program generates a task, and each task is a process.

When programs and processes are understood together, the word program can be considered to refer to static things. A typical program is static code and data composed of an EXE file or an EXE file plus several DLL files. A process is a run of a program, which is code and dynamically changing data that run dynamically in memory. When a static program is required to run, the operating system will provide a certain memory space for this operation, transfer the static program code and data into these memory spaces, and reposition and map the program code and data in this space. The program is executed inside, thus creating a dynamic process.

Two copies of the same program running at the same time mean that there are two process spaces in the system memory, but their program functions are the same, but they are in different dynamically changing states.

In terms of the running time of the process, each process is executed at the same time. The professional term is called parallel execution or concurrent execution. But this is mainly the superficial feeling that the operating system gives us. In fact, each process is executed in a time-sharing manner, that is, each process takes turns occupying the CPU time to execute the program instructions of the process. For a CPU, only the instructions of one process are executed at the same time. The operating system is the manipulator behind the operation of the scheduled process. It constantly saves and switches the current status of each process executed in the CPU, so that each scheduled process thinks that it is running completely and continuously. Since the time-sharing scheduling of processes is very fast, it gives us the impression that the processes are all running at the same time. In fact, true simultaneous operation is only possible in a multi-CPU hardware environment. When we talk about threads later, we will find that threads are what really drive the process, and more importantly, they provide process space.

In terms of the space occupied by the process, each process space is relatively independent, and each process runs in its own independent space. A program includes both code space and data space. Both code and data occupy process space. Windows allocates actual memory for the data space required by each process, and generally uses sharing methods for code space, mapping one code of a program to multiple processes of the program. This means that if a program has 100K of code and requires 100K of data space, which means a total of 200K of process space is required, the operating system will allocate 200K of process space the first time the program is run, and 200K of process space will be allocated the second time the program is run. When a process is started, the operating system only allocates 100K of data space, while the code space shares the space of the previous process.

The above is the basic time and space view of the process in the Windows operating system. In fact, there is a big difference in the time and space view of the process between the 16-bit and 32-bit operating systems of Windows.

In terms of time, the process management of 16-bit Windows operating systems, such as Windows 3.x, is very simple. It is actually just a multi-task management operating system. Moreover, the operating system's task scheduling is passive. If a task does not give up processing the message, the operating system must wait. Due to the flaws in the process management of the 16-bit Windows system, when a process is running, it completely occupies the CPU resources. In those days, in order for 16-bit Windows to have a chance to schedule other tasks, Microsoft praised the developers of Windows applications for being broad-minded programmers, so that they were willing to write a few more lines of code to gift the operating system. On the contrary, WIN32 operating systems, such as Windows 95 and NT, have real multi-process and multi-tasking operating system capabilities. The process in WIN32 is completely scheduled by the operating system. Once the time slice of the process running ends, the operating system will actively switch to the next process regardless of whether the process is still processing data. Strictly speaking, the 16-bit Windows operating system cannot be regarded as a complete operating system, but the 32-bit WIN32 operating system is the true operating system. Of course, Microsoft will not say that WIN32 makes up for the shortcomings of 16-bit Windows, but claims that WIN32 implements an advanced technology called "preemptive multitasking", which is a commercial method.

From a space perspective, although the process space in the 16-bit Windows operating system is relatively independent, processes can easily access each other's data space. Because these processes are actually different data segments in the same physical space, and improper address operations can easily cause incorrect space reading and writing, and crash the operating system. However, in the WIN32 operating system, each process space is completely independent. WIN32 provides each process with a virtual and continuous address space of up to 4G. The so-called continuous address space means that each process has an address space from $00000000 to $FFFFFFFF, rather than the segmented space of 16-bit Windows. In WIN32, you don't have to worry about your read and write operations unintentionally affecting the data in other process spaces, and you don't have to worry about other processes coming to harass your work. At the same time, the continuous 4G virtual space provided by WIN32 for your process is the physical memory mapped to you by the operating system with the support of the hardware. Although you have such a vast virtual space, the system will never waste a byte. physical memory.

Section 2 Process Space

When we use DELPHI to write WIN32 applications, we rarely care about the internal world of the process when it is running. Because WIN32 provides 4G of continuous virtual process space for our process, perhaps the largest application in the world currently only uses part of it. It seems that the process space is unlimited, but the 4G process space is virtual, and the actual memory of your machine may be far from this. Although the process has such a vast space, some complex algorithm programs will still be unable to run due to stack overflow, especially programs containing a large number of recursive algorithms.

Therefore, an in-depth understanding of the structure of the 4G process space, its relationship with physical memory, etc. will help us understand the space-time world of WIN32 more clearly, so that we can use the correct methods in actual development work. Worldview and methodology to solve various difficult problems.

Next, we will use a simple experiment to understand the internal world of WIN32's process space. This may require some knowledge of CUP registers and assembly language, but I tried to explain it in simple language.

When DELPHI is started, a Project1 project will be automatically generated, and we will start with it. Set a breakpoint anywhere in the original program of Project1.dpr, for example, set a breakpoint at the begin sentence. Then run the program and it will automatically stop when it reaches the breakpoint. At this time, we can open the CPU window in the debugging tool to observe the internal structure of the process space.

The current instruction pointer register Eip is stopped at $0043E4B8. From the highest two hexadecimal digits of the address where the program instruction is located are both zeros, it can be seen that the current program is at the address position at the bottom of the 4G process space, which occupies $00000000 to Pretty little address space for $FFFFFFFF.

In the command box in the CPU window, you can look up at the contents of process space. When viewing the content of the space less than $00400000, you will find a series of question marks "????" appearing in the content less than $00400000. That is because the address space has not been mapped to the actual physical space. If you look at the hexadecimal value of the global variable HInstance at this time, you will find that it is also $00400000. Although HInstance reflects the handle of the process instance, in fact, it is the starting address value when the program is loaded into memory, also in 16-bit Windows. Therefore, we can think that the program of the process is loaded starting from $00400000, that is, the space starting from 4M in the 4G virtual space is the space where the program is loaded.

From $00400000 onwards and before $0044D000, it is mainly the address space of program code and global data. In the stack box in the CPU window, you can view the address of the current stack. Similarly, you will find that the current stack address space is from $0067B000 to $00680000, with a length of $5000. In fact, the minimum stack space size of the process is $5000, which is obtained based on the Min stack size value set in the Linker page of ProjectOptions when compiling the DELPHI program, plus $1000. The stack grows from the high-end address to the bottom. When the stack when the program is running is not enough, the system will automatically increase the size of the stack space toward the bottom address. This process will map more actual memory to the process space. When compiling a DELPHI program, you can control the maximum stack space that can be increased by setting the value of Max stack size in the Linker page in ProjectOptions. Especially in programs that contain deep subroutine calling relationships or use recursive algorithms, the value of Max stack size must be set reasonably. Because calling a subroutine requires stack space, and after the stack is exhausted, the system will throw a "Stack overflow" error.

It seems that the process space after the stack space should be free space. In fact, this is not the case. The relevant information of WIN32 says that the 2G space after $80,000,000 is the space used by the system. It seems that the process can really own only 2G space. In fact, the space that a process can really own is not even 2G, because the 4M space from $00000000 to $00400000 is also a restricted area.

But no matter what, the addresses that our process can use are still very broad. Especially after the stack space and between $80,000,000, it is the main battlefield of the process space. The memory space allocated by the process from the system will be mapped to this space, the dynamic link library loaded by the process will be mapped to this space, the thread stack space of the new thread will also be mapped to this space, almost all operations involving memory allocation All will be mapped to this space. Please note that the mapping mentioned here means the correspondence between actual memory and this virtual space. Process space that is not mapped to actual memory cannot be used, just like the string of "" in the command box of the CPU window during debugging. ????".

............

Thanks for reading!