The statistical data of Google Analytics and Baidu Statistics are different, and the differences are very large. The fundamental reason is that the principles and mechanisms of data statistics are different. Let’s share the principle analysis of Google Analytics and Baidu Statistics.
Analysis of the working principle of Baidu statistics
The JS provided by Baidu Statistics essentially introduces the code of hm.baidu.com/h.js into the page. The content of the code will vary according to the following parameters. The parameters following h.js? are what you are looking for. id in Baidu Statistics.
While obtaining the h.js code, Baidu Statistics will write a cookie named "HMACCOUNT" to your browser. The expiration time of this cookie is 2038, so as long as you do not clear the browser cookie, it will basically never Expired.
After h.js is downloaded, its script is executed to obtain some browser-related information and access sources. The information obtained includes screen size, color depth, Flash version, user language, etc.
From the js code, all parameters include these: "cc, cf, ci, ck, cl, cm, cp, cw, ds, ep, et, fl, ja, ln, lo, lt, nv, rnd, sb , se, si, st, su, sw, sse, v”. The meaning of these parameters is roughly as follows:
cc: Don’t know, usually 1
cf:value of url parameter hmsr
ci:value of url parameter hmci
ck: Whether to support cookies 1:0
cl: Color depth such as "32-bit"
cm:value of url parameter hmmd
cp:value of url parameter hmpl
cw:value of url parameter hmkw
ds: screen size, such as '1024×768′
ep: The initial value is '0', a time variable that reflects the page stay time. The format is probably: current time - loading time + "," + another small time value
et: The initial value is '0', if the ep time variable is not 0, it will become something else
fl:flash version
ja:java supports 1:0
ln: language zh-cn
lo: Don’t know, usually 0
lt: Date time.time(), such as "1327847756", not available in the first request
nv: I don’t know, usually 1 or 0
rnd: ten random numbers
sb: If it is a 360se browser, the value is equal to '17'
se: related to search engines
si: statistical code id
st:
su: previous page document.referrer
sw: I don’t know. It is probably related to the search engine. It is usually empty.
sse: I don’t know. It is probably related to the search engine. It is usually empty.
v: version of the statistical code, currently the value is "1.0.17"
When these parameters are all set (some parameters are not assigned values), filter out the assigned parameters and use them as parameters of hm.baidu.com/hm.gif to piece together a URL, such as: http://upload.chinaz .com//?cc=1&ck=1&cl=32-bit & ds=1366×768&ep=0&et=0&fl=11.0&ja=1&ln=zh-cn. Then request the image.
Baidu statistics server receives this request and obtains relevant information from the parameters attached to the URL of the image to record the visitor access record; when the page is closed by the user, it will also trigger a request for hm.gif, but this The procedure is not supported by all browsers and not all closing actions.
Using Wireshark (a network packet capture tool) test, it can be found that the browser sent a total of 4 requests to the server:
Request a js script.
When loading is complete, initiate a request and pass parameters
When exiting the page, make a request and pass the parameters. Compared with the above, it is found that the ep parameter has changed.
Baidu Statistics is based on cookies. When a js script is requested, a permanent cookie will be saved in your computer, which serves as your user ID. At the same time, it was discovered that the parameter ep changed from the initial 0 to "7289%2C115" when exiting. After escaping, it was "7289,115", which are two millisecond units, namely 7.2 seconds and 0.1 seconds. At the same time, the lt parameter (time, javascript: (new Date).getTime()) remains unchanged when requesting hm.gif the first two times. rnd random number changes every time.
How Google Analytics works
When a user visits a page that contains Google Analytics statistics code, this code will be executed by the user's browser, and the function of this code is to collect the visitor's information, such as the URL of the page viewed, browser type, Operating system, system language, screen resolution, etc.
The GA statistics code then stores this visitor information in a cookie. A cookie is a short text that is stored locally and is associated with the visited website. It is used to determine whether a user is visiting for the first time or multiple times, and the recommended source of the page. and subsequent page view information, etc.
Finally, all collected information will be sent to Google Analytics' data servers. This process is quite clever. We know that the server's log file will record each file request information, and the way Google Analytics collects data is by requesting a transparent 1×1 GIF image file from the server. This file request and the request time will be recorded in the server log, and the file request information contains the data collected by the GA statistics code and cookie information. In this way, whenever this GIF image receives a request, the visitor's access information will be collected by the Google Analytics data server.
However, Google Analytics only sends one gif request, and many times it sends multiple gif requests. If there is a gif image that cannot be counted, GA will send other gif requests.
Summarize:
Website analysis master Avinash once said that as long as the data is 90% accurate, then action can be taken in time. The important thing is to be able to see trends, take action, then test and continuously optimize.
Article source: Lu Songsong's blog, please indicate the address of this article when reprinting, thank you.
(Editor: Yang Yang) The personal space of the author Lu Songsong’s blog