-
[Foreword] The measurement of website analysis is one of the introductory courses of website analysis. I have seen many friends raise many questions related to it, which shows that this is an area that everyone wants to know the most and is least able to grasp. Accurately grasping metrics makes it possible to accurately grasp website analysis.
【text】
Today’s topic returns to measurement, because measurement is the skeleton of website analysis. It is said that if the skin is not there, the hair will not be attached. Without measurement, website analysis cannot become a science. Measurement is also the question that most friends ask, such as the following question:
Teacher Song Xing:
There is a question that has puzzled me for a long time. The problem of bounce rate and exit rate in GA.
We all know their meanings: but when they exist at the same time, which data is better?
It's okay if they appear alone, but they appear at the same time in GA.
This is a good question, embodying a great spirit of inquiry and a keen observation that gets to the heart of the matter. There are so many similar questions that a new series of posts is essential. Now, let's start with the most basic measurement concepts that can easily confuse us. At the same time, this article will not repeat the past content (for the content of measurement, please look at the website map of the blog), but just the finishing touch and talk about the things that everyone should know most.
Even the most basic traffic measurement has pitfalls
Page view, visit and visitor are the three most basic traffic metrics. These three metrics are distinguished according to the difficulty of monitoring as follows:
Visit 》 Visitor 》 Page View
The reason is:
Page view is just a simple count. The website analysis monitoring code in the page is run once, nothing more. It's the simplest.
Visitor is also a simple count. The website analysis and monitoring code identifies a different cookie, or a different IP (for some tools, IP is used to identify visitors when there is no cookie) that comes to the website. But the visitor is definitely more complicated than the page view, because it includes the recording and judgment of cookies or IPs.
Visit represents a series of website access actions by a visitor, and the interval between each action does not exceed a specific time (for example, no more than 30 minutes). It means judging several things: (1) There must be a visitor. If the visitor cannot be judged, the visit will be meaningless; (2) It must judge the website access actions that can be identified by page view or other website analysis tools; (3) To identify the time between actions. Therefore, the judgment of visit is the most complicated. Therefore, when we first used log files for website analysis, we did not have a very clear concept of visit, only the concept of session.
So, where is the trap?
There are no traps in Visitor and page view. They are simple counting measures. When they are triggered, they are triggered and just record them. But there are traps in visit. The trap lies in the following possibilities:
I visited website A for 20 minutes. At the 21st minute, I ran from the link of website A (such as CWA website: http://www.chinawebanalytics.cn ) (this link is connected to website B) to website B, and then at 25 Minutes later, the link from website B (this link points back to website A) returns to website A. The browser window is not closed during this process, so how many visits does website A have during this process?
I visited website A for 20 minutes. At the 21st minute, I closed the page of website A, then opened a new browser window, and then opened a new window at 25 minutes and entered the URL of A to return to website A. This process How many visits does website A have?
I visited website A for 20 minutes. At the 21st minute, I closed the page of website A, then opened a new browser page, namely Tab (note that the browser was not closed), and then opened a new tab at 25 minutes. Enter A's URL to return to website A. How many visits does website A have during this process?
Picture: Tab, the great Tab
I don’t want to discuss the answers to these three questions with you here. You are welcome to discuss them in the comments. One thing to remind is that different website analysis tools have different definitions of these processes. So, if we're choosing a website analytics tool, we'd better ask the vendor to tell us what their basic definitions and monitoring methods are for these basic metrics.
However, these three questions directly answer our following questions:
(1) Why are the visits monitored by Omniture SiteCatalyst only 80% of those measured by Google Analytics?
(2) Why is the data from Google Analytics so different from the data from my server logs?
I would be surprised if their data were the same! The differences between these different tools are not obvious (some excessive differences between similar tools may of course mean that the monitoring is implemented incorrectly). What I want to say is that we should at least understand that visit is actually a very complex measurement, and it is definitely not as simple as we think.
Therefore, we move beyond the general understanding of this metric and enter into a fundamental question - why set the "visit" metric? Why can we not use page view or visitor?
If you can think this question clearly, I think you will truly understand visit.
Picture: It's not as easy as you thought!
The answer is actually very simple - what is the science of analyzing web analytics in a narrow sense? It is the science of analyzing the behavior of website visitors, so the focus is behavior. Therefore, it is definitely not possible to only have a visitor. If the visitor does not have corresponding behaviors attached to it, it is meaningless. However, if the behavior is isolated and has no context, it is of little significance, so only page view will not work either. Visit was built for this purpose, to measure a series of behaviors of a visitor that are represented as page views. It is a bridge that allows the visitor to establish a relationship with the page view, and also allows the visitor to establish a relationship with the behavior and express it in the form of data.
It sounds like such an artistic process. This is the beauty of website analytics. If you take a closer look at the why behind the why, you will find that there is a world in one flower.
Even basic measurements do not all have uniform definitions
What is mass, what is length, and what is speed? These measurements we often use in real life have standard definitions and units that are unified in the world. However, in the world of website analytics, not all metrics have uniform definitions.
This is because website analytics is still a very new subject. The name of the discipline of website analysis was actually initially uncertain. At first, people used e-metrics (e-metrics), and later web metrics (website metrics). It was not until more and more people started to use web analytics (website analytics) that the subject got a formal name.
Although the name of the discipline is fixed, many measures within the discipline have different interpretations. For example, bounce rate (bounce rate), this measurement still has more than two common interpretations. In addition to differences in interpretation, different monitoring tools also have different algorithms for some measurements. For example, as mentioned above, different tools have different algorithms for how to identify visitors, and the same is true for visits.
In order to solve the contradictions caused by inconsistencies, some smart website analysis tool providers will provide some functions that can customize measurements, allowing users to more flexibly adjust the definition and scale of measurements as needed, which objectively greatly increases the efficiency of website analysis. Adaptable and produces great results.
However, inconsistent definitions are not a good thing after all, especially for some basic measures. Therefore, some organizations in the industry are also working to establish some international standards. These organizations include: Britain's Audit Bureau of Circulation ( www.abc.org.uk ), the Joint Industry Committee for Website Standards (the Joint Industry Committee for Web Standards, www.jicwebs.org ) and the Web Analytics Association, www.webanalyticsassociation.org .
For different definitions, the final possible result is that some measurement definitions used by the most people will become the definitions agreed upon by the industry and eventually become implementation standards.
However, don't think that the definition of a website analysis tool represents the website analysis industry. It may be just one of countless definitions and regulations. The key is to understand what purpose these metrics exist for and what the real-world status of the site is that it corresponds to.
The most basic measures constitute composite measures
The most basic measures are very simple and insufficient to describe more complex website browsing behavior, so people began to introduce composite measures. The so-called composite metric is a new metric that is composed of multiple basic metrics using four arithmetic operations. For example, bounce rate, exit rate, PV/visit.
Composite metrics bring a lot of trouble to novice friends. I hope the following text can solve your problems.
First look at Bounce Rate. Bounce Rate is called bounce rate (Google Analytics) or bounce rate (China Web Analytics). You can choose any name. Everyone should be able to understand it. I like the latter I invented.
Bounce Rate must remember the following points:
Bounce Rate is not a metric that measures all pages, but a metric that measures all pages when they are just landing pages.
It is a special measure. It can measure the performance of the entire website, or it can be used to measure the performance of a certain page as a landing page. That is, it is both a website-level metric and a page-level metric. We will talk about this later in this article.
Different website analysis tools define it differently.
Its formula is less important than its purpose and meaning.
Now let me talk about what its purpose is.
The purpose of Bounce Rate is very clear, which is to help people figure out what the visitor's first impression is when they enter your website. Please note that it is the first impression, the first impression entering the website from outside the website.
For this purpose, people began to think, how to use a measurement to describe it? The first thing people think of is to use the time interval from when you enter the website to when you leave the website. For example, when you come to Tencent, you take a few casual glances, then spit and say, "Fuck, monopoly," and then close the window. The whole process may only take 5 seconds. This means that this website gives you a bad impression. So, it's a really good idea to describe it in terms of time. This is the approach that was originally envisioned and the approach that Mr. Avinash originally advocated on his blog.
However, there is a big problem with this method, and that is the issue of time. You may hate Tencent, but due to the existence of web tabs, you may not be in a hurry to close it, but open a new web page, such as opening the homepage of 360 Anti-Virus, and read with relish Mr. Zhou Hongyi’s "advocate" criticizing Tencent. Then half an hour later I discovered why the "disgusting" Tencent website was still open, so I turned it off. At this time, there is a bias in judging by time. Another big problem is that the time monitored by website analysis tools cannot be completely consistent with the time we actually browse the web page. Therefore, it is difficult to implement the time method to measure the first impression of the website.
But the human brain is always smart. Although on a cosmic scale, such intelligence is just a cloud, and it may not be much different from Sister Feng's beauty, but we are not afraid of difficulties. Therefore, another idea was born - if you find it annoying when you enter the first page of this website, then you are unlikely to spend time continuing to browse other pages of this website, which gave birth to bounce rate. The bounce rate measures the proportion of visits (visits) that only visit one page to the total visits (visits), or the proportion of visitors (visitors) who only visit one page (visitors) to the total visitors (visitors). As for the mathematical definition, it doesn't matter. The key is that people have finally found a time-independent and easy-to-calculate method to measure the first impression of a website.
This is the story of bounce rate, so bounce rate is not used to measure all visits to all pages, but only to measure the visit impression when the page is used as a landing page, because the landing page is the first impression that the website brings to visitors. Therefore, you should also understand: every page of a website may be a landing page (because search engines can bring traffic to any page of your website), but relative to different visits, only a part of each page It may be the landing page - if and only if the first page visited by this visit when entering the website is this page.
What about Exit Rate? That's another story. Exit Rate measures the behavior of people leaving the website. People always have to leave a website. Although I think the Guinness World Records should count the person who has been online the longest, but this person is mortal after all, so even if he can continue to visit a website for 100 years, he must eventually leave it. Beloved website. Besides, cookies don’t have that long a time limit. Therefore, where people leave the website more often becomes a concern.
Exit rate is a measure of this matter. To put it bluntly, exit rate is the probability of a web page being used as a website exit. Exit rate = 87%, which means that among all visits to this page, there is an 87% possibility of leaving the website from this page. This website certainly bears the responsibility for not being able to “retain” visitors.
From this point of view, the original intention of the invention of the two measures of bounce rate and exit rate has nothing to do with each other. They each measure their own. Although they are very similar, their logic is completely different. When I first learned website analysis, I was also very confused and tried desperately to figure out the relationship between these two measurements. Now it seems that it doesn't make much sense to figure out the relationship between the two. It makes more sense to figure out when to use which of them.
So, let's not let composite measures confuse us mathematically. I believe that when Google Analytics was invented, they did not expect that people would eventually calculate these composite measures so accurately, which is why we now find that there are so many inconsistencies in the numbers on Google Analytics. However, this does not hinder our analysis at all, because we already know what to use under what circumstances.
Count measures and composite measures
Now, let’s summarize what are counting measures and what are compound measures. Count measurement (count) refers to a unary measurement that does not require calculation and is designed to record the number, frequency, length of time, etc. Page view, visit, and visitor are all counting measurements, and overall time on page is also a counting measurement. Count measures cannot be split anymore.
Composite measure (calculate) refers to a measure that is composed of multiple count measures and formula operations (usually four arithmetic operations). For example, our commonly used measure of the breadth of pages visited by visitors - page view/visit, is calculated by dividing page view by visit.
Count measures and composite measures involve problems expressed through data. Usually, website analysis uses counting methods to express measurement specific values. For example, the website had 34,567 visits and 23,456 visitors in May. Count measures often correspond to the count representation for which the data is reported.
For composite metrics, it is also represented by a count report. For example, the bounce rate of the website is 13.3%. Count reports are the most common website analytics report. The following report is a typical counting report:
Another type of report is called a distribution report, which records the distribution of different statistical dimensions. For example, Figure D is a typical distribution report, indicating the number of visits corresponding to different path lengths.
The figure below is also a typical distribution report, showing the distribution of the number of visits of different lengths of time:
Count reports and distribution reports are both commonly used data display forms in website analysis tools. When making website analysis reports, we also often use these two forms. Arguably, counts and distributions are the most common models we deal with every day.
Okay, that’s it for today. If you have any ideas, please leave a message! Finally, I would like to share a movie - "The Thirty-sixth Story" is a Taiwanese literary sketch film with a full literary tone, but I think it is powerful enough. It reminds me of my previous days of running a restaurant. I recommend it to those who like "Website Analysis in China" "Girls - of course, it's best if you also like me by the way.
Author: Song Xing
Article source: http://www.chinawebanalytics.cn/metrics-and-its-back-story-1/