GitHub Repository: https://github.com/JSREI/js-cookie-monitor-debugger-hook
Simplified Chinese | English
In an era where data is priceless, the confrontation between crawlers and anti-crawlers has become intense. Cookie anti-crawlers are最常见之一
types of anti-crawlers. Websites set cookies through JS code that is so confusing that even mothers do not recognize it (usually when browsing) Server fingerprints, cookies that must be brought when making requests, etc.), Cookies that must be brought when facing requests but do not know where they are generated, You are struggling in tens of thousands of confusing lines of JS shit that your mother doesn’t recognize, hoping to find the place where cookies are generated (if the reverse thinking is unscientific, you may choke a few times...), and even want to find a few times. Are you trying to trick yourself into giving up, or why not just use a browser emulation method like Selenium? If you are a coward, this script is here to help you! (You and I both know that this paragraph is just nonsense to support the scene, you can skip it, if you are not unfortunate enough to finish reading it...)
The function of this script is roughly divided into two parts:
This script injects its own JS code into the page and hooks document.cookie
to complete various functions. Therefore, before using this script, you must first confirm that the cookie to be generated is indeed generated through JS (a very special method is introduced later. Simply determine whether the cookie is generated by JS or returned by the server).
At present, many Hook scripts have incorrect hooking postures. This script uses one-time and repeated Hooks, which has no impact on the browser's built-in cookie management:
In addition to the cookie breakpoint function, a cookie modification monitoring function has been added, which can analyze the cookies on the page from a more macro perspective:
(Forget it, give up coding...)
Color is used to distinguish operation types:
Each operation will be followed by a code location. Click to locate the location of the JS code that performed the operation.
Starting from v0.6, breakpoint rules with more powerful functions and more flexible configurations have been introduced, and an event mechanism has been introduced to subdivide cookie modifications into three events: add, delete, and update, supporting more fine-grained breakpoints. About Cookie events , please refer to Part 5 of this article for details.
Why is it designed this way? A relatively common situation is that the anti-crawling cookie on the target website is set by JS, but the logic of the JS code is to delete it frantically first, and then delete it many times before adding the real value. Setting the cookie in this way can exactly counteract it. General Cookie Hook debugging.
Here is one example, such as F5's Cookie protection, there is a Cookie TS51c47c46075
, which is deleted many times and then added again: In this case, you can set a breakpoint for the cookie event named
TS51c47c46075
to avoid confusing the red deletion events.
Theoretically, as long as the JS code of this script can be injected into the page, the Grease Monkey plug-in is used to inject the JS code into the page.
The Grease Monkey plugin can be installed from the Chrome Store:
https://chrome.google.com/webstore/detail/tampermonkey/dhdgffkkebhmkfjojejmpbldmpobfkfo
If you cannot circumvent the wall, you can search for "Tampermonkey" on Baidu to find third-party websites to download. However, please be careful not to install fake malicious plug-ins. It is recommended to install from the official store.
Other tools are also available, as long as the JS code of this script can be injected into the top of the page for execution.
You can install the Grease Monkey script from the official store, or you can copy the code and create it locally.
This method is recommended. The Oil Monkey script installed from the Oil Monkey Store can be automatically updated when subsequent versions are updated. This script has been put on the Oil Monkey Store:
https://greasyfork.org/zh-CN/scripts/419781-js-cookie-monitor-debugger-hook
If you find automatic updates too annoying, or have other concerns, you can copy the code of this script here:
https://github.com/CC11001100/js-cookie-monitor-debugger-hook/blob/main/js-cookie-monitor-debugger-hook.js
After reviewing and confirming that there is no problem, you can add it in the management panel of Oil Monkey.
Note that monitoring is to have an overall understanding at a macro level, not to locate details (usually the correct use of tools can improve efficiency. Of course, a person's knowledge is limited, and everyone is welcome to give feedback on more interesting ways to play) , for example when opening a page:
Based on this picture, we can have a general understanding of which cookies on this website are operated by JS, when and how they are operated.
Another example is to use a monitor to observe the changing pattern of cookies. For example, on this page, you can see based on the time that the cookie will be changed every half a minute:
(2021-1-7 18:27:49 updated v0.4 to add this feature): If there is too much information printed by the console, you can use the filtering that comes with the Chrome browser to filter it. The format of the printed logs has been unified, and only Just need cookieName = Cookie名字
, for example:
Please note that when searching, make sure that your search information is URL decoded, otherwise it may not match, because the console printing information is URL decoded first and then printed.
If you are not sure whether the cookie you want to set is locally generated or returned by a requesting server set-cookie
, you can open this script, refresh the page of the target website, and then search for the cookie name in the console. The method is the same as above. This section is similar. When the name of the cookie is short and not distinctive, you can add cookieName
to assist positioning, for example:
cookieName = v
Sometimes the target website may set a cookie repeatedly with the same value. This variable is used to ignore such events:
Generally, just keep the default.
@since v0.6
This part of the document applies to v0.6+ version. If your local version is less than 0.6, please upgrade the version before reading the document.
Starting from v0.6, breaking points when the value of Cookie changes has become very complicated, and it has also become very simple. The complexity is because of the introduction of the event mechanism, and the simplicity is because the breakpoint rule configuration is simplified and more flexible.
Breakpoint rules can be divided into标准规则
and简化规则
. Standard rules are for easy implementation and processing at the bottom of the program. Simplified rules are for users to configure more conveniently. Under normal circumstances, you only need to understand the simplified rules. When the simplified rule configuration cannot Check back to see how to configure the standard rules when your needs are met.
All rules are configured in the debuggerRules
array, and there is a variable at the head of the script: If you can't find it, you can press Ctrl+F to search by the name of the variable:
debuggerRules
This variable is an array type, which stores some rule conditions to determine under what circumstances the breakpoint will be entered.
Note that this is an array, and the rules in the array are in an OR relationship. When the Cookie modification event is triggered, each rule will be matched sequentially. As long as one rule is successfully matched, a breakpoint will be entered.
Enter a breakpoint when the cookie named foo
changes:
const debuggerRules = [ "foo" ] ;
Specifying a string in the above way will match the cookie name if it is equal to the given string.
Note that if there is a URL-encoded part of the exact match here, it needs to be URL-decoded first and then pasted here. Other places involving strings are the same and will not be described again.
If the name of the cookie contains a constantly changing part, such as timestamp, UUID, etc., which cannot be located by name, then regular matching is used:
const debuggerRules = [ / foo.+ / ] ;
In most cases, only these two configurations are enough.
Let’s practice it now. When opening this page
https://www.ishumei.com/trial/captcha.html
You can see that the script has detected some cookie operations:
One of them, smidV2
is suspicious, so we add a breakpoint for it:
After modifying debuggerRules
array, be sure to press Ctrl+S to save the script. Because Oil Monkey injects the JS code when the page is loaded, you need to refresh the page and re-inject it. When the page is refreshed, the breakpoint will be automatically entered:
In the red box A in the picture above are some variables specially passed in. By moving the mouse over these variables to view the values, we can roughly know some conditions of the current breakpoint:
Then there is the red box B. We set the Cookie breakpoint to track the call stack and locate the place where the cookie is generated. The red box is the call stack of this script. There is a clear userscript.html
logo. Just ignore this part of the call stack. .
Then trace the call stack and you can see where the cookie is set:
Of course, it is useless for us to look at this stack. What we have to do is to gradually move forward until we locate the place where the cookie is actually generated. However, this script can only help you set a breakpoint. The journey of the stars and the sea will depend on it later. Yourself!
Enter a breakpoint when the cookie named foo
is添加
:
const debuggerRules = [ { "add" : "foo" } ] ;
Enter a breakpoint when the cookie named foo
is删除
:
const debuggerRules = [ { "delete" : "foo" } ] ;
Enter a breakpoint when the cookie named foo
already exists but the value is更新
:
const debuggerRules = [ { "update" : "foo" } ] ;
Multiple conditions can be specified at the same time, and breakpoints are entered when添加和更新
, which is equivalent to excluding deletions:
const debuggerRules = [ { "add|update" : "foo" } ] ;
Strings or regular expressions can be used wherever cookie name matching is involved:
const debuggerRules = [ { "add" : / foo_d+ / } ] ;
The above simplified rules will be converted into standard rules. You can also configure standard rules directly in debuggerRules
array. The format of a standard rule is:
{
"events": "{add|delete|update}",
"name": {"cookie-name" | /cookie-name-regex/},
"value": {"cookie-value" | /cookie-value-regex/}
}
String type, indicating the event type matched by this rule. It can be a single event, such as add
, or multiple events. Use |
to separate multiple events, such as add|update
. If you feel crowded, you can also add |
Add spaces on both sides, such as add | update
When the event type is configured, it will only match the given event type. When this option is not configured, all event types will be matched by default.
It can be a string or a regular pattern. It is true when the cookie name matches the given string or regular pattern. This item cannot be ignored and must be configured.
It can be a string or a regular pattern. This rule is true when the cookie value matches the given string or regular pattern. It does not need to be configured. If not configured, this option will be ignored.
The configuration of breakpoint rules was introduced earlier, and event types were mentioned many times. We only know the name string corresponding to each event, but we still don’t know what each event means at the bottom level. This section explains each event. The realization mechanism of an event.
Cookie changes are subdivided into adding cookies, deleting cookies, and updating existing cookie values. Each event corresponds to an event name:
The cookie did not exist locally before and this is the first time it has been added. It may be the first time you visit this website, or you may clear cookies and visit again, or a new cookie will be generated every time you visit the website, or it may even be that the website's own code deletes cookies and re-adds them, which will trigger the addition of cookies. event.
For example, execute the following code. In order to ensure that the cookie does not exist before, a timestamp is added to the name of the cookie:
document . cookie = "foo_" + new Date ( ) . getTime ( ) + "=bar; expires=Fri, 31 Dec 9999 23:59:59 GMT; path=/" ;
When we run this line of code in the console, the Cookie Add event will be triggered:
When a cookie already exists locally and you try to set a value for it, the Update Cookie event will be triggered.
For example, the following code:
document . cookie = "foo_10086=blabla; expires=Fri, 31 Dec 9999 23:59:59 GMT; path=/" ;
document . cookie = "foo_10086=wuawua; expires=Fri, 31 Dec 9999 23:59:59 GMT; path=/" ;
The first statement to set Cookie will trigger the Cookie New event, and the second statement to set Cookie will trigger the Cookie Update event because the Cookie to be set already exists.
If the front-end developer gives an expires earlier than the current time when setting the cookie, it means that the cookie needs to be deleted. For example, a common way to delete cookies is:
const expires = new Date ( new Date ( ) . getTime ( ) - 1000 * 30 ) . toGMTString ( ) ;
document . cookie = "foo=; expires=" + expires + "; path=/"
When we run this code in the console, the Cookie deletion event will be triggered:
It can also be seen from the above that triggering the cookie deletion event is purely to detect expires, and will not actually check whether the cookie existed before.
As mentioned earlier, there is an event type when configuring Cookie breakpoint rules. In fact, each event type corresponds to a flag bit indicating whether the breakpoint of this event type is turned on. The priority of this flag bit is the highest. For example, if it is not When the Cookie deletion breakpoint is turned on and a Cookie deletion event is triggered, it will first check whether the Cookie deletion breakpoint is turned on. If it is turned off, the event will be ignored and no more attempts will be made to match the breakpoint rules (Controlled by Developer Tools) The log of this deletion event will still be printed on the platform).
So now the situation has become very complicated. Let us walk through the process of this small Cookie breakpoint:
By default, only the breakpoints for Cookie addition events and Cookie modification events are enabled:
Because under normal circumstances, adding cookies and updating cookies can be confused. They both assign a value to the cookie, and in most cases we will not pay attention to the event that the cookie is deleted, so this is how it is set here. If it does not satisfy your needs If required, you can modify the corresponding value of enableEventDebugger
by yourself.
If you encounter any problems during use, you can provide feedback in Issues
on GitHub, you can also provide feedback in the comment area of Grease Monkey Script, or you can send me an email, and I will deal with it as soon as possible after seeing it.
Starting from version v0.6, a variable has been added to adjust the font size of the log printed by this script on the console, in px:
As the version iterates, it may no longer be at this location. If you can't find it at once, just search in the code:
consoleLogFontSize
Then modify the value of this variable.
Or as another solution, you can hold down Ctrl + mouse wheel in the developer tools console to zoom and adjust the overall size. This is a function that comes with the Chrome browser.
As explained at the beginning of this article, this script must be successfully injected into the beginning of the page and executed before the Hook can be successful. For the entire page similar to the first layer of Accelerator, only one script is returned. This is the logic inside. :
< script >
document . cookie = 这里是一些奇奇怪怪的JS用于计算出Cookie ;
location . href = "跳转走了" ;
</ script >
The cookie is set and redirected to a new page immediately. For this operation, the hook may not be available. This is a problem with the Grease Monkey script. If you insist on hooking, you can use a hanging proxy to inject this script into this URL. Response header.
Below this page is a summary of some practical examples of reverse engineering using this script:
Click me to enter the navigation page
This project is split from: https://github.com/CC11001100/crawler-js-hook-framework-public/tree/master/001-cookie-hook#%E7%9B%91%E6%8E%A7%E5 %AE%9A%E4%BD%8Djavascript%E6%93%8D%E4%BD%9Cookie
After changing the namespace, the number of installations may be cleared. I took a screenshot to commemorate it. As of now (2022-7-29 21:40:01), the number of installations has exceeded 300. It feels like it is very large for a small tool in such a narrow field. It's not easy anymore...
Thank you to the enthusiastic netizens for feedback, thank you for your support.
js-cookie-monitor-debugger-hook has now joined the 404 Star Chain Project
Scan the QR code to join the reverse technology exchange group:
If the group QR code expires, you can add me on personal WeChat and send [Reverse Group] to join you in the group:
Click here or scan the QR code to join the TG communication group: