Using JavaScript to crack verification codes
Recently, a JavaScript script that can crack verification codes has appeared on the Internet - GreaseMonkey! This script developed by "Shaun Friedle" can easily solve the CAPTCHA of the Megaupload site. If you don’t believe it, you can try it yourself at http://herecomethelizards.co.uk/mu_captcha/ !
Now, the CAPTCHA provided by the Megaupload site has been defeated by the above code. To be honest, the verification code here is designed No, not very good. But what’s more interesting is:
1. The Canvas application interface getImageData in HTML 5 can be used to obtain pixel data from the verification code image. Using Canvas, we can not only embed an image into a canvas, but also re-extract it from it later.
2. The above script contains a neural network implemented entirely in JavaScript.
3. After using Canvas to extract pixel data from the image, send it to the neural network, and use a simple optical character recognition technology to infer which characters are used in the verification code.
By reading the source code, we can not only better understand how it works, but also understand how this verification code is implemented. As you saw earlier, the CAPTCHAs used here are not very complex - each CAPTCHA consists of three characters, each character uses a different color, and only uses characters from the 26-letter alphabet, while all characters All use the same font.
The purpose of the first step is obvious, which is to copy the verification code to the canvas and convert it into a grayscale image.
function convert_grey(image_data){
for (var x = 0; x < image_data.width; x++){
for (var y = 0; y < image_data.height; y++){
var i = x*4+y*4*image_data.width;
var luma = Math.floor(image_data.data[i] * 299/1000 +
image_data.data[i+1] * 587/1000 +
image_data.data[i+2] * 114/1000);
image_data.data[i] = luma;
image_data.data[i+1] = luma;
image_data.data[i+2] = luma;
image_data.data[i+3] = 255;
}
}
}
Then, split the canvas into three separate pixel matrices, each containing one character. This step is very easy to achieve because each character uses a separate color, so it can be distinguished by color.
filter(image_data[0], 105);
filter(image_data[1], 120);
filter(image_data[2], 135);
function filter(image_data, color){
for (var x = 0; x < image_data.width; x++){
for (var y = 0; y < image_data.height; y++){
var i = x*4+y*4*image_data.width;
// Turn all the pixels of the certain color to white
if (image_data.data[i] == color) {
image_data.data[i] = 255;
image_data.data[i+1] = 255;
image_data.data[i+2] = 255;
// Everything else to black
} else {
image_data.data[i] = 0;
image_data.data[i+1] = 0;
image_data.data[i+2] = 0;
}
}
}
}
Finally, all irrelevant interfering pixels are eliminated. To do this, you can first find those white (matched) pixels that are surrounded by black (unmatched) pixels in front or behind, and then delete the matched pixels.
var i = x*4+y*4*image_data.width;
var above = x*4+(y-1)*4*image_data.width;
var below = x*4+(y+1)*4*image_data.width;
if (image_data.data[i] == 255 &&
image_data.data[above] == 0 &&
image_data.data[below] == 0) {
image_data.data[i] = 0;
image_data.data[i+1] = 0;
image_data.data[i+2] = 0;
}
Now that we have an approximate shape of the character, the script further performs the necessary edge detection on it before loading it into the neural network. The script will look for the leftmost, right, top, and bottom pixels of the graphic, convert them into a rectangle, and then re-convert the rectangle into a 20*25 pixel matrix.
cropped_canvas.getContext("2d").fillRect(0, 0, 20, 25);
var edges = find_edges(image_data[i]);
cropped_canvas.getContext("2d").drawImage(canvas, edges[0], edges[1],
edges[2]-edges[0], edges[3]-edges[1], 0, 0,
edges[2]-edges[0], edges[3]-edges[1]);
image_data[i] = cropped_canvas.getContext("2d").getImageData(0, 0,
cropped_canvas.width, cropped_canvas.height);
After the above processing, what do we get? A 20*25 matrix containing a single rectangle filled with black and white. That's great!
Then, we'll simplify the rectangle even further. We strategically extract points from the matrix as "photoreceptors" that will be fed into the neural network. For example, a certain photoreceptor may correspond to a pixel located at 9*6, with or without pixels. The script extracts a series of such states (far fewer than the entire 20x25 matrix calculation - only 64 states are extracted) and feeds these states into the neural network.
You might ask, why not just compare pixels directly? Is it necessary to use a neural network? The point is, we want to get rid of those ambiguous situations. If you've tried the previous demo, you'll notice that comparing pixels directly is more error-prone than comparing through a neural network, although it doesn't happen much. But we have to admit that for most users, direct pixel comparison should be enough.
The next step is to try to guess the letters. 64 Boolean values (obtained from one of the character images) are imported into the neural network, as well as a series of pre-calculated data. One of the concepts of neural networks is that the results we want to obtain are known in advance, so we can train the neural network accordingly based on the results. The script author can run the script multiple times and collect a series of best scores that can help the neural network guess the answer by working backwards from the values that produced them, but these scores have no special meaning.
When the neural network calculates the 64 Boolean values corresponding to a letter in the verification code, compares it with a pre-calculated alphabet, and then gives a score for the match with each letter. (The final result may be similar: 98% may be the letter A, 36% may be the letter B, etc.)
After all three letters in the verification code have been processed, the final result will come out. It should be noted that this script is not 100% correct (I wonder if the accuracy of the scoring can be improved if the letters are not converted into rectangles at the beginning), but it is pretty good, at least for the current purpose. Say so. And all operations are completed in the browser based on standard client technology!
As a supplement, this script should be regarded as a special case. This technology may work well on other simple verification codes. , but for complex verification codes, it is a bit beyond the reach (especially this kind of client-based analysis). I hope more people can be inspired by this project and develop more wonderful things, because its potential is so great