I recently had to solve the Geetest slider captcha.
The Captcha is basic enough, you must slide a puzzle piece into the slot.
While doing this I thought that it would be a fun challenge to try and create a program that will solve this captcha!
To begin I'm going to use puppeteer. This allows me to control a chrome browser with Javascript.
Let's set everything up!
npm init && npm i puppeteer
touch index.js
This will create a new node project, install puppeteer, and create a index.js
file.
Then I add the following to the index.js
file.
I use this captcha test page that contains the puzzle slider captcha.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
headless: false,
});
const page = await browser.newPage();
await page.goto('https://scraperbox.com/captcha/geetest');
})();
Let's run the program:
node index.js
And this spins up a chrome browser and loads the Geetest captcha page, nice! 🎉
Alright, the next step is to solve the captcha.
I create a new function that will wait for the Click to verify
element to show up and then click it.
async function clickVerifyButton(page) {
await page.waitForSelector('[aria-label="Click to verify"]');
await page.click('[aria-label="Click to verify"]');
await page.waitForSelector('.geetest_canvas_img canvas', { visible: true })
await page.waitForTimeout(1000)
}
Next, I simply call the function in the main piece of code.
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto('https://scraperbox.com/captcha/geetest');
// Call the function here
await clickVerifyButton(page)
})();
When running this, you will see that the browser clicks the verify button. 🚀
Behind the scenes, Geetest generates 3 images. Let's extract them.
I'm going to use Jimp
, this is a great image package for node.
npm i jimp
And add the dependency at the top of your index.js
const puppeteer = require('puppeteer');
const Jimp = require('jimp');
// Rest of the code
Then, I create a new function to extract the images.
async function getCaptchaImages(page) {
const images = await page.$$eval(
'.geetest_canvas_img canvas',
(canvases) => {
return canvases.map((canvas) => {
// This will get the base64 image data from the
// html canvas. The replace function simply strip
// the "data:image" prefix.
return canvas
.toDataURL()
.replace(/^data:image\/png;base64,/, '')
})
}
);
// For each base64 string create a Javascript buffer.
const buffers = images.map((img) => new Buffer(img, 'base64'));
// And read each buffer into a Jimp image.
return {
captcha: await Jimp.read(buffers[0]),
puzzle: await Jimp.read(buffers[1]),
original: await Jimp.read(buffers[2]),
};
}
Let's call this function and write the result to 3 image files.
// Add this in the main function
const images = await getCaptchaImages(page);
images.captcha.write("./captcha.png");
images.original.write("./original.png");
images.puzzle.write("./puzzle.png");
When running this 3 files should be created that show the images! 🦾
We must find the location of the puzzle piece. To do this I'm going to calculate the difference between the captcha and the original image.
I'm going to use the pixelmatch and Open CV packages to do this.
npm i pixelmatch opencv-wasm
Once again don't forget to require the packages at the top of the index.js
file
const puppeteer = require('puppeteer');
const Jimp = require('jimp');
const pixelmatch = require('pixelmatch');
const { cv } = require('opencv-wasm');
// Rest of the code
Then, I create a new function that creates the diff image.
async function getDiffImage(images) {
const { width, height } = images.original.bitmap
// Use the pixelmatch package to create an image diff
const diffImage = new Jimp(width, height)
pixelmatch(
images.original.bitmap.data,
images.captcha.bitmap.data,
diffImage.bitmap.data,
width,
height,
{ includeAA: true, threshold: 0.2 }
)
// Use opencv to make the diff result more clear
const src = cv.matFromImageData(diffImage.bitmap)
const dst = new cv.Mat()
const kernel = cv.Mat.ones(5, 5, cv.CV_8UC1)
const anchor = new cv.Point(-1, -1)
cv.threshold(src, dst, 127, 255, cv.THRESH_BINARY)
cv.erode(dst, dst, kernel, anchor, 1)
cv.dilate(dst, dst, kernel, anchor, 1)
return new Jimp({
width: dst.cols,
height: dst.rows,
data: Buffer.from(dst.data),
})
}
And I call the new getDiffImage
function from the main function.
The whole main function now looks like this:
(async () => {
const browser = await puppeteer.launch({
headless: false,
});
const page = await browser.newPage();
await page.goto('https://scraperbox.com/captcha/geetest');
await clickVerifyButton(page);
const images = await getCaptchaImages(page);
const diffImage = await getDiffImage(images);
diffImage.write("./diff.png");
})();
I run the program and it shows a nice bright red puzzle piece!
Now we must do some image magic 🧙 to find the center coordinates of that puzzle piece.
I create a new function.
async function getPuzzlePieceSlotCenterPosition(diffImage) {
const src = cv.matFromImageData(diffImage.bitmap)
const dst = new cv.Mat()
cv.cvtColor(src, src, cv.COLOR_BGR2GRAY)
cv.threshold(src, dst, 150, 255, cv.THRESH_BINARY_INV)
// This will find the contours of the image.
const contours = new cv.MatVector()
const hierarchy = new cv.Mat()
cv.findContours(
dst,
contours,
hierarchy,
cv.RETR_EXTERNAL,
cv.CHAIN_APPROX_SIMPLE
)
// Next, extract the center position from these contours.
const contour = contours.get(0)
const moment = cv.moments(contour)
const cx = Math.floor(moment.m10 / moment.m00)
const cy = Math.floor(moment.m01 / moment.m00)
// Just for fun, let's draw the contours and center on a new image.
cv.cvtColor(dst, dst, cv.COLOR_GRAY2BGR);
const red = new cv.Scalar(255,0,0);
cv.drawContours(dst, contours, 0, red);
cv.circle(dst, new cv.Point(cx, cy), 3, red);
new Jimp({
width: dst.cols,
height: dst.rows,
data: Buffer.from(dst.data)
}).write('./contours.png');
return {
x: cx,
y: cy,
}
}
And I call the function from the main function.
const center = await getPuzzlePieceSlotCenterPosition(diffImage);
When running this it displays the center!
Now that we have the coordinates, we must move the puzzle piece slider.
Let's create a new function for that.
async function slidePuzzlePiece(page, center) {
const sliderHandle = await page.$('.geetest_slider_button')
const handle = await sliderHandle.boundingBox()
let handleX = handle.x + handle.width / 2;
let handleY = handle.y + handle.height / 2;
await page.mouse.move(handleX, handleY, { steps: 25} );
await page.mouse.down();
await page.waitForTimeout(250);
let destX = handleX + center.x;
let destY = handleY + 32;
await page.mouse.move(destX, handleY, { steps: 25 });
await page.waitForTimeout(100)
}
I call the function in the main program and run the program.
There is a problem though, it keeps missing the puzzle center.
The problem is that the puzzle piece begins at a random x position.
So there is no way to know for us how far to move the slider to the right if we don't know the starting x position of the puzzle piece.
Let's once again use some opencv magic to find the location of the puzzle piece.
async function findMyPuzzlePiecePosition(page) {
// Must call the getCaptchaImages again, because we have changed the
// slider position (and therefore the image)
const images = await getCaptchaImages(page)
const srcPuzzleImage = images.puzzle
const srcPuzzle = cv.matFromImageData(srcPuzzleImage.bitmap)
const dstPuzzle = new cv.Mat()
cv.cvtColor(srcPuzzle, srcPuzzle, cv.COLOR_BGR2GRAY)
cv.threshold(srcPuzzle, dstPuzzle, 127, 255, cv.THRESH_BINARY)
const kernel = cv.Mat.ones(5, 5, cv.CV_8UC1)
const anchor = new cv.Point(-1, -1)
cv.dilate(dstPuzzle, dstPuzzle, kernel, anchor, 1)
cv.erode(dstPuzzle, dstPuzzle, kernel, anchor, 1)
const contours = new cv.MatVector()
const hierarchy = new cv.Mat()
cv.findContours(
dstPuzzle,
contours,
hierarchy,
cv.RETR_EXTERNAL,
cv.CHAIN_APPROX_SIMPLE
)
const contour = contours.get(0)
const moment = cv.moments(contour)
return {
x: Math.floor(moment.m10 / moment.m00),
y: Math.floor(moment.m01 / moment.m00),
}
}
And let's complete the slidePuzzlePiece
function
async function slidePuzzlePiece(page, center) {
const sliderHandle = await page.$('.geetest_slider_button')
const handle = await sliderHandle.boundingBox()
let handleX = handle.x + handle.width / 2;
let handleY = handle.y + handle.height / 2;
await page.mouse.move(handleX, handleY, { steps: 25} );
await page.mouse.down();
await page.waitForTimeout(250);
let destX = handleX + center.x;
let destY = handleY + 32;
await page.mouse.move(destX, handleY, { steps: 25 });
await page.waitForTimeout(100)
// find the location of my puzzle piece.
const puzzlePos = await findMyPuzzlePiecePosition(page)
destX = destX + center.x - puzzlePos.x
await page.mouse.move(destX, destY, 5)
await page.mouse.up()
}
Putting it all together the main function looks like this.
(async () => {
const browser = await puppeteer.launch({
headless: false,
});
const page = await browser.newPage();
await page.goto('https://scraperbox.com/captcha/geetest');
await clickVerifyButton(page);
const images = await getCaptchaImages(page);
const diffImage = await getDiffImage(images);
const center = await getPuzzlePieceSlotCenterPosition(diffImage);
await slidePuzzlePiece(page, center);
})();
And when I run it, the program slides the puzzle piece into the slot! 🎉
We've set up a headless chrome browser, and solved the geetest slider captcha.
So, basically, we've passed a Turing test! Albeit a very simple one. All it took was some basic open cv magic.
You can find the complete code on Github here
Happy coding!
Start now with 500 free API credits, no creditcard required.
Try Scraperbox for free