Solving a Geetest Slider Captcha with Puppeteer

Dirk author image By on 01 Dec, 2020


I recently had to solve the Geetest slider captcha.

The Captcha is basic enough, you must slide a puzzle piece into the slot.

While doing this I thought that it would be a fun challenge to try and create a program that will solve this captcha!

Setting up Puppeteer

To begin I'm going to use puppeteer. This allows me to control a chrome browser with Javascript.

Let's set everything up!

npm init && npm i puppeteer
touch index.js

This will create a new node project, install puppeteer, and create a index.js file.

Then I add the following to the index.js file. I use this captcha test page that contains the puzzle slider captcha.

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch({
	headless: false,
  });
  const page = await browser.newPage();
  await page.goto('https://scraperbox.com/captcha/geetest');
})();

Let's run the program:

node index.js

And this spins up a chrome browser and loads the Geetest captcha page, nice! 🎉

Solving the Captcha

Alright, the next step is to solve the captcha.

I create a new function that will wait for the Click to verify element to show up and then click it.

async function clickVerifyButton(page) {
    await page.waitForSelector('[aria-label="Click to verify"]');
    await page.click('[aria-label="Click to verify"]');
    await page.waitForSelector('.geetest_canvas_img canvas', { visible: true })
	await page.waitForTimeout(1000)
}

Next, I simply call the function in the main piece of code.

(async () => {
    const browser = await puppeteer.launch({ headless: false });
    const page = await browser.newPage();
    await page.goto('https://scraperbox.com/captcha/geetest');

    // Call the function here
    await clickVerifyButton(page)
})();

When running this, you will see that the browser clicks the verify button. 🚀

The Captcha Images

Behind the scenes, Geetest generates 3 images. Let's extract them.

I'm going to use Jimp, this is a great image package for node.

npm i jimp

And add the dependency at the top of your index.js

const puppeteer = require('puppeteer');
const Jimp = require('jimp');

// Rest of the code

Then, I create a new function to extract the images.

async function getCaptchaImages(page) {
	const images = await page.$$eval(
		'.geetest_canvas_img canvas',
		(canvases) => {
			return canvases.map((canvas) => {
                // This will get the base64 image data from the
				// html canvas. The replace function simply strip
				// the "data:image" prefix.
				return canvas
					.toDataURL()
					.replace(/^data:image\/png;base64,/, '')
			})
		}
	);

    // For each base64 string create a Javascript buffer.
	const buffers = images.map((img) => new Buffer(img, 'base64'));

    // And read each buffer into a Jimp image.
	return {
		captcha: await Jimp.read(buffers[0]),
		puzzle: await Jimp.read(buffers[1]),
		original: await Jimp.read(buffers[2]),
	};
}

Let's call this function and write the result to 3 image files.

// Add this in the main function
const images = await getCaptchaImages(page);

images.captcha.write("./captcha.png");
images.original.write("./original.png");
images.puzzle.write("./puzzle.png");

When running this 3 files should be created that show the images! 🦾

Image Difference

We must find the location of the puzzle piece. To do this I'm going to calculate the difference between the captcha and the original image.

I'm going to use the pixelmatch and Open CV packages to do this.

npm i pixelmatch opencv-wasm

Once again don't forget to require the packages at the top of the index.js file

const puppeteer = require('puppeteer');
const Jimp = require('jimp');
const pixelmatch = require('pixelmatch');
const { cv } = require('opencv-wasm');

// Rest of the code

Then, I create a new function that creates the diff image.

async function getDiffImage(images) {
	const { width, height } = images.original.bitmap

	// Use the pixelmatch package to create an image diff
	const diffImage = new Jimp(width, height)
	pixelmatch(
		images.original.bitmap.data,
		images.captcha.bitmap.data,
		diffImage.bitmap.data,
		width,
		height,
		{ includeAA: true, threshold: 0.2 }
	)

	// Use opencv to make the diff result more clear
	const src = cv.matFromImageData(diffImage.bitmap)
	const dst = new cv.Mat()
	const kernel = cv.Mat.ones(5, 5, cv.CV_8UC1)
	const anchor = new cv.Point(-1, -1)
	cv.threshold(src, dst, 127, 255, cv.THRESH_BINARY)
	cv.erode(dst, dst, kernel, anchor, 1)
	cv.dilate(dst, dst, kernel, anchor, 1)

	return new Jimp({
		width: dst.cols,
		height: dst.rows,
		data: Buffer.from(dst.data),
	})
}

And I call the new getDiffImage function from the main function.

The whole main function now looks like this:

(async () => {
  const browser = await puppeteer.launch({
	headless: false,
  });
  const page = await browser.newPage();
  await page.goto('https://scraperbox.com/captcha/geetest');

  await clickVerifyButton(page);
  const images = await getCaptchaImages(page);
  const diffImage = await getDiffImage(images);

  diffImage.write("./diff.png");
})();

I run the program and it shows a nice bright red puzzle piece!

Finding the center

Now we must do some image magic 🧙 to find the center coordinates of that puzzle piece.

I create a new function.

async function getPuzzlePieceSlotCenterPosition(diffImage) {
	const src = cv.matFromImageData(diffImage.bitmap)
	const dst = new cv.Mat()

	cv.cvtColor(src, src, cv.COLOR_BGR2GRAY)
	cv.threshold(src, dst, 150, 255, cv.THRESH_BINARY_INV)

	// This will find the contours of the image.
	const contours = new cv.MatVector()
	const hierarchy = new cv.Mat()
	cv.findContours(
		dst,
		contours,
		hierarchy,
		cv.RETR_EXTERNAL,
		cv.CHAIN_APPROX_SIMPLE
	)

	// Next, extract the center position from these contours.
	const contour = contours.get(0)
	const moment = cv.moments(contour)
	const cx = Math.floor(moment.m10 / moment.m00)
	const cy = Math.floor(moment.m01 / moment.m00)

	// Just for fun, let's draw the contours and center on a new image.
	cv.cvtColor(dst, dst, cv.COLOR_GRAY2BGR);
	const red = new cv.Scalar(255,0,0);
	cv.drawContours(dst, contours, 0, red);
	cv.circle(dst, new cv.Point(cx, cy), 3, red);
	new Jimp({
		width: dst.cols,
		height: dst.rows,
		data: Buffer.from(dst.data)
	}).write('./contours.png');

	return {
		x: cx,
		y: cy,
	}
}

And I call the function from the main function.

const center = await getPuzzlePieceSlotCenterPosition(diffImage);

When running this it displays the center!

Moving the slider

Now that we have the coordinates, we must move the puzzle piece slider.

Let's create a new function for that.

async function slidePuzzlePiece(page, center) {
	const sliderHandle = await page.$('.geetest_slider_button')
	const handle = await sliderHandle.boundingBox()

	let handleX = handle.x + handle.width / 2;
	let handleY = handle.y + handle.height / 2;

	await page.mouse.move(handleX, handleY, { steps: 25} );
	await page.mouse.down();

	await page.waitForTimeout(250);

	let destX = handleX + center.x;
	let destY = handleY + 32;
	await page.mouse.move(destX, handleY, { steps: 25 });
	await page.waitForTimeout(100)


}

I call the function in the main program and run the program.

There is a problem though, it keeps missing the puzzle center.

Locating the puzzle piece

The problem is that the puzzle piece begins at a random x position.

So there is no way to know for us how far to move the slider to the right if we don't know the starting x position of the puzzle piece.

Let's once again use some opencv magic to find the location of the puzzle piece.

async function findMyPuzzlePiecePosition(page) {
	// Must call the getCaptchaImages again, because we have changed the
	// slider position (and therefore the image)
	const images = await getCaptchaImages(page)
	const srcPuzzleImage = images.puzzle
	const srcPuzzle = cv.matFromImageData(srcPuzzleImage.bitmap)
	const dstPuzzle = new cv.Mat()

	cv.cvtColor(srcPuzzle, srcPuzzle, cv.COLOR_BGR2GRAY)
	cv.threshold(srcPuzzle, dstPuzzle, 127, 255, cv.THRESH_BINARY)

	const kernel = cv.Mat.ones(5, 5, cv.CV_8UC1)
	const anchor = new cv.Point(-1, -1)
	cv.dilate(dstPuzzle, dstPuzzle, kernel, anchor, 1)
	cv.erode(dstPuzzle, dstPuzzle, kernel, anchor, 1)

	const contours = new cv.MatVector()
	const hierarchy = new cv.Mat()
	cv.findContours(
		dstPuzzle,
		contours,
		hierarchy,
		cv.RETR_EXTERNAL,
		cv.CHAIN_APPROX_SIMPLE
	)

	const contour = contours.get(0)
	const moment = cv.moments(contour)

	return {
		x: Math.floor(moment.m10 / moment.m00),
		y: Math.floor(moment.m01 / moment.m00),
	}
}

And let's complete the slidePuzzlePiece function

async function slidePuzzlePiece(page, center) {
	const sliderHandle = await page.$('.geetest_slider_button')
	const handle = await sliderHandle.boundingBox()

	let handleX = handle.x + handle.width / 2;
	let handleY = handle.y + handle.height / 2;

	await page.mouse.move(handleX, handleY, { steps: 25} );
	await page.mouse.down();

	await page.waitForTimeout(250);

	let destX = handleX + center.x;
	let destY = handleY + 32;
	await page.mouse.move(destX, handleY, { steps: 25 });
	await page.waitForTimeout(100)

	// find the location of my puzzle piece.
	const puzzlePos = await findMyPuzzlePiecePosition(page)
	destX = destX + center.x - puzzlePos.x
	await page.mouse.move(destX, destY, 5)
	await page.mouse.up()
}

Putting it all together the main function looks like this.

(async () => {
  const browser = await puppeteer.launch({
	headless: false,
  });
  const page = await browser.newPage();
  await page.goto('https://scraperbox.com/captcha/geetest');

  await clickVerifyButton(page);
  const images = await getCaptchaImages(page);
  const diffImage = await getDiffImage(images);
  const center = await getPuzzlePieceSlotCenterPosition(diffImage);
  await slidePuzzlePiece(page, center);
})();

And when I run it, the program slides the puzzle piece into the slot! 🎉

Conclusion

We've set up a headless chrome browser, and solved the geetest slider captcha.

So, basically, we've passed a Turing test! Albeit a very simple one. All it took was some basic open cv magic.

You can find the complete code on Github here

Happy coding!


Dirk author image Dirk Hoekstra has a Computer Science and Artificial Intelligence degree. He is a technical author on Medium where his articles have been read over 100,000 times. Founder of multiple tech companies of which one was acquired in 2020.