Sudoku Solver using OpenCV & Python
Free Machine Learning courses with 130+ real-time projects Start Now!!
We all played sudoku at least once in our childhood. It is a very interesting puzzle game. In this project, we’re going to build a sudoku solver program using OpenCV. It detects a sudoku board from an image and then solves it.
First, the program will detect a sudoku board in an image then extract only the board. Then it detects each given number from that board and then it solves the board. After solving the board it puts the solved numbers in that original image at the same position where the board was empty. All these operations will be built with OpenCV python and Deep learning OCR detection.
What is OpenCV?
OpenCV is an Image processing library for python and it is written in C/C++. We can also perform some real-time operations with OpenCV because it is very fast and lightweight.
What is Deep Learning?
Deep learning is a subset of machine learning that is built with neural networks. Neural networks try to mimic human brains to solve a problem. Deep learning is used to solve complex problems that cannot be solved with typical machine learning algorithms.
To detect numbers from the sudoku board we’ll use a pre-trained OCR model. OCR or Optical Character Recognition is a technology that recognizes text within a digital image. The model that we’ll be using is specially trained for sudoku. It only detects a single digit in an image.
Prerequisites:
1. Python 3.x (we used 3.7.10)
2. OpenCV – 4.4.0
3. Numpy – 1.19.3
4. Tensorflow – 2.5.0
5. Imutils – 0.5.4
Download Sudoku Solver OpenCV Project
Download source code of python sudoku solver: Sudoku Solver Project Source Code
So let’s start with the solver problem first:
So what is Sudoku?
Sudoku is a combinational number-placement puzzle. The objective of sudoku is to fill a 9×9 grid with some digits in such a way so that each column, row, and nine 3×3 subgrids contain all of the digits from 0-9.
We’ll use an algorithm called backtracking. Backtracking is an algorithm for finding solutions to some computational problem. It tries to find a solution until it validates all criteria.
In sudoku, the algorithm will put a number in an empty space and then move on to the next empty space, if anywhere some criteria fail then it will turn back to some previous solutions and changes it according to the criteria. It continues until it finds a final solution that matches all the criteria.
solver.py
# DataFlair OpenCV Sudoku solver def find_empty(board): """checks where is an empty or unsolved block""" for i in range(len(board)): for j in range(len(board[0])): if board[i][j] == 0: return (i, j) # row, col return None
- The function takes a 9×9 2D matrix board and finds all the empty positions within the board. Here 0 means an empty space.
def valid(board, num, pos): # Check row for i in range(len(board[0])): if board[pos[0]][i] == num and pos[1] != i: return False # Check column for i in range(len(board)): if board[i][pos[1]] == num and pos[0] != i: return False # Check box box_x = pos[1] // 3 box_y = pos[0] // 3 for i in range(box_y*3, box_y*3 + 3): for j in range(box_x * 3, box_x*3 + 3): if board[i][j] == num and (i,j) != pos: return False return True
- The valid function also takes the same board as input and validates if the board can be solved or not.
- It takes a number from the board and checks if the same number is existing in the row or column or the same subgrid.
def solve(board): find = find_empty(board) if not find: return True else: row, col = find for i in range(1,10): if valid(board, i, (row, col)): board[row][col] = i if solve(board): return True board[row][col] = 0 return False
- The solve function loop through each empty and nonempty element.
- It validates the number and wherever it finds an empty space it puts a validated number between 1-9.
- The loop continues until it fills up all the empty spaces and validates each digit of the board.
- If the board is solved then it returns True, otherwise, it returns False.
def get_board(bo): if solve(bo): return bo else: raise ValueError
- And finally, the get_board function calls the solve function. If the board is solved then it returns the solved board, otherwise, it raises a ValueError.
Now the main program where we actually solve and preview the solved sudoku.
Steps :
1. Import necessary packages.
2. Find sudoku board in an image.
3. Split the board.
4. Predict digits of each box.
5. Solve the board.
Sudoku-Solver.py
Step 1 – Import necessary packages:
# DataFlair Sudoku solver import cv2 import numpy as np from tensorflow.keras.models import load_model import imutils from solver import *
- Import all the required packages for the project.
Step 2 – Find sudoku board in an image:
# Read image img = cv2.imread('sudoku1.jpg') cv2.imshow("Input image", img)
- First read an image using cv2.imread() function.
def find_board(img): """Takes an image as input and finds a sudoku board inside of the image""" gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) bfilter = cv2.bilateralFilter(gray, 13, 20, 20) edged = cv2.Canny(bfilter, 30, 180) keypoints = cv2.findContours(edged.copy(), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) contours = imutils.grab_contours(keypoints) newimg = cv2.drawContours(img.copy(), contours, -1, (0, 255, 0), 3) cv2.imshow("Contour", newimg)
- We’ll try to find the board using contour detection.
- To detect contours we need to convert the image into the grayscale mode, so using cv2.cvtColor() we convert the image into grayscale mode.
- cv2.bilateralFilter() removes some noises from the image.
- Using cv2.Canny() we detect edges in the image.
- And then we detect all the continuous points within the edges using cv2.findContours. These continuous points are called contours.
Output:
contour.png contours = sorted(contours, key=cv2.contourArea, reverse=True)[:15] location = None # Finds rectangular contour for contour in contours: approx = cv2.approxPolyDP(contour, 15, True) if len(approx) == 4: location = approx break result = get_perspective(img, location) return result, location
- Now we sort the contours by their area size using the sorted function and take the first 10 detected contours.
- We know that sudoku boards are rectangular. So we loop through all the contours and check if the contour is rectangular or not. We do it using cv2.approxPolyDP() function.
- If the length of the contour is 4 then it is considered a rectangle.
- After that, we get a perspective transformation of that board and extract it from the main image.
def get_perspective(img, location, height = 900, width = 900): """Takes an image and location of an interesting region. And return the only selected region with a perspective transformation""" pts1 = np.float32([location[0], location[3], location[1], location[2]]) pts2 = np.float32([[0, 0], [width, 0], [0, height], [width, height]]) # Apply Perspective Transform Algorithm matrix = cv2.getPerspectiveTransform(pts1, pts2) result = cv2.warpPerspective(img, matrix, (width, height)) return result
- Perspective transformation is also known as a birds-eye view. Perspective transformation is used to get a better insight into a given image or video. In perspective transformation, a selected region from an image is projected into another plane.
- Pts1 points will be selected from the given image and pts2 is the point of another plane where the selected region will be projected.
- cv2.getPerspectiveTransformation() returns the matrix of the transformation and the cv2.warpPerspective() gives the actual transformed image.
board, location = find_board(img) cv2.imshow("Board", board)
Output:
Step 3 – Split the board:
# split the board into 81 individual images def split_boxes(board): """Takes a sudoku board and split it into 81 cells. each cell contains an element of that board either given or an empty cell.""" rows = np.vsplit(board,9) boxes = [] for r in rows: cols = np.hsplit(r,9) for box in cols: box = cv2.resize(box, (input_size, input_size))/255.0 cv2.imshow("Splitted block", box) cv2.waitKey(50) boxes.append(box) return boxes
- Sudoku contains a total of 81 numbers. And we want to classify each number individually.
- The split_boxes() function takes a board and splits it into 81 images or blocks.
- np.vsplit() splits an image vertically and np.hsplit() splits an image horizontally.
gray = cv2.cvtColor(board, cv2.COLOR_BGR2GRAY) rois = split_boxes(gray) rois = np.array(rois).reshape(-1, input_size, input_size, 1)
Step 4 – Predict digits of each box:
classes = np.arange(0, 10) model = load_model('model-OCR.h5') # get prediction prediction = model.predict(rois) # print(prediction) predicted_numbers = [] # get classes from prediction for i in prediction: index = (np.argmax(i)) predicted_number = classes[index] predicted_numbers.append(predicted_number) print(predicted_numbers)
- We have a total of 9 classes to predict. np.arange(0,10) creates an array containing 0 to 9.
- Using the load_model function we load the pre-trained model.
model.predict(rois) predicts all total 81 digits at once. It returns an array containing 81 predictions. Each prediction array contains 9 elements. - Then we iterate through each prediction and get the index of the max element using np.argmax().
- After that, we get the final predicted number from the classes list.
Step 5 – Solve the board:
# reshape the list board_num = np.array(predicted_numbers).astype('uint8').reshape(9, 9)
- The solver program takes a 9×9 2D matrix. But our predicted list is a 1D flat list. So we reshaped the list first.
# solve the board try: solved_board_nums = get_board(board_num)
- After that we fed the matrix to the get_board() function from the solver program. It returns a 9*9 solved matrix.
But we want to preview the solved board in the main input image. So let’s do that.
def displayNumbers(img, numbers, color=(0, 255, 0)): W = int(img.shape[1]/9) H = int(img.shape[0]/9) for i in range (9): for j in range (9): if numbers[(j*9)+i] !=0: cv2.putText(img, str(numbers[(j*9)+i]), (i*W+int(W/2)-int((W/4)),int((j+0.7)*H)), cv2.FONT_HERSHEY_COMPLEX, 2, color, 2, cv2.LINE_AA) return img
- The displayNumbers() function displays numbers to an image.
def get_InvPerspective(img, masked_num, location, height = 900, width = 900): """Takes original image as input""" pts1 = np.float32([[0, 0], [width, 0], [0, height], [width, height]]) pts2 = np.float32([location[0], location[3], location[1], location[2]]) # Apply Perspective Transform Algorithm matrix = cv2.getPerspectiveTransform(pts1, pts2) result = cv2.warpPerspective(masked_num, matrix, (img.shape[1], img.shape[0])) return result
- get_InvPerspective is exact the opposite of the get_perspective() function. It takes an image and projects it into another planes selected region.
binArr = np.where(np.array(predicted_numbers)>0, 0, 1) print(binArr) # get only solved numbers for the solved board flat_solved_board_nums = solved_board_nums.flatten()*binArr # create a mask mask = np.zeros_like(board) solved_board_mask = displayNumbers(mask, flat_solved_board_nums) cv2.imshow("Solved Mask", solved_board_mask)
- First, we create a binary array of 0 and 1 from the predicted unsolved array. If an element is greater than 0, that will be added to the binary array as 1 and the rest of the element as 0.
- Then we create a blank mask of the same size as the board using np.zeros_like().
- The displayNumbers() function displays only solved digits in the mask.
Output:
# Get inverse perspective inv = get_InvPerspective(img, solved_board_mask, location)
Output:
combined = cv2.addWeighted(img, 0.7, inv, 1, 0) cv2.imshow("Final result", combined) cv2.waitKey(0) except: print("Solution doesn't exist. Model misread digits.")
- Finally, we combined the inverse mask and the original image.
OpenCV Sudoku Solver Output:
Summary:
In this project, we built OpenCV sudoku solver algorithm that detects sudoku board from an image and then solves it. Through this project, we’ve learned about deep learning, image transformation, and some other image processing operations.
Did you like our efforts? If Yes, please give DataFlair 5 Stars on Google
That is a very interesting project, I would like to recreate it from scratch.
Can you tell me where you downloaded the model in h5 format? I am only able to deduce its architecture by calling model.summary() but I would like to know more about the dataset it was trained on and the hyperparameters. I am also wondering how the model is able to infer the digit “zero” from empty cells.
Thank you in advance,
greetings
Giorgio
Excellent explanation. Thank you very much.
I hope to recommend it for your audience.