In this project we designed, implemented and tested a spectrum of giant sudoku solvers. The serial algorithm we used is backtracking algorithm, based on which we derived our parallel algorithm using a queue scheme and implemented it with both OpenMP and MPI. We found that the main overhead of our parallelism comes from the extremely unbalanced workload of each queue elements. The OpenMP dynamic scheduling policy solves such problem perfectly and achieves sublinear to linear speedup. However, for MPI there is no built-in counterpart to the OpenMP dynamic scheduling policy. Therefore, we added a shuffling strategy and our own dynamic scheduling policy to the MPI implementation. Our test result suggests that both shuffling and dynamic scheduling significantly improve the MPI speedup to a sublinear level, but their combination doesn't seem to provide any further benefit and on the contrary increases the overhead.

One possible direction is to implement a shared stack for the OpenMP version, which stores
all the intermediate states. We can initialize the stack with the original board. Each
thread pops an intermediate board from the stack each time, and pushes the board back
with one more cell filled. Invalid solutions will be discarded, and solutions will be
recorded in a queue and output in the end.

Another possibility is to apply a better solving algorithm. The solver we implemented is a
brute force solver using backtracking algorihm. It guarantees to find all solutions if
they exist but is very slow. With a humanistic solver or constraint propagation alogrithm,
the execution time may be reduced.

Our project is inspired by a problem in AM225. The queue scheme for parallelism is learned from the files on this repo, but our structure and algorithm of the Sudoku solver are different.