The optimization researches

Chapter 1


Solving Sudoku puzzles has been an area of great interest in the field of optimization researches. It is known to be an NP-Complete problem [34] with discrete possible solutions. Since Sudoku has only one correct solution, it is impractical to use brute force trial and error or guessing. Various algorithms have been developed and tested to see how it solves the puzzle and how efficient and fast it is in finding the unique solution. Logic-based algorithm, an algorithm that mimics the human way of solving the puzzles, can efficiently solve easy puzzles. But, with harder puzzles that usually employ certain degree of guessing, logic-based algorithms are unable to come up with a solution. Thus, backtracking algorithms has been applied and used to solve these hard puzzles; however, the efficiency of the backtracking algorithm depends on the number of guesses required to solve the puzzles [19].

In the search for a fast and efficient algorithm in solving Sudoku puzzles, computational intelligence, such as Genetic Algorithm (GA), the Ant Colony Optimization (ACO), the Particle Swarm Optimization (PSO), the Artificial Bee Colony (ABC) and Simulated Annealing (SA), have been proposed in recent years. These optimization algorithms were inspired from studying the behavior of biological creatures and are known as Swarm Intelligence (SI) algorithms [1].

Through the idea of computational intelligence derived from observing the behavior of creatures, Cat Swarm Optimization (CSO) was proposed in 2006 by studying the behavior of the cats. Cats, as a feline with natural hunting skill, has a high level of alertness and strong curiosity of moving objects even when they spend most of their time inactive and resting that is why they appear to be lazy; however, behind this lazy behavior will show its awake large eyes observing and looking around its environment [5]. This hunting skill and lazy behavior can be modeled to find the solution space for Sudoku problems. Cat Swarm Optimization can be applied to optimization problems with greater success in finding the optimal solution than other algorithms [5]. Thus, this inspired the researchers to apply Cat Swarm Optimization in solving Sudoku puzzles.


Puzzles, as defined by the Random House Dictionary, are toys or other contrivance designed to amuse by presenting difficulties to be solved by ingenuity or patient effort [13]. This emphasizes that solving puzzles requires ingenuity and patience on the solver. Other definitions vary in some aspects depending on the one who defines it.

In Britain, puzzles cover all kinds of mental activities such as tricks, surprises, finding pretty patterns, and so on, but puzzles are something you do for fun [9]. Similarly, Stan Isaac, a puzzle collector, defines puzzles as "it is fun and it has a right answer". This definition emphasizes that puzzles are a form of play with an answer which differentiate puzzles from other forms of play.

In an abstract strategy game, there is an intimate relationship between the game and a puzzle, where good players present every board position to their opponent by a puzzle [32]. This puzzle must be discovered and solved to ensure a winning game.

Puzzles are mental challenges where manipulating it to find a way makes it fun and gives the solver a certain level of enjoyment. However, when we can't figure out how to solve them, these puzzles cause unplanned anguish and decrease the level of enjoyment.

Puzzles are solved using problem-solving skills such as pattern recognition, strategy, sequence solving, word completion, logic, and "common" sense which requires the player to apply a bit of real-world knowledge. Fast solvers have high inductive reasoning that enables them to identify patterns that would lead to a solution.

Types of Puzzles

There are many different types of puzzles. Some of them are popular which can usually be seen on newspapers, magazines, books and the Internet. These puzzles are categorized into several categories: guessing game, logic puzzles, mechanical puzzles, picture puzzles, tour puzzles, word puzzles, metapuzzle, and puzzle video game [22].

Guessing Game

A guessing game is a game in which guessing theobjectusing some kind of information, such as a word, a phrase, a title, or the location of an object. These games are played cooperatively. In some cases, playing this kind of a game means thatplayers know the answer, but should not tell the other players, instead they must help them to guess the answer. Riddles and situation puzzles fall under this category.

Riddles are statements, questions, or phrases having a double or veiled meaning. This form of a puzzle is generally expressed in metaphorical or allegorical language that requires ingenuity and careful thinking to guess the right answer.

Situation puzzles, often referred to as lateral thinking puzzles or yes/no puzzles, are puzzles usually played in a group where players are asking question with a "yes" or "no" answer to the one hosting the puzzle. The puzzle is solved when one of the players figures out what's on the mind of the host.

Other examples of this category includes Battleship, Charades, Hangman, I-spy, Mastermind, Name that tune, Pictionary, Protmuis, Quizbowl, Twenty questions, Ulam's Game, Who's closest, Guess who?, What's my line?, Yes and no, and Contact [22].

Logic Puzzles

Another category of puzzles is the so-called logic puzzles. These puzzles are derived from mathematics field of deduction, which requires deductive reasoning in solving the puzzles. One of the popular forms is the Sudoku puzzle, shown in Figure 1.1, a partially filled n x n grid that must completely filled by the solver abiding its rules. Some popular forms include Kakuro, Nonogram, also called "Paint by Numbers", and logic grid puzzles, which are shown in Figures 1.2 -1.4 [17].

Mechanical Puzzles

These types of puzzles are puzzles that are presented as a set of mechanical interlinked pieces. These puzzles are categorized into: Assembly puzzles, Disassembly puzzles, Interlocking puzzles (like Chinese wood knot), Disentanglement puzzle, Fold puzzles, Lock puzzles, Trick Vessels, Impossible objects (like Ship in a bottle), Dexterity puzzles, and Sequential movement puzzle (like Rubik's Cube, Tower of Hanoi shown in Figure 1.5 and Figure 1.6 respectively) [22].

Picture Puzzles

Picture puzzles are puzzles on images, usually popular to children. Spot the difference and Connect the dots are the most popular [17]. These include the following: Tiling puzzle, Jigsaw puzzles, and Sliding puzzle [22].

Tiling puzzles are puzzles in which flat shapes are rearrange to form a larger shape without overlapping. These puzzles are usually made from woods, metals, plastic, cardboard, and any other sheet-material. Puzzles like Tangram puzzles, as shown in Figure 1.7, and jigsaw puzzles are among of the famous [22].

Sliding puzzles, usually in two dimensional, are like tour puzzles which prohibits lifting any piece on the board. Pieces are mechanically interlinked on the board. Thus, solving this puzzle challenges the player by moving the pieces by sliding. TheFifteen Puzzle, as shown in Figure1.8,is the oldest type of sliding block puzzle [17].

Jigsaw puzzles, as shown in Figure 1.9, are tiling puzzle which originally created by painting a picture on a flat, rectangular piece of wood and cut into small pieces with a jigsaw, where the name comes. It has usually an oddly shaped, interlocking pieces. Nature, building, castle and mountains on traditional jigsaw puzzles, and repetitive design are the typical images found on jigsaw puzzles. In fact, any picture can be used as an image on jigsaw puzzles [15].

Tour Puzzles

In this type of a puzzle, the player acts as a traveler and moves to certain points on the board using a token. Puzzles under this category are Maze puzzles (like the labyrinth as shown in Figure 1.10), Transport puzzles (like Sokoban Puzzle), and Chess problems (like a Knight's Tour) [17].

Word Puzzles

Word puzzles are puzzles that maybe spoken or are on board games usually designed to test the language ability or to explore the language's properties. These puzzles become the source of entertainment and at the same time have an educational purpose. Solvingwordpuzzles requires familiarity with a largervocabulary and indeed, becomes a pastime to young people and adults in order to keep their minds sharp. Crossword and Anagrams are most popular under this type [17].

A crossword, as shown in Figure 1.11, is a word puzzle that is normally has white and shaded smaller squares over a bigger square or rectangular grid. Filling the white squares with letters, forming words or phrases which correspond to the answers of the given clues solves the puzzle.

Anagrams, also called Snatch, Snatch-words, or Grabscrab, is a word puzzle wherein the goal is to rearrange the letters to form words. Letters are written on tiles wherein tiles are turned over one by one and the players will combine the tiles of letters to form the words. Nowadays these word puzzles are usually played on boards, like Scrabble.


These puzzles are puzzles which are composed of several puzzles. In order to solve this puzzle, individual puzzles that comprise this puzzle must be solved first. The structure of a metapuzzle often makes it possible to guess, with a certain degree of certainty, the solutions to the puzzles that feed into it without solving them. This technique is called backsolving.

Puzzle Video Game

Puzzle video games are video games that focus on solving puzzles. These games involve a variety of logical and conceptual challenges, although occasionally the games add time-pressure or other action-element. Some of the popular examples of this include Tetris and Minesweeper.

Sudoku Puzzles

In the 18th century, Swiss mathematician, Leonhard Euler developed the concept of Latin Squares where numbers in a grid appear only once, across, up and down. The Sudoku puzzle is a special case of Latin squares. It is a popular logic-based combinatorial puzzle game played on a partially filled 9x9 grid. The Sudoku puzzle was designed by Howard Garns, an American retired architect and freelance puzzle constructor. Garns took the concept of Latin square and applied it to a 9x9 grid with the addition of 3x3 subgrids or boxes. The Sudoku puzzle was first published in New York in 1979 by Dell Pencil Puzzles and Word Games Magazine under the title "Number Place". However, it became known only after it was introduced in Japanese monthly paper called Nikolist in April 1984. The puzzle was then named as Suuji wa dokushin ni kagiru (means "digits must be single" or "the numbers must occur only once") by Kaji maki, the president of Nikoli. At a later date, it was abbreviated to Sudoku where "Su" means number in Japanese and "Doku" refers to the single place in puzzle board that each number can fit into. One way to describe the game is "Solitaire with numbers". In 1989, the first prototype of the computerized Sudoku puzzle generator was published by Loadstar/Softdisk Publishing. Later in 1997, retired Hong Kong judge, Wayne Gould, worked over 6 years and developed a computer program to produce puzzles quickly. And in November 2004, this puzzle game became an international phenomenon when it was printed by The Times of London [33].

The rules in playing Sudoku puzzles were simple. Each of the cells in a Sudoku puzzle should contain a number between one and nine. The goal of the game is to solve the puzzle by assigning every empty cell with numbers ranging from 1 to 9 such that the entries in each row, each column and each subgrid will have the following properties [19]:

  • Each row must contain each of the integers 1-9 exactly once.
  • Each column must contain each of the integers 1-9 exactly once.
  • Each sub-grid must contain each of the integers 1-9 exactly once.

Sudoku puzzle has a single, unique solution. The level of difficulty in solving a Sudoku puzzle depends on the number of unknown or missing entry and which of the squares are missing. Typical puzzles have 40 to 54 missing entries. Puzzles are usually labeled as easy, medium, and hard in relation to its difficulty level [11].

Variants of Sudoku

In 2004, Sudoku started to become world famous with a standard 9x9 Sudoku grid. From this simple but brilliant puzzle, many Sudoku variants were designed. Sudoku puzzles can be 4x4 grid with 2x2 sub-grid, 55 grids with pentomino regions, published under the name Logi-5, the World Puzzle Championship has previously featured a 66 grid with 23 regions and a 77 grid with six heptomino regions and a disjoint region; Daily SuDoku features new 44, 66, simpler 99 grids every day as Daily SuDoku for Kids, and et. al. [29]. Among the variants of Sudoku puzzles, the following are the most popular: Traditional Sudoku, Sudoku X, Jigsaw Sudoku, Killer Sudoku, Hyper Sudoku, and Samurai Sudoku.

Traditional Sudoku

This is the most popular Sudoku puzzle. Traditional Sudoku puzzles, as shown in Figure 1.13, uses numerals to completely fill in the puzzle. However, symbols, colors, letters and numbers can also be used. The objective is to fill the grid so that every column, every row and every 3x3 box or sub-grid contain each of the digits 1 to 9. With the given partially completed grid, typically, it has only one solution.

Sudoku X

Sudoku X, as shown in Figure 1.14, is another Sudoku variant, which is a traditional Sudoku puzzle with a special case of extra regions. These two extra regions, the main diagonals which give an X-shape, require the numbers 1 to 9 to appear exactly once.

Jigsaw Sudoku

Jigsaw Sudoku or Nonomino Sudoku, as shown in Figure 1.15, is one of the variants of Sudoku puzzles. This variant is the same with the traditional Sudoku, only that its 3x3 region is squeezed into an irregular shape, that is why it is also called Irregular Sudoku. This region must contain no duplication of numbers 1- 9.

Killer Sudoku

Another variant of Sudoku puzzle is the Killer Sudoku, as shown in Figure 1.16. The same 9x9 board with rows, columns and nine boxes is to be filled in with the numbers 1 to 9. Killer Sudoku, also called Sum-Sudoku, follows the same rules as Sudoku with usually an empty starting grid and the given clues are in the form of sum. The 9x9 grids are divided into regions or cages. Each cage contains a number corresponding to the sum of all the numbers in that region. With the use of these given numbers, you will be able to solve the puzzle.

Hyper Sudoku

Hyper Sudoku, as shown in Figure 1.17, is also known as Windoku because it looks like a window with glazing bars. This variant is simply a traditional Sudoku with an additional 4 regions overlapping with the nine region of the traditional Sudoku. The numbers contained in each of this region must also appear exactly once. Windoku is solved like a traditional Sudoku, only with 13 regions.

Samurai Sudoku

Samurai Sudokupuzzle, as shown in Figure 1.18, is another variation of the original Sudoku Puzzles made popular by the Times Newspaper. The game is essentially the same except it consists of 5 Sudoku puzzles that are connected to the center one, which will provide even the experienced Sudoku Addict with a good challenge [23].

Mathematics of Sudoku

A typical Sudoku is usually played in a 9 x 9 grid. Nowadays, variants of the game are played such as 4 x 4 and even 16 x 16. In general, an n x n Sudoku grid contains n x n sub-boxes or regions and n x n spaces for numbers in each of those boxes. Suppose it will take S(n) to solve the Sudoku puzzle and V(n) time to verify the solution, verifying it would be in V(n) = O(n2 x n2). Sudoku is an example of a polynomial-time verifiable problem. It is easy to verify the given solution whether it is correct but finding that solution can be non-solvable in polynomial time. This type of problem is known to be an NP (Nondeterministic Polynomial-Time) problem. Sudoku grids larger than 9 x 9 falls into the NP-complete class of computer problem, which means that it is a problem that will take forever when solving computationally [27].

The Sudoku is a puzzle that has only one unique solution. The constraints imposed on it make Sudoku one of the difficult problems in NP-complete set. NP includes problems which are of great practical importance such as scheduling (like classroom scheduling and scheduling jobs on machines), network routing, packing objects into beans, allocating variables to registers, decryption, gene sequencing, etc. Solving Sudoku is an ideal tool for investigating the rest of the NP problems. It was stated that finding solution to any NP-complete problem would imply P = NP which means that for every problem that has an efficiently verifiable solution, an efficient solution can also be found [10]. If derived through brute force computation, the number of valid Sudoku solution grids for the standard 9 x 9 grid is 6,670,903,752,021,072,936,960 which is equivalent to 9! 722 27 27,704,267,971, the last factor of which is a prime. Finding the shorter path to the solution, that is, an optimal solution, to Sudoku puzzles and other NP-complete problems has been the goal in recent studies. An efficient algorithm for any NP-complete problems automatically provides an efficient algorithm for solving all [33].

Swarm Intelligence

Swarm Intelligence is a study of nature-inspired algorithms that have been recently developed to handle real world problems. The term Swarm Intelligence (SI) denotes an artificial intelligent system wherein the collective behavior of simple agents generates a solution or pattern. SI based algorithms produce a low cost, fast and reasonably accurate solutions to problems [8]. Historically, Swarm Intelligence was introduced by Beny &Wang in 1989 using the context of cellular robotic systems. The term swarm refers to a group of agents interacting with each other to achieve a purpose. Fundamental properties of a swarm are self-organization, division of labor and ability to adapt within a given environment. SI algorithms are based on the study of the collective behavior of particular group of biological creatures [8].

One of the well-known SI algorithms is the Ant Colony Optimization (ACO). It mimics the behavior of a group of real ants. See an example of a collective behavior of ants in Figure 1.19. Ants looking for food deposit trails of chemical substance known as pheromones to help other members of its team to follow its trail. Basically, ants follow the trail with more pheromones thereby a shortest path to the food source can be determined. The artificial ant colony system was proposed by Dorigo and the algorithm has been found to be robust and versatile in handling combinatorial optimization problems [8].

Another well-known SI algorithm, which is based on the intelligent behavior of honey bee swarm, is the Artificial Bee Colony (ABC). Figure 1.20 shows a honey bee colony. In a colony, there are three group of bees namely, employed bees, onlooker bees and scout bees. Employed bees go to a food source, evaluate its nectar amount and come back to hive and dance in this area. Scouts are bees looking for a new food source and onlookers are bees watching the dances of employed bees and choosing food sources depending on the dances. The onlookers choose the food source with a high nectar amount. In ABC, the position of food source represents a possible solution to the optimization problem and the nectar amount of food source corresponds to the fitness of the associated solution. The process goes into a cycle from the employed bees, onlooker bees and scout bees, keeping the position of the best food source until a condition is satisfied [12].

Particle Swarm Optimization (PSO) is also a well-known SI Algorithm. It is a global optimization algorithm. Each of the particles is considered as a conceptual entity which flies through a multi-dimensional search space. The populations of such particles are then called swarms. Each particle is assigned a velocity and position and is evaluated according to some fitness criterion. The particles are expected to accelerate and move toward those particles with better fitness values [8].

Cat Swarm Optimization (CSO)

Cat Swarm Optimization is an optimization algorithm based on the common behavior of cats proposed by Chu, et al. [4] in 2006. This algorithm somehow belongs to the swarm intelligence category. The CSO uses cats as the solution set of the problem.

Behavior of Cats

Cats or domestic cats are most valued pets aside from dogs for humans. They are needed for companionship and for its ability to hunt household pests. Two of the major behavioral traits of cats are its ability to hunt an object and spending most its time resting. The cats are hunters by nature and express a strong curiosity for moving objects. However, rather than chasing objects, cats are seen resting most of the day but this does not mean they are inactive. Cats are known for its keen vision and hearing around its surrounding. They are very alert animals and this alertness is still seen in them even if they are resting. Hence, cats that appear to be lazy may actually be observing the environment with its large wide eyes. Figure 1.21 shows a cat observing its environment while resting [5].

Seeking Mode

This sub model represents the behavior of a cat wherein the cat appears to be lazy and resting but is awake and is observing its surroundings for its next move [5].

Seeking Mode has four essential factors and these are as follows: Seeking Memory Pool (SMP), Seeking Range of the Selected Dimension (SRD), Counts of Dimension to Change (CDC), and Self Position Consideration (SPC). SMP is used to define the size of the seeking memory of each cat. The cat would pick a point from the memory pool. SRD declares the mutative ratio for the selected dimension. The difference between the old and new values of the selected dimension may not be out of range defined by SRD. CDC tells how many of the dimensions will be varied. SPC is a Boolean valued variable indicating whether the point at which the cat is already standing will be one of the candidate points to move to. SPC cannot influence the value of SMP [5].

The seeking mode process is described with the following steps [5]:

  1. Make j copies of the present position of catk, where j = SMP. If the value of SPC is true, let j = (SMP - 1), then retain the present position as one of the candidates.
  2. For each copy, according to CDC, randomly plus or minus SRD percents the present values and replace the old ones.
  3. Calculate the fitness value (FS) of all candidate points.
  4. If all FS are not exactly equal, calculate the selecting probability of each candidate point by equation (1), otherwise set all the selecting probability of each candidate point by 1.
  5. Randomly pick the point to move to from the candidate points, and replace the position of catk.

CSO Algorithm

To include these two modes into the CSO algorithm, Mixture ratio (MR) dictates the joining of seeking mode with tracing mode. In the real world, a cat spends very little time chasing things as this leads to over use of energy resources. This means that the cat is in seeking mode most of the time. To model this behavior in CSO, MR is allocated a very small value to guarantee that the cats spend most of time in seeking mode [5].

The process of Cat Swarm Optimization (CSO) can be described in 6 steps as follows [5]:

Step 1. Create N cats in the process.

Step 2. Randomly sprinkle the cats into the M-dimensional solution space and randomly select values, which are in range of maximum velocity, to the velocities of each cat. Then haphazardly pick the number of cats and set them into tracing mode according to MR, and the others set into seeking mode.

Step 3. Evaluate the fitness value of each cat by applying the position of cats into the fitness function and keep the position of the best cat (xbest) into memory.

Step 4. Move the cats according to their flags, if catk , is in seeking mode, apply the cat to the seeking mode process, otherwise apply it to the tracing mode process.

Step 5. Re-pick number of cats and set them into tracing mode according to MR, the set the other cats into seeking mode.

Step 6. Check the termination condition, if satisfied, terminate the program, otherwise repeat step 3 to step 5.

The final solution would be the best position found to the fact that CSO keeps the best solution till it satisfies the termination conditions.

Chapter 2

Literature Review

Sudoku, being a subset of NP-Complete problems and the top most popular puzzle in the 21st century, is continually receiving lots of attention in researches involving optimization. In recent years, various techniques and algorithms have been applied to efficiently solve the Sudoku puzzle. Swordfish, and X-wing are among of the most common techniques used by the paper-and-pencil Sudoku solvers in solving Sudoku puzzles. This approach is proven to be effective in coming up with a solution to Sudoku puzzles seen in newspapers and puzzle books [31]. These techniques essentially define the constraint programming approach in solving Sudoku, where the Sudoku problem is modeled using all different constraints [26].

In the paper "Using Constraint Programming to solve Sudoku Puzzles" by Crawford, Aranda, Castro and Monfroy, Sudoku problems were viewed as a Constraint Satisfaction Problem [7]. They model the problem by means of declaring variables and constraints. The main goal of this paper is finding a value that would satisfy all defined constraints. The basic mechanisms in Constraint Programming are Constraint Propagation and Enumeration wherein searching for a solution is done by looking ahead using the constraints to prune the search space. Values that cannot participate in a solution are eliminated. Optimization approach is feasible from constraint satisfaction in form of branch and bound procedure which means that as soon as the solution is found, further constraint is added to find a much better value. Then backtracking is done until an optimum value is reached. The result of this strategy showed that it had failed to provide solution to harder level of Sudoku puzzles such as when the initial number of integers given is 17. Based on the results, the efficiency of the algorithm depends on the number of entries given to solve the puzzle.

On another paper, Moraglio, Togelios, and Lucas [16] used Genetic Algorithm with product geometric crossover in solving Sudoku puzzles. Their paper showed that using the abstract definition of geometric crossover, a new geometric crossover in any representation can be derived by geometricity-preserving transformations/combinations. By the use of Cartesian product of geometric crossover, this allows the researcher to create of new geometric crossovers combined with pre-existing geometric crossovers in such a simple way. These are the two-point crossover and whole-row crossover. In their paper, two search spaces were considered. The first search space called the Hamming space considers one constraint while the other search space called the Row-swap space considers two constraints at the same time. Considering the three constraints at the same time seems to be unreasonable even in generating random solution. Results where Hamming space was considered showed that the best crossover is the whole-row crossover, followed by two-point; however, the optimal solution was rarely found. In Swap space, the final fitness values are a lot higher than on the Hamming space. All crossovers efficiently find an optimal solution for three easy instances when subjected to some mutation. For the medium level Sudoku puzzle, an optimal solution cannot be reached. But for hard puzzles, there was one case where 15 out of 30 where solved. Smart-square mutation was also applied to get an optimum solution for Sudoku. The smart-square mutation considers the most obvious constraints to the possible values of a square. Example, if a column has a fixed "9" in it, then all the squares in that column have "9" removed from the set of possible values. The smart-square mutation solved all the easy puzzles even before applying the evolutionary process. In solving the medium and hard puzzles, no mutation operators managed to get even close to optimal solution.

Lewis [14] presented the first application of a metaheuristic technique in solving Sudoku puzzles. This stochastic search-based algorithm uses simulated annealing in finding the optimal solution of a Sudoku puzzle. In his paper, Lewis showed that Simulated Annealing (SA) Algorithm can solve the puzzles consistently in a short amount of time. Various degrees of order-3 (81 cells) Sudoku puzzles found in the UK's national newspapers were tested using the SA Algorithm and they were quickly solved for about half a second regardless of their difficulty. Order-4 (256 cells) puzzles from The Independent during 2005 were also tested. The algorithm is also consistent in finding the solutions to those puzzles but it takes a little bit longer for about 5 - 15 seconds. Since the SA Algorithm is consistent in solving Sudoku puzzles seen on newspapers, Lewis created a Sudoku instance generator to determine the instances which the algorithm will have difficulty in solving the puzzle. On the "easy-hard-easy" style phase-transition, the algorithm was observed on order-3, order-4, and order-5 using time limits 5, 30, and 350 seconds respectively. It was found out that SA Algorithm have difficulties in solving an order-4 puzzles with 102 - 115 given values and order-5 puzzles with 250 - 281 given values. Thus, this implies that SA Algorithm is efficient and consistent in solving the standard order-3 Sudoku puzzles in all instances.

Perez and Marwala in their paper entitled, "Stochastic Optimization Approaches For Solving Sudoku" [19] explore the implementation of different stochastic optimization techniques in solving Sudoku puzzles. These techniques are Cultural Genetic Algorithm (CGA), Repulsive Particle Swarm Optimization (RPSO), Quantum Simulated Annealing (QSA) and the Hybrid method that combines Genetic Algorithm with Simulated Annealing (HGASA). This paper is motivated to solve difficult Sudoku puzzles as efficiently as the simple puzzles by stochastically searching through the solution space the global optimum of the problem. In this paper, the solution space were represented in three ways: (1) Treating each one of the cells as a separate variable or particle with each one requiring its own population, (2) Treating a combination of integers ranging between 1 and 9 as one individual or particle and (3) Representing each particle as a puzzle. The CGA is implemented using the solution representation 2 and 3. QSA has been implemented with the third approach. RPSO used the second approach. The HGSA is implemented using the third approach. The results obtained showed that the CGA, QSA and HGASA were able to solve the Sudoku puzzle with CGA finding a solution in 28 seconds, while QSA finding a solution in 65 seconds and HGASA in 1.447 seconds. HGASA got the best result mainly because it combines the parallel searching of GA with the flexibility of SA. The RPSO was found to be unable to solve the puzzle.

David Mullaney in "Using Ant Systems to Solve Sudoku Problems" [18] presented an ant system approach for solving Sudoku puzzles using the behavior of ants in finding the shortest path between their colony and the food source. With the success of Ant Systems with other problems, like Travelling Salesman Problem and Latin Squares Problem, Mullaney reasoned that Ant Systems can be used to solve the Sudoku puzzle problem since the mathematical properties of a Sudoku Puzzle Problem and Travelling Salesman Problem are quite similar. Also, Mullaney looks at the Sudoku Puzzle Problem as a specific case of Latin Squares Problem in 9x9 grids. Operating under Tabu Search principle, each individual ant visits all possible nodes without violating the rules of the game. Calculation of each ant's fitness is determined by the number of nodes the ant has traversed, and the complete Sudoku Puzzle Problem will have a fitness of 81. The Top 95 Hardest Sudoku Puzzles compiled by Geunter Stertenbrink was tested using Shortest Tabu List (STL), Navigator Assisted Ant Colony Optimization (NACO), and Standard Ant Colony Optimization (ACO). Results showed that the STL, which uses the brute force trial and error, was able to solve the puzzles but it takes approximately 120 minutes to solve each puzzle. With most people, solving Sudoku puzzles take approximately less than 30 minutes; this is quite an ineffective algorithm for solving Sudoku puzzles. In 20-25 minutes, the NACO was able to solve 60 % of the puzzles. The ACO took only 3-7 minutes to solve each puzzle. This seemed to be the fastest among the 3 algorithms; however it only solved 20 % of the puzzles.

In 2006, Chu and Tsai [5] proposed a new idea of computational intelligence through inspecting the behavior of cats. Based on the idea of swarm intelligence proposed in recent years, where the intelligent behavior of animals is modeled in solving optimization problems, the potential algorithm, Cat Swarm Optimization (CSO) can possibly perform well when applied to optimization problems since CSO somehow belongs to swarm intelligence. Two major behaviors of cats were modeled and these are termed "seeking mode" and "tracing mode". Seeking mode refers to the behavior of cats where they are in a period of resting but their eyes are wide open, looking and observing the environment for its next move. The tracing mode refers to the behavior of cats when they are tracing their targets. CSO has been evaluated by six functions as well as the Particle Swarm Algorithm (PSO) and PSO with weighted factor and the results were compared. From the results obtained, CSO provides a better performance in finding the global best solution.

In a more recent paper of Chen, et al. [3], a parallel structure of Cat Swarm Optimization (CSO), called Parallel Cat Swarm Optimization (PCSO), was studied to advance its convergence and searching ability. In the structure of CSO, individual cats work independently in seeking mode but they share the same information, the global best solution. In this paper, the procedure in the tracing mode was modified to make it more cooperative. The cats were separated into several sub-populations. For this reason, the individuals in tracing mode do not move forward directly to the global best solution, instead they move forward to the local best solution of its own group. This procedure somehow achieves cooperation through exchanging of information within the group. To compare the performance of PCSO, CSO and PSO with weighted factor, these algorithms were evaluated using three test functions, namely Rosenbrock Function, Rastrigrin Function, and Griewank function. The experimental results showed that the convergence and searching ability of PCSO is higher than CSO and PSO with weighted factor in small population size and in less iteration. This implies that the convergence and searching ability of CSO was improved by the information exchange and cooperation between sub-populations.

On the paper, "Cat Swarm Optimization for Clustering", by Santosa and Ningrum, CSO has been applied to clustering problem since CSO algorithm has been proven to be effective when applied in function optimization problems [4]. The new CSO clustering algorithm was applied to group data/objects into clusters containing similar data without any information about the number of cluster and grouping pattern. The algorithm was tested on four different datasets: Iris, Soybean, Glass, and Balance Scale. In this paper, three major experiments were conducted: experiment of CSO Clustering with different number of iteration, experiment of CSO Clustering with modification in velocity updating formula, and CSO Clustering compared to other clustering methods. The first experiment result showed that there is no correlation between the difference in the number of iterations and the average cluster error value. In the second experiment, the value of clustering error was decreased after the modification, which implies that the accuracy level of clusters obtained from the modified CSO clustering is better that the clusters obtained from the non-modified CSO clustering. In the third experiment, where CSO clustering algorithm was compared to k-means clustering and PSO clustering, it takes more CPU time with CSO clustering than k-means and PSO clustering. However, CSO clustering has better accuracy than the other two methods on clustering data with small number of data. But, when clustering large data, the CSO clustering was not as good as the other methods.

Chapter 3

Statement of the Problem

Sudoku is a well known puzzle that has achieved international popularity for many years. Sudoku puzzles are created in the expectation that they will be solved. The completion rules are simple yet the completion of the puzzle requires more than "logical reasoning". Each puzzle has a unique solution and does not require the use of trial and error or guessing. Normally, they are solved by just looking for a permutation that fits the basic rules, and the given numbers. Others solved the puzzle using the common techniques such as Swordfish, and X-wing, while, others solved the puzzle by using optimization algorithms.

Existing algorithms were battling for the fastest solving algorithm for a Sudoku puzzle. Normally, without violating its rules, increasing the input size of the Sudoku puzzle will directly affect the time it takes to find the best solution. In fact the weakness of some of these algorithms is, as the size of the Sudoku puzzle increases, its efficiency decreases.

Thus, this study aims to show that using Cat Swarm Optimization algorithm will efficiently solve Sudoku Puzzles.

Chapter 4


The objectives of this study are as follows:

  1. To use Cat Swarm Optimization (CSO) algorithm in generating and solving Sudoku Puzzles.
  2. To evaluate the performance of CSO algorithm in generating and solving Sudoku Puzzles.
  3. To show the effects of varying parameters (e.g. Selected Range of Dimension, Counts of Dimension to Change, Mutative Ratio) to the efficiency of the CSO algorithm.
  4. To compare the performance of CSO algorithm against existing algorithms used in solving Sudoku Puzzles.

Chapter 5

Proposed Approach


The n x n Sudoku puzzle represents as an input where n = 6, 9 or 12 which corresponds to the size of the puzzle. Initially, each of the squares or cells of Sudoku board can be either blank or filled with a number between 1 and 9. The empty squares in the puzzle are called non-start squares and the squares with given number are called start squares. The position of given numbers are fixed which means it should not be altered all throughout the process of completing the puzzle. A sample Sudoku board with initial values is given in Figure 5.1.

The Sudoku board is represented in a three-dimensional array of integers as shown in Figure 5.2. The first and second indices correspond to the row and column of the board, respectively. The third index will hold two values: 1) current value of the square in range of {0, 1...9} and 2) flag in range of {0, 1} denoting if the square is a start square or not (0 if non-start square, 1 otherwise). Hence, the representation of the board as an array will be in [row] [column] [2]. The [row] [column] [0] holds the current value of the square and the [row] [column] [1] indicates if the square is a start square or not. Using Figure 5.1, the row 0 and column 1 is a non-empty square with the value 7. In a three-dimensional array, that will be denoted as [0] [1] [2] where [0] [1] [0] holds the value of the square which is 7 and the array [0] [1] [1] holds the flag 1 to denote that the square is a start square. Note that the value of the start squares are fixed and cannot be changed when solving the puzzle.

An empty square in the Sudoku puzzle is represented in an array with a value 0 and flag 0 to indicate that it is non-start square. See Figure 5.3.

Solving Sudoku Using Cat Swarm Optimization (CSO)

Setting of Parameters

The following parameters must be set first before doing the core process of the algorithm:

In CSO, cats represent the solution sets of the problem. The first parameter defines the population size of cats needed to solve the Sudoku puzzle. The second parameter will determine when to stop the entire solving process. The last seven parameters play a vital role to the core process of CSO algorithm. The first two parameters are defined by the user and the rest will be set accordingly to suit the process of solving the problem.


The pseudocode above shows the entire process of CSO algorithm. It starts by creating N cats. Using the process for solving Sudoku, the cats represent the solution set for Sudoku problem. In Line 02, each cat is given a copy of the incomplete puzzle and each will give values to the missing squares and its associated velocities. The flag of each cat is then set to determine what sub-mode process it will apply. In Line 03 to 12, all cats will go through a cycle of evaluating the fitness of its solution, keeping the best solution in memory and applying each cat into tracing or seeking process based on its flag until termination condition is reached. Line 05 evaluates the solution of each cat for the Sudoku problem according to a fitness criterion. The cat with best fitted solution for the problem is kept in memory. Line 06 and 07 will be applied to cats whose flag is set for seeking mode process of CSO. Seeking mode creates a number of copies of a cat's solution and all or number-1 of the copies will have its decision variables changed. The one with best fitted solution for the problem will become the cat's new solution. Line 08 and 09 will be applied to cats whose flag is set for tracing mode. Tracing mode updates the solution of a cat. Line 10 reset the cat that will move in tracing or seeking process. This way, cats will have the chance to apply the seeking or tracing mode in every cycle.

CSO Initialization

Every cat will store a copy of the given Sudoku puzzle and each will randomly assign numbers ranging between 1 and 9 in each non-start square satisfying the constraint that each subgrid must contain only one instance of each number. This copy of puzzle will represent the position of the cat and the squares of the puzzle represent the dimensions in that position. Figure 5.4 shows a cat's generated values for the sample Sudoku in Figure 5.1. The squares with numbers in bold are the start square and were not manipulated during the process. As shown in the figure, every subgrid contains the numbers {1...9} once.

Every cat is assigned a velocity for each dimension, and a seeking/tracing flag during the CSO initialization process. The velocity for each dimension is initially given a value within the given range through randomization. Also, assigning the seeking or tracing flag for each cat is done randomly but within the ratio defined by Mixture ratio or MR. MR gives the number of cats to enter the tracing mode process.

CSO Evaluation of Fitness Function

Fitness function determines whether the values applied to the non-start squares solved the given Sudoku puzzle. The fitness value of each cat is evaluated by applying the position of each cat to the fitness function. The position of the cat with the best fitness is kept in memory until the end of iteration. The cat with the best fitness is the cat with highest fitness value. The fitness value of each cat is determined by this fitness function:

where fiti is the fitness value of cat i and fi is the penalty value of the solution obtained of cat i. Penalty value is the total number of missing values in each row and column of the Sudoku puzzle. The smaller the penalty value, the higher the fitness values that will be obtain. A completed Sudoku puzzle will have a penalty value of zero since there will be no missing values for each row and column of the puzzle. Hence, a completed Sudoku will have a fitness value of 1.

The figure above calculates the missing numbers for each row and column and sums it up to obtain the penalty value of the generated solution. That is, the penalty value is equal to sum of missing numbers in each column plus sum of missing numbers in each row. Figure 5.5 shows a Sudoku Puzzle with a penalty value of 34 hence using the formula, the fitness value is 1/(1+34) = 0.0286.

Seeking Mode Process

Seeking mode aims to get a better fitness value for every cat that moves into this process. There are four parameters that need to be defined for the seeking mode. These are SMP (seeking memory pool), SRD (seeking range of selected dimension), SPC (self position considering), and CDC (counts of dimension to change). SMP represents how many copy of itself will be allowed for each cat. All or SMP-1 of these copies will be the candidate position that will replace the cat's current position. SRD declares the mutative ratio for updating the value of the selected dimension for each candidate. It is usually given a value between 0 and 1. SPC is a Boolean parameter which indicates whether the point at which the cat is standing will be one of the candidate points to move to. CDC tells how many of the dimensions will be varied for one candidate. The values that were assigned for the parameters will be set as constants on the entire CSO process. The seeking mode process for Sudoku is described below:

Make SMP copy of the present position of each cat. The position refers to the configuration of the puzzle stored in each cat. For example, if SMP is set to 3, then it means seeking mode will create 3 copies of the current position of the cat that enter this process. A sample is shown in Figure 5.6.

Step 2:

Define j value which represents the number of copies of catk that will be the candidates to move to. If SPC is true, the current position is also considered as one of the candidates and let j = (SMP - 1). Otherwise, let j = SMP and the current position not part of the candidates. Using again the example in Step 1, if SPC is true, then the current position will be consider as one of the candidates and the two copies of catk will undergo mutation. If SPC is false, all of the copies will undergo mutation hence the current position of catk will be likely replace with a new position.

Step 3:

For each candidate, determine the number of dimensions or cells that will experience mutation. This is done by multiplying the total number of dimensions or cells within the Sudoku puzzle with CDC. So a 9 x 9 puzzle has 81 total numbers of dimensions. If the copy 1 of catk in Figure 5.6 is a candidate, then according to CDC ratio, a certain number of dimensions will be chosen randomly to update its value. See Figure 5.7.

The cells with darker borders are the dimensions randomly selected to experience mutation.

Step 4:

For each selected dimension randomly plus or minus the current value it contains by SRD.

val = current value +/- (current value * SRD)

Every dimension or cells of Sudoku puzzle belongs to a particular subgrid. Updating the dimension must not violate the property of a subgrid, that is, it should not contain any duplication of numbers. Also, only the selected non-start square should be updated. To change the value of a selected dimension, it is swap with a number selected from the list of needed values for that subgrid. Needed values are the numbers that are required in that subgrid. For example, if a subgrid has start squares with values 7, 3, and 6, the needed values for that subgrid are 1, 2, 4, 5, 8 and 9. So, there are six needed numbers required for this subgrid. Choosing a value is done by using this formula: val%size of the list of needed values which gives the index or position from the list of needed values. Using the given example of list of needed numbers, if val = 14 and the numbers required for a subgrid is 6, 14%6 = 2. The needed number 4 is chosen to swap with the current value of selected dimension. See Figure 5.8.

The figure above shows an example of updating a dimension. The current value of selected dimension is 8 and it is swap with a value 4. The current value of the selected dimension is now 4.

Step 5:

Calculate the fitness value of each copy. If all FS are not exactly equal, calculate the selecting probability of each copy using the probability function, else set all selecting probability by 1. Section describes the process of finding the fitness value. After determining the fitness of each copy, the maximum and minimum fitness is needed for calculating the selecting probability of each one. The formula is given in

Figure 5.9 shows the new values of the selected candidates after performing Step 3 and Step 4. Using the fitness function discussed in the previous section, catk will have a fitness value of 0.0286 having a penalty value of 34. Here, catk is one of the candidate points to move to, so one of its copies retains its value. The other copies of catk now have a fitness value of 0.0244, 0.0278 and 0.025 respectively. The fitness values are not equal hence the probability function will be applied to set the selected probability of each cat. The minimum fitness among the candidates is 0.0244 and the maximum fitness is 0.0286 of catk. To determine the selected probability of copy 3, using the formula given in Section the probability will be:

Step 6:

Set the chosen candidate with highest probability to be the new position of catk. Instead of randomly choosing the new position, the candidate with highest probability is preferred since it has the best fitness value among other candidates. The higher the fitness value of a cat, the better its position which represents the solution for the puzzle. Using the example in Step 5, the candidate with highest probability is still catk, hence the position of catk is not changed. In cases where there are more than 1 copy of cats with highest probability, the new position of catk is randomly selected from these copies.

Tracing Mode Process

The tracing mode updates the position and velocities of each dimension of catk. The process starts by updating first the velocity of each dimension and then updates the position in that dimension. The tracing mode process for Sudoku is described below:

Step 1:

Update the velocities within the range of {-100...100} for every dimension (vk,d ) according to this equation:

where dimension d corresponds to the row-column indices of the board, that is d = [0][0].....[9][9]. xbest,d is the value of the cat with best fitness in corresponding dimension d, and xk,d is the position of catk in same dimension. The dimension refers to the square of the puzzle and the position to the configuration of the Sudoku. The velocity of each dimension is updated by adding the current velocity with the product of a random value between [0,1], a constant value, and the difference of the value of best cat with the value of catk in that dimension.

Figure 5.10 shows the configuration of the puzzle in catk and the cat with the best fitness so far. Suppose the associated velocity of dimension [0] [0] containing the value 5 in catk is -3, the new velocity will be determined by this equation: vcatk,[0][0] = -3+rand()*c1*(8-5). 8 is the value of dimension [0] [0] in catbest and 5 is the value of catk in same dimension. rand() will give a random value between 0 and 1 and c1 is some constant value.

Step 2:

Check if the new velocity is larger or smaller than the limitation. If it is greater than the maximum velocity, which is 5, set it equal to maximum velocity. If it is less than the minimum velocity, set it equal to minimum velocity, which is -5.

Step 3:

Update the position of catk according to the equation:

where xbest,d is the current position of the catk and vk,d is the new velocity of catk in that dimension. Again, the constraint that each subgrid will have a distinct values {1...9} and only the non-start squares should be manipulated must be satisfied. Then, follow the process of Step 4 in Seeking Mode wherein updating each dimension is done by swapping the current value with a selected number from the list of needed values. Choosing a value is done using xk,d % size of list of needed values to get the index or position from list of needed values. The needed value having this index will now be the current value of the selected dimension.

Probability Function

The probability function determines the possibility of being selected. The higher the probability, the better is the chance to be selected. FS refers to the fitness value of the solution. The formula is:

Mixture Ratio (MR)

The mixture ratio or MR determines the ratio of cats that will move to the tracing mode process of CSO. It is given a small value so that most of the cats will apply the seeking mode process. This will mimic then the behavior of cats spending more time resting and observing the environment rather than chasing things.

Termination Condition

The entire process will terminate once the fitness value is equal to 1 or the maximum number of cycle is reached.


The output is a complete Sudoku puzzle with every constraints satisfied if the fitness value obtained is 1. This means that every row, column and sub-grid of the puzzle contains the number from 1 to 9 once. Figure 5.11 shows the complete puzzle of Figure 5.1. Otherwise, the output is the best solution obtained after the maximum cycle is reached.



  1. Bonabeau, E., Dorigo, M., Theraulaz, G.: Swarm Intelligence: From Natural to Artificial Systems, New York, NY: Oxford University Press, (1999).
  2. Cats. From, (Accessed: January 2010)
  3. Chen, S.M., Hao, S.P., Liao, B.Y., Pan, J.S., and Tsai, P.W.: Parallel Cat Swarm Optimization, Proceedings of the Seventh International Conference on Machine Learning and Cybernetics, Kunming, Vol.6 (2008) 3328-3333
  4. Chu, S.C., Pan, J.S., and Tsai, P.W.: Cat Swarm Optimization, Proceedings of the 9th Pacific Rim International Conference on Artificial Intelligence, LNAI 4099, (2006) 854-858.
  5. Chu, S.C. and Tsai, P.W.: Computational Intelligence Based on the Behavior of Cat, International Journal of Innovative Computing, Information and Control, 3 (1), (2007) 163-173.
  6. Clarity Media - Buy Puzzles. From, (Accessed: January 2010)
  7. Crawford, B., Aranda, M., Castro, C., and Monfroy, E.: Using Constraint Programming to solve Sudoku Puzzles. ICCIT Proceedings of the 2008 Third International Conference on Convergence and Hybrid Information Technology, Volume 2. (2008) 926-931
  8. Das, S., Abraham, A., and Konar, A.: Swarm Intelligence Algorithms in Bioinformatics. Studies in Computational Intelligence (SCI) 94, (2008) 113-147.
  9. Eastaway, R.: What is the Difference Between a Puzzle and a Maths Question? Teaching Mathematics and its Applications, 16(1):1-4, (1997).
  10. Fortnow, L.: The Status of the P versus NP Problem. From, (Accessed: December 2009)
  11. Hereford, J. and Gerlach, H.: Integer-valued Particle Swarm Optimization applied to Sudoku Puzzles, in 2008 IEEE Swarm Intelligence Symposium, St. Loius MO USA, (2008).
  12. Karaboga, D.: An Idea Based On Honey Bee Swarm for Numerical Optimization. Erciyes University, Engineering Faculty, Computer Engineering Department, Turkey. (2005)
  13. Kim, S.: What is a Puzzle? From, (Accessed: December 2009)
  14. Lewis, R.: Metaheuristics can Solve Sudoku Puzzles, Journal of Heuristics, vol. 13, no. 4, (2007) 387- 401.
  15. McAdam, D.: History of Jigsaw Puzzles, American Jigsaw Puzzle Society. From, (Accessed: January 2010)
  16. Moraglio, A., Togelius, J., Lucas, S.: Product geometric crossover for the Sudoku puzzle, in 2006 IEEE Congress on Evolutionary Comp. (CEC2006), Vancouver, BC, Canada, (2006) 470-476.
  17. Mraz, D.: Types of Puzzles. From, (Accessed: January 2010)
  18. Mullaney, D., Using Ant Systems to Solve Sudoku Problems, Springer Publishing Company, (2006)
  19. Perez, M. and Marwala, T.: Stochastic Optimization Approaches for Solving Sudoku, arXiv: 0805.0697v1, (2008).
  20. Pest of the month: Bees. From, (Accessed: January 2010)
  21. Play Hyper Sudoku online. From, (Accessed: January 2010)
  22. Puzzle. From, (Accessed: January 2010)
  23. Samurai Sudoku. From, (Accessed: January 2010)
  24. Santos-Garca. G. and Palomino.M.: Solving Sudoku Puzzles with Rewriting Rules. Electronic Notes in Theoretical Computer Science, 17(4), (2007) 79-93.
  25. Santosa, B., Ningrum, M.K.: Cat Swarm Optimization for Clustering, 2009 International Conference of Soft Computing and Pattern Recognition, (2009).
  26. Simonis, H.: Sudoku as a constraint problem. CP Workshop on Modeling and Reformulating Constraint satisfaction Problems, (2005).
  27. Some Theoretical Ideas for Computer Science. From, (Accessed: December 2009)
  28. Sudoku variants and other puzzles. From, (Accessed: January 2010)
  29. Sudoku - Variants. From, (Accessed: December 2009)
  30. Swarm Intelligence Algorithms and Applications Symposium. From, (Accessed: January 2010)
  31. Techniques for Solving Sudoku. From, (Accessed: December 2009)
  32. Thompson, M.J.: Defining the Abstract. (2000) From, (Accessed: December 2009)
  33. What is Sudoku? From, (Accessed: December 2009)
  34. Yato, T. and Seta, T.: Complexity and Completeness of Finding Another Solution and Its Application to Puzzles. IEICE Trans. Fundamentals, E86-A (5):1052-1060, (2003)

Please be aware that the free essay that you were just reading was not written by us. This essay, and all of the others available to view on the website, were provided to us by students in exchange for services that we offer. This relationship helps our students to get an even better deal while also contributing to the biggest free essay resource in the UK!