Write up on Tech Geek History: Deep Blue

Literature Review
What Is Deep Blue Algorithm?
The deep blue algorithm was developed by IBM. It was a chess-playing computer system designed for a regular chess game or chess match against the reigning world champion under some predefined time controls.
History of Deep Blue Algorithm
The early development of Deep Blue began in 1985 with the ChipTest project at Carnegie Mellon University. The American Chess Grandmaster Joel Benjamin was a part of the development team that was hired by IBM. Earlier the project was named Deep Thought and was renamed to Deep Blue in 1989. Deep Blue won its first game against world champion Garry Kasparov on 10 February 1996. However, Kasparov made a comeback and won three and drew two matches that didn’t lead deep blue to dominate the game.

Later on, in May 1997, Deep Blue was heavily upgraded and defeated the reigning world champion and won the six-match game. Although Kasparov criticized IBM for cheating.
How Does Deep Blue Algorithm Work?
System Overview
Deep blue is the gigantic parallel system to carry out chess game tree searches – a graphical representation of all the possible moves from the initial stages of the game. The system was developed with 30-node ( 30-processor) IBM RS/6000 SP computer and single-chip chess search engines having 16 chess chips per SP processor. The SP system had 28 nodes with MHz P2SC processors, and 2 nodes with 135 MHz P2SC processors. All nodes communicate with each other using a high-speed switch and contain 1 GB RAM and 4 GB hard disk space. Each chess chip of deep blue is capable of searching 2 to 2.5 million chess positions per second. In order to do that, they communicate with their host via a microchannel bus.
How Does System Works?
The functioning of the SP processors
The Deep Blue algorithm is separated into three layers: out of many SP processors, one acts as a master, and the other two as workers. The master processor’s job is to search the top levels of the chess game tree and hand over the results – often called “leaf positions”- to the workers for further examination. Now, workers carry out additional searches and distribute leaf positions to the chess chips. Finally, chess chips extend the search to the last few levels of the game tree.

While performing all these activities, the speed of the system also varies. For example, tactical positions – when long and forcing moves are required, the deep blue algorithm can explore up to 100 million positions per second. On the other hand, for quieter positions, the search could go up to 200 million positions per second. It is said that in 1997 match with Garry Kasparov, the minimum search speed was 126 million positions per second, and the maximum number of the search speed was 330 million positions per second.
Components of Deep Blue Algorithm
• Move Generation: The Deep Blue algorithm has numbers of other functions like generation of checking, check evasion moves, and developing certain kinds of attacking moves. Besides, chess chips also support several search extensions, which is made possible by the move generator. It’s an 8 X 8 array of combinatorial logic acting as a silicon chessboard. However, the move generator generates a single move at a time but it calculates all the possible moves and selects the most significant one with the help of an arbitration network.
• Evaluation Function: The evaluation function is composed of fast evaluation and slow evaluation. These standard techniques are used to save the system from running an expensive search where a simple approximation is required. The fast evaluation computes the score on the basis of the position of a particular piece in a single clock cycle. Contrary, the slow evaluation scans the chess board’s each column at a time. For example, it computes the values of chess concepts such as square control, king safety, pawn structure, pawn majority, restraint, color complex, trapped pieces, development, etc.
• Search Control: The Deep Blue chess chip contains the null-window alpha-beta search. The best part of the chip is it averts the need for a value stack, thus simplifying the board design. However, there are some disadvantages too like in some cases it requires multiple searches and the lack of transposition table that increases the search efficiency. The search algorithm needs a move stack-a repetition detector to keep track of the previous moves up to the last 32 positions.
• Extendability: The chess chip also supports the use of external Field Programmable Gate Array (FPGA) to give access to the external transpositional table, complex search control, and additional terms for the evaluation function. The main aim of this function is to address the complexity of the mechanism and make it efficient. Null move search is also supported by this system but due to time constraints, this was never used in Deep Blue.
https://www.professional-ai.com/deep-blue-algorithm.html
AI Techniques
Tree Search
The basic model of chess is that of a Tree Search problem, where each state is a particular arrangement of the pieces on the board and the available actions correspond to the legal chess moves for the current player in that arrangement. An example “slice” of such a tree is given in the following figure:

Once we have modeled the game in this way, we can begin applying our algorithms from this course to the problem!
The Evaluation Function
As put forth in Shannon’s paper, the primary ingredient in a chess-playing program is the evaluation function. Since we can’t look forward all the way to the end of the game and see if a particular move will win (especially since we don’t know what the other player will do during their turns!), we must create a function which takes in a state of the game (in our case, a board arrangement) and boils it down to a real-number evaluation of the state. For example, the function could give higher scores to board states in which the player of interest has more of their pieces on the board than the opponent. In particular, we would probably want the function to assign an extremely high score (perhaps even infinity) to the board arrangement in which the opponent’s king is in checkmate, meaning that the player of interest is guaranteed to win the game.

The Minimax Algorithm
Given an evaluation, all that’s left is a way of actually choosing which move to take. Although looking ahead one step and simply choosing the move which leads to the board arrangement with the highest evaluation score would be a good baseline, we can be even smarter and take into account the actions our opponent could take once we’ve moved. This intuition leads to the “Minimax algorithm”, so-called because we choose the action which minimizes our maximum possible “loss” from making a particular move. Specifically, for each move we could make we look ahead as many steps as our computing power will allow and examine all the possible moves our opponent could make in each of their future turns, given that we’ve made our original move. We then take the maximum “loss” (equivalently, the minimum of our evaluation function) that our opponent could induce for us via their moves, and we choose the move we could make which minimizes this maximum.
Heuristics/Optimizations
Equipped with an evaluation function and an implementation of the minimax algorithm, one can already design an incredibly effective chess-playing program. However, the “big time” programs build even further upon these by implementing “heuristics”, simple rules which can cut down on computation time, along with optimizations of the minimax algorithm, given the specific structure of chess. An example heuristic could be that if a move leads to the player’s king being in checkmate, then the algorithm should not look any farther down that path of the game tree, since we know the player will never want to make that move. A popular optimization of minimax is known as Alpha-beta pruning, wherein any move for which another move has already been discovered that is guaranteed to do better than it is eliminated. For example, in the following tree we do not need to explore any of the paths whose edges are crossed-out, since we’ve already found moves we know will perform better:
Decision Trees
start with a root node, which symbolizes the current state of the game. Every possible move we can make at that point will be the children of that node. Then for each child, there is a new set of possible moves for the opponent. The tree branches out until it covers every possible state in the game and the game ends when it reaches a leaf node.
The very first artificial intelligence algorithms were based on making a brute-force search on the decision trees. The search algorithm tries to reach any leaf node that makes the machine win and makes decisions so that it can reach one of these winning nodes. We shall now see one of these algorithms in action.
Minimax Algorithm
Abstract yourself from the game now and assume we have assigned a score to every possible outcome of the game. The scores are assigned to the leaf nodes of the tree. A positive score indicates that the machine wins and a negative score indicates that you win. So, the goal of AI is to maximize the score and yours is to minimize. Green arrows indicate the turn of the AI (maximizer) and red arrows indicate yours (minimizer) in the tree.

The minimax algorithm is very simple and it is a modified version of depth-first search. The AI (green) will always choose the move with the maximum possible outcome, assuming that its opponent (red) will play optimally and choose the minimum possible outcome all the time. It is intelligent to assume that your opponent plays optimally so that you will be prepared for the worst. Now take a while and follow the tree from the bottom to the top to see the moves each opponent made.

You can see that this algorithm pretty much searches all possible scenarios by brute force. If we assume that b is the branching factor and d is the depth of the decision tree, the algorithm works in O(bᵈ) which exponential.
If you are implementing a Tic-Tac-Toe game, this may not be that bad. After all, there are 9 possible moves on the first turn, 8 in the next, 7, and so on, which will make up to 9! scenarios in total. However, if you were making a chess game, the number of possibilities will grow up by an insane amount! It would take millions of years for any computer to calculate all possibilities.
Do I need to say there must be a better way?
Alpha-Beta Pruning
Alpha-beta pruning is the strategy of eliminating the branches that will not be contributing to the solution. I will explain this with an example. The red lines in the tree below mark the current state of our search. The maximizer (AI) has chosen 9 and 5, which are the maximum reachable values on the corresponding subtrees. At this point, the minimizer currently holds the value 5, which is the smaller one among 9 and 5.
There is still one branch left to search and the first value the depth-first search sees is the value 6. Now we know that whatever maximizer chooses will be at least 6. But we also know that minimizer has chosen 5, which is already smaller than 6. At this point, we no longer need to check the remaining children (1 and 7) because we know for sure that the minimizer will select 5.

The reverse could also be true: If the maximizer has already chosen a value bigger than the value minimizer chose, we wouldn’t need to search the rest of the subtrees. In the tree below, the maximizer has already chosen 5 at the root node. Since -2 is smaller than 5 and anything minimizer chooses will be at most -2, we no longer need to search the rest of the subtrees.

So go ahead, apply this strategy to your entire search and prune the c**p out of the decision tree. There is one last subtree you can prune out on the very right side, which I’m leaving you to check why we can skip it. The final result of the alpha-beta pruning algorithm shall be this:

We pruned the tree quite a bit. Alpha-beta pruning can provide performance optimization up to the square root of the performance of the original minimax algorithm. It may also provide no performance improvement at all, depending on how unlucky you are.
Depth-Limited Search
Even though alpha-beta pruning provides a great amount of performance improvement, searching the entire set of possible scenarios still can be overkill. We can employ intelligent strategies to avoid searching the entire tree and still get very good results.
One such strategy is depth-limited search and it is exactly what it sounds like. Instead of searching the entire tree, you search it to a limited depth that you predefined. For example, you can search the next 5 moves in chess. But in order to do that, you need a deterministic way to score the current state of the game, since you no longer know who wins the game when you reach the end of your search. In order to do that, we will use an evaluation function.
Note: There are other alternatives to the depth-limited search, like iterative deepening but I preferred not to include them in order to keep the story short.
How deep is your search? Get it? That was a joke…
Evaluation Functions
So, you decided to limit your search to the next 5 moves you and your opponent will make in the game. Then you realized that most of the time the game still continues after 5 moves and you are now stuck in this intermediary step. How do you feed the numbers into the minimax algorithm? What you need is an evaluation function.
An evaluation function is a way to deterministically score the current state of the game. If you are playing chess, for example, an evaluation function can be the numeric difference between the number of chess pieces you and your opponent have. The bigger the difference is, the better chance you have.
A better evaluation function can use a weighted calculation in which each piece has a weight depending on how important it is. This will probably produce better results than simply counting them. And an even better one may use the locations on the chessboard in addition to their weights.
How you define your evaluation function is entirely depends on you. Beware this will be the most critical part of your algorithm, though. How good your evaluation function determines the state of the game will greatly affect the success of your algorithm. There are two other things you should be careful when you are writing an evaluation function:

The function should be deterministic: Given the same state, it should always produce the same result.
The function should work fast: You will be making a lot of calls to your evaluation function. If it works slow, then your AI will respond slow.
https://medium.com/data-science/algorithms-revisited-part-7-decision-trees-alpha-beta-pruning-9b711b6bf109
https://stanford.edu/~cpiech/cs221/apps/deepBlue.html

Write up on Tech Geek History: Deep Blue

Leave a Comment Cancel Reply