### Introduction to Neural Networks

Artificial neural networks are one approach to creating artificial intelligence — programs that imitate the cognitive functions of humans. Artificial intelligence (AI) may recognize cats in a set of images, play a game, or learn to drive a car. With the invention of artificial neural networks, computers have become better at these tasks — even better than humans, in some cases.

The dream of building a machine that matches or exceeds our ability to solve problems preceded modern computers. Accounts of fantastical AI appear in age-old stories. Somehow, we imagine, intelligent machines will teach us something we don't already know about ourselves.

Humans' pursuit of AI continues today. To understand why neural networks are so useful in modern AI, we'll look first at some of the trouble early AI researchers had using traditional approaches.

# Can Computers Learn?

Some of the most common examples of AI appear in computer and video games.

Gaming and computing were entwined not long after the invention of the first computers in the 1940s. Board games with definite rules are straightforward to translate into code, so early programmers demonstrated the capabilities of early computers by designing programs to mimic human players.

In board games, players take turns to advance toward the game's goal according to the game rules. On their turn, the objective of any player, human or computer, is to maximize their chance of winning.

If you are designing AI to play a board game that can win against a human player, what functions must the AI be able to do?

# Can Computers Learn?

The biggest challenge to designing a game-playing AI is developing an algorithm that ranks different moves. The rank of a particular move isn't just how much it boosts your advantage, but also how disruptive it is to your opponent.

To illustrate, consider the game tic-tac-toe (or noughts and crosses depending on where you're from). The goal of this game is to place your markers (either X's or O's) on the grid so that you can draw a straight line through three of them across the board ("three in a row").

Here's a board state from the middle of a game:

It's X's turn. Decide on the best move for X. Assuming neither X nor O makes a mistake on subsequent moves, which player will win?

# Can Computers Learn?

To make a good move in most games, you need to anticipate how your opponent will move in the future, and consider how your move affects your opportunities later.

It's X's turn. What move guarantees X a win?

Assume X places the markers correctly on all subsequent moves, and O blocks X from winning whenever possible.

# Can Computers Learn?

Designing an AI to play tic-tac-toe is actually easier than programming a computer to mimic the reasoning you used to answer the last two problems—if you design an AI that learns from mistakes.

Around 1960, an AI researcher named Donald Michie had a revolutionary idea. He set out to design an AI that learns a strategy for tic-tac-toe from nothing. In his mind, even if the AI couldn't play very well at first, learning to play in ways it was never explicitly told to do was an improvement over programming it directly with the game logic. If his approach worked for tic-tac-toe, it could be extended to more complex games.

Michie was determined, and his invention is widely seen as an important step toward artificial neural networks. At the time, computers were rare and expensive, so he built his AI from something he had plenty of: matchboxes.

# Can Computers Learn?

Here's how Michie's Matchbox AI works.

If you've never counted, there are exactly $304$ distinct tic-tac-toe board states $($distinct ways to put X's, O's, and unclaimed spots on the $3\times3$ grid$)$ that one player can encounter from the beginning of the game to the final move.

The Matchbox AI has one matchbox for each one of these states so that it can learn what move to make in any situation it may find itself in. Each matchbox has the board state it corresponds to drawn on its exterior, so the human player can assist the AI in putting its markers on the board.

Once all this setup work is done, it's time to begin teaching the Matchbox AI to play. It's not the matchboxes that learn the game, it's what's inside them...

# Can Computers Learn?

Each matchbox contains a set of beads. These beads may be different colors, and each different color corresponds to one of the empty spots on the grid.

When it's time for the AI to place an X, you follow these steps:

• Find the matchbox corresponding to the current board state.
• Draw a bead at random from the box.
• Match the color of the bead to an empty spot on the board, and place the AI's marker on it.

In the middle of a game, here's what's inside the box you found corresponding to the current board state:

The number of beads of each color doesn't have to be equal. If you pick a bead from the box at random, which move is the AI most likely to make?

# Can Computers Learn?

Depending on how many beads of each color are in the matchbox, the likelihood of certain moves can be adjusted.

For example, if it's decided that the spot corresponding to pink is a move that usually leads to losing the game, what can be done to decrease the likelihood that the AI picks pink?

# Can Computers Learn?

Here's how the Matchbox AI develops a strategy: if you increase the beads that lead to winning positions, and eliminate the beads that correspond to bad moves, then the Matchbox AI could learn gradually to be a formidable opponent.

Before the Matchbox AI plays its first game, every matchbox contains equal numbers of each bead color, so for each board state every possible move is equally likely — even the bad ones.

When every empty grid space is equally likely on the AI's turn, who is more likely to win the first round?

# Can Computers Learn?

On the first round, the Matchbox AI makes some bad moves and you win handily after $3$ turns. Now, it's time to help the AI learn from its mistakes by adjusting the number of beads in just the matchboxes that were used on that round.

Since you won, the plays made by the AI weren't effective, so each bead that was randomly drawn from the matchboxes is thrown away.

Suppose that, after $10$ games, the Matchbox AI has lost all $10$ games. How has its likelihood of winning or playing to a draw changed?

# Can Computers Learn?

At first, the Matchbox AI mostly loses. But if it eventually wins, the beads should be adjusted so that these moves are more likely in future rounds. The beads that were drawn during play are returned, plus $2$ additional beads of the color drawn.

Of course, in tic-tac-toe, it's possible for neither player to get three in a row and the game ends in a draw.

In fact, if the second player makes no mistakes, they can always arrange for this to happen. In a game between two expert players that make no mistakes (this situation is called "optimal play"), a draw is the best possible outcome.

After a game ends in a draw, how should you modify the number of beads in the boxes that were used?

# Can Computers Learn?

Under these rules, the Matchbox AI starts out losing every game, but after about $100$ games, beads corresponding to many of the blatantly bad moves have been eliminated entirely.

As the AI gains experience, most of the games it plays end in a draw, and it even wins sometimes when its human opponent makes a mistake. (It turns out that tic-tac-toe isn't a very interesting game because a skilled player can always force a draw). Eventually, the Matchbox AI never loses and it always gains beads at the end of each round.

The number of beads in the matchbox corresponding to the opening move is a good measure of how well the Matchbox AI has learned the game, since a bead is drawn from this box every game. You can see on the plot below that, after an initial decrease, the beads in this box increases steady, which means the AI is not losing.

# Can Computers Learn?

This simple strategy of building AI — reinforcing strategies that lead to a desirable outcome — is nearly universal. Some form of this principle is at work whenever an entity learns — even your own brain.

But how would a Matchbox AI do in a game more complicated than tic-tac-toe?

Like tic-tac-toe, chess is a strategy game that's played on a grid, where the board is visible to both players and there is no element of chance (e.g. moves aren't determined by rolling a die).

The Matchbox AI learned tic-tac-toe over the course of a few hundred games. If we developed a Matchbox AI that learns chess by an identical learning algorithm, and if you were able to generate one move per second, roughly how long would it take the Matchbox AI to play chess to a win or draw every time?

Hints:

• Tic-tac-toe has $304$ possible board states.
• The number of ways you can put just $12$ different chess pieces (and no duplicate pieces) on an $8\times 8$ grid is $\num{e21}.$
• The current age of the universe in seconds is $\SI{4e17}{\second}.$

# Can Computers Learn?

In games more complex than tic-tac-toe, the number of different ways a game plays out can be mind-boggling. To learn chess, a Matchbox AI would require a prohibitive amount of time. For many years, expert-level AI in games like chess was unattainable for this reason.

Today, however, computers can follow billions of individual instructions each second, and AI for games like chess is now within reach. Computers can also tell you who is in a photo, scan x-rays for signs of cancer, and compose a human-readable news summary. The space of possibilities in these tasks is comparable to the number of board states in chess — or bigger. The gap between what computers can do and what humans can do is shrinking.

The goal of this course is to explore one advance that made this possible: a computational paradigm called an artificial neural network that learns much like the Matchbox AI. In the next quiz, we'll look at a different kind of problem that eluded classical approaches to AI: computer vision.

# Can Computers Learn?

×