Computers can make decisions, and computers can do things very very fast. Right now, a computer is deciding what the solution to a mathematical equation is. Somewhere else, a computer is deciding whether to suspend someone's credit card to protect them from fraud, and another computer is deciding whether an image represents a stop sign or a bird.
An important part of computer science is understanding how computers can make the right decisions, or at least pretty good ones.
In the first two explorations of this course, you'll be learning about how computers use lots of simple decisions in order to make complex decisions quickly and precisely.
One of the ways computers (and sometimes humans) make decisions is with a structure called a decision tree. Decision trees encode a series of simple yes-or-no questions that you can follow in order to answer a more complex question. Here is a silly decision tree that helps you decide which of eight different creatures you're dealing with:
If a computer were using this diagram while looking at a creature that happened to be an adult blue whale, the computer would start at the top box, the root of the decision tree. The computer would ask whether the creature was smaller than a bicycle. The answer is definitely no, a blue whale is not smaller than a bicycle.
Next, the computer would follow the "no" path, which is the black arrow to the right. The next box contains the question "Does it swim?" The answer to that is definitely "yes," blue whales swim. So the computer would continue down the "yes" path, which is the white arrow to the left. The computer then correctly concludes that it's looking at a blue whale.
A computer is using this decision tree. Which of these creatures will require the computer to ask the most questions?
The first question at the very top of the decision tree is an especially important one.
Imagine that you want to make a decision tree to distinguish the four faces above. If you want the root of the decision tree to split the faces into two groups of two, which of these questions should you ask?
The first question in your decision tree will split the faces into two distinct groups. What is a single question that you could ask of both groups of two in order to identify any single face?
None of the choices you were given were able to further split up both of the groups of two faces. You can nevertheless make a decision tree that distinguishes all four faces, because the tree can ask a different follow-up question based on the answer to the first question. This flexibility makes decision trees quite powerful.
A computer scientist can use this decision tree to write a simple face recognizer. The computer scientist first writes three simpler tests to detect glasses, long hair, and smiling. The decision tree organizes these simple tests, allowing a computer to distinguish between all the faces.
The shape of a decision tree can make a big difference in what happens when you use it.
If your computer program is using this decision tree, how many questions, on average, will the computer program need to ask and answer in order to distinguish a random face?
The decision tree from the previous page distinguishes between eight faces with exactly three questions. This decision tree distinguishes between the exact same set of eight faces; however, it takes an average of questions to identify a random face with this decision tree.
How many of the faces take more than three questions to identify with this decision tree?
This decision tree distinguishes six faces. Which face will be chosen if the decision tree is used to identify this different face:
The decision tree on the previous page did not do a very good job classifying this face. These faces are almost as different as they can be! One reason that the decision tree failed so badly was that the first few questions focused on hair color and the decision tree contained no faces with blond hair. The lack of blond-haired faces in the decision tree made it easier to badly misclassify a blond-haired face.
This was a simple example, but if you pay attention to the news, you'll see many real-world examples of exactly the same kind of failure.
It can be funny when computers make mistakes because they were designed with limited information. For example, computer programs designed to identify pictures sometimes "hallucinate" sheep in every field. This is because the computer programs are designed based on a bunch of pictures taken by people on vacation. The pictures of fields taken by people on vacation mostly also contain sheep.
Other cases of computers failing to make good decisions are more worrying. Many computer programs that do real-life facial recognition don't work well for people with darker skin. This can happen when the facial recognition program is designed around on a bunch of pictures the designers had easy access to, and those pictures mostly contained white people.
Decision trees are useful tools for computer scientists. They turn simple yes-or-no decisions into more complex decisions involving many different options. Computers are often better at answering simple yes-or-no questions, so decision trees help computers manage more complexity.
Some computer scientists use decision trees designed by human experts to help computers make smarter choices. Other computer scientists use computer programs that do machine learning in order to create the best decision trees for solving a new problem. When you're using decision trees, the order of questions can make a big difference in the number of questions you have to ask.
Decision trees aren't the right tool for every problem. For example, how would you use a decision tree to sort a deck of cards?
In the rest of Computer Science Essentials, you'll learn about many of the other tools computer scientists use to solve problems.