This file has been truncated. show original
# Introduction to Deep Learning
In 2016 Joel Grus, a well-known data scientist went for a [job interview](http://joelgrus.com/2016/05/23/fizz-buzz-in-tensorflow/) at a major internet company. As is common, one of the interviewers had the task to assess his programming skills. The goal was to implement a simple children's game - FizzBuzz. In it, the player counts up from 1, replacing numbers divisible by 3 by 'fizz' and those divisible by 5 by 'buzz'. Numbers divisible by 15 result in 'FizzBuzz'. That is, the player generates the sequence
```1 2 fizz 4 buzz fizz 7 8 fizz buzz 11 ...```
What happened was quite unexpected. Rather than solving the problem with a few lines of Python code *algorithmically*, he decided to solve it by programming with data. He used pairs of the form (3, fizz), (5, buzz), (7, 7), (2, 2), (15, fizzbuzz) as examples to train a classifier for what to do. Then he designed a small neural network and trained it using this data, achieving pretty high accuracy (the interviewer was nonplussed and he did not get the job).
Situations such as this interview are arguably watershed moments in computer science when program design is supplemented (and occasionally replaced) by programming with data. They are significant since they illustrate the ease with which it is now possible to accomplish these goals (arguably not in the context of a job interview). While nobody would seriously solve FizzBuzz in the way described above, it is an entirely different story when it comes to recognizing faces, to classify the sentiment in a human's voice or text, or to recognize speech. Due to good algorithms, plenty of computation and data, and due to good software tooling it is now within the reach of most software engineers to build sophisticated models, solving problems that only a decade ago were considered too challenging even for the best scientists.
This book aims to help engineers on this journey. We aim to make machine learning practical by combining mathematics, code and examples in a readily available package. The Jupyter notebooks are available online, they can be executed on laptops or on servers in the cloud. We hope that they will allow a new generation of programmers, entrepreneurs, statisticians, biologists, and anyone else who is interested to deploy advanced machine learning algorithms to solve their problems.
## Programming with Data
Let us delve into the distinction between programing with code and programming with data into a bit more detail, since it is more profound than it might seem. Most conventional programs do not require machine learning. For example, if we want to write a user interface for a microwave oven, it is possible to design a few buttons with little effort. Add some logic and rules that accurately describe the behavior of the microwave oven under various conditions and we're done. Likewise, a program checking the validity of a social security number needs to test whether a number of rules apply. For instance, such numbers must contain 9 digits and not start with 000.
It is worth noting that in the above two examples, we do not need to collect data in the real world to understand the logic of the program, nor do we need to extract the features of such data. As long as there is plenty of time, our common sense and algorithmic skills are enough for us to complete the tasks.
As we observed before, there are plenty of examples that are beyond the abilities of even the best programmers, yet many children, or even many animals are able to solve them with great ease. Consider the problem of detecting whether an image contains a cat. Where should we start? Let us further simplify this problem: if all images are assumed to have the same size (e.g. 400x400 pixels) and each pixel consists of values for red, green and blue, then an image is represented by 480,000 numbers. It is next to impossible to decide where the relevant information for our cat detector resides. Is it the average of all the values, the values of the four corners, or is it a particular point in the image? In fact, to interpret the content in an image, you need to look for features that only appear when you combine thousands of values, such as edges, textures, shapes, eyes, noses. Only then will one be able to determine whether the image contains a cat.