Submitted by zslotyi on Thu, 07/02/2020 - 11:08

Ever since computers have existed, we keep hearing that computers are just machines, and as such they will only do as they are told. (If you’ve tried programming and your experience doesn’t necessarily confirm this notion, let me elaborate a little more: computers will do as they are told, and definitely will not do what we originally meant to tell them to do. Does it make sense now?)

Back to our original line of thought, for the computer to do a task for us, we not only have to tell it exactly what we expect as a result, but also how it can get there. In addition, we need to tell it all this in a manner it is capable of understanding.

Let's see an simple task as an example!

Take a snippet of text (call it ‘string’ for simplicity): “XY”. For some mysterious reason, we want the computer to invert this snippet.

So how to go about that?

If we have a command “invert” in our (or rather in the computer’s) vocabulary, meaning it knows what we mean by “inverting” a string, our job is extremely simple. We just give it the instruction: invert XY, then we sit back and take a big sip of the cold lemonade while waiting for the output.

But what if this entry is not in the dictionary? What if we don't have the invert command at our fingertips? If no one in the world has ever bothered to teach our computer to reverse strings before? 

In this case, we have no choice but to take our machine by the hand and guide it through the incredibly difficult task all the way to its solution. 

(Note: inverting a string is a question that typical to pop up at a test or, for example at a (n entry level) job interview. Like most of these tasks, there are several possible solutions; the one outlined below is only one of these.)

  1. Split the string into its components (say letters or characters for simplicity): X , Y
  2. store these characters as separate elements in a structure that remembers both the elements themselves and their order. Let's say we create a(n ordered) list, whose elements are (1) X (2) Y
  3. Create another instance of the same structure, but this time empty
  4. Select the last element of the first list and store it in the first place of the second list
  5. Select the element of the first list that’s immediately before the one we selected last time, and store it in the first available, empty place of the second list.
  6. Repeat step (5) until you reach the end of the first list.

If everything went well, we have just taught our computer how to invert strings in six steps. If you look closely, steps 4-6 are quite similar to each other - almost redundant - , so with a little extra effort it would be easy to streamline the algorithm further. 

And this algorithm is general and versatile, too. If we follow these same steps, the computer will faithfully invert any Hungarian, French  or English text, a short snippet or War and Peace. (There are, of course a few nuances here and there, but let’s ignore them for the sake of argument to keep things simple).


The problem at hand, however, is given. For the computer to solve our task, we had to teach it how to do it. And even for a simple task like this, the solution looked pretty complicated. What about more complex tasks? 

And then, not to mention what about tasks that we ourselves can not easily solve or whose solutions cannot be easily described. Like that of recognizing a childhood friend, or diagnosing a disease from an X-Ray shot.

This is where Machine Learning comes in handy.

Let’s look at the previous task, again, but now from a different angle, as if, it wasn’t only the computer whose dictionary lacks the word “swap”. We don’t have it, either. 

So, what do we do to show the computer what is it that we want?

Exactly that: show it.

We want our algorithm, whatever it may be, to output ‘YX’ every time when we give it ‘XY’ as an input.

Instead of the previous, step by step guide (which we can call rules-based method for simplicity) we know now the two endpoints of the procedure, and want the computer to figure out what’s in between.

The good news is that machine learning allows us to accomplish just that.

However, there are a few pitfalls.Let’s have a look at a few of them


One example is never enough

Looking at the input XY and the consequent output, YX, any 12-year-old with average IQ will tell you what the task was. What could possibly go wrong?

For one, one example is barely enough to clearly define the task. Think about some of the ways we can get the same expected output starting from the same input.

XY -> YX

Did we just swap the characters of the input string? But which ones? Only the ones in uppercase or all of them? Just the first and the last ones, or all of them? Maybe instead of swapping them, we put them in reverse-alphabetical order?

Based on our one example, these questions can not be answered.

So let's find another example and show it to the computer.

xyn -> nyx
Okay, this has clarified some of our issues: We already know that we swap lowercase characters, too, and we can safely exclude the thing with the reverse alphabetical order. However, we still don't know if we need to swap all or only the first and last characters of the string.

As trivial as these examples are, they illustrate that in order to describe a task accurately, and leave enough room for the computer to “learn”, we need a sufficiently large set of examples.

This is especially true to problems machine learning is used for in everyday practice - that are usually slightly more complex than swapping strings, to put it mildly. Accordingly, when we think about the volume of data we need we don’t just add or multiply. We shift orders of magnitude.

For a relatively simple application, a set of a few hundred examples may yield good results, but commercial applications such as the iPhone's face recognition or individual Natural Language Processing (NLP) applications are not uncommon with hundreds of millions or even billions of examples. and training.


The structure still needs to be built

The computer can, within certain limits, figure out, in elementary what the machine is doing even if it sees only the input and the desired output - and enough pairs of them. As for the “how”, or we don't even necessarily have to worry about it. Unless there are more thorough considerations (which of course there are), until the accuracy of the model satisfies our needs, we just go with it and leave it at that.

But the structure of the machine itself still needs to be built.

We need to place structures between the input and the output whose components allow the computer not only to solve but also to figure out the tasks at hand.

Although there is a wide variety of such structures, in the field of machine learning, the last ten years have seen the emergence of the ones that are commonly called ‘neural networks’.

Neural networks, like neurons in the nervous system, consist of separate but complex and closely related units. While these units operate separately, they by no means operate in isolation: when properly coordinated, a bunch of these neurons are capable of performing amazingly complex tasks. 

The degree of similarity between neural networks and neurons in the nervous system is a debate for another day (hint: similarity is rather superficial and limited), but the name adequately reflects the structure and complexity of the networks themselves. And it also helps illustrate the concept that while we explicitly see the input and an output, we can’t really access directly the details of what’s going on “inside the black box”.

Neural networks, like nervous systems in nature, come in a variety of forms, structures, and complexities. Their construction is usually optimized for the task they perform, so there are some of them that consist of only a few — while others consist of hundreds of thousands or millions of neurons.

Their research and development has now become a completely separate and expanding field of science at an amazing pace.

The site you are reading is one of the gazillion of websites on the internet inviting you for an adventure in this area.

If you are compelled enough, Read on