Simplest intro to Artificial Inteligence

Miguel Cardoso
9 min readSep 10, 2023

--

Artificial Intelligence (AI) is more and more of a buzzword. It impacts everything that we do, and even though it was an already increasingly popular topic, ChatGPT made it even more popular. It seems that everyone is talking about it. In this post, we are deconstructing AI, we will explain its origins, where it is at and leave for the reader’s imagination where it might go.

Brief History of AI

The term “Artificial Intelligence” was coined by John McCarthy in 1955 [1], and back then, AI was defined as “the science and engineering of making intelligent machines”. For the sake of this post, let’s assume the following definition of intelligence:

“* the ability to learn, understand, and make judgments or have opinions that are based on reason[2]

Interestingly enough, what could be considered the first *neural network*, which is the fundamental unit for deep learning algorithms, dates back to 1943 [3]. Further ahead we will discuss exactly what those are.

Distinguishing the Terms

ChatGPT, MidJourney and plenty of other contemporary advancements brought AI even closer to the world. Both technical and non-technical people are talking about it, the media is talking about it. AI was already present in our lives but now we are more aware of it.

With that in mind, let’s first focus on understanding and distinguishing the terms that are often used interchangeably yet inaccurately: Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL).

🤖 Artificial Intelligence (AI): In practical terms, AI refers to the development of machines that can mimic human intelligence. It covers a spectrum of activities, ranging from simple automated actions to highly intricate decision-making. AI aims to enable machines to perform tasks and make judgments in ways that resemble human thinking processes. The key word here is resemble. By its definition, an AI algorithm does not need to reason or learn — as many might think. A straightforward ‘do this if A, else do B’ i.e if and else control flow can be considered an AI algorithm if it gets complex enough to resemble human-like decisions. That being said, Local search algorithms [4] are also some examples of artificial intelligence algorithms with no learning capabilities. These family of algorithms perform generic optimization to find something, for example, the best path between two points.

📚 Machine Learning (ML): If we add learning capabilities, we have Machine Learning. Instead of explicitly defined rules or well-defined functions, ML systems learn from data, improving their decision-making over time based on what they experience. However, more often than not data needs to be highly pre-processed which is really time consuming and error prone. For example, if we want to detect spam emails, first we need to convert each email to its corresponding bag of words representation i.e., a list with the numbers of each word that appears.

🧠 Deep Learning (DL): Deep Learning, a specialized branch of Machine Learning, delves into the realm of data analysis. It revolves around “artificial neural pathways”, the so called neural networks, containing multiple interconnected layers, mimicking in some sense the synaptic connections within the human brain. These algorithms autonomously gather insights from raw data, enabling them to excel in tasks like image, text generation, speech recognition and so much more through pattern and feature recognition.

In summary, artificial intelligence involves creating systems that imitate human intelligence, while machine learning, a subset of artificial intelligence, enables machines to learn from data and enhance decision-making. Deep learning, a subset of machine learning, employs complex neural networks which allow these learning algorithms to get substantially better than their machine learning counterparts.

Range of Tasks and Applications

AI, ML, and DL are on our phones, on our computers, in our schools and in our public services; they are everywhere. From understanding if there is a dog in a picture, to saving lifes, and even empowering self-driving vehicles, they can be found really everywhere in several shapes and forms.

Learning Paradigms and Techniques 📚💡

Within the realm of machine learning, three fundamental paradigms drive innovation:

Supervised Learning 🧑‍🏫📊: This encompasses tasks like classification, where models assign labels to data, and regression, where models predict continuous values. These techniques empower spam detection, medical diagnoses, financial forecasting and so much more.

Self-Supervised Learning 🤖📘: An exciting approach, popularized further by transformer models, involves models learning the underlying data structure by solving pretext tasks. These models have revolutionized language translation and understanding, propelling cross-cultural communication forward. In a nutshell, it learns using only raw data without any labeling or whatsoever. This is part of what empowers ChatGPT, a transformer-based model that was trained on an enormous amount of text, and its job was simply to try to predict the next sentences given what it read before.

Unsupervised Learning 🧩🔍: This paradigm is about uncovering hidden patterns in data without labeled outcomes. It plays a pivotal role in tasks like customer segmentation for targeted marketing campaigns.

The crucial difference between Supervised Learning and Unsupervised Learning, is that for supervised learning we need labels i.e., what we want to learn, whilst in unsupervised learning the patterns are uncovered automatically without any explicit goal in mind. In supervised learning we require the models to be ‘trained’, which in a sense, mimic how a human would learn. We give them data, we give them an objective and what each data point is supposed to mean, and we let them figure out how the input data correlates with the objective at hand. A practical example is detecting spam email; we give the algorithm a bunch of normal and spam emails, we also tell it which emails are which, and its goal is to learn what defines a spam email. After it learns, we can feed it some emails and see what it says. Technology like this powers our gmail, hotmail, and other mail providers to prevent us from being barraged with spam and scam emails. Yet, they are not perfect, as you might be thinking since we still get some weird emails now and then.

Algorithms🔧

While sophisticated techniques grab the spotlight, the utility of simple algorithms can’t be overlooked. As a matter of fact the following three algorithms are still used widely:

Naive Bayes 📊🔤: Built on probabilistic principles, it shines in text classification and spam filtering. In sentiment analysis, it gauges user emotions, aiding businesses in understanding customer opinions.

Logistic Regression 📈🔬: A versatile method that finds applications in predicting outcomes. In healthcare, it helps determine medical conditions while in finance, it aids in stock price forecasting.

Decision Trees 🌳🔍: A visual and intuitive algorithm used for both classification and regression tasks that leverages information theory to learn. They work by recursively splitting the dataset into subsets based on the value of a chosen feature, aiming to create nodes [5] that best separate the data according to the target variable.

Important to note that these are just examples, there is an enormous amount of ML algorithms as well as several variants of seemingly similar algorithms.

When it comes to Deep Learning, things do get a bit trickier, however, the fundamental unit of every deep learning algorithm are neural networks. These networks are composed of interconnected nodes, resembling the structure of the human brain. They process data through input, hidden, and output layers, learning patterns collaboratively. Weights and biases in connections play a crucial role, adjusted during training to minimize output differences. Each layer consists of one or more nodes that apply the function:

f(x) = wx + b

Where w stands for weights, which is what the neural network adjusts as it learns, b stands for bias, and x is the input data of that node.

Within Deep Learning there are several architectures, and each architecture caters to specific tasks. For example, Convolutional Neural Networks (CNNs) are normally used for image related tasks and quite successful at it. While on the other hand, Recurrent Neural Networks (RNNs), and Transformers are normally used for text related tasks. It is relevant to note that there are several variations of each architecture.

Getting Started

All the theory might sound jarring and complicated but getting started and building your first Deep Learning model actually can be quite simple, assuming you are at least a beginner in programming. When it comes to building Deep Learning models using the go-to Python language, we can identify two relevant frameworks to help you get started fast.

PyTorch [6]: A framework originally created by Meta (facebook). It’s great for experimenting and learning. Many researchers use it because it’s easy to work with.

import torch
import torch.nn as nn

# Define the neural network class
class NeuralNetwork(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super(NeuralNetwork, self).__init__()
self.layer1 = nn.Linear(input_size, hidden_size)
self.layer2 = nn.Linear(hidden_size, output_size)

def forward(self, x):
x = torch.relu(self.layer1(x))
x = self.layer2(x)
return x


# Define the input size, hidden size, and output size
input_size = 10
hidden_size = 20
output_size = 5
# Create an instance of the neural network
model = NeuralNetwork(input_size, hidden_size, output_size)
# Create some random input data
input_data = torch.randn(3, input_size)
# Get the model's predictions
output_data = model(input_data)
print("Input data shape:", input_data.shape)
print("Output data shape:", output_data.shape)

TensorFlow [7]: A framework originally created by Google. You plan things out in advance and then build your project. It’s good for making things that need to work reliably, like real-world applications.

import tensorflow as tf
from tensorflow.keras.layers import Dense
# Define the neural network model
class NeuralNetwork(tf.keras.Model):
def __init__(self, input_size, hidden_size, output_size):
super(NeuralNetwork, self).__init__()
self.layer1 = Dense(hidden_size, activation='relu')
self.layer2 = Dense(output_size)

def call(self, x):
x = self.layer1(x)
x = self.layer2(x)
return x

# Define the input size, hidden size, and output size
input_size = 10
hidden_size = 20
output_size = 5
# Create an instance of the neural network
model = NeuralNetwork(input_size, hidden_size, output_size)
# Create some random input data
input_data = tf.random.normal((3, input_size))
# Get the model's predictions
output_data = model(input_data)
print("Input data shape:", input_data.shape)
print("Output data shape:", output_data.shape)

Both frameworks do similar things but in slightly different ways. You can pick the one that feels more comfortable for your style of work or the kind of projects you want to create. In the code shown above we defined a simple neural network with one hidden layer, exactly as seen in the image below.

Neural Network

Contemporary Hype and Impact

In recent years, the applicability and hype around AI has reached new heights. The availability of large datasets and powerful computing resources has fueled breakthroughs in both research and industry. Powerful recommendation systems like the one used by TikTok, self-driving cars, advanced medical diagnostic tools, and much more have demonstrated the significant impact AI can have on various aspects of our lives. We are reaching a point that the main question is no longer “Can it be done?”, but instead, “Should it be done ? What are the possible impacts ?” Gladly, there are researchers and full companies focusing on AI Safety, Fairness and Ethics, but there is still much to be done, and that topic deserves its own couple of write ups.

Conclusion

The world of Artificial Intelligence (AI) is captivating, thrilling, multifaceted and booming, tracing its origins back to the visionary minds of the past. These innovations have found their way into countless applications, revolutionizing industries and reshaping the way we interact with technology. From more theoretical research, to tooling that is directly used both within the industry and by hobbyists, AI is here to serve our needs, improving our day to day life, and increasing our productivity. The AI community is big, expanding and teeming with open-source resources and free learning materials, a good example is Hugging Face [8], which is one of the most popular AI community hubs in the world.

However, as the AI wave surges forward, it’s crucial to ponder its ethical implications and potential impacts, ensuring responsible development for the betterment of society. This requires both a deep understanding of the technology and the mindset to understand that it can impact real people and think deeply about what can go wrong, which something certainly will. This enormous topic requires its own set of posts.

As you delve into the world of AI, remember that understanding its history, mechanisms, and applications is a journey that promises both knowledge and transformative opportunities. So, seize the chance to explore, innovate, and contribute to the unfolding story of AI’s evolution.

--

--

Miguel Cardoso

Innovator and problem solver at heart. Product and Software Engineer exploring AI, software architecture and product management through writting.