The field of artificial intelligence in the world of science and technology is gaining momentum and the main focus today is the development of computer systems (machines) capable of performing tasks that humans are capable of doing, such as object recognition, understanding and decoding of human speech, decision making, performing diverse tasks and more, which is ideal for senior executives in organizations big and small.

Narrow AI: This is the field of machines designed to perform a single task, but once the model that the machine runs is well trained it is no longer general and cannot be used in other areas. This is the field we are working with today, such as the ~Google Translate application.

Artificial General Intelligence (AGI): This is the field of machines designed to perform any intellectual task a person can perform. This area is more aware and makes decisions similar to how humans make decisions. At these very moments, scientists and researchers are breaking their heads in very difficult work that involves complex calculations in order to be able to develop and refine the field of AGI. According to various forecasts, it is estimated that there will be such machines between 2029 and 2049 and there is even a chance that we will never get there. The big hope is that they will be able to develop this in about twenty years, but serious challenges, such as high energy and acceleration due to big data consumption, can make this important task difficult. With these kinds of challenges, today’s powerful machines need to cope and the need for an optimal solution can create a catastrophic memory loss situation that can affect even the most advanced deep learning algorithms we know today.

Super Intelligence: This is the field of machines designed to perform any intelligent task that goes beyond the performance of human beings and in all areas, especially with regard to complex real-time decision making, general wisdom, problem solving and high creativity.

Classical Artificial Intelligence: This is the field of machines designed to perform relatively simple tasks and for which there are algorithms and approaches that include rule-based systems, such as search algorithms that include non-information search, i.e. universal and more sophisticated search like algorithms: A* STAR and A. These algorithms are assumed to have created a strong foundation that has helped to develop more advanced approaches that are well matched to large search volumes and large data systems.

Artificial Intelligence is heavily influenced by the media and has been influential in recent times and across all sectors, from healthcare, financial services, retail, marketing, transportation, security, manufacturing, travel and more.

Big Data’s accelerated entry, fueled by the arrival of the Internet revolution, smart mobile and social media, has enabled AI algorithms, such as machine learning and deep learning, to leverage Big Data and perform their tasks most successfully. When combined with one another, and inexpensive relatively hardware, but powerful and stable enough, such as graphics processing units (GPUs) that allow AI to evolve into more complex architectures.


Machine learning is defined as the field of AI that implements statistical methods to enable computer systems (machines) to learn from the data towards the complete execution of any task.

Key Terms to Understand

Features / Attributes: Used to represent data in a way that algorithms are able to understand and process them.

Entropy: The amount of uncertainty in a random variable.

Information Gain: The amount of information stored from receiving previous information.

Supervised Learning: A machine learning algorithm that works with tagged data.

Unsupervised Learning: A machine learning algorithm capable of detecting hidden patterns in untagged data.

Semi-Supervised Learning: A machine learning algorithm capable of identifying elements when only a small portion of the data is labeled.

Loss Function: The difference between ground reality and what the algorithm has already learned. When using a machine learning algorithm, the goal is to minimize the loss function so that the algorithm can continue to work and generate inclusion and performance in invisible scenarios.


Regression, Clustering and Classification are the 3 main areas of machine learning algorithms.

Linear Regression: An area of ​​machine learning that models the relationship between two or more variables that have continuous values.

Logistic Regression: A classification technique that models the logit function as a linear combination of features. Binary logistic regressions deal with situations where the variable that is being predicted has two outcomes (‘0’ or ‘1’). Multinomial logistic regression deals with situations where there can be several different values ​​for the predicted variable.

Reinforcement Learning: An area that deals with modeling agents in an environment that continuously rewards the agent for making the right decision. These trainings are what make the agent a significant player who can also be penalized when making a wrong move, much like a chess game. After training, the agent can deal with a human being in a real game in reality.
See here: ~Reinforcement Learning.

Decision Trees: an algorithm to learn decision rules inferred from the data. These rules are then followed for decision making.

K-Means: an unsupervised approach to group (or cluster) different instances of data based upon their similarity with each other. An example is to group a population of people based on similarity.

Support Vector Machines (SVM): is a classification algorithm that draws separating hyper plane between two classes of data. Once trained, a SVM model can be used as a classifier on an unseen data.

Boosting and Ensemble: is a method used to take several weak learners that perform poorly and then combine these weak learners into a strong classifier.

Busting & Composition: It is a method that takes some poor learners who are performing poorly and then integrates these weak learners into a strong sort.

Principal Components Analysis (PCA): A method of reducing the dimensionality of the data while maintaining the explanation of the data. The usability of the method is to get rid of excess and unnecessary information found in the data while retaining the features that explain most of the data.

Simultaneous Positioning & Mapping (SLAM): An area that deals with the methods used by robots to settle in unknown environments.

Random Forest: An area of ​​a family of composition learning techniques that involves creating multiple decision trees during training, at random, while building the tree. Output should be the average outlook from the individual trees (or class) that represent the status of the class. This prevents the algorithm from over-adjusting or memorizing the training data.

Evolutionary Genetic Algorithms: biologically inspired algorithms, inspired by evolutionary theory. The algorithms are implemented to solve optimization and search problems by applying bio-inspired concepts, including selection, crossover and mutation.

Neural Networks: Biologically inspired networks that extract abstract characteristics from the data hierarchically.


Deep learning is defined as belonging to the domain of neural networks and has a number of hidden layers, most often such a neural network is defined as a deep neural network.

These are the main types of deep neural networks used today–

Deep Reinforcement Learning: Deep reinforcement learning algorithms deal with modeling an agent that learns to interact with the environment in the best possible way. The agent continuously performs actions with regard to the purpose and the environment rewards or punishes the agent for performing a positive or negative action, respectively. In this way, the agent learns to behave best in order to achieve the goal. AlphaGo from DeepMind is one of the best examples of how the agent learned to play Go’s game and was able to handle and beat a real person.

Curved Neural Network (CNN): Curved neural networks are a type of neural network that uses winding to extract patterns from input data hierarchically. These networks mainly use data that have spatial connections such as images.

Recurrent Neural Network (RNN): Recurrent neural networks, in particular LSTMs, are used to process continuous data, such as: time series data, stock market data, speech, sensor signals, and energy data that are temporally dependent. LSTMs networks are a more efficient type of RNN that alleviates the problem of scalability and provides reminiscence capability in both the short and long term.

Boltzmann Limited (RBM) Machine: Neural Network with Stochastic Properties. Boltzmann limited machines are trained in an approach called opposite divergence. After training, the hidden layers are a latent representation of the input. RBMs study a probabilistic representation of the input.

Deep Belief Network: Boltzmann Machines Composition is limited as each layer is used as a visible layer. Each layer is trained before adding additional layers to the network that helps in probabilistic reconstruction. The network is adept at layer-by-layer supervision.

Variable Automatic Encoders (VAEs): An impromptu version of automatic encoders used to study optimal latent representation of the input data. It is complex, coded and decoded with a loss function. VAEs use probabilistic approaches and relate to estimated inference in a latent Gaussian model.

GAN Networks: Generative suppressor networks are a type of CNN that uses a generator and a discriminator. The generator generates data continuously while the discriminator learns to discriminate falsely from real data. In this way, as the training progresses, the generator improves to create fake data that looks like real while the image detectors also improve to learn the difference between fake and real, which helps the generator improve itself. After training you can use the generator to create fake data that looks real.

Capsules: Still an active area in research. CNN is known to display data representations when in most cases it is not interpretable. The Capsule network is known for finding certain types of representations from the input, for example, when it maintains the hierarchical position relationship between the object parts. Another advantage of the capsule networks is that it is able to study the representations with a fraction of the data, different from what CNN requires.

“I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.”― Alan Turing