Artificial Intelligence (A.I.) will soon be at the heart of every major technological system in the world including: cyber and homeland security, payments, financial markets, biotech, healthcare, marketing, natural language processing, computer vision, electrical grids, nuclear power plants, air traffic control, and Internet of Things (IoT).
While A.I. seems to have only recently captured the attention of humanity, the reality is that A.I. has been around for over 60 years as a technological discipline. In the late 1950’s, Arthur Samuel wrote a checkers playing program that could learn from its mistakes and thus, over time, became better at playing the game. MYCIN, the first rule-based expert system, was developed in the early 1970’s and was capable of diagnosing blood infections based on the results of various medical tests. The MYCIN system was able to perform better than non-specialist doctors.
While Artificial Intelligence is becoming a major staple of technology, few people understand the benefits and shortcomings of A.I. and Machine Learning technologies.
- Machine Learning Machine learning is the science of getting computers to act without being explicitly programmed. Machine learning is applied in various fields such as computer vision, speech recognition, NLP, web search, biotech, risk management, cyber security, and many others. The machine learning paradigm can be viewed as “programming by example”. Two types of learning are commonly used: supervised and unsupervised. In supervised learning, a collection of labeled patterns is provided, and the learning process is measured by the quality of labeling a newly encountered pattern. The labeled patterns are used to learn the descriptions of classes which in turn are used to label a new pattern. In the case of unsupervised learning, the problem is to group a given collection of unlabeled patterns into meaningful categories. Within supervised learning, there are two different types of labels: classification and regression. In classification learning, the goal is to c ategorize objects into fixed specific categories. Regression learning, on the other hand, tries to predict a real value. For instance, we may wish to predict changes in the price of a stock and both methods can be applied to derive insights. The classification method is used to determine if the stock price will rise or fall, and the regression method is used to predict how much the stock will increase or decrease. This paper examines popular A.I. and machine learning techniques, and their limitations as they are used in the industry.
- Business Rule Management System A business rule management system (BRMS) enables companies to easily define, deploy, monitor, and maintain new regulations, procedures, policies, market opportunities, and workflows. One of the main advantages of business rules is that they can be written by business analysts without the need of IT resources. Rules can be stored in a central repository and can be accessed across the enterprise. Rules can be specific to a context, a geographic region, a customer, or a process. Advanced Business Rules Management systems offer role-based management authority, testing, simulation, and reporting to ensure that rules are updated and deployed accurately.
- Limits in Business Rule Management Systems Business rules represent policies, procedures, and constraints regarding how an enterprise conducts business. Business rules can, for example, focus on the policies of the organization for considering a transaction as suspicious. A fraud expert writes rules to detect suspicious transactions. However, the same rules will also be used to monitor customers whose unique spending behavior are not accounted for properly in the rule set and this results in poor detection rates and high false positives. Additionally, risk systems based only on rules detect anomalous behavior associated with just the existing rules; they cannot identify new anomalies which can occur daily. As a result, systems based on rules are outdated almost as soon as they are implemented.
- Neural Network A neural network (NN) is a technology loosely inspired by the structure of the brain. A neural network consists of many simple elements called artificial neurons, each producing a sequence of activations. The elements used in a neural network are far simpler than biological neurons. The number of elements and their interconnections are orders of magnitude fewer than the number of neurons and synapses in the human brain.
- Deep Learning
Back propagation (BP) [Rumelhart, 1986] is the most popular supervised neural network learning algorithm. Back propagation is organized into layers and connections between the layers. The leftmost layer is called the input layer. The rightmost, or output, layer contains the output neurons. Finally, the middle layers are called hidden layers.
The goal of back propagation is to compute the gradient (a vector of partial derivatives) of an objective function with respect to the neural network parameters. Input neurons activate through sensors perceiving the environment and other neurons activate through weighted connections from previously active neurons. Each element receives numeric inputs and transforms this input data by calculating a weighted sum over the inputs.A non-linear function is then applied to this transformation to calculate an intermediate state. While the design of the input and output layers of a neural network is straightforward, there is an art to the design of the hidden layers. Designing and training a neural network requires choosing the number and types of nodes, layers, learning rates, training data, and test sets.
Recently deep learning, a new term that describes a set of algorithms that use a neural network as an underlying architecture, has generated many headlines. The earliest deep learning-like algorithms possessed multiple layers of non-linear features and can be traced back to Ivakhnenko and Lapa in 1965. They used thin but deep models with polynomial activation functions which they analyzed using statistical methods.
Deep learning became more usable in recent years due to the availability of inexpensive parallel hardware (GPUs, computer clusters) and massive amounts of data. Deep neural networks learn hierarchical layers of representation from the input to perform pattern recognition. When the problem exhibits non-linear properties, deep networks are computationally more attractive than classical neural networks. A deep network can be viewed as a program in which the functions computed by the lower-layered neurons are subroutines. These subroutines are reused many times in the computation of the ﬁnal program.