This blog post was first published on our Norwegian blog, and you can read the original article here.
AI, Deep Learning, Machine Learning, and Data Science. There are many terms and many people use them without knowing what the terms really entail.
The choice of term is often driven by which term has the greatest “hype” effect when it comes to the AI field. This can be unfortunate because most concepts have their own justification. However, it can be difficult to separate them, as they overlap and are highly connected.
Our sister company in Denmark, Kapacity, has made an interpretation of some of the most important concepts within Data Science and Advanced Analytics. Here you get an overview of the most common concepts:
Databases and data processing
In order to use data to be able to make informed decisions, they must be retrieved, transformed, and stored. The terms “databases and data processing” refer to the collection, storage, and manipulation of data. It is also within these terms that one can define one of the classic business intelligence disciplines: data warehousing.
Simply explained, data warehousing is the electronic storage of a large amount of information by a business or organisation. Data warehouses are used for business reporting and analytical purposes and are designed to run queries and analysis on historical data that has been derived from transactional sources for business intelligence and data mining purposes.
Learn more about AI and machine learning on our Insights page.
Data visualisation
Visualisation is about arranging data as figures and graphs. Visualisation can take place in two cases:
- As part of Visual Analytics where the purpose is to use the human capacity to recognise patterns in the analysis of data
- As a visualisation of results, where the purpose is to present results of analyses to others in a simple and visual way
Statistics
Classical statistics cover, for example, topics such as sampling technique, distribution theory, hypothesis testing, and design, as well as analysis of experiments. This type of method is widely used in academic research, but can also be used to investigate business-related needs.
These methods are most often used when it is not possible or desirable to collect data, but when a sample of this data is used instead. When you do not have complete information (read: all data), you have to use specific methods to make reliable and valid analyses.
Pattern recognition
As the name implies, pattern recognition is about recognising patterns in data. This can be done using visual, mathematical, or statistical techniques. Pattern recognition is often a combination of these techniques.
You might also be interested in: Can a robot be sustainable?
Neurocomputing
With a desire to be able to model human brain activity, researchers have developed mathematical models that simulate people’s neural activity.
These models (Neural Network) have proven to be useful for solving many problems, due to their degree of flexibility and complexity. The models are used in Artificial Intelligence, Machine Learning, and Deep Learning, among others.
Artificial Intelligence (AI)
The purpose of AI is to make devices perceive their environment/surroundings and act in ways that ensure that a goal is achieved. The term, therefore, covers the desire to create systems that are ultimately able to think and act by themselves.
AI covers an approach or philosophy, as well as techniques such as machine learning.
Machine learning (ML)
Machine learning deals with a wide range of tools used to learn contexts in data. Through many iterations of the model’s parameters, these tools can find the best solutions to problems.
We are talking about machine learning or ML when the computer uses algorithms that can process large amounts of data, learn from this data, and use that data to make decisions. This can solve tasks such as data classification, pattern recognition, and forecasting.
So, what problems can machine learning solve? Machine learning algorithms can mainly be used for two things:
- Supervised learning: To predict what will happen in the future, given historical data. You divide your data into test and training data, give the training data to the machine, and tell it what to look for. Then you validate whether the computer has learned enough by displaying the test data and seeing what results from this data the computer provides.
- Unsupervised learning: Finding patterns in data. With unsupervised learning, you leave your computer “to its own devices” to completely evaluate which patterns are in the data.
Also read: Optimized kindergarten admissions with AI
Deep Learning
Deep learning is a specific part of machine learning where, with the help of multiple models based on the output from each other, one can learn very complex connections. In deep learning, it is often advanced neural networks with many layers that are informing each other.
Data Mining
Data Mining entails several methods for working with machine learning and other techniques to generate knowledge and business value. It is therefore not a pure technique for data analysis, but rather about the principles and processes that one must follow to apply machine learning techniques to achieve the objectives of a business.
The most well-known data mining process model is probably CRISP-DM (Cross-Industry Standard Process for Data Mining) which deals with the Data Mining process from start to finish.
Also read: How can you turn your customer care department into your new sales department?
Data Lake
In its simplest meaning, a Data Lake is a place where all raw data is collected in its original format. This might, for example, be a suitable database or file system that is fast and scalable enough to receive data as it is created, and then made available.
The purpose of a data lake is to make it easier to handle and process data. This is done by making sure that the person handling the data doesn’t have to deal with many different data sources, different locations of data, different security mechanisms in each system, and different data capture technologies.
Another advantage of a data lake is that you don’t need to have a clearly defined data model in advance. A data lake can be cloud-based, on-premise, or even a hybrid of the two.
Want to learn more about artificial intelligence and machine learning, and how we are working with it in Visma?