Читать онлайн Глоссариум по искусственному интеллекту: 2500 терминов. Том 2 бесплатно
Редактор Хаджимурад Ахмедович Магомедов
Корректор Александр Хафизович Юлдашев
Иллюстратор Александр Юрьевич Чесалов
Дизайнер обложки Александр Юрьевич Чесалов
© Александр Юрьевич Чесалов, 2023
© Александр Николаевич Власкин, 2023
© Матвей Олегович Баканач, 2023
© Александр Юрьевич Чесалов, иллюстрации, 2023
© Александр Юрьевич Чесалов, дизайн обложки, 2023
ISBN 978-5-0060-9410-9 (т. 2)
ISBN 978-5-0060-9411-6
Создано в интеллектуальной издательской системе Ridero
От авторов-составителей
Александр Юрьевич Чесалов,
Власкин Александр Николаевич,
Баканач Матвей Олегович
Эксперты по информационным технологиям и искусственному интеллекту, разработчики программы Центра искусственного интеллекта МГТУ им. Н. Э. Баумана, программы «Искусственный интеллект» и «Глубокая аналитика» проекта «Приоритет 2030» МГТУ им. Н. Э. Баумана в 2021—2022 годах.
Дорогие Друзья и Коллеги!
Авторы составители этой книги посвятили подготовке и созданию данного глоссария (краткого словаря специализированных терминов) два года.
«Пилотная» версия книги была подготовлена всего за восемь месяцев и представлена на на 35-ой Московской международной книжной ярмарке в 2022 году.
В какой-то момент времени книга выросла до восьми ста шестидесяти страниц и нам пришлось подготовить двухтомное издание.
Сейчас мы рады представить Вам второй том книги, который содержит более тысячи двух сот пятидесяти терминов и определений по искусственному интеллекту на английском языке.
Изображение обложки к книге нарисовано в системе генеративного искусственного интеллекта Easy Diffusion.
35-ая Московская международная книжная ярмарка в 2022 году. С лева направо Александр Чесалов, Александр Власкин и Матвей Баканач
Почему книга называется «Глоссариум»?
«Glossarium» на латинском языке означает словарь узкоспециализированных терминов.
Идея составления «глоссариев» принадлежит одному из соавторов книги – Александру Чесалову. Первый его опыт в этой области был в составлении глоссария по искусственному интеллекту и информационным технологиям, который он опубликовал в декабре 2021 года.1 В нем первоначально было всего 400 терминов. Затем, уже в 2022 году, Александр его существенно расширил до более чем 1000 актуальных терминов и определений. Впоследствии он опубликовал целую серию книг, раскрывающих темы четвертой промышленной революции, цифровой экономики, цифрового здравоохранения и многих других.
Идея создания большого глоссария по искусственному интеллекту родилась в начале 2022 года. Авторы пришли к единодушному решению объединить свои усилия и свой опыт последних лет в области искусственного интеллекта, который был подкреплен несколькими знаменательными и судьбоносными событиями.
Несомненно, самое существенное событие, которое произошло несколько ранее в 2021 году – это участие авторов (как экспертов) в Конкурсе, проводимом Аналитическим Центром при Правительстве России по отбору получателей поддержки исследовательских центров в сфере искусственного интеллекта, в том числе в области «сильного» искусственного интеллекта, систем доверенного искусственного интеллекта и этических аспектов применения искусственного интеллекта. Перед нами стояла неординарная и еще на тот момент времени никем не решенная задача создания Центра разработки и внедрения сильного и прикладного искусственного интеллекта МГТУ им. Н. Э. Баумана. Все авторы этой книги приняли самое непосредственное участие в разработке и написании программы и плана мероприятий нового Центра. Подробнее об этой истории можно узнать из книги Александра Чесалова «Как создать центр искусственного интеллекта за 100 дней».
Далее мы приняли участие в Первом международном форуме «Этика искусственного интеллекта: начало доверия», который состоялся 26 октября 2021 года и в рамках которого была организована церемония торжественного подписания Национального кодекса этики искусственного интеллекта, устанавливающего общие этические принципы и стандарты поведения, которыми следует руководствоваться участникам отношений в сфере искусственного интеллекта в своей деятельности. По сути, форум стал первой в России специализированной площадкой, где собралось около полутора тысяч разработчиков и пользователей технологий искусственного интеллекта.
В дополнение ко всему мы не прошли мимо и Международной конференции по искусственному интеллекту и анализу данных AI Journey, в рамках которой 10 ноября 2021 года к подписанию Национального Кодекса этики искусственного интеллекта присоединились лидеры ИТ-рынка. Число спикеров конференции поражало воображение – их было более двухсот, а число онлайн-посещений сайта более сорока миллионов.
Уже в 2022 году мы приняли самое активное участие в Международном военно-техническом форуме «Армия-2022» с докладом «Разработка программно-аппаратных комплексов для решения широкого круга прикладных задач с использованием технологий машинного обучения и доверенного искусственного интеллекта в Оборонно-промышленном комплексе РФ».
Резюмируя всю нашу активную работу за последние пару лет, мы пришли к необходимости систематизировать накопленные знания и изложить их в новой книге, которую вы держите в своих руках.
Мы часто с вами слышим «искусственный интеллект».
Но понимаем ли мы что это такое?
Например, в этой книге мы зафиксировали, что Искусственный интеллект – это компьютерная система, основанная на комплексе научных и инженерных знаний, а также технологий создания интеллектуальных машин, программ, сервисов и приложений, имитирующая мыслительные процессы человека или живых существ, способная с определенной степенью автономности воспринимать информацию, обучаться и принимать решения на основе анализа больших массивов данных, целью создания которой является помощь людям в решении их повседневных рутинных задач.
Или, еще один пример.
Что такое «доверенный искусственный интеллект»?
Системой доверенного искусственного интеллекта называют прикладную систему искусственного интеллекта, обеспечивающую выполнение возложенных на нее задач с учетом ряда дополнительных требований, учитывающих этические аспекты применения искусственного интеллекта, которая обеспечивает доверие к результатам ее работы, которые, в свою очередь, включают в себя: достоверность (надежность) и интерпретируемость выводов и предлагаемых решений, полученных с помощью системы и проверенных на верифицированных тестовых примерах; безопасность как с точки зрения невозможности причинения вреда пользователям системы на протяжении всего жизненного цикла системы, так и с точки зрения защиты от взлома, несанкционированного доступа и других негативных внешних воздействий, приватность и проверяемость данных, с которыми работают алгоритмы искусственного интеллекта, включая разграничение доступа и другие связанные с этим вопросы.
А что же тогда такое «машинное обучение»?
Машинное обучение – это одно из направлений (подмножеств) искусственного интеллекта, благодаря которому воплощается ключевое свойство интеллектуальных компьютерных систем – самообучение на основе анализа и обработки больших разнородных данных. Чем больше объем информации и ее разнообразие, тем проще искусственному интеллекту найти закономерности и тем точнее будет получаемый результат.
Чтобы заинтересовать уважаемого читателя, приведем еще несколько «забавных» примеров.
Слышали ли вы когда-нибудь о «Трансгуманистах»?
С одной стороны, как идея Трансгуманизм (Transhumanism) – это расширение возможностей человека с помощью науки. С другой стороны – это философская концепция и международное движение, приверженцы которого желают стать «постлюдьми» и преодолеть всевозможные физические ограничения, болезни, душевные страдания, старость и смерть благодаря использованию возможностей нано- и био- технологий, искусственного интеллекта и когнитивной науки.
На наш взгляд, идеи «трансгуманизма» очень тесно пересекаются с идеями и концепциями «цифрового человеческого бессмертия».
TEDx ForestersPark 2019 год
Несомненно, вы слышали и конечно знаете, кто такой «Data Scientist» – ученый и специалист по работе с данными.
А слышали ли вы когда-нибудь о «датасатанистах»? :-)
Датасатанисты – это определение, придуманное авторами, но отражающее современную действительность (наравне, например, с термином «инфоцыганщина»), которая сформировалась в период популяризации и повсеместной реализации идей искусственного интеллекта в современном информационном обществе. По своей сути датасатанисты – это мошенники и преступники, которые очень умело маскируются под ученых и специалистов в области ИИ и МО, но при этом пользуются чужими заслугами, знаниями и опытом, в своих корыстных целях и целях незаконного обогащения.
А, как вам такой термин – «библеоклазм»?
Библиоклазм – человек, в силу своего трансформированного мировоззрения и чрезмерно раздутого эго, из зависти или какой-либо другой корыстной цели, который стремится уничтожить книги других авторов. Вы не поверите, но таких людей, как «датасатанисты» или «библиоклазмы» сейчас достаточно.
А, как вам такие термины: «искусственная жизнь», «искусственный сверхинтеллект», «нейроморфный искусственный интеллект», «человеко-ориентированный искусственный интеллект», «синтетический интеллект», «распределенный искусственный интеллект», «дружественный искусственный интеллект», «дополненный искусственный интеллект», «композитный искусственный интеллект», «объяснимый искусственный интеллект», «причинно-следственный искусственный интеллект», «символический искусственный интеллект» и многие другие (все они есть в этой книге).
Таких примеров «удивительных» терминов мы можем привести еще не мало. Но в своей работе мы не стали тратить время на «суровую действительность» и сместили акцент на конструктивный и позитивный настрой. Одним словом, мы провели для Вас большую работу и собрали более 2500 терминов и определений по машинному обучению и искусственному интеллекту на основе своего опыта и данных из огромного числа различных источников.
2500 терминов и определений.
Много это или мало?
Наш опыт подсказывает, что для взаимопонимания двум собеседникам достаточно знать десяток или, максимум, два десятка определений. Но, когда дело касается профессиональной деятельности, то может получиться так, что мало знать, даже, несколько десятков терминов.
В этой книге приведены самые актуальные термины и определения, по-нашему мнению, наиболее часто употребляемые, как в повседневной работе, так и профессиональной деятельности специалистами самых разных профессий, интересующихся темой «искусственного интеллекта».
Мы очень старались сделать для вас нужный и полезный «инструмент» для вашей работы.
В заключение хочется добавить и проинформировать уважаемого читателя о том, что эта книга является абсолютно открытым и свободным к распространению документом. В случае, если Вы используете ее в своей практической работе, просим Вас делать ссылку на нее.
Многие из терминов и определений к ним, в этой книге, встречаются в сети Интернет. Они повторяются десятки или сотни раз на различных информационных ресурсах (в основном на зарубежных). Тем не менее, мы поставили перед собой цель – собрать и систематизировать самые актуальные из них в одном месте из самых разных источников, нужные из них перевести на русский язык и/или адаптировать, а какие-то и написать заново, исходя из собственного опыта.
Учитывая вышесказанное, мы не претендуем на авторство или уникальность представленных терминов и определений, но, несомненно, мы внесли свой собственный вклад в систематизацию и адаптацию многих из них.
Книга написана, прежде всего, для вашего удовольствия.
Мы продолжаем работу по улучшению качества и содержания текста этой книги, в том числе дополняем ее новыми знаниями по предметной области. Будем вам благодарны за любые отзывы, предложения и уточнения. Направляйте их, пожалуйста, на [email protected]
Приятного Вам чтения и продуктивной работы!
Ваши, Александр Чесалов, Александр Власкин и Матвей Баканач.
16.08.2022. Издание первое.
09.03.2023. Издание второе. Исправленное и дополненное.
01.01.2024. Издание третье. Исправленное и дополненное.
Artificial Intelligence glossary
«A»
A/B Testing is a statistical way of comparing two (or more) techniques, typically an incumbent against a new rival. A/B testing aims to determine not only which technique performs better but also to understand whether the difference is statistically significant. A/B testing usually considers only two techniques using one measurement, but it can be applied to any finite number of techniques and measures2.
Abductive logic programming (ALP) is a high-level knowledge-representation framework that can be used to solve problems declaratively based on abductive reasoning. It extends normal logic programming by allowing some predicates to be incompletely defined, declared as adducible predicates3.
Abductive reasoning (also abduction) is a form of logical inference which starts with an observation or set of observations then seeks to find the simplest and most likely explanation. This process, unlike deductive reasoning, yields a plausible conclusion but does not positively verify it. abductive inference, or retroduction4.
Abstract data type is a mathematical model for data types, where a data type is defined by its behavior (semantics) from the point of view of a user of the data, specifically in terms of possible values, possible operations on data of this type, and the behavior of these operations5.
Abstraction — the process of removing physical, spatial, or temporal details or attributes in the study of objects or systems in order to more closely attend to other details of interest6.
Accelerating change is a perceived increase in the rate of technological change throughout history, which may suggest faster and more profound change in the future and may or may not be accompanied by equally profound social and cultural change7.
Access to information – the ability to obtain information and use it8.
Access to information constituting a commercial secret – familiarization of certain persons with information constituting a commercial secret, with the consent of its owner or on other legal grounds, provided that this information is kept confidential9.
Accuracy – the fraction of predictions that a classification model got right. In multi-class classification, accuracy is defined as follows:
In binary classification, accuracy has the following definition:
See true positive and true negative. Contrast accuracy with precision and recall10,11.
Action in reinforcement learning, is the mechanism by which the agent transitions between states of the environment. The agent chooses the action by using a policy12.
Action language is a language for specifying state transition systems, and is commonly used to create formal models of the effects of actions on the world. Action languages are commonly used in the artificial intelligence and robotics domains, where they describe how actions affect the states of systems over time, and may be used for automated planning13.
Action model learning is an area of machine learning concerned with creation and modification of software agent’s knowledge about effects and preconditions of the actions that can be executed within its environment. This knowledge is usually represented in logic-based action description language and used as the input for automated planners14.
Action selection is a way of characterizing the most basic problem of intelligent systems: what to do next. In artificial intelligence and computational cognitive science, «the action selection problem» is typically associated with intelligent agents and animats – artificial systems that exhibit complex behaviour in an agent environment15.
Activation function in the context of Artificial Neural Networks, is a function that takes in the weighted sum of all of the inputs from the previous layer and generates an output value to ignite the next layer16.
Active Learning/Active Learning Strategy is a special case of Semi-Supervised Machine Learning in which a learning agent is able to interactively query an oracle (usually, a human annotator) to obtain labels at new data points. A training approach in which the algorithm chooses some of the data it learns from. Active learning is particularly valuable when labeled examples are scarce or expensive to obtain. Instead of blindly seeking a diverse range of labeled examples, an active learning algorithm selectively seeks the particular range of examples it needs for learning17,18,19.
Adam optimization algorithm it is an extension of stochastic gradient descent which has recently gained wide acceptance for deep learning applications in computer vision and natural language processing20.
Adaptive algorithm is an algorithm that changes its behavior at the time it is run, based on a priori defined reward mechanism or criterion21,22.
Adaptive Gradient Algorithm (AdaGrad) is a sophisticated gradient descent algorithm that rescales the gradients of each parameter, effectively giving each parameter an independent learning rate23.
Adaptive neuro fuzzy inference system (ANFIS) (also adaptive network-based fuzzy inference system) is a kind of artificial neural network that is based on Takagi—Sugeno fuzzy inference system. The technique was developed in the early 1990s. Since it integrates both neural networks and fuzzy logic principles, it has potential to capture the benefits of both in a single framework. Its inference system corresponds to a set of fuzzy IF—THEN rules that have learning capability to approximate nonlinear functions. Hence, ANFIS is considered to be a universal estimator. For using the ANFIS in a more efficient and optimal way, one can use the best parameters obtained by genetic algorithm24.
Adaptive system is a system that automatically changes the data of its functioning algorithm and (sometimes) its structure in order to maintain or achieve an optimal state when external conditions change25.
Additive technologies are technologies for the layer-by-layer creation of three-dimensional objects based on their digital models («twins»), which make it possible to manufacture products of complex geometric shapes and profiles26.
Admissible heuristic in computer science, specifically in algorithms related to pathfinding, a heuristic function is said to be admissible if it never overestimates the cost of reaching the goal, i.e., the cost it estimates to reach the goal is not higher than the lowest possible cost from the current point in the path27.
Affective computing (also artificial emotional intelligence or emotion AI) – the study and development of systems and devices that can recognize, interpret, process, and simulate human affects. Affective computing is an interdisciplinary field spanning computer science, psychology, and cognitive science28.
Agent architecture is a blueprint for software agents and intelligent control systems, depicting the arrangement of components. The architectures implemented by intelligent agents are referred to as cognitive architectures29.
Agent in reinforcement learning, is the entity that uses a policy to maximize expected return gained from transitioning between states of the environment30.
Agglomerative clustering (see hierarchical clustering) is one of the clustering algorithms, first assigns every example to its own cluster, and iteratively merges the closest clusters to create a hierarchical tree31.
Aggregate is a total created from smaller units. For instance, the population of a county is an aggregate of the populations of the cities, rural areas, etc., that comprise the county. To total data from smaller units into a large unit32.
Aggregator is a type of software that brings together various types of Web content and provides it in an easily accessible list. Feed aggregators collect things like online articles from newspapers or digital publications, blog postings, videos, podcasts, etc. A feed aggregator is also known as a news aggregator, feed reader, content aggregator or an RSS reader33.
AI acceleration – acceleration of calculations encountered with AI, specialized AI hardware accelerators are allocated for this purpose (see also artificial intelligence accelerator, hardware acceleration)34.
AI acceleration is the acceleration of AI-related computations, for this purpose specialized AI hardware accelerators are used35.
AI accelerator is a class of microprocessor or computer system designed as hardware acceleration for artificial intelligence applications, especially artificial neural networks, machine vision, and machine learning36.
AI accelerator is a specialized chip that improves the speed and efficiency of training and testing neural networks. However, for semiconductor chips, including most AI accelerators, there is a theoretical minimum power consumption limit. Reducing consumption is possible only with the transition to optical neural networks and optical accelerators for them37.
AI benchmark is an AI benchmark for evaluating the capabilities, efficiency, performance and for comparing ANNs, machine learning (ML) models, architectures and algorithms when solving various AI problems, special benchmarks are created and standardized, initial marks. For example, Benchmarking Graph Neural Networks – benchmarking (benchmarking) of graph neural networks (GNS, GNN) – usually includes installing a specific benchmark, loading initial datasets, testing ANNs, adding a new dataset and repeating iterations.
AI benchmark is benchmarking of AI systems, to assess the capabilities, efficiency, performance and to compare ANNs, machine learning (ML) models, architectures and algorithms when solving various AI problems, special benchmark tests are created and standardized, benchmarks. For example, Benchmarking Graph Neural Networks – benchmarking (benchmarking) of graph neural networks (GNS, GNN) – usually includes installing a specific benchmark, loading initial datasets, testing ANNs, adding a new dataset and repeating iterations (see also artificial neural network benchmarks).
AI Building and Training Kits – applications and software development kits (SDKs) that abstract platforms, frameworks, analytics libraries, and data analysis appliances, allowing software developers to incorporate AI into new or existing applications.
AI camera is a camera with artificial intelligence, digital cameras of a new generation – allow you to analyze is by recognizing faces, their expression, object contours, textures, gradients, lighting patterns, which is taken into account when processing is; some AI cameras are capable of taking pictures on their own, without human intervention, at moments that the camera finds most interesting, etc. (see also camera, software-defined camera).
AI chipset is a chipset for systems with AI, for example, AI chipset industry is an industry of chipsets for systems with AI, AI chipset market is a market for chipsets for systems with AI.
AI chipset market – chipset market for systems with artificial intelligence (AI), see also AI chipset.
AI chipset market is the market for chipsets for artificial intelligence (AI) systems.
AI cloud services – AI model building tools, APIs, and associated middleware that enable you to build/train, deploy, and consume machine learning models that run on a prebuilt infrastructure as cloud services. These services include automated machine learning, machine vision services, and language analysis services.
AI CPU is a central processing unit for AI tasks, synonymous with AI processor.
AI engineer – AI systems engineer.
AI engineering – transfer of AI technologies from the level of R&D, experiments and prototypes to the engineering and technical level, with the expanded implementation of AI methods and tools in IT systems to solve real production problems of a company, organization. One of the strategic technological trends (trends) that can radically affect the state of the economy, production, finance, the state of the environment and, in general, the quality of life of a person and humanity.
AI hardware (also AI-enabled hardware) – artificial intelligence infrastructure system hardware, AI infrastructure. Explanations in the Glossary are usually brief.
AI hardware is infrastructure hardware or artificial intelligence system, AI infrastructure.
AI industry – for example, commercial AI industry – commercial AI industry, commercial sector of the AI industry.
AI industry trends are promising directions for the development of the AI industry.
AI infrastructure (also AI-defined infrastructure, AI-enabled Infrastructure) – artificial intelligence infrastructure systems, for example, AI infrastructure research – research in the field of AI infrastructures (see also AI, AI hardware).
AI server (artificial intelligence server) – is a server with (based on) AI; a server that provides solving AI problems.
AI shopper is a non-human economic entity that receives goods or services in exchange for payment. Examples include virtual personal assistants, smart appliances, connected cars, and IoT-enabled factory equipment. These AIs act on behalf of a human or organization client.
AI supercomputer is a supercomputer for artificial intelligence tasks, a supercomputer for AI, characterized by a focus on working with large amounts of data (see also artificial intelligence, supercomputer).
AI term is a term from the field of AI (from terminology, AI vocabulary), for example, in AI terms – in terms of AI (in AI language) (see also AI terminology).
AI term is a term from the field of AI (from terminology, AI vocabulary), for example, in AI terms – in terms of AI (in AI language).
AI terminology (artificial intelligence terminology) is a set of special terms related to the field of AI (see also AI term).
AI terminology is the terminology of artificial intelligence, a set of technical terms related to the field of AI.
AI TRiSM is the management of an AI model to ensure trust, fairness, efficiency, security, and data protection38.
AI vendor is a supplier of AI tools (systems, solutions).
AI vendor is a supplier of AI tools (systems, solutions).
AI winter (Winter of artificial intelligence) is a period of reduced interest in the subject area, reduced research funding. The term was coined by analogy with the idea of nuclear winter. The field of artificial intelligence has gone through several cycles of hype, followed by disappointment and criticism, followed by a strong cooling off of interest, and then followed by renewed interest years or decades later39,40.
AI workstation is a workstation (PC) with (based on) AI; AI RS, a specialized computer for solving technical or scientific problems, AI tasks; usually connected to a LAN with multi-user operating systems, intended primarily for the individual work of one specialist.
AI workstation is a workstation (PC) with means (based on) AI; AI PC, a specialized desktop PC for solving technical or scientific problems, AI tasks; usually connected to a LAN with multi-user operating systems, intended primarily for the individual work of one specialist.
AI-based management system – the process of creating policies, allocating decision-making rights and ensuring organizational responsibility for risk and investment decisions for an application, as well as using artificial intelligence methods.
AI-based systems are information processing technologies that include models and algorithms that provide the ability to learn and perform cognitive tasks, with results in the form of predictive assessment and decision making in a material and virtual environment. AI systems are designed to work with some degree of autonomy through modeling and representation of knowledge, as well as the use of data and the calculation of correlations. AI-based systems can use various methodologies, in particular: machine learning, including deep learning and reinforcement learning; automated reasoning, including planning, dispatching, knowledge representation and reasoning, search and optimization. AI-based systems can be used in cyber-physical systems, including equipment control systems via the Internet, robotic equipment, social robotics and human-machine interface systems that combine the functions of control, recognition, processing of data collected by sensors, as well as the operation of actuators in the environment of functioning of AI systems41.
AI-complete. In the field of artificial intelligence, the most difficult problems are informally known as AI-complete or AI-hard, implying that the difficulty of these computational problems is equivalent to that of solving the central artificial intelligence problem – making computers as intelligent as people, or strong AI. To call a problem AI-complete reflects an attitude that it would not be solved by a simple specific algorithm42.
AI-enabled – using AI and equipped with AI, for example, AI-enabled tools – tools with AI (see also AI-enabled device).
AI-enabled device is a device supported by an artificial intelligence (AI) system, such as an intelligent robot.
AI-enabled device is a device supported by an artificial intelligence (AI) system, such as an intelligent robot (see also AI-enabled healthcare device)43.
AI-enabled healthcare device is an AI-enabled device for healthcare (medical care), see also AI-enabled device.
AI-enabled healthcare device is an AI-enabled healthcare device44.
AI-enabled is hardware or software that uses AI-enabled AI, such as AI-enabled tools.
AI-optimized – optimized for AI tasks; AI-optimized chip, for example, an AI-optimized chip is a chip optimized for AI tasks (see also artificial intelligence).
AI-optimized is one that is optimized for AI tasks or optimized using AI tools, for example, an AI-optimized chip is a chip that is optimized for AI tasks.
AlexNet – the name of a neural network that won the ImageNet Large Scale Visual Recognition Challenge in 2012. It is named after Alex Krizhevsky, then a computer science PhD student at Stanford University. See ImageNet.45
Algorithm – an exact prescription for the execution in a certain order of a system of operations for solving any problem from some given class (set) of problems. The term «algorithm» comes from the name of the Uzbek mathematician Musa Al-Khorezmi, who in the 9th century proposed the simplest arithmetic algorithms. In mathematics and cybernetics, a class of problems of a certain type is considered solved when an algorithm is established to solve it. Finding algorithms is a natural human goal in solving various classes of problems46.
Algorithmic Assessment is a technical evaluation that helps identify and address potential risks and unintended consequences of AI systems across your business, to engender trust and build supportive systems around AI decision making47.
AlphaGo is the first computer program that defeated a professional player on the board game Go in October 2015. Later in October 2017, AlphaGo’s team released its new version named AlphaGo Zero which is stronger than any previous human-champion defeating versions. Go is played on 19 by 19 board which allows for 10171 possible layouts (chess 1050 configurations). It is estimated that there are 1080 atoms in the universe48.
Physical media is the physical material that is used to store or transmit information in a data transmission49.
Ambient intelligence (AmI) represents the future vision of intelligent computing where explicit input and output devices will not be required; instead, sensors and processors will be embedded into everyday devices and the environment will adapt to the user’s needs and desires seamlessly. AmI systems, will use the contextual information gathered through these embedded sensors and apply Artificial Intelligence (AI) techniques to interpret and anticipate the users’ needs. The technology will be designed to be human centric and easy to use50.
Analogical Reasoning – solving problems by using analogies, by comparing to past experiences51.
Analysis of algorithms (AofA) – the determination of the computational complexity of algorithms, that is the amount of time, storage and/or other resources necessary to execute them. Usually, this involves determining a function that relates the length of an algorithm’s input to the number of steps it takes (its time complexity) or the number of storage locations it uses (its space complexity)52.
Annotation is a metadatum attached to a piece of data, typically provided by a human annotator53.
Anokhin’s theory of functional systems is a functional system consists of a certain number of nodal mechanisms, each of which takes its place and has a certain specific purpose. The first of these is afferent synthesis, in which four obligatory components are distinguished: dominant motivation, situational and triggering afferentation, and memory. The interaction of these components leads to the decision-making process54.
Anomaly detection – the process of identifying outliers. For example, if the mean for a certain feature is 100 with a standard deviation of 10, then anomaly detection should flag a value of 200 as suspicious55,56.
Anonymization – the process in which data is de-identified as part of a mechanism to submit data for machine learning57.
Answer set programming (ASP) is a form of declarative programming oriented towards difficult (primarily NP-hard) search problems. It is based on the stable model (answer set) semantics of logic programming. In ASP, search problems are reduced to computing stable models, and answer set solvers – programs for generating stable models – are used to perform search58.
Antivirus software is a program or set of programs that are designed to prevent, search for, detect, and remove software viruses, and other malicious software like worms, trojans, adware, and more59.
Anytime algorithm is an algorithm that can return a valid solution to a problem even if it is interrupted before it ends60.
API-AS-a-service (AaaS) combines the API economy and software renting and provides application programming interfaces as a service61.
Application programming interface (API) is a set of subroutine definitions, communication protocols, and tools for building software. In general terms, it is a set of clearly defined methods of communication among various components. A good API makes it easier to develop a computer program by providing all the building blocks, which are then put together by the programmer. An API may be for a web-based system, operating system, database system, computer hardware, or software library62.
Application security is the process of making apps more secure by finding, fixing, and enhancing the security of apps. Much of this happens during the development phase, but it includes tools and methods to protect apps once they are deployed. This is becoming more important as hackers increasingly target applications with their attacks63.
Application-specific integrated circuit (ASIC) is a specialized integrated circuit for solving a specific problem64.
Approximate string matching (also fuzzy string searching) – the technique of finding strings that match a pattern approximately (rather than exactly). The problem of approximate string matching is typically divided into two sub-problems: finding approximate substring matches inside a given string and finding dictionary strings that match the pattern approximately65.
Approximation error – the discrepancy between an exact value and some approximation to it66.
Architectural description group (Architectural view) is a representation of the system as a whole in terms of a related set of interests67,68.
Architectural frameworks are high-level descriptions of an organization as a system; they capture the structure of its main components at varied levels, the interrelationships among these components, and the principles that guide their evolution69.
Architecture of a computer is a conceptual structure of a computer that determines the processing of information and includes methods for converting information into data and the principles of interaction between hardware and software70.
Architecture of a computing system is the configuration, composition and principles of interaction (including data exchange) of the elements of a computing system71.
Architecture of a system is the fundamental organization of a system, embodied in its elements, their relationships with each other and with the environment, as well as the principles that guide its design and evolution72.
Archival Information Collection (AIC) is information whose content is an aggregation of other archive information packages. The digital preservation function preserves the capability to regenerate the DIPs (Dissemination Information Packages) as needed over time73.
Archival Storage is a source for data that is not needed for an organization’s everyday operations, but may have to be accessed occasionally. By utilizing an archival storage, organizations can leverage to secondary sources, while still maintaining the protection of the data. Utilizing archival storage sources reduces primary storage costs required and allows an organization to maintain data that may be required for regulatory or other requirements74.
Area under curve (AUC) – the area under a curve between two points is calculated by performing the definite integral. In the context of a receiver operating characteristic for a binary classifier, the AUC represents the classifier’s accuracy75.
Area Under the ROC curve is the probability that a classifier will be more confident that a randomly chosen positive example is actually positive than that a randomly chosen negative example is positive76.
Argumentation framework is a way to deal with contentious information and draw conclusions from it. In an abstract argumentation framework, entry-level information is a set of abstract arguments that, for instance, represent data or a proposition. Conflicts between arguments are represented by a binary relation on the set of arguments77.
Artifact is one of many kinds of tangible by-products produced during the development of software. Some artifacts (e.g., use cases, class diagrams, and other Unified Modeling Language (UML) models, requirements and design documents) help describe the function, architecture, and design of software. Other artifacts are concerned with the process of development itself – such as project plans, business cases, and risk assessments78.
Artificial General Intelligence (AGI) as opposed to narrow intelligence, also known as complete, strong, super intelligence, Human Level Machine Intelligence, indicates the ability of a machine that can successfully perform any tasks in an intellectual way as the human being. Artificial superintelligence is a term referring to the time when the capability of computers will surpass humans79,80.
Artificial Intelligence (AI) – (machine intelligence) refers to systems that display intelligent behavior by analyzing their environment and taking actions – with some degree of autonomy – to achieve specific goals. AI-based systems can be purely software-based, acting in the virtual world (e.g., voice assistants, i analysis software, search engines, speech and face recognition systems) or AI can be embedded in hardware devices (e.g., advanced robots, autonomous cars, drones, or Internet of Things applications). The term AI was first coined by John McCarthy in 195681.
Artificial Intelligence Automation Platforms – platforms that enable the automation and scaling of production-ready AI. Artificial Intelligence Platforms involves the use of machines to perform the tasks that are performed by human beings. The platforms simulate the cognitive function that human minds perform such as problem-solving, learning, reasoning, social intelligence as well as general intelligence. Top Artificial Intelligence Platforms: Google AI Platform, TensorFlow, Microsoft Azure, Rainbird, Infosys Nia, Wipro HOLMES, Dialogflow, Premonition, Ayasdi, MindMeld, Meya, KAI, Vital A.I, Wit, Receptiviti, Watson Studio, Lumiata, Infrrd82.
Artificial intelligence engine (also AI engine, AIE) is an artificial intelligence engine, a hardware and software solution for increasing the speed and efficiency of artificial intelligence system tools.
Artificial Intelligence for IT Operations (AIOps) is an emerging IT practice that applies artificial intelligence to IT operations to help organizations intelligently manage infrastructure, networks, and applications for performance, resilience, capacity, uptime, and, in some cases, security. By shifting traditional, threshold-based alerts and manual processes to systems that take advantage of AI and machine learning, AIOps enables organizations to better monitor IT assets and anticipate negative incidents and impacts before they take hold. AIOps is a term coined by Gartner in 2016 as an industry category for machine learning analytics technology that enhances IT operations analytics covering operational tasks include automation, performance monitoring and event correlations, among others. Gartner define an AIOps Platform thus: «An AIOps platform combines big data and machine learning functionality to support all primary IT operations functions through the scalable ingestion and analysis of the ever-increasing volume, variety and velocity of data generated by IT. The platform enables the concurrent use of multiple data sources, data collection methods, and analytical and presentation technologies»83,84,85.
Artificial Intelligence Markup Language (AIML) is an XML dialect for creating natural language software agents86.
Artificial Intelligence of the Commonsense knowledge is one of the areas of development of artificial intelligence, which is engaged in modeling the ability of a person to analyze various life situations and be guided in his actions by common sense87.
Artificial Intelligence Open Library is a set of algorithms designed to develop technological solutions based on artificial intelligence, described using programming languages and posted on the Internet88.
Artificial intelligence system (AIS) is a programmed or digital mathematical model (implemented using computer computing systems) of human intellectual capabilities, the main purpose of which is to search, analyze and synthesize large amounts of data from the world around us in order to obtain new knowledge about it and solve them. basis of various vital tasks. The discipline «Artificial Intelligence Systems» includes consideration of the main issues of modern theory and practice of building intelligent systems.
Artificial intelligence technologies – technologies based on the use of artificial intelligence, including computer vision, natural language processing, speech recognition and synthesis, intelligent decision support and advanced methods of artificial intelligence89.
Artificial life (Alife, A-Life) is a field of study wherein researchers examine systems related to natural life, its processes, and its evolution, through the use of simulations with computer models, robotics, and biochemistry. The discipline was named by Christopher Langton, an American theoretical biologist, in 1986. In 1987 Langton organized the first conference on the field, in Los Alamos, New Mexico. There are three main kinds of alife, named for their approaches: soft, from software; hard, from hardware; and wet, from biochemistry. Artificial life researchers study traditional biology by trying to recreate aspects of biological phenomena90.
Artificial Narrow Intelligence (ANI), also known as weak or applied intelligence, represents most of the current artificial intelligent systems which usually focus on a specific task. Narrow AIs are mostly much better than humans at the task they were made for: for example, look at face recognition, chess computers, calculus, and translation. The definition of artificial narrow intelligence is in contrast to that of strong AI or artificial general intelligence, which aims at providing a system with consciousness or the ability to solve any problems. Virtual assistants and AlphaGo are examples of artificial narrow intelligence systems91.
Artificial Neural Network (ANN) is a computational model in machine learning, which is inspired by the biological structures and functions of the mammalian brain. Such a model consists of multiple units called artificial neurons which build connections between each other to pass information. The advantage of such a model is that it progressively «learns» the tasks from the given data without specific programing for a single task92.
Artificial neuron is a mathematical function conceived as a model of biological neurons, a neural network. The difference between an artificial neuron and a biological neuron is shown in the figure. Artificial neurons are the elementary units of an artificial neural network. An artificial neuron receives one or more inputs (representing excitatory postsynaptic potentials and inhibitory postsynaptic potentials on nerve dendrites) and sums them to produce an output signal (or activation, representing the action potential of the neuron that is transmitted down its axon). Typically, each input is weighted separately, and the sum is passed through a non-linear function known as an activation function or transfer function. Transfer functions are usually sigmoid, but they can also take the form of other non-linear functions, piecewise linear functions, or step functions. They are also often monotonically increasing, continuous, differentiable, and bounded93,94.
Artificial Superintelligence (ASI) is a term referring to the time when the capability of computers will surpass humans. «Artificial intelligence,» which has been much used since the 1970s, refers to the ability of computers to mimic human thought. Artificial superintelligence goes a step beyond and posits a world in which a computer’s cognitive ability is superior to a human’s95.
Assistive intelligence is AI-based systems that help make decisions or perform actions.
Association for the Advancement of Artificial Intelligence (AAAI) is an international, nonprofit, scientific society devoted to promote research in, and responsible use of, artificial intelligence. AAAI also aims to increase public understanding of artificial intelligence (AI), improve the teaching and training of AI practitioners, and provide guidance for research planners and funders concerning the importance and potential of current AI developments and future directions96.
Association is another type of unsupervised learning method that uses different rules to find relationships between variables in a given dataset. These methods are frequently used for market basket analysis and recommendation engines, along the lines of «Customers Who Bought This Item Also Bought» recommendations97.
Association Rule Learning is a rule-based Machine Learning method for discovering interesting relations between variables in large data sets98.
Asymptotic computational complexity in computational complexity theory, asymptotic computational complexity is the usage of asymptotic analysis for the estimation of computational complexity of algorithms and computational problems, commonly associated with the usage of the big O notation99.
Asynchronous inter-chip protocols are protocols for data exchange in low-speed devices; instead of frames, individual characters are used to control the exchange of data100.
Attention mechanism is one of the key innovations in the field of neural machine translation. Attention allowed neural machine translation models to outperform classical machine translation systems based on phrase translation. The main bottleneck in sequence-to-sequence learning is that the entire content of the original sequence needs to be compressed into a vector of a fixed size. The attention mechanism facilitates this task by allowing the decoder to look back at the hidden states of the original sequence, which are then provided as a weighted average as additional input to the decoder101.
Attributional calculus (AC) is a logic and representation system defined by Ryszard S. Michalski. It combines elements of predicate logic, propositional calculus, and multi-valued logic. Attributional calculus provides a formal language for natural induction, an inductive learning process whose results are in forms natural to people102.
Augmented Intelligence is the intersection of machine learning and advanced applications, where clinical knowledge and medical data converge on a single platform. The potential benefits of Augmented Intelligence are realized when it is used in the context of workflows and systems that healthcare practitioners operate and interact with. Unlike Artificial Intelligence, which tries to replicate human intelligence, Augmented Intelligence works with and amplifies human intelligence103.
Augmented reality (AR) is an interactive experience of a real-world environment where the objects that reside in the real-world are «augmented» by computer-generated perceptual information, sometimes across multiple sensory modalities, including visual, auditory, haptic, somatosensory, and olfactory104.
Augmented reality technologies are visualization technologies based on adding information or visual effects to the physical world by overlaying graphic and/or sound content to improve user experience and interactive features105.
Auto Associative Memory is a single layer neural network in which the input training vector and the output target vectors are the same. The weights are determined so that the network stores a set of patterns. As shown in the following figure, the architecture of Auto Associative memory network has «n’ number of input training vectors and similar «n’ number of output target vectors106.
Autoencoder (AE) is a type of Artificial Neural Network used to produce efficient representations of data in an unsupervised and non-linear manner, typically to reduce dimensionality107.
Automata theory – the study of abstract machines and automata, as well as the computational problems that can be solved using them. It is a theory in theoretical computer science and discrete mathematics (a subject of study in both mathematics and computer science). Automata theory (part of the theory of computation) is a theoretical branch of Computer Science and Mathematics, which mainly deals with the logic of computation with respect to simple machines, referred to as automata108,109.
Automated control system – a set of software and hardware designed to control technological and (or) production equipment (executive devices) and the processes they produce, as well as to control such equipment and processes110.
Automated planning and scheduling (also simply AI planning) is a branch of artificial intelligence that concerns the realization of strategies or action sequences, typically for execution by intelligent agents, autonomous robots and unmanned vehicles. Unlike classical control and classification problems, the solutions are complex and must be discovered and optimized in multidimensional space. Planning is also related to decision theory111.
Automated processing of personal data – processing of personal data using computer technology112.
Automated reasoning is an area of computer science and mathematical logic dedicated to understanding different aspects of reasoning. The study of automated reasoning helps produce computer programs that allow computers to reason completely, or nearly completely, automatically. Although automated reasoning is considered a sub-field of artificial intelligence, it also has connections with theoretical computer science, and even philosophy113.
Automated system is an organizational and technical system that guarantees the development of solutions based on the automation of information processes in various fields of activity114.
Automation bias is when a human decision maker favors recommendations made by an automated decision-making system over information made without automation, even when the automated decision-making system makes errors115.
Automation is a technology by which a process or procedure is performed with minimal human intervention116.
Autonomic computing is the ability of a system to adaptively self-manage its own resources for high-level computing functions without user input117.
Autonomous artificial intelligence is a biologically inspired system that tries to reproduce the structure of the brain, the principles of its operation with all the properties that follow from this118,119.
Autonomous car (also self-driving car, robot car, and driverless car) is a vehicle that is capable of sensing its environment and moving with little or no human input120.
Autonomous is a machine is described as autonomous if it can perform its task or tasks without needing human intervention121.
Autonomous robot is a robot that performs behaviors or tasks with a high degree of autonomy. Autonomous robotics is usually considered to be a subfield of artificial intelligence, robotics, and information engineering122.
Autonomous vehicle is a mode of transport based on an autonomous driving system. The control of an autonomous vehicle is fully automated and carried out without a driver using optical sensors, radar and computer algorithms123.
Autoregressive Model is an autoregressive model is a time series model that uses observations from previous time steps as input to a regression equation to predict the value at the next time step. In statistics and signal processing, an autoregressive model is a representation of a type of random process. It is used to describe certain time-varying processes in nature, economics, etc.124.
Auxiliary intelligence – systems based on artificial intelligence that complement human decisions and are able to learn in the process of interacting with people and the environment.
Average precision is a metric for summarizing the performance of a ranked sequence of results. Average precision is calculated by taking the average of the precision values for each relevant result (each result in the ranked list where the recall increases relative to the previous result)125.
Ayasdi is an enterprise scale machine intelligence platform that delivers the automation that is needed to gain competitive advantage from the company’s big and complex data. Ayasdi supports large numbers of business analysts, data scientists, endusers, developers and operational systems across the organization, simultaneously creating, validating, using and deploying sophisticated analyses and mathematical models at scale126.
«B»
Backpropagation through time (BPTT) is a gradient-based technique for training certain types of recurrent neural networks. It can be used to train Elman networks. The algorithm was independently derived by numerous researchers127.
Backpropagation, also called «backward propagation of errors,» is an approach that is commonly used in the training process of the deep neural network to reduce errors128.
Backward Chaining, also called goal-driven inference technique, is an inference approach that reasons backward from the goal to the conditions used to get the goal. Backward chaining inference is applied in many different fields, including game theory, automated theorem proving, and artificial intelligence129.
Bag-of-words model in computer vision. In computer vision, the bag-of-words model (BoW model) can be applied to i classification, by treating i features as words. In document classification, a bag of words is a sparse vector of occurrence counts of words; that is, a sparse histogram over the vocabulary. In computer vision, a bag of visual words is a vector of occurrence counts of a vocabulary of local i features130.
Bag-of-words model is a simplifying representation used in natural language processing and information retrieval (IR). In this model, a text (such as a sentence or a document) is represented as the bag (multiset) of its words, disregarding grammar and even word order but keeping multiplicity. The bag-of-words model has also been used for computer vision. The bag-of-words model is commonly used in methods of document classification where the (frequency of) occurrence of each word is used as a feature for training a classifier131.
Baldwin effect – the skills acquired by organisms during their life as a result of learning, after a certain number of generations, are recorded in the genome132.
Baseline is a model used as a reference point for comparing how well another model (typically, a more complex one) is performing. For example, a logistic regression model might serve as a good baseline for a deep model. For a particular problem, the baseline helps model developers quantify the minimal expected performance that a new model must achieve for the new model to be useful133.
Batch – the set of examples used in one gradient update of model training134.
Batch Normalization is a preprocessing step where the data are centered around zero, and often the standard deviation is set to unity135.
Batch size – the number of examples in a batch. For example, the batch size of SGD is 1, while the batch size of a mini-batch is usually between 10 and 1000. Batch size is usually fixed during training and inference; however, TensorFlow does permit dynamic batch sizes136,137.
Bayes’s Theorem is a famous theorem used by statisticians to describe the probability of an event based on prior knowledge of conditions that might be related to an occurrence138.
Bayesian classifier in machine learning is a family of simple probabilistic classifiers based on the use of the Bayes theorem and the «naive» assumption of the independence of the features of the objects being classified139.
Bayesian Filter is a program using Bayesian logic. It is used to evaluate the header and content of email messages and determine whether or not it constitutes spam – unsolicited email or the electronic equivalent of hard copy bulk mail or junk mail. A Bayesian filter works with probabilities of specific words appearing in the header or content of an email. Certain words indicate a high probability that the email is spam, such as Viagra and refinance140.
Bayesian Network, also called Bayes Network, belief network, or probabilistic directed acyclic graphical model, is a probabilistic graphical model (a statistical model) that represents a set of variables and their conditional dependencies via a directed acyclic graph141.
Bayesian optimization is a probabilistic regression model technique for optimizing computationally expensive objective functions by instead optimizing a surrogate that quantifies the uncertainty via a Bayesian learning technique. Since Bayesian optimization is itself very expensive, it is usually used to optimize expensive-to-evaluate tasks that have a small number of parameters, such as selecting hyperparameters142.
Bayesian programming is a formalism and a methodology for having a technique to specify probabilistic models and solve problems when less than the necessary information is available143,144.
Bees’ algorithm is a population-based search algorithm which was developed by Pham, Ghanbarzadeh and et al. in 2005. It mimics the food foraging behaviour of honey bee colonies. In its basic version the algorithm performs a kind of neighbourhood search combined with global search, and can be used for both combinatorial optimization and continuous optimization. The only condition for the application of the bee’s algorithm is that some measure of distance between the solutions is defined. The effectiveness and specific abilities of the bee’s algorithm have been proven in a number of studies145.
Behavior informatics (BI) — the informatics of behaviors so as to obtain behavior intelligence and behavior insights146.
Behavior tree (BT) is a mathematical model of plan execution used in computer science, robotics, control systems and video games. They describe switchings between a finite set of tasks in a modular fashion. Their strength comes from their ability to create very complex tasks composed of simple tasks, without worrying how the simple tasks are implemented. BTs present some similarities to hierarchical state machines with the key difference that the main building block of a behavior is a task rather than a state. Its ease of human understanding makes BTs less error-prone and very popular in the game developer community. BTs have shown to generalize several other control architectures147.
Belief-desire-intention software model (BDI) is a software model developed for programming intelligent agents. Superficially characterized by the implementation of an agent’s beliefs, desires and intentions, it actually uses these concepts to solve a particular problem in agent programming. In essence, it provides a mechanism for separating the activity of selecting a plan (from a plan library or an external planner application) from the execution of currently active plans. Consequently, BDI agents are able to balance the time spent on deliberating about plans (choosing what to do) and executing those plans (doing it). A third activity, creating the plans in the first place (planning), is not within the scope of the model, and is left to the system designer and programmer148.
Bellman equation – named after Richard E. Bellman, is a necessary condition for optimality associated with the mathematical optimization method known as dynamic programming. It writes the «value» of a decision problem at a certain point in time in terms of the payoff from some initial choices and the «value» of the remaining decision problem that results from those initial choices. This breaks a dynamic optimization problem into a sequence of simpler subproblems, as Bellman’s «principle of optimality» prescribes149.
Benchmark (also benchmark program, benchmarking program, benchmark test) – test program or package for evaluating (measuring and/or comparing) various aspects of the performance of a processor, individual devices, computer, system or a specific application, software; a benchmark that allows products from different manufacturers to be compared against each other or against some standard. For example, online benchmark – online benchmark; standard benchmark – standard benchmark; benchmark time comparison – comparison of benchmark execution times150.
Benchmarking is a set of techniques that allow you to study the experience of competitors and implement best practices in your company151.
BETA refers to a phase in online service development in which the service is coming together functionality-wise but genuine user experiences are required before the service can be finished in a user-centered way. In online service development, the aim of the beta phase is to recognize both programming issues and usability-enhancing procedures. The beta phase is particularly often used in connection with online services and it can be either freely available (open beta) or restricted to a specific target group (closed beta)152.
Bias is a systematic trend that causes differences between results and facts. Error exists in the numbers of the data analysis process, including the source of the data, the estimate chosen, and how the data is analyzed. Error can seriously affect the results, for example, when studying people’s shopping habits. If the sample size is not large enough, the results may not reflect the buying habits of all people. That is, there may be discrepancies between survey results and actual results153.
Biased algorithm – systematic and repetitive errors in a computer system that lead to unfair results, such as one privilege persecuting groups of users over others. Also, sexist and racist algorithms154,155.
Bidirectional (BiDi) is a term used to describe a system that evaluates the text that both precedes and follows a target section of text. In contrast, a unidirectional system only evaluates the text that precedes a target section of text156.
Bidirectional Encoder Representations from Transformers (BERT) is a model architecture for text representation. A trained BERT model can act as part of a larger model for text classification or other ML tasks. BERT has the following characteristics: Uses the Transformer architecture, and therefore relies on self-attention. Uses the encoder part of the Transformer. The encoder’s job is to produce good text representations, rather than to perform a specific task like classification. Is bidirectional. Uses masking for unsupervised training157,158.
Bidirectional language model is a language model that determines the probability that a given token is present at a given location in an excerpt of text based on the preceding and following text159.
Big data is a term for sets of digital data whose large size, rate of increase or complexity requires significant computing power for processing and special software tools for analysis and presentation in the form of human-perceptible results160.
Big O notation is a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. It is a member of a family of notations invented by Paul Bachmann, Edmund Landau, and others, collectively called Bachmann—Landau notation or asymptotic notation161.
Bigram – an N-grams in which N=2162.
Binary choice regression model is a regression model in which the dependent variable is dichotomous or binary. Dependent variable can take only two values and mean, for example, belonging to a particular group163.
Binary classification is a type of classification task that outputs one of two mutually exclusive classes. For example, a machine learning model that evaluates email messages and outputs either «spam» or «not spam» is a binary classifier164.
Binary format is any file format in which information is encoded in some format other than a standard character-encoding scheme. A file written in binary format contains information that is not displayable as characters. Software capable of understanding the particular binary format method of encoding information must be used to interpret the information in a binary-formatted file. Binary formats are often used to store more information in less space than possible in a character format file. They can also be searched and analyzed more quickly by appropriate software. A file written in binary format could store the number «7» as a binary number (instead of as a character) in as little as 3 bits (i.e., 111), but would more typically use 4 bits (i.e., 0111). Binary formats are not normally portable, however. Software program files are written in binary format. Examples of numeric data files distributed in binary format include the IBM-binary versions of the Center for Research in Security Prices files and the U.S. Department of Commerce’s National Trade Data Bank on CD-ROM. The International Monetary Fund distributes International Financial Statistics in a mixed-character format and binary (packed-decimal) format. SAS and SPSS store their system files in binary format165.
Binary number is a number written using binary notation which only uses zeros and ones. Example: Decimal number 7 in binary notation is: 111166.
Binary tree is a tree data structure in which each node has at most two children, which are referred to as the left child and the right child. A recursive definition using just set theory notions is that a (non-empty) binary tree is a tuple (L, S, R), where L and R are binary trees or the empty set and S is a singleton set. Some authors allow the binary tree to be the empty set as well167.
Binning is the process of combining charge from neighboring pixels in a CCD during readout. This process is performed prior to digitization in the CCD chip using dedicated serial and parallel register control. The two main benefits of binning are improved signal-to-noise ratio (SNR) and the ability to increase frame rates, albeit at the cost of reduced spatial resolution.
Bioconservatism (a portmanteau of biology and conservatism) is a stance of hesitancy and skepticism regarding radical technological advances, especially those that seek to modify or enhance the human condition. Bioconservatism is characterized by a belief that technological trends in today’s society risk compromising human dignity, and by opposition to movements and technologies including transhumanism, human genetic modification, «strong» artificial intelligence, and the technological singularity. Many bioconservatives also oppose the use of technologies such as life extension and preimplantation genetic screening168,169.
Biometrics is a people recognition system, one or more physical or behavioral traits170,171.
Black box is a description of some deep learning system. They take an input and provide an output, but the calculations that occur in between are not easy for humans to interpret172,173.
Blackboard system is an artificial intelligence approach based on the blackboard architectural model, where a common knowledge base, the «blackboard», is iteratively updated by a diverse group of specialist knowledge sources, starting with a problem specification and ending with a solution. Each knowledge source updates the blackboard with a partial solution when its internal constraints match the blackboard state. In this way, the specialists work together to solve the problem174.
BLEU (Bilingual Evaluation Understudy) is a text quality evaluation algorithm between 0.0 and 1.0, inclusive, indicating the quality of a translation between two human languages (for example, between English and Russian). A BLEU score of 1.0 indicates a perfect translation; a BLEU score of 0.0 indicates a terrible translation175.
Blockchain is algorithms and protocols for decentralized storage and processing of transactions structured as a sequence of linked blocks without the possibility of their subsequent change176.
Boltzmann machine (also stochastic Hopfield network with hidden units) is a type of stochastic recurrent neural network and Markov random field. Boltzmann machines can be seen as the stochastic, generative counterpart of Hopfield networks177.
Boolean neural network is an artificial neural network approach which only consists of Boolean neurons (and, or, not). Such an approach reduces the use of memory space and computation time. It can be implemented to the programmable circuits such as FPGA (Field-Programmable Gate Array or Integrated circuit).
Boolean satisfiability problem (also propositional satisfiability problem; abbreviated SATISFIABILITY or SAT) is the problem of determining if there exists an interpretation that satisfies a given Boolean formula. In other words, it asks whether the variables of a given Boolean formula can be consistently replaced by the values TRUE or FALSE in such a way that the formula evaluates to TRUE. If this is the case, the formula is called satisfiable. On the other hand, if no such assignment exists, the function expressed by the formula is FALSE for all possible variable assignments and the formula is unsatisfiable178.
Boosting is a Machine Learning ensemble meta-algorithm for primarily reducing bias and variance in supervised learning, and a family of Machine Learning algorithms that convert weak learners to strong ones179.
Bounding Box commonly used in i or video tagging; this is an imaginary box drawn on visual information. The contents of the box are labeled to help a model recognize it as a distinct type of object.
Brain technology (also self-learning know-how system) is a technology that employs the latest findings in neuroscience. The term was first introduced by the Artificial Intelligence Laboratory in Zurich, Switzerland, in the context of the ROBOY project. Brain Technology can be employed in robots, know-how management systems and any other application with self-learning capabilities. In particular, Brain Technology applications allow the visualization of the underlying learning architecture often coined as «know-how maps»180.
Brain—computer interface (BCI), sometimes called a brain—machine interface (BMI), is a direct communication pathway between the brain’s electrical activity and an external device, most commonly a computer or robotic limb. Research on brain—computer interface began in the 1970s by Jacques Vidal at the University of California, Los Angeles (UCLA) under a grant from the National Science Foundation, followed by a contract from DARPA. The Vidal’s 1973 paper marks the first appearance of the expression brain—computer interface in scientific literature181.
Brain-inspired computing – calculations on brain-like structures, brain-like calculations using the principles of the brain (see also neurocomputing, neuromorphic engineering).
Branching factor in computing, tree data structures, and game theory, the number of children at each node, the outdegree. If this value is not uniform, an average branching factor can be calculated182,183.
Broadband refers to various high-capacity transmission technologies that transmit data, voice, and video across long distances and at high speeds. Common mediums of transmission include coaxial cables, fiber optic cables, and radio waves184.
Brute-force search (also exhaustive search or generate and test) is a very general problem-solving technique and algorithmic paradigm that consists of systematically enumerating all possible candidates for the solution and checking whether each candidate satisfies the problem’s statement185.
Bucketing – converting a (usually continuous) feature into multiple binary features called buckets or bins, typically based on value range186.
Byte – eight bits. A byte is simply a chunk of 8 ones and zeros. For example: 01000001 is a byte. A computer often works with groups of bits rather than individual bits and the smallest group of bits that a computer usually works with is a byte. A byte is equal to one column in a file written in character format187.
«C»
CAFFE is short for Convolutional Architecture for Fast Feature Embedding which is an open-source deep learning framework de- veloped in Berkeley AI Research. It supports many different deep learning architectures and GPU-based acceleration computation kernels188,189.
Calibration layer is a post-prediction adjustment, typically to account for prediction bias. The adjusted predictions and probabilities should match the distribution of an observed set of labels190.
Candidate generation — the initial set of recommendations chosen by a recommendation system191.
Candidate sampling is a training-time optimization in which a probability is calculated for all the positive labels, using, for example, softmax, but only for a random sample of negative labels. For example, if we have an example labeled beagle and dog candidate sampling computes the predicted probabilities and corresponding loss terms for the beagle and dog class outputs in addition to a random subset of the remaining classes (cat, lollipop, fence). The idea is that the negative classes can learn from less frequent negative reinforcement as long as positive classes always get proper positive reinforcement, and this is indeed observed empirically. The motivation for candidate sampling is a computational efficiency win from not computing predictions for all negatives192.
Canonical Formats in information technology, canonicalization is the process of making something conform] with some specification… and is in an approved format. Canonicalization may sometimes mean generating canonical data from noncanonical data. Canonical formats are widely supported and considered to be optimal for long-term preservation193.
Capsule neural network (CapsNet) is a machine learning system that is a type of artificial neural network (ANN) that can be used to better model hierarchical relationships. The approach is an attempt to more closely mimic biological neural organization194,195.
Case-Based Reasoning (CBR) is a way to solve a new problem by using solutions to similar problems. It has been formalized to a process consisting of case retrieve, solution reuse, solution revise, and case retention196.
Categorical data — features having a discrete set of possible values. For example, consider a categorical feature named house style, which has a discrete set of three possible values: Tudor, ranch, colonial. By representing house style as categorical data, the model can learn the separate impacts of Tudor, ranch, and colonial on house price. Sometimes, values in the discrete set are mutually exclusive, and only one value can be applied to a given example. For example, a car maker categorical feature would probably permit only a single value (Toyota) per example. Other times, more than one value may be applicable. A single car could be painted more than one different color, so a car color categorical feature would likely permit a single example to have multiple values (for example, red and white). Categorical features are sometimes called discrete features. Contrast with numerical data197.
Center for Technological Competence is an organization that owns the results, tools for conducting fundamental research and platform solutions available to market participants to create applied solutions (products) on their basis. The Technology Competence Center can be a separate organization or be part of an application technology holding company198.
Central Processing Unit (CPU) is a von Neumann cyclic processor designed to execute complex computer programs199.
Centralized control is a process in which control signals are generated in a single control center and transmitted from it to numerous control objects200.
Centroid – the center of a cluster as determined by a k-means or k-median algorithm. For instance, if k is 3, then the k-means or k-median algorithm finds 3 centroids201.
Centroid-based clustering is a category of clustering algorithms that organizes data into nonhierarchical clusters. k-means is the most widely used centroid-based clustering algorithm. Contrast with hierarchical clustering algorithms202.
Character format is any file format in which information is encoded as characters using only a standard character-encoding scheme. A file written in «character format» contains only those bytes that are prescribed in the encoding scheme as corresponding to the characters in the scheme (e.g., alphabetic and numeric characters, punctuation marks, and spaces)203.
Сhatbot is a software application designed to simulate human conversation with users via text or speech. Also referred to as virtual agents, interactive agents, digital assistants, or conversational AI, chatbots are often integrated into applications, websites, or messaging platforms to provide support to users without the use of live human agents. Chatbots originally started out by offering users simple menus of choices, and then evolved to react to particular keywords. «But humans are very inventive in their use of language,» says Forrester’s McKeon-White. Someone looking for a password reset might say they’ve forgotten their access code, or are having problems getting into their account. «There are a lot of different ways to say the same thing,» he says. This is where AI comes in. Natural language processing is a subset of machine learning that enables a system to understand the meaning of written or even spoken language, even where there is a lot of variation in the phrasing. To succeed, a chatbot that relies on AI or machine learning needs first to be trained using a data set. In general, the bigger the training data set, and the narrower the domain, the more accurate and helpful a chatbot will be204.
Checkpoint — data that captures the state of the variables of a model at a particular time. Checkpoints enable exporting model weights, as well as performing training across multiple sessions. Checkpoints also enable training to continue past errors (for example, job preemption). Note that the graph itself is not included in a checkpoint205.
Chip is an electronic microcircuit of arbitrary complexity, made on a semiconductor substrate and placed in a non-separable case or without it, if included in the micro assembly206,207.
Class — one of a set of enumerated target values for a label. For example, in a binary classification model that detects spam, the two classes are spam and not spam. In a multi-class classification model that identifies dog breeds, the classes would be poodle, beagle, pug, and so on208.
Classification model is a type of machine learning model for distinguishing among two or more discrete classes. For example, a natural language processing classification model could determine whether an input sentence was in French, Spanish, or Italian209.
Classification threshold is a scalar-value criterion that is applied to a model’s predicted score in order to separate the positive class from the negative class. Used when mapping logistic regression results to binary classification210.
Classification. Classification problems use an algorithm to accurately assign test data into specific categories, such as separating apples from oranges. Or, in the real world, supervised learning algorithms can be used to classify spam in a separate folder from your inbox. Linear classifiers, support vector machines, decision trees and random forest are all common types of classification algorithms211.
Сloud robotics is a field of robotics that attempts to invoke cloud technologies such as cloud computing, cloud storage, and other Internet technologies centred on the benefits of converged infrastructure and shared services for robotics. When connected to the cloud, robots can benefit from the powerful computation, storage, and communication resources of modern data center in the cloud, which can process and share information from various robots or agent (other machines, smart objects, humans, etc.). Humans can also delegate tasks to robots remotely through networks. Cloud computing technologies enable robot systems to be endowed with powerful capability whilst reducing costs through cloud technologies. Thus, it is possible to build lightweight, low cost, smarter robots have intelligent «brain» in the cloud. The «brain» consists of data center, knowledge base, task planners, deep learning, information processing, environment models, communication support, etc.212.
Clinical Decision Support (CDS) is a health information technology system that is designed to provide physicians and other health professionals with clinical decision support, that is, assistance with clinical decision- making tasks213.
Clipping is a technique for handling outliers. Specifically, reducing feature values that are greater than a set maximum value down to that maximum value. Also, increasing feature values that are less than a specific minimum value up to that minimum value. For example, suppose that only a few feature values fall outside the range 40—60. In this case, you could do the following: Clip all values over 60 to be exactly 60. Clip all values under 40 to be exactly 40. In addition to bringing input values within a designated range, clipping can also used to force gradient values within a designated range during training214.
Closed dictionary in speech recognition systems, a dictionary with a limited number of words, to which the recognition system is configured and which cannot be replenished by the user215.
Cloud computing is an information technology model for providing ubiquitous and convenient access using the Internet to a common set of configurable computing resources («cloud»), data storage devices, applications and services that can be quickly provided and released from the load with minimal operating costs or with little or no involvement of the provider216.
Cloud is a general metaphor that is used to refer to the Internet. Initially, the Internet was seen as a distributed network and then with the invention of the World Wide Web as a tangle of interlinked media. As the Internet continued to grow in both size and the range of activities it encompassed, it came to be known as «the cloud.» The use of the word cloud may be an attempt to capture both the size and nebulous nature of the Internet217.
Cloud TPU is a specialized hardware accelerator designed to speed up machine learning workloads on Google Cloud Platform218.
Cluster analysis is a type of unsupervised learning used for exploratory data analysis to find hidden patterns or groupings in the data; clusters are modeled with a similarity measure defined by metrics such as Euclidean or probability distance.
Clustering is a data mining technique for grouping unlabeled data based on their similarities or differences. For example, K-means clustering algorithms assign similar data points into groups, where the K value represents the size of the grouping and granularity. This technique is helpful for market segmentation, i compression, etc219.
Co-adaptation is when neurons predict patterns in training data by relying almost exclusively on outputs of specific other neurons instead of relying on the network’s behavior as a whole. When the patterns that cause co-adaption are not present in validation data, then co-adaptation causes overfitting. Dropout regularization reduces co-adaptation because dropout ensures neurons cannot rely solely on specific other neurons220.
COBWEB is an incremental system for hierarchical conceptual clustering. COBWEB was invented by Professor Douglas H. Fisher, currently at Vanderbilt University. COBWEB incrementally organizes observations into a classification tree. Each node in a classification tree represents a class (concept) and is labeled by a probabilistic concept that summarizes the attribute-value distributions of objects classified under the node. This classification tree can be used to predict missing attributes or the class of a new object221.
Code is a one-to-one mapping of a finite ordered set of symbols belonging to some finite alphabet222.
Codec is a codec is the means by which sound and video files are compressed for storage and transmission purposes. There are various forms of compression: ’lossy’ and ’lossless’, but most codecs perform lossless compression because of the much larger data reduction ratios that occur with lossy compression. Most codecs are software, although in some areas codecs are hardware components of i and sound systems. Codecs are necessary for playback, since they uncompress or decompress the moving i and sound files and allow them to be rendered223.
Cognitive architecture – the Institute of Creative Technologies defines cognitive architecture as: «hypothesis about the fixed structures that provide a mind, whether in natural or artificial systems, and how they work together – in conjunction with knowledge and skills embodied within the architecture – to yield intelligent behavior in a diversity of complex environments»224.
Cognitive computing is used to refer to the systems that simulate the human brain to help with the decision- making. It uses self-learning algorithms that perform tasks such as natural language processing, i analysis, reasoning, and human—computer interaction. Examples of cognitive systems are IBM’s Watson and Google DeepMind225.
Cognitive Maps are structured representations of decision depicted in graphical format (variations of cognitive maps are cause maps, influence diagrams, or belief nets). Basic cognitive maps include nodes connected by arcs, where the nodes represent constructs (or states) and the arcs represent relationships. Cognitive maps have been used to understand decision situations, to analyze complex cause-effect representations and to support communication226.
Cognitive science – the interdisciplinary scientific study of the mind and its processes227.
Cohort is a sample in study (conducted to evaluate a machine learning algorithm, for example) where it is followed prospectively or retrospectively and subsequent status evaluations with respect to a disease or outcome are conducted to determine which initial participants’ exposure characteristics (risk factors) are associated with it.
Cold-Start is a potential issue arising from the fact that a system cannot infer anything for users or items for which it has not gathered a sufficient amount of information yet228.
Collaborative filtering – making predictions about the interests of one user based on the interests of many other users. Collaborative filtering is often used in recommendation systems229.
Combinatorial optimization in operations research, applied mathematics and theoretical computer science, combinatorial optimization is a topic that consists of finding an optimal object from a finite set of objects230.
Committee machine is a type of artificial neural network using a divide and conquer strategy in which the responses of multiple neural networks (experts) are combined into a single response. The combined response of the committee machine is supposed to be superior to those of its constituent experts. Compare ensembles of classifiers231.
Commoditization is the process of transforming a product from an elite to a generally available (comparatively cheap commodity of mass consumption)232.
Common Data Element (CDE) is a tool to support data management for clinical research233.
Commonsense reasoning is a branch of artificial intelligence concerned with simulating the human ability to make presumptions about the type and essence of ordinary situations they encounter every day234.
Compiler is a program that translates text written in a programming language into a set of machine codes. AI framework compilers collect the computational data of the frameworks and try to optimize the code of each of them, regardless of the hardware of the accelerator. The compiler contains programs and blocks with which the framework performs several tasks. The computer memory resource allocator, for example, allocates power individually for each accelerator235.
Composite AI is a combined application of various artificial intelligence methods (deep machine learning, computer vision, natural language processing, contextual analysis, knowledge graphs, data visualization, forecasting methods, etc.) to increase the efficiency of model training in order to achieve a synergistic effect from their use and the best results of the work of artificial intelligence systems. One of the ideas that is laid down in the creation of composite artificial intelligence is to obtain a sane artificial intelligence that will be able to understand the essence of the problems and solve a wide range of problems, offering optimal solutions.236,237,238.
Compression is a method of reducing the size of computer files. There are several compression programs available, such as gzip and WinZip239.
Computation is any type of arithmetic or non-arithmetic calculation that follows a well-defined model (e.g., an algorithm)240.
Computational chemistry is a discipline using mathematical methods for the calculation of molecular properties or for the simulation of molecular behaviour. It also includes, e.g., synthesis planning, database searching, combinatorial library manipulation.241,242,243.
Computational complexity theory – focuses on classifying computational problems according to their inherent difficulty, and relating these classes to each other. A computational problem is a task solved by a computer. A computation problem is solvable by mechanical application of mathematical steps, such as an algorithm244.
Computational creativity (also artificial creativity, mechanical creativity, creative computing, or creative computation) is a multidisciplinary endeavour that includes the fields of artificial intelligence, cognitive psychology, philosophy, and the arts245.
Computational cybernetics is the integration of cybernetics and computational intelligence techniques246.
Computational efficiency of an agent or a trained model is the amount of computational resources required by the agent to solve a problem at the inference stage247.
Computational efficiency of an intelligent system is the amount of computing resources required to train an intelligent system with a certain level of performance on a given volume of tasks248.
Computational Graphics Processing Unit (computational GPU, cGPU) – graphic processor-computer, GPU-computer, multi-core GPU used in hybrid supercomputers to perform parallel mathematical calculations; for example, one of the first GPUs in this category contains more than 3 billion transistors – 512 CUDA cores and up to 6 GB of memory249.
Computational humor is a branch of computational linguistics and artificial intelligence which uses computers in humor research250.
Computational intelligence (CI) usually refers to the ability of a computer to learn a specific task from data or experimental observation251.
Computational learning theory (COLT) in computer science, is a subfield of artificial intelligence devoted to studying the design and analysis of machine learning algorithms252.
Computational linguistics is an interdisciplinary field concerned with the statistical or rule-based modeling of natural language from a computational perspective, as well as the study of appropriate computational approaches to linguistic questions253.
Computational mathematics is the mathematical research in areas of science where computing plays an essential role254.
Computational neuroscience (also known as theoretical neuroscience or mathematical neuroscience) is a branch of neuroscience which employs mathematical models, theoretical analysis and abstractions of the brain to understand the principles that govern the development, structure, physiology, and cognitive abilities of the nervous system255,256.
Computational number theory (also algorithmic number theory) – the study of algorithms for performing number theoretic computations257,,258.
Computational problem in theoretical computer science is a mathematical object representing a collection of questions that computers might be able to solve259.
Computational statistics (or statistical computing) is the application of computer science and software engineering principles to solving scientific problems. It involves the use of computing hardware, networking, algorithms, programming, databases and other domain-specific knowledge to design simulations of physical phenomena to run on computers. Computational science crosses disciplines and can even involve the humanities260,261.
Computer engineering — technologies for digital modeling and design of objects and production processes throughout the life cycle262.
Computer incident is a fact of violation and (or) cessation of the operation of a critical information infrastructure object, a telecommunication network used to organize the interaction of such objects, and (or) a violation of the security of information processed by such an object, including as a result of a computer attack263.
Computer science – the theory, experimentation, and engineering that form the basis for the design and use of computers. It involves the study of algorithms that process, store, and communicate digital information. A computer scientist specializes in the theory of computation and the design of computational systems. Computer science is the study of computation, automation, and information. Computer science spans theoretical disciplines (such as algorithms, theory of computation, and information theory) to practical disciplines (including the design and implementation of hardware and software). Computer science is generally considered an area of academic research and distinct from computer programming264.
Computer simulation is the process of mathematical modelling, performed on a computer, which is designed to predict the behaviour of, or the outcome of, a real-world or physical system. The reliability of some mathematical models can be determined by comparing their results to the real-world outcomes they aim to predict. Computer simulations have become a useful tool for the mathematical modeling of many natural systems in physics (computational physics), astrophysics, climatology, chemistry, biology and manufacturing, as well as human systems in economics, psychology, social science, health care and engineering265.
Computer vision (CV) is scientific discipline, field of technology and the direction of artificial intelligence (AI), which deals with computer processing, recognition, analysis and classification of dynamic is of reality. It is widely used in video surveillance systems, in robotics and in modern industry to improve product quality and production efficiency, comply with legal requirements, etc. In computer vision, the following areas are distinguished: face recognition (face recognition), i recognition (i recognition), augmented reality (augmented reality (AR) and optical character recognition (OCR). Synonyms – artificial vision, machine vision266.
Computer vision processing (CVP) is the processing of is (signals) in a computer vision system, in computer vision systems – about algorithms (computer vision processing algorithms), processors (computer vision processing unit, CVPU), convolutional neural networks (convolutional neural network), which are used for i processing and implementation of visual functions in robotics, real-time systems, smart video surveillance systems, etc.267.
Computer-Aided Detection/Diagnosis (CAD), uses computer programs to assist radiologists in the interpretation of medical is. CAD systems process digital is for typical appearances and highlight suspicious regions in order to support a decision taken by a professional268.
Computer-automated design (CAutoD) – design automation usually refers to electronic design automation, or Design Automation which is a Product Configurator. Extending Computer-Aided Design (CAD), automated design and computer-automated designare concerned with a broader range of applications, such as automotive engineering, civil engineering, composite material design, control engineering, dynamic system identification and optimization, financial systems, industrial equipment, mechatronic systems, steel construction, structural optimisation, and the invention of novel systems. More recently, traditional CAD simulation is seen to be transformed to CAutoD by biologically inspired machine learning, including heuristic search techniques such as evolutionary computation, and swarm intelligence algorithms269.
Computing modules are plug-in specialized computers designed to solve narrowly focused tasks, such as accelerating the work of artificial neural networks algorithms, computer vision, voice recognition, machine learning and other artificial intelligence methods, built on the basis of a neural processor – a specialized class of microprocessors and coprocessors (processor, memory, data transfer).
Computing system is a software and hardware complex intended for solving problems and processing data (including calculations) or several interconnected complexes that form a single infrastructure270.
Computing units are blocks that work like a filter that transforms packets according to certain rules. The instruction set of the calculator can be limited, which guarantees a simple internal structure and a sufficiently high speed of operation271.
Сoncept drift in predictive analytics and machine learning, the concept drift means that the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. This causes problems because the predictions become less accurate as time passes272.
Сonnectionism is an approach in the fields of cognitive science, that hopes to explain mental phenomena using artificial neural networks273,274.
Сonsistent heuristic in the study of path-finding problems in artificial intelligence, a heuristic function is said to be consistent, or monotone, if its estimate is always less than or equal to the estimated distance from any neighboring vertex to the goal, plus the cost of reaching that neighbor275.
Сonstrained conditional model (CCM) is a machine learning and inference framework that augments the learning of conditional (probabilistic or discriminative) models with declarative сonstraints276.
Constraint logic programming is a form of constraint programming, in which logic programming is extended to include concepts from constraint satisfaction. A constraint logic program is a logic program that contains constraints in the body of clauses277.
Constraint programming is a programming paradigm wherein relations between variables are stated in the form of constraints. Constraints differ from the common primitives of imperative programming languages in that they do not specify a step or sequence of steps to execute, but rather the properties of a solution to be found278.
Constructed language (also conlang) is a language whose phonology, grammar, and vocabulary are consciously devised, instead of having developed naturally. Constructed languages may also be referred to as artificial, planned, or invented languages279.
Control theory in control systems engineering, is a subfield of mathematics that deals with the control of continuously operating dynamical systems in engineered processes and machines. The objective is to develop a control model for controlling such systems using a control action in an optimum manner without delay or overshoot and ensuring control stability280.
Convolutional neural network (CNN, or ConvNet) in deep learning, is a class of deep neural networks, most commonly applied to analyzing visual iry. CNNs use a variation of multilayer perceptrons designed to require minimal preprocessing. They are also known as shift invariant or space invariant artificial neural networks (SIANN), based on their shared-weights architecture and translation invariance characteristics. Сonvolutional neural network is a class of artificial neural network most commonly used to analyze visual is. They are also known as Invariant or Spatial Invariant Artificial Neural Networks (SIANN) based on an architecture with a common weight of convolution kernels or filters that slide over input features and provide equivalent translation responses known as feature maps281.
Confidentiality of information is a mandatory requirement for a person who has access to certain information not to transfer such information to third parties without the consent of its owner282.
Confirmation Bias – the tendency to search for, interpret, favor, and recall information in a way that confirms one’s own beliefs or hypotheses while giving disproportionately less attention to information that contradicts it283.
Confusion matrix is a situational analysis table that summarizes the prediction results of a classification model in machine learning. The records in the dataset are summarized in a matrix according to the real category and the classification score made by the classification model284,285.
Consumer artificial intelligence is specialized artificial intelligence programs embedded in consumer devices and processes286.
Continuous feature is a floating-point feature with an infinite range of possible values. Contrast with discrete feature287,288.
Contributor is a human worker providing h5s on the Appen data h5 platform289.
Convenience sampling – using a dataset not gathered scientifically in order to run quick experiments. Later on, it’s essential to switch to a scientifically gathered dataset290.
Convergence – informally, often refers to a state reached during training in which training loss and validation loss change very little or not at all with each iteration after a certain number of iterations. In other words, a model reaches convergence when additional training on the current data will not improve the model. In deep learning, loss values sometimes stay constant or nearly so for many iterations before finally descending, temporarily producing a false sense of convergence. See also early stopping291,292.
Convex function is a function in which the region above the graph of the function is a convex set. The prototypical convex function is shaped something like the letter U. For example, the following are all convex functions:
By contrast, the following function is not convex. Notice how the region above the graph is not a convex set:
A strictly convex function has exactly one local minimum point, which is also the global minimum point. The classic U-shaped functions are strictly convex functions. However, some convex functions (for example, straight lines) are not U-shaped. A lot of the common loss functions, including the following, are convex functions: L2 loss; Log Loss; L1 regularization; L2 regularization. Many variations of gradient descent are guaranteed to find a point close to the minimum of a strictly convex function. Similarly, many variations of stochastic gradient descent have a high probability (though, not a guarantee) of finding a point close to the minimum of a strictly convex function. The sum of two convex functions (for example, L2 loss + L1 regularization) is a convex function. Deep models are never convex functions. Remarkably, algorithms designed for convex optimization tend to find reasonably good solutions on deep networks anyway, even though those solutions are not guaranteed to be a global minimum293,294.
Convex optimization – the process of using mathematical techniques such as gradient descent to find the minimum of a convex function. A great deal of research in machine learning has focused on formulating various problems as convex optimization problems and in solving those problems more efficiently. For complete details, see Boyd and Vandenberghe, Convex Optimization295.
Convex set is a subset of Euclidean space such that a line drawn between any two points in the subset remains completely within the subset.296.
Convolution — the process of filtering. A filter (or equivalently: a kernel or a template) is shifted over an input i. The pixels of the output i are the summed product of the values in the filter pixels and the corresponding values in the underlying i297.
Convolutional filter – one of the two actors in a convolutional operation. (The other actor is a slice of an input matrix). A convolutional filter is a matrix having the same rank as the input matrix, but a smaller shape298.
Convolutional layer is a layer of a deep neural network in which a convolutional filter passes along an input matrix299.
Convolutional neural network (CNN) is a type of neural network that identifies and interprets is300,301.
Convolutional operation – the following two-step mathematical operation: Element-wise multiplication of the convolutional filter and a slice of an input matrix. (The slice of the input matrix has the same rank and size as the convolutional filter); Summation of all the values in the resulting product matrix302.
Corelet programming environment (CPE) is a scalable environment that allows programmers to set the functional behavior of a neural network by adjusting its parameters and communication characteristics303.
Corpus of texts is a large dataset of written or spoken material that can be used to train a machine to perform linguistic tasks304.
Correlation analysis is a statistical data processing method that measures the strength of the relationship between two or more variables. Thus, it determines whether there is a connection between the phenomena and how strong the connection between these phenomena is305.
Correlation is a statistical relationship between two or more random variables306.
Cost – synonym for loss. A measure of how far a model’s predictions are from its label. Or, to put it more pessimistically, a measure of how bad a model is. To determine this value, the model must define a loss function. For example, linear regression models typically use the standard error for the loss function, while logistic regression models use the log loss307,308.
Co-training essentially amplifies independent signals into a stronger signal. For instance, consider a classification model that categorizes individual used cars as either Good or Bad. One set of predictive features might focus on aggregate characteristics such as the year, make, and model of the car; another set of predictive features might focus on the previous owner’s driving record and the car’s maintenance history. The seminal paper on co-training is Combining Labeled and Unlabeled Data with Co-Training by Blum and Mitchell309.
Counterfactual fairness is a fairness metric that checks whether a classifier produces the same result for one individual as it does for another individual who is identical to the first, except with respect to one or more sensitive attributes. Evaluating a classifier for counterfactual fairness is one method for surfacing potential sources of bias in a model. See «When Worlds Collide: Integrating Different Counterfactual Assumptions in Fairness» for a more detailed discussion of counterfactual fairness310.
Coverage bias – this bias means that the study sample is not representative and that the data set in the array has zero chance of being included in the sample311.
Crash blossom is a sentence or phrase with an ambiguous meaning. Crash blossoms present a significant problem in natural language understanding. For example, the headline Red Tape Holds Up Skyscraper is a crash blossom because an NLU model could interpret the headline literally or figuratively312.
Critic – synonym for Deep Q-Network313.
Critical information infrastructure – objects of critical information infrastructure, as well as telecommunication networks used to organize the interaction of such objects314.
Critical information infrastructure of the Russian Federation is a set of critical information infrastructure objects, as well as telecommunication networks used to organize the interaction of critical information infrastructure objects with each other315.
Cross-entropy is a generalization of Log Loss to multi-class classification problems. Cross-entropy quantifies the difference between two probability distributions. See also perplexity316.
Crossover (also recombination) in genetic algorithms and evolutionary computation, a genetic operator used to combine the genetic information of two parents to generate new offspring. It is one way to stochastically generate new solutions from an existing population, and analogous to the crossover that happens during sexual reproduction in biological organisms. Solutions can also be generated by cloning an existing solution, which is analogous to asexual reproduction. Newly generated solutions are typically mutated before being added to the population317.
Cross-Validation (k-fold Cross-Validation, Leave-p-out Cross-Validation) is a collection of processes designed to evaluate how the results of a predictive model will generalize to new data sets. k-fold Cross-Validation; Leave-p-out Cross-Validation318.
Cryogenic freezing (cryonics, human cryopreservation) is a technology of preserving in a state of deep cooling (using liquid nitrogen) the head or body of a person after his death with the intention to revive them in the future319.
Cyber-physical systems are intelligent networked systems with built-in sensors, processors and drives that are designed to interact with the physical environment and support the operation of computer information systems in real time320.
«D»
Darkforest is a computer go program, based on deep learning techniques using a convolutional neural network. Its updated version Darkforest2 combines the techniques of its predecessor with Monte Carlo tree search. The MCTS effectively takes tree search methods commonly seen in computer chess programs and randomizes them. With the update, the system is known as Darkforest3321.
Dartmouth workshop – the Dartmouth Summer Research Project on Artificial Intelligence was the name of a 1956 summer workshop now considered by many (though not all) to be the seminal event for artificial intelligence as a field322.
Data analysis is obtaining an understanding of data by considering samples, measurement, and visualization. Data analysis can be particularly useful when a dataset is first received, before one builds the first model. It is also crucial in understanding experiments and debugging problems with the system323.
Data analytics is the science of analyzing raw data to make conclusions about that information. Many of the techniques and processes of data analytics have been automated into mechanical processes and algorithms that work over raw data for human consumption324.
Data augmentation in data analysis are techniques used to increase the amount of data. It helps reduce overfitting when training a machine learning325.
Data Cleaning is the process of identifying, correcting, or removing inaccurate or corrupt data records326.
Data Curation – includes the processes related to the organization and management of data which is collected from various sources327.
Data entry – the process of converting verbal or written responses to electronic form328.
Data fusion — the process of integrating multiple data sources to produce more consistent, accurate, and useful information than that provided by any individual data source329.
Data Integration involves the combination of data residing in different resources and then the supply in a unified view to the users. Data integration is in high demand for both commercial and scientific domains in which they need to merge the data and research results from different repositories330.
Data is a collection of qualitative and quantitative variables. It contains the information that is represented numerically and needs to be analyzed.
Data Lake is a type of data repository that stores data in its natural format and relies on various schemata and structure to index the data331.
Data markup is the stage of processing structured and unstructured data, during which data (including text documents, photo and video is) are assigned identifiers that reflect the type of data (data classification), and (or) data is interpreted to solve a specific problem, in including using machine learning methods (National Strategy for the Development of Artificial Intelligence for the period up to 2030)332.
Data Mining is the process of data analysis and information extraction from large amounts of datasets with machine learning, statistical approaches. and many others333.
Data parallelism is a way of scaling training or inference that replicates an entire model onto multiple devices and then passes a subset of the input data to each device. Data parallelism can enable training and inference on very large batch sizes; however, data parallelism requires that the model be small enough to fit on all devices. See also model parallelism334.
Data Processing Unit (DPU) is a programmable specialized electronic circuit with hardware accelerated data processing for data-oriented computing335.
Data protection is the process of protecting data and involves the relationship between the collection and dissemination of data and technology, the public perception and expectation of privacy and the political and legal underpinnings surrounding that data. It aims to strike a balance between individual privacy rights while still allowing data to be used for business purposes336.
Data Refinement is used to convert an abstract data model in terms of sets for example into implementable data structures such as arrays337.
Data Science is a broad grouping of mathematics, statistics, probability, computing, data visualization to extract knowledge from a heterogeneous set of data (is, sound, text, genomic data, social network links, physical measurements, etc.). The methods and tools derived from artificial intelligence are part of this family338,339.
Data set is a set of data that has undergone preliminary preparation (processing) in accordance with the requirements of the legislation of the Russian Federation on information, information technology and information protection and is necessary for the development of software based on artificial intelligence (National strategy for the development of artificial intelligence for the period up to 2030)340.
Data Streaming Accelerator (DSA) is a device that performs a specific task, which in this case is the transfer of data in less time than the CPU would do. What makes DSA special is that it is designed for one of the characteristics that Compute Express Link brings with it over PCI Express 5.0, which is to provide consistent access to RAM for all peripherals connected to a PCI Express port, i.e., they use the same memory addresses.
Data variability describes how far apart data points lie from each other and from the center of a distribution. Along with measures of central tendency, measures of variability give you descriptive statistics that summarize your data341.
Data veracity is the degree of accuracy or truthfulness of a data set. In the context of big data, its not just the quality of the data that is important, but how trustworthy the source, the type, and processing of the data are342.
Data Warehouse is typically an offline copy of production databases and copies of files in a non-production environment343.
Database is a «container» storing data such as numbers, dates or words, which can be reprocessed by computer means to produce information; for example, numbers and names assembled and sorted to form a directory344.
DataFrame is a popular datatype for representing datasets in pandas. A DataFrame is analogous to a table. Each column of the DataFrame has a name (a header), and each row is identified by a number345.
Datalog is a declarative logic programming language that syntactically is a subset of Prolog. It is often used as a query language for deductive databases. In recent years, Datalog has found new application in data integration, information extraction, networking, program analysis, security, and cloud computing346.
Datamining – the discovery, interpretation, and communication of meaningful patterns in data347.
Dataset API (tf. data) is a high-level TensorFlow API for reading data and transforming it into a form that a machine learning algorithm requires. A tf. data. Dataset object represents a sequence of elements, in which each element contains one or more Tensors. A tf.data.Iterator object provides access to the elements of a Dataset. For details about the Dataset API, see Importing Data in the TensorFlow Programmer’s Guide348.
Debugging is the process of finding and resolving bugs (defects or problems that prevent correct operation) within computer programs, software, or systems. Debugging tactics can involve interactive debugging, control flow analysis, unit testing, integration testing, log file analysis, monitoring at the application or system level, memory dumps, and profiling. Many programming languages and software development tools also offer programs to aid in debugging, known as debuggers349.
Decentralized applications (dApps) are digital applications or programs that exist and run on a blockchain or peer-to-peer (P2P) network of computers instead of a single computer. DApps (also called «dapps») are outside the purview and control of a single authority. DApps – which are often built on the Ethereum platform – can be developed for a variety of purposes including gaming, finance, and social media350.
Decentralized control is a process in which a significant number of control actions related to a given object are generated by the object itself on the basis of self-government351.
Decision boundary – the separator between classes learned by a model in a binary class or multi-class classification problems352.
Decision boundary in the case of backpropagation-based artificial neural networks or perceptrons, the type of decision boundary that the network can learn is determined by the number of hidden layers the network has. If it has no hidden layers, then it can only learn linear problems. If it has one hidden layer, then it can learn any continuous function on compact subsets of Rn as shown by the Universal approximation theorem, thus it can have an arbitrary decision boundary.
Decision intelligence (DI) is a practical discipline used to improve the decision making process by clearly understanding and programmatically developing how decisions are made and how the outcomes are evaluated, managed and improved through feedback.
Decision intelligence is a discipline offers a framework to assist data and analytics practitioners develop, model, align, implement, track, and modify decision models and processes related to business results and performance353.
Decision support system (DSS) is an information system that supports business or organizational decision-making activities. DSSs serve the management, operations and planning levels of an organization (usually mid and higher management) and help people make decisions about problems that may be rapidly changing and not easily specified in advance – i.e., unstructured and semi-structured decision problems. Decision support systems can be either fully computerized or human-powered, or a combination of both354.
Decision theory (also theory of choice) – the study of the reasoning underlying an agent’s choices. Decision theory can be broken into two branches: normative decision theory, which gives advice on how to make the best decisions given a set of uncertain beliefs and a set of values, and descriptive decision theory which analyzes how existing, possibly irrational agents actually make decisions355.
Decision threshold this indicator allows you to define the cut-off point for classifying observations. Observations with predicted values greater than the classification cutoff are classified as positive, and those with predicted values less than the cutoff are classified as negative356.
Decision tree is a tree-and-branch model used to represent decisions and their possible consequences, similar to a flowchart357.
Decision tree learning – uses a decision tree (as a predictive model) to go from observations about an item (represented in the branches) to conclusions about the item’s target value (represented in the leaves). It is one of the predictive modeling approaches used in statistics, data mining and machine learning358.
Decision Tree uses tree-like graph or model as a structure to perform decision analysis. It uses each node to represent a test on an attribute, each branch to represent the outcome of the test, and each leaf node to represent a class label359,360,361.
Declarative programming is a programming paradigm – a style of building the structure and elements of computer programs – that expresses the logic of a computation without describing its control flow362,363.
Decoder in general, any ML system that converts from a processed, dense, or internal representation to a more raw, sparse, or external representation. Decoders are often a component of a larger model, where they are frequently paired with an encoder. In sequence-to-sequence tasks, a decoder starts with the internal state generated by the encoder to predict the next sequence. Refer to Transformer for the definition of a decoder within the Transformer architecture364.
Decompression – used to restore data to uncompressed form after compression365.
Deductive classifier is a type of artificial intelligence inference engine. It takes as input a set of declarations in a frame language about a domain such as medical research or molecular biology366.
Deductive Reasoning, also known as logical deduction, is a reasoning method that relies on premises to reach a logical conclusion. It works in a top- down manner, in which the final conclusion is obtained by reducing the general rules that hold the entire domain until only the conclusion is left367.
Deep Blue was a chess supercomputer developed by IBM. It was the first computer chess player that beat the world cham- pion Garry Kasparov, after six-game match in 1997368.
Deep Learning (DL) is a subfield of machine learning concerned with algorithms that are inspired by the human brain that works in a hierarchical way. Deep Learning models, which are mostly based on the (artificial) neural networks, have been applied to different fields, such as speech recognition, computer vision, and natural language processing369.
Deep model is a type of neural network containing multiple hidden layers. Contrast with wide model370.
Deep neural network is a multilayer network containing several (many) hidden layers of neurons between the input and output layers, which allows modeling complex nonlinear relationships. GNNs are now increasingly used to solve such artificial intelligence problems as speech recognition, natural language processing, computer vision, etc., including in robotics371.
Deep Q-Network (DQN) in Q-learning, is a deep neural network that predicts Q-functions. Critic is a synonym for Deep Q-Network372.
DeepMind is an artificial intelligence company founded in 2010 and later acquired by Google in 2014. DeepMind developed Alphago program that beat a human professional Go player for the first time373,374.
Default logic is a non-monotonic logic proposed by Raymond Reiter to formalize reasoning with default assumptions375.
Degree of maturity is the degree of clarity (clarity) of the definition, management, measurement, control and implementation of a specific technological process376.
Demographic parity is a fairness metric that is satisfied if the results of a model’s classification are not dependent on a given sensitive attribute377.
Denoising it is the task of machine vision to remove noise from an i. It is a common supervised learning approach in which noise is artificially added to the dataset and the system removes it on its own378.
Dense feature is a feature in which most values are non-zero, typically a Tensor of floating-point values. Contrast with sparse feature379.
Dense layer – synonym for fully connected layer380.
Depersonalization of personal data – actions, as a result of which it becomes impossible, without the use of additional information, to determine the ownership of personal data by a specific subject of personal data381,382.
Depth – the number of layers (including any embedding layers) in a neural network that learn weights. For example, a neural network with 5 hidden layers and 1 output layer has a depth of 6383.
Depthwise separable convolutional neural network (sepCNN) is a convolutional neural network architecture based on Inception, but where Inception modules are replaced with depthwise separable convolutions. Also known as Xception. A depthwise separable convolution (also abbreviated as separable convolution) factors a standard 3-D convolution into two separate convolution operations that are more computationally efficient: first, a depthwise convolution, with a depth of 1 (n ✕ n ✕ 1), and then second, a pointwise convolution, with length and width of 1 (1 ✕ 1 ✕ n). To learn more, see Xception: Deep Learning with Depthwise Separable Convolutions384.
Description logic is a family of formal knowledge representation languages. Many DLs are more expressive than propositional logic but less expressive than first-order logic. In contrast to the latter, the core reasoning problems for DLs are (usually) decidable, and efficient decision procedures have been designed and implemented for these problems. There are general, spatial, temporal, spatiotemporal, and fuzzy descriptions logics, and each description logic features a different balance between DL expressivity and reasoning complexity by supporting different sets of mathematical constructors385.
Design Center is an organizational unit (the entire organization or its subdivision) that performs a full range or part of the work on creating products up to the stage of its mass production, and also has the necessary personnel, equipment and technologies for this386.
Developmental robotics (DevRob) (also epigenetic robotics) is a scientific field which aims at studying the developmental mechanisms, architectures, and constraints that allow lifelong and open-ended learning of new skills and new knowledge in embodied machines387.
Device is a category of hardware that can run a TensorFlow session, including CPUs, GPUs, and TPUs388.
DevOps (development & operations) is a set of practices, tools, and culture philosophies that automate and integrate the processes of software development teams and IT teams. DevOps emphasizes team empowerment, collaboration and collaboration, and technology automation. The term DevOps is also understood as a special approach to organizing development teams. Its essence is that developers, testers and administrators work in a single thread – they are not each responsible for their own stage, but work together on the release of the product and try to automate the tasks of their departments so that the code moves between stages without delay. In DevOps, responsibility for the result is distributed among the entire team389,390.
Diagnosis concerned with the development of algorithms and techniques that are able to determine whether the behaviour of a system is correct. If the system is not functioning correctly, the algorithm should be able to determine, as accurately as possible, which part of the system is failing, and which kind of fault it is facing. The computation is based on observations, which provide information on the current behaviour391.
Dialogflow API.AI is a platform that allows users to build brand-unique, natural language interactions for bots, applications, services, and devices. It features a Natural Language Understanding Tools to design unique conversation scenarios, design corresponding actions and analyze interactions with users392.
Dialogue system (also conversational agent (CA)) is a computer system intended to converse with a human with a coherent structure. Dialogue systems have employed text, speech, graphics, haptics, gestures, and other modes for communication on both the input and output channel393.
Dice coefficient is a measure to compare the similarity of two segmentations, e.g., by expert and by machine. It is the ratio of twice the number of common pixels to the sum of all pixels in both sets.
Dictation – speech (voice) text input.
Dictation system is a system for speech text input.
Digital Body Language encompasses all the digital activities performed by an individual. Every time a person performs a Google search, visits a web page, opens a newsletter or downloads a guide, they contribute to their digital body language. Digital body language is used in building marketing automation394.
Digital divide is a concept that has become especially widespread in the last decade due to the increased importance of introducing new digital technologies in society and overcoming existing differences in the field of information and knowledge that hinder the development of basic economic and social infrastructures, in particular the energy sector, telecommunications and education395.
Digital educational environment is an open set of information systems designed to support various tasks of the educational process. The word «open» means the ability and the right to use different information systems as part of the DSP, replace them or add new ones at your own discretion396.
Digital ethics is a form of ethics that includes systems of values and moral principles of electronic interaction between people, organizations and things.
Digital platform is a group of technologies that are used as a basis for creating a specific and specialized system of digital interaction397.
Digital rights are the rights of individuals as it pertains to computer access and the ability to use, create and publish digital media. Digital rights can also refer to allowed permissions for fair use of digital copyrighted materials. Digital rights are extensions of human rights like freedom of expression and the right to privacy. The extent to which digital rights are recognized varies from country to country, but Internet access is a recognized right in several countries398.
Digital Social Innovation (DSI) is innovation that uses digital technologies to enable or help carry out SI399.
Digital society (Global information society) is a new world knowledge society that exists and interacts, and is also closely integrated into a fundamentally and qualitatively new digital social, economic and cultural ecosystem, in which the free exchange of information and knowledge is implemented using artificial intelligence, augmented and virtual reality, which are additional interfaces for the interaction of people and machines (computers, robots, wearable devices, etc.)400.
Digital transformation is the process of integrating digital technologies into all aspects of activity, requiring fundamental changes in technology, culture, operations and the principles of creating new products and services401.
Digital transformation of the economy is a continuous and dynamically changing process of development, implementation and development of innovations and new technologies in all its sectors, which fundamentally affects the socio-economic and cultural development of the information society.
Digitalization is a new stage in the automation and informatization of economic activity and public administration, the process of transition to digital technologies, which is based not only on the use of information and communication technologies to solve production or management problems, but also on the accumulation and analysis of big data with their help in order to predict situation, optimization of processes and costs, attraction of new contractors, etc.402.
Dimension reduction – decreasing the number of dimensions used to represent a particular feature in a feature vector, typically by converting to an embedding403.
Dimensionality reduction (also dimension reduction) – the process of reducing the number of random variables under consideration by obtaining a set of principal variables. It can be divided into feature selection and feature extraction404.
Dimensionality reduction is a learning technique used when the number of features (or dimensions) in a given dataset is too high. It reduces the number of data inputs to a manageable size while also preserving the data integrity. Often, this technique is used in the preprocessing data stage, such as when autoencoders remove noise from visual data to improve picture quality405.
Dimensions is the maximum number of linearly independent vectors contained in the space406,407.
Directed Acyclic Graph (DAG) in computer science and mathematics, a directed acyclic graph is a finite directed graph with no directed cycles. It consists of finitely many vertices and edges, with each edge directed from one vertex to another, such that there is no way to start at any vertex and follow a consistently directed sequence of edges that eventually loops back to that starting vertex again408.
Disaster tolerance is the ability of a system to restore an application on an alternate cluster when the primary cluster fails. Disaster tolerance is based on data replication and failover. Data replication is the copying of data from a primary cluster to a backup or secondary cluster409.
Disclosure of information constituting a commercial secret is an action or inaction as a result of which information constituting a commercial secret, in any possible form (oral, written, other form, including using technical means) becomes known to third parties without the consent of the owner of such information, or contrary to an employment or civil law contract410.
Discrete feature is a feature with a finite set of possible values. For example, a feature whose values may only be animal, vegetable, or mineral is a discrete (or categorical) feature. Contrast with continuous feature411.
Discrete system is any system with a countable number of states. Discrete systems may be contrasted with continuous systems, which may also be called analog systems. A final discrete system is often modeled with a directed graph and is analyzed for correctness and complexity according to computational theory. Because discrete systems have a countable number of states, they may be described in precise mathematical models. A computer is a finite state machine that may be viewed as a discrete system. Because computers are often used to model not only other discrete systems but continuous systems as well, methods have been developed to represent real-world continuous systems as discrete systems. One such method involves sampling a continuous signal at discrete time intervals412.
Discriminative model is a model that predicts labels from a set of one or more features. More formally, discriminative models define the conditional probability of an output given the features and weights; that is (output|features, weights). For example, a model that predicts whether an email is spam from features and weights is a discriminative model. The vast majority of supervised learning models, including classification and regression models, are discriminative models. Contrast with generative model413.
Discriminator is a system that determines whether examples are real or fake. The subsystem within a generative adversarial network that determines whether the examples created by the generator are real or fake414.
Disparate impact – making decisions about people that impact different population subgroups disproportionately. This usually refers to situations where an algorithmic decision-making process harms or benefits some subgroups more than others415.
Disparate treatment – factoring subjects’ sensitive attributes into an algorithmic decision-making process such that different subgroups of people are treated differently416.
Dissemination of information – actions aimed at obtaining information by an indefinite circle of persons or transferring information to an indefinite circle of persons417.
Dissemination of personal data – actions aimed at disclosing personal data to an indefinite circle of persons418.
Distributed artificial intelligence (DAI) (also decentralized artificial intelligence) is a subfield of artificial intelligence research dedicated to the development of distributed solutions for problems. DAI is closely related to and a predecessor of the field of multi-agent systems419.
Distributed registry technologies (Blockchain) are algorithms and protocols for decentralized storage and processing of transactions structured as a sequence of linked blocks without the possibility of their subsequent change420.
Distribution series are series of absolute and relative numbers that characterize the distribution of population units according to a qualitative (attributive) or quantitative attribute. Distribution series built on a quantitative basis are called variational421.
Divisive clustering – see hierarchical clustering422,423.
Documentation generically, any information on the structure, contents, and layout of a data file. Sometimes called «technical documentation» or «a codebook». Documentation may be considered a specialized form of metadata424.
Documented information – information recorded on a material carrier by means of documentation with details that make it possible to determine such information, or, in cases established by the legislation of the Russian Federation, its material carrier425.
Downsampling – overloaded term that can mean either of the following: Reducing the amount of information in a feature in order to train a model more efficiently. For example, before training an i recognition model, downsampling high-resolution is to a lower-resolution format. Training on a disproportionately low percentage of over-represented class examples in order to improve model training on under-represented classes. For example, in a class-imbalanced dataset, models tend to learn a lot about the majority class and not enough about the minority class. Downsampling helps balance the amount of training on the majority and minority classes426.
Driver is computer software that allows other software (the operating system) to access the hardware of a device427.
Drone – unmanned aerial vehicle (unmanned aerial system)428.
Dropout regularization is a form of regularization useful in training neural networks. Dropout regularization works by removing a random selection of a fixed number of the units in a network layer for a single gradient step. The more units dropped out, the stronger the regularization429.
Dynamic epistemic logic (DEL) is a logical framework dealing with knowledge and information change. Typically, DEL focuses on situations involving multiple agents and studies how their knowledge changes when events occur430.
Dynamic model is a model that is trained online in a continuously updating fashion. That is, data is continuously entering the model431,432.
«E»
Eager execution is a TensorFlow programming environment in which operations run immediately. By contrast, operations called in graph execution don’t run until they are explicitly evaluated. Eager execution is an imperative interface, much like the code in most programming languages. Eager execution programs are generally far easier to debug than graph execution programs433.
Eager learning is a learning method in which the system tries to construct a general, input-independent target function during training of the system, as opposed to lazy learning, where generalization beyond the training data is delayed until a query is made to the system434.
Early stopping is a method for regularization that involves ending model training before training loss finishes decreasing. In early stopping, you end model training when the loss on a validation dataset starts to increase, that is, when generalization performance worsens435.
Earth mover’s distance (EMD) is a measure of the relative similarity between two documents. The lower the value, the more similar the documents436.
Ebert test is a test which gauges whether a computer-based synthesized voice can tell a joke with sufficient skill to cause people to laugh. It was proposed by film critic Roger Ebert at the 2011 TED conference as a challenge to software developers to have a computerized voice master the inflections, delivery, timing, and intonations of a speaking human. The test is similar to the Turing test proposed by Alan Turing in 1950 as a way to gauge a computer’s ability to exhibit intelligent behavior by generating performance indistinguishable from a human being437.
Echo state network (ESN) is a recurrent neural network with a sparsely connected hidden layer (with typically 1% connectivity). The connectivity and weights of hidden neurons are fixed and randomly assigned. The weights of output neurons can be learned so that the network can (re) produce specific temporal patterns. The main interest of this network is that although its behaviour is non-linear, the only weights that are modified during training are for the synapses that connect the hidden neurons to output neurons. Thus, the error function is quadratic with respect to the parameter vector and can be differentiated easily to a linear system438.
Ecosystem of the digital economy is a partnership of organizations that ensures the constant interaction of their technological platforms, applied Internet services, analytical systems, information systems of state authorities of the Russian Federation, organizations and citizens439.
Edge computing is a subspecies of distributed computing in which information processing takes place in close proximity to the place where the data was received and will be consumed (for example, using phones and other consumer devices)440.
Electronic circuit is a product, a combination of individual electronic components, such as resistors, capacitors, diodes, transistors and integrated circuits, interconnected to perform any task or a circuit with conventional signs441,442.
Electronic Data Interchange (EDI) is a series of standards and conventions for the transfer of structured digital information between organizations, based on certain regulations and formats of transmitted messages443.
Electronic government (e-Government) is a package of technologies and a set of related organizational measures, regulatory and legal support for organizing digital interaction between public authorities of various branches of government, citizens, organizations and other economic entities444.
Electronic industry is a set of organizations that perform scientific, technological and other work in the field of development, production, maintenance of operation, as well as providing services related to electronic and microelectronic products, respectively445.
Electronic Medical Record (EMR) is electronic health record, is the systematized collection of patient and population electronically stored health information in a digital format. These records can be shared across different healthcare settings446.
Electronic state is a way of implementing the information aspects of state activity based on the use of IT systems, as well as a new type of state based on the use of this technology. In the Russian Federation, activities to create an «electronic state» are carried out within the framework of the federal target program «Electronic Russia»447,448.
Eli5 environment is a Python environment that is used to debug and visualize machine learning models. By default, it supports several machine learning frameworks – Scikit-learn, XGBoost, LightGBM, CatBoost, lightning, Keras and so on. Eli5 also provides LIME and Permutation Importance models to test machine learning pipelines as black boxes449.
ELIZA effect is a term used to discuss progressive artificial intelligence. It is the idea that people may falsely attach meanings of symbols or words that they ascribe to artificial intelligence in technologies450.
Embedding (Word Embedding) is one instance of some mathematical structure contained within another instance, such as a group that is a subgroup451.
Embedding space – the d-dimensional vector space that features from a higher-dimensional vector space are mapped to. Ideally, the embedding space contains a structure that yields meaningful mathematical results; for example, in an ideal embedding space, addition and subtraction of embeddings can solve word analogy tasks. The dot product of two embeddings is a measure of their similarity452.
Embeddings is a categorical feature represented as a continuous-valued feature. Typically, an embedding is a translation of a high-dimensional vector into a low-dimensional space453.
Embodied agent (also interface agent) is an intelligent agent that interacts with the environment through a physical body within that environment. Agents that are represented graphically with a body, for example a human or a cartoon animal, are also called embodied agents, although they have only virtual, not physical, embodiment454.
Embodied cognitive science is an interdisciplinary field of research, the aim of which is to explain the mechanisms underlying intelligent behavior. It comprises three main methodologies: 1) the modeling of psychological and biological systems in a holistic manner that considers the mind and body as a single entity, 2) the formation of a common set of general principles of intelligent behavior, and 3) the experimental use of robotic agents in controlled environments455.
Empirical risk minimization (ERM) – choosing the function that minimizes loss on the training set. Contrast with structural risk minimization456,457.
Encoder in general, is any system that converts from a raw, sparse, or external representation into a more processed, denser, or more internal representation. Encoders are often a component of a larger model, where they are frequently paired with a decoder. Some Transformers pair encoders with decoders, though other Transformers use only the encoder or only the decoder. Some systems use the encoder’s output as the input to a classification or regression network. In sequence-to-sequence tasks, an encoder takes an input sequence and returns an internal state (a vector). Then, the decoder uses that internal state to predict the next sequence. Refer to Transformer for the definition of an encoder in the Transformer architecture458.
Encryption is the reversible transformation of information in order to hide from unauthorized persons, while providing, at the same time, authorized users access to it459,460.
End-to-end digital technologies is a set of technologies that are part of the digital economy: big data, neurotechnologies and artificial intelligence, distributed registry systems, quantum technologies, new production technologies, industrial Internet, robotics and sensor components, wireless communication technologies, virtual and augmented reality technologies461.
Energy Efficiency – from both economic and environmental points of view, it is important to minimize the energy costs of both training and running an agent or model.
Ensemble averaging in machine learning, particularly in the creation of artificial neural networks, is the process of creating multiple models and combining them to produce a desired output, as opposed to creating just one model462.
Ensemble is a merger of the predictions of multiple models. You can create an ensemble via one or more of the following: different initializations; different hyperparameters; different overall structure. Deep and wide models are a kind of ensemble463.
Enterprise Imaging has been defined as «a set of strategies, initiatives and workflows implemented across a health- care enterprise to consistently and optimally capture, index, manage, store, distribute, view, exchange, and analyze all clinical imaging and multimedia content to enhance the electronic health record» by members of the HIMSSSIIM Enterprise Imaging Workgroup464.
Entity h5 – the process of labeling unstructured sentences with information so that a machine can read them. This could involve labeling all people, organizations and locations in a document, for example465.
Entity extraction is an umbrella term referring to the process of adding structure to data so that a machine can read it. Entity extraction may be done by humans or by a machine learning model466.
Entropy — the average amount of information conveyed by a stochastic source of data467.
Environment in reinforcement learning, the world that contains the agent and allows the agent to observe that world’s state. For example, the represented world can be a game like chess, or a physical world like a maze. When the agent applies an action to the environment, then the environment transitions between states468.
Episode in reinforcement learning, is each of the repeated attempts by the agent to learn an environment469.
Epoch in the context of training Deep Learning models, is one pass of the full training data set470,471.
Epsilon greedy policy in reinforcement learning, is a policy that either follows a random policy with epsilon probability or a greedy policy otherwise. For example, if epsilon is 0.9, then the policy follows a random policy 90% of the time and a greedy policy 10% of the time472.
Equality of opportunity is a fairness metric that checks whether, for a preferred label (one that confers an advantage or benefit to a person) and a given attribute, a classifier predicts that preferred label equally well for all values of that attribute. In other words, equality of opportunity measures whether the people who should qualify for an opportunity are equally likely to do so regardless of their group membership. For example, suppose Glubbdubdrib University admits both Lilliputians and Brobdingnagians to a rigorous mathematics program. Lilliputians’ secondary schools offer a robust curriculum of math classes, and the vast majority of students are qualified for the university program. Brobdingnagians’ secondary schools don’t offer math classes at all, and as a result, far fewer of their students are qualified. Equality of opportunity is satisfied for the preferred label of «admitted» with respect to nationality (Lilliputian or Brobdingnagian) if qualified students are equally likely to be admitted irrespective of whether they’re a Lilliputian or a Brobdingnagian473.
Equalized odds is a fairness metric that checks if, for any particular label and attribute, a classifier predicts that label equally well for all values of that attribute474.
Ergatic system is a scheme of production, one of the elements of which is a person or a group of people and a technical device through which a person carries out his activities. The main features of such systems are socio-psychological aspects. Along with the disadvantages (the presence of the «human factor»), ergatic systems have a number of advantages, such as fuzzy logic, evolution, decision-making in non-standard situations475.
Error backpropagation – the process of adjusting the weights in a neural network by minimizing the error at the output. It involves a large number of iteration cycles with the training data476.
Error-driven learning is a sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to minimize some error feedback. It is a type of reinforcement learning477.
Ethical use of artificial intelligence is a systematic normative understanding of the ethical aspects of AI based on an evolving complex, comprehensive and multicultural system of interrelated values, principles and procedures that can guide societies in matters of responsible consideration of the known and unknown consequences of the use of AI technologies for people, communities, the natural environment environment and ecosystems, as well as serve as a basis for decision-making regarding the use or non-use of AI-based technologies478.
Ethics of Artificial Intelligence is the ethics of technology specific to robots and other artificial intelligence beings, which is divided into robot ethics and machine ethics. The former one is about the concern with the moral behavior of humans as they design, construct, use, and treat artificially intelligent beings, and the latter one is about the moral behavior of artificial moral agents479.
Evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm. An EA uses mechanisms inspired by biological evolution, such as reproduction, mutation, recombination, and selection. Candidate solutions to the optimization problem play the role of individuals in a population, and the fitness function determines the quality of the solutions (see also loss function). Evolution of the population then takes place after the repeated application of the above operators480.
Evolutionary computation is a family of algorithms for global optimization inspired by biological evolution, and the subfield of artificial intelligence and soft computing studying these algorithms. In technical terms, they are a family of population-based trial and error problem solvers with a metaheuristic or stochastic optimization character481.
Evolving classification function (ECF) – evolving classifier functions or evolving classifiers are used for classifying and clustering in the field of machine learning and artificial intelligence, typically employed for data stream mining tasks in dynamic and changing environments482.
Example – one row of a dataset. An example contains one or more features and possibly a label. See also labeled example and unlabeled example483.
Executable – executable code, an executable file, or an executable program, sometimes simply referred to as an executable or binary, causes a computer «to perform indicated tasks according to encoded instructions», as opposed to a data file that must be interpreted (parsed) by a program to be meaningful484.
Existential risk – the hypothesis that substantial progress in artificial general intelligence (AGI) could someday result in human extinction or some other unrecoverable global catastrophe485.
Experience replay in reinforcement learning, a DQN technique used to reduce temporal correlations in training data. The agent stores state transitions in a replay buffer, and then samples transitions from the replay buffer to create training data486.
Experimenter’s bias it is the tester’s tendency to seek and interpret information, or give preference to one or another information, that is consistent with his point of view, belief or hypothesis. A kind of cognitive distortion and bias in inductive thinking487.
Expert system is a computer system that emulates the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as if—then rules rather than through conventional procedural code488,489.
Expert systems are systems that use industry knowledge (from medicine, chemistry, law) combined with sets of rules that describe how to apply the knowledge490.
Explainable artificial intelligence (XAI) is a key term in AI design and in the tech community as a whole. It refers to efforts to make sure that artificial intelligence programs are transparent in their purposes and how they work. Explainable AI is a common goal and objective for engineers and others trying to move forward with artificial intelligence progress491