General AI Glossary
A Glossary of General AI Terms for Using the Clarifai Platform Effectively
A
A/B Testing
A statistical way of comparing two (or more) techniques, typically an incumbent against a new rival. It aims to determine which technique performs better, and whether the difference is statistically significant.
Accuracy
The fraction of correct predictions a model got right. The goal of any model is to get it to see the world as you see it.
- In Multi-class classification, accuracy is determined by the number of correct predictions divided by the total number of examples.
- In Binary classification, or for two mutually exclusive classes, accuracy is determined by the number of true positives added to the number of true negatives, divided by the total number of examples.
Activation Function
In artificial neural networks, the activation function of a node defines the output of that node given an input or set of inputs.
Active Learning
A machine learning term that refers to various methods for actively improving the performance of trained models.
Adversarial Example
Adversarial examples are specialized inputs created with the purpose of confusing a neural network, resulting in the misclassification of a given input. These notorious inputs are indistinguishable to the human eye, but cause the network to fail to identify the contents of the image.
Adversarial Machine Learning
A research field that lies at the intersection of machine learning (ML) and computer security. It enables the safe adoption of ML techniques in adversarial settings like spam filtering, malware detection, etc.
Agents
In the context of AI, agents are software that can independently perform specific tasks without human intervention. They often employ various tools, like calculators or web browsing, to process data and develop solutions.
Agent System Operators
Agent system operators are "non-trainable," or "fixed function," models that help you connect, route, and control the inputs and outputs that you send through your workflows. Operator models are critical building blocks for creating more advanced workflows.
- Concept Thresholder allows you to threshold input concepts according to both a threshold and an operator (>, >=, =, <=, or <). For example, if you use the " > " threshold type and set the threshold value to 0.9, only concepts that have been predicted with a confidence score greater than 0.9 will be sent as outputs from the concept thresholder, and other concepts will be ignored.
- Region Thresholder allows you to threshold regions based on the concepts that they contain using a threshold per concept and an overall operator (>, >=, =, <=, or <).
- Random Sampler allows you to randomly allow an input to pass to the output.
- Image Cropper allows you to crop the input image according to each input region that is present in the input.
- Image Align allows you to align images using key points.
- Annotation Writer allows you to write the input data to the database in the form of an annotation with a specified status as if a specific user created the annotation.
- Regex-Based Classifier allows you to classify text using regular expressions. When the specified regex pattern matches the text, the text is assigned to one of the predefined concepts.
- Concept Synonym Mapper allows you to map the input concepts to output concepts by following synonym concept relations in the knowledge graph of your app.
AI Algorithms
Extended subset of machine learning that tells the computer how to learn to operate on its own through a set of rules or instructions.
AI Ethics in Generative Models
With the advancement of generative AI, the urgency of addressing ethical concerns such as deepfakes, data privacy, and bias within AI has intensified. There are increasing calls for meticulous oversight to guarantee their responsible development and application.
AI-Generated Art and Copyright
The rise of AI in generating art has led to discussions about copyright, ownership, and the definition of creativity.
AI Lake
A centralized platform designed to consolidate, organize, and manage all your AI assets, including models, annotations, datasets, workflows, and user interfaces. It enables seamless collaboration between teams, fostering AI adoption and reusability across the enterprise. With AI-powered indexing, it automatically organizes massive amounts of data objects and makes them easily searchable.
The platform supports dataset versioning and lineage tracking for all AI assets, ensuring control over access, modifications, and deletions. AI Lake aims to make AI applications reproducible by allowing users to recreate results using input data, code, and configurations.
Built on enterprise-grade infrastructure with 99.999% uptime, it integrates seamlessly with major cloud providers like AWS, GCP, and Azure, as well as on-premises and air-gapped systems. AI Lake accelerates AI development by providing data scientists with the necessary tools to build accurate models without redundant efforts, promoting collaboration and making AI assets easily findable and reusable. Furthermore, AI Lake enhances AI governance by offering auditable and reproducible AI solutions with comprehensive provenance and change history tracking.
Anchor Box
The archetypal location, size, and shape for finding bounding boxes in an object detection problem. For example, square anchor boxes are typically used in face detection models.
Annotation
The "answer key" for each image. Annotations are markups placed on an image (bounding boxes for object detection, polygons or a segmentation map for segmentation) to teach the model the ground truth.
Annotation Format
The particular way of encoding an annotation. There are many ways to describe a bounding box's size and position (JSON, XML, TXT, etc) and to delineate which annotation goes with which image.
Annotation Group
Describes what types of objects you are identifying. For example, "chess pieces" or "vehicles."
API Key
An API key is essentially a “password” for accessing the API. Accounts are billed for API calls, and this helps us keep track of activity.
Application
An application is literally what it sounds like: an application of AI to an existing challenge. It’s a self-contained project for storing and handling, data, annotations, models, concepts, datasets, workflows (chaining of models together), and searches.
An operation performed in one application will return results from data within that application, but will be blind to data in other applications. You can create as many applications as you like and can divide your use among them to segment data into collections and manage access accordingly. Usually, you would create a new application for each new set of related tasks you want to accomplish.
Application Programming Interface (API)
A set of commands, functions, protocols, and objects that programmers can use to create software or interact with an external system.
Clarifai’s API allows users to access the Clarifai platform through four request types:
- POST - Upload inputs and information
- PATCH - Update or modify existing information
- GET - Request information
- DELETE - Delete existing information
Application Template
Clarifai app templates are pre-built blueprints that provide a starting point for creating your own applications. They are apps with their contents grouped by some use case — enabling you to easily get started building your applications.
Architecture
A specific neural network layout (layers, neurons, blocks, etc). These often come in multiple sizes whose design is similar except for the number of parameters.
Artificial General Intelligence
Computational system that can perform any intellectual task a human can. Also called "Strong Al." At this point, AGI is fictional.
Artificial Intelligence
The simulation of human intelligence processes by machines, especially computer systems. These processes include learning (the acquisition of information and rules for using it), reasoning (using rules to reach approximate or definite conclusions), and self-correction.
Artificial Neural Network
A learning model created to act like a human brain that solves tasks that are too difficult for traditional computer systems to solve.
Artificial Super Intelligence
Artificial Super Intelligence (ASI) refers to a level of AI that surpasses human intelligence across all domains, including creativity, problem-solving, and emotional intelligence.
AUC Score
The AUC, or area under the ROC curve, is a metric used to measure the performance of a binary classifier, such as a spam filter or fraud detector. It’s a numerical value between 0 and 1 that represents the overall performance of the classifier and its degree of separability, where 1 means the classifier is perfect at distinguishing between two classes, and 0.5 means it’s no better than a coin flip.
Audio Speech Recognition (ASR)
A technology that processes human speech into readable text.
These models take audio containing speech and convert it into text. These can be extremely useful as they allow audio to be searched for key terms, or AI models to transmit text instead of audio over networks, which is much smaller and faster.
Authentication
Authentication is the process of verifying someone's claimed identity. It's essentially confirming that a user trying to access a system or resource is who they say they are.
Two-factor authentication (2FA) is an optional sign-in security feature that provides an additional layer of security to your account.
Authorization
Authorization, following authentication, determines what a user is allowed to do with a system or resource after their identity has been verified. It's about granting specific permissions based on a user's role or privileges.
Auto-Annotation
Auto-annotation, also known as automatic annotation or automated labeling, refers to the use of machine learning and artificial intelligence techniques to automatically generate annotations or labels for data.
Automation Bias
When a human decision maker favors recommendations made by an automated decision-making system over information made without automation, even when the automated decision-making system makes errors.
AutoML
Automates each step of the ML workflow so that it ’s easier for users with minimal effort and machine learning expertise.
Autonomous AI
The most advanced form of AI is autonomous artificial intelligence, in which processes are automated to generate the intelligence that allows machines, bots and systems to act on their own, independent of human intervention. It is often used in autonomous vehicles.
B
Backpropagation
The main algorithm used for performing gradient descent on neural networks. Short for "backward propagation of errors," it’s an algorithm for supervised learning of artificial neural networks using gradient descent. Given an artificial neural network and an error function, the method calculates the gradient of the error function with respect to the neural network weights.
Backward Chaining
A method where the model starts with the desired output and works in reverse to find data that might support it.
Base Workflow
One of Clarifai's pre-built workflows that can be built upon to create a custom model. It pre-indexes inputs for search and provides a default embedding space.
The base workflow acts as the default knowledge base for your app and provides the basic structure for indexing your data. It gives you a "head start" when working with your data — by pre-indexing your inputs for search and by providing a default embedding for your custom models.
Baseline
A model used as a reference point for comparing how well another model (typically, a more complex one) is performing. Baseline models help developers quantify the minimal expected performance that a new model must achieve to be useful.
Batch
The set of examples used in one iteration (that is, one gradient update) of model training.
Batch Inference
Asynchronous process that is executing predictions based on existing models and observations, and then stores the output.
Batch Size
The number of training examples utilized in one iteration.
Bayes's Theorem
A famous theorem used by statisticians to describe the probability of an event based on prior knowledge of conditions that might be related to an occurrence.
Bias
When an Al algorithm produces results that are systemically prejudiced due to erroneous assumptions in the machine learning process.
It occurs when the scope of your training data is too narrow. If you only see green apples, you’ll assume that all apples are green and think red apples are another kind of fruit. If the training data contains only a small number of examples, it’ll react accordingly, taking it as truth. Small datasets make for a smaller worldview.
Big Data
Big data refers to data that is so large, fast or complex that it's difficult or impossible to process using traditional methods.
Binary Classification/Mutually Exclusive
The task of classifying elements of a set into two groups on the basis of a classification rule i.e. a model that evaluates email messages and outputs either spam or not spam is a binary classifier.
Mutual exclusivity means the outcomes are disjoint if they cannot both be true. When classes are referred to as “mutually exclusive,” this means that the neural network will only predict an input as a single concept, and no other classes or concepts.
In this case, there is no intersection between any of the classes for a model. For instance, a network may classify an image as a cat or dog, but not both. If the goal of a model is to recognize only ONE concept for an input, making the concepts in your model mutually exclusive will give you stronger, more accurate predictions.
Black Box Al
An Al system whose inputs and operations are not visible to the user. A black box, in a general sense, is an impenetrable system.
Boosting
A machine learning technique that iteratively combines a set of simple and not very accurate classifiers (referred to as "weak" classifiers) into a classifier with high accuracy (a "strong" classifier) by upweighting the examples that the model is currently misclassifying.
Bootstrapping
Bootstrapping is any test or metric that uses random sampling with replacement and falls under the broader class of resampling methods. Bootstrapping assigns measures of accuracy (bias, variance, confidence intervals, prediction error, etc.) to sample estimates.
Bounding Box
In an image, the (x, y) coordinates of a rectangle around an area of interest.