Generative AI Glossary
A Glossary of Generative AI Terms for Using the Clarifai Platform Effectively
A
Adversarial Autoencoder (AAE)
A type of autoencoder which combines the principles of adversarial loss, integral to GANs, and the architecture of an autoencoder. This combination empowers the model to learn complex distributions of data effectively.
Audio Synthesis
This involves using AI to create new, artificial sounds or voice outputs. Such sounds can be as simple as a specific tone or as complex as a mimicked form of speech.
Autoregressive Models
These are generative models that produce data by conditioning each element's probability on previous elements in a sequence. For example, WaveNet and PixelCNN are autoregressive models for creating music and images, respectively.
Autoencoder
An autoencoder is an artificial neural network utilized for learning efficient encodings of input data. It has two crucial components: an encoder that compresses the input data and a decoder that reconstructs the data from its reduced form.
Autoregressive Generative Models
These models predict the distribution of subsequent sequence elements using prior sequence elements to implicitly establish a distribution across sequences using Conditional Probability's Chain Rule. The main architectures for autoregressive models are causal convolutional networks and recurrent neural networks.
B
BERT (Bidirectional Encoder Representations from Transformers)
BERT, developed by Google, is a pre-trained transformer-based language model. It stands out for its bidirectional training approach, which allows it to understand the context of a word based on all of its surroundings (left and right of the word).
BLOOM
Developed by The BLOOM project, Bloom is a large-scale language model that can execute a vast array of natural language understanding and generation tasks accurately.
C
ChatGPT
Developed by OpenAI, ChatGPT is a specialized large-scale language model that generates human-like text. It's a popular choice for developing AI powered chatbots due to its convincing conversation-generation capabilities.
CLIP (Contrastive Language—Imagen Pretraining)
This involves using AI to create new, artificial sounds or voice outputs. Such sounds can be as simple as a specific tone or as complex as a mimicked form of speech.
Close-Book QA
Close-book QA, also known as zero-shot QA, refers to the ability of an LLM to answer questions without access to any additional information or context beyond its internal knowledge base.
Close-book QA stands in contrast to open-book QA, where the LLM can access and process external sources of information, such as documents, web pages, or knowledge bases.
Conditional GANs (cGANs)
These are a type of GAN where a conditional variable is introduced to the input layer, allowing the model to generate data conditioned on certain factors. This augmentation provides the model with the capability to generate data with desired characteristics.
Cross-modal
Cross-modal learning refers to using information from one modality to understand or make predictions in another modality. This could involve translating or transforming the data in some way. For example, a cross-modal learning system might be designed to accept text input and output a related image or vice versa.
CycleGAN
A type of GAN that can translate an image from a source domain to a target domain without paired examples. It's particularly useful in tasks like photo enhancement, image colorization, and style transfer for unpaired photo-to-photo translation.
D
DALL-E 2
This is an updated version of DALL-E, an AI model developed by OpenAI to generate images from textual descriptions. It's an excellent example of a multi-modal AI system.
Data Distribution
In machine learning, data distribution refers to the overall layout or spread of data points within a dataset. In the case of generative models such as GANs, the generator seeks to mimic the actual data distribution.
Deepfake
Synthetic media in which a person in an existing image or video is replaced with someone else's likeness using machine learning techniques. While they could serve interactive entertainment purposes, deepfakes may mislead viewers, often with unintended consequences.
Diffusion
In AI, 'diffusion' refers to a technique used for generating new data by starting with a portion of actual data, then gradually adding random noise. This process is generally reversed, with a neural network trained to predict the reverse process of noise addition to the data.
Discriminator
In a GAN, the discriminator is the component that tries to differentiate real data instances from the fictitious ones fabricated by the generator. It helps refine the generator's ability to create realistic data.