Skip to main content

Upload Inputs

Add structured or unstructured data to the Clarifai platform


An input is any piece of structured or unstructured data added to the Clarifai platform. This includes images, text, videos, and more — you can add as many inputs as you want.

Whether your data is hosted online via URLs, stored locally as file paths, or represented as bytes, our platform supports a wide range of formats, ensuring flexibility and ease of use. You can also upload zipped archive files (ZIP format) containing mixed data types, such as text and images.

Once uploaded, you can organize your inputs into datasets to support a wide range of tasks, including:

info

As each input is uploaded, it is automatically indexed using the specified base workflow for your app. This indexing enables you to perform searches over the uploaded inputs, leveraging Clarifai’s custom-built vector database for fast and efficient search capabilities.

Upload Limits

When uploading data to the Clarifai platform, your inputs should meet the conditions outlined below.

Note that if the size of your input, such as a video or an audio file, exceeds these limits, you will need to split ithem into smaller chunks. Otherwise, the processing will time out, and you will receive an error response.

Images

  • The supported image formats include JPEG, PNG, TIFF, BMP, WEBP, and GIF.
  • Each request can include up to 128 image inputs per batch.
  • Each image file must be a maximum of 85 megapixels and less than 20MB in size.
  • The total batch size (in bytes) for each request must be less than 128MB.

Videos

  • The supported video formats include AVI, MP4, WMV, MOV, and 3GPP.
  • Each request can include only 1 video input.
  • If uploading via URL, the video can be up to 300MB or 10 minutes long.
  • If uploading via direct file upload (byte data), the video must be less than 128MB.

Text Files

  • The supported text formats include plain texts, CSV files, and TSV files.
  • Each request can include up to 128 text files per batch.
  • Each text file must be less than 20MB.
  • The total batch size (in bytes) must be less than 128MB.

Audio Files

  • The supported audio format is WAV.
  • Each request can include up to 128 audio files per batch.
  • Each audio file must be less than 20MB in size (suitable for a 48kHz, 60-second, 16-bit recording).
  • The total batch size (in bytes) must be less than 128MB.
bypass upload limits

When uploading data to the Clarifai platform using the Python SDK — such as with upload_from_bytes() or upload_from_url() methods (demonstrated below) — the standard upload limits apply. However, you can bypass these limits by using the upload_from_folder() method from the Dataset class, which efficiently handles larger volumes of inputs by automatically batching them while adhering to upload restrictions.

For example, when uploading images in bulk, the method incrementally processes and uploads them in multiple batches, ensuring that each batch contains a maximum of 128 images and does not exceed 128MB in size.

You can also customize the batch_size variable, which allows for concurrent upload of inputs and annotations. For example, if your folder exceeds 128MB, you can set the variable to ensure that each batch contains an appropriate number of images while staying within the 128MB per batch limit.

Upload via the UI

Upload Inputs

To upload inputs, navigate to your individual app's page and select the Inputs option in the collapsible left sidebar. Then, click the Upload Inputs button located in the upper-right corner of the page.

The inputs uploader window that pops up allows you to upload any type of input data — files from your local directory, texts, or publicly accessible URLs.

Note that you can also use the inputs uploader modal to:

  • Organize your inputs into datasets.

  • Add concepts to inputs.

  • Attach JSON metadata for additional context. Metadata are additional pieces of information you attach to your inputs when uploading them to the Clarifai platform. This can include product IDs, user IDs, or any other relevant details necessary for achieving specific outcomes

This modal simplifies the process of managing and enriching your inputs on the platform.

Upload Files

tip

Click here to learn how to upload data using files in .csv or .tsv formats.

To upload files containing any supported data type from your local directory, select the Files tab in the inputs uploader window. Then, click the upload button to select your files or drag and drop them directly into the designated area.

If you want to make multiple uploads without closing the uploader window, select the Keep window open for multiple uploads checkbox.

After uploading your inputs, click the Upload Inputs button located in the lower section of the uploader window.

Upload Texts

To upload text data directly through the UI, select the Text tab in the inputs uploader window and enter your text. Each input can contain a maximum of 500 words.

Upload URLs

To upload a URL containing any supported data type, select the URL tab in the inputs uploader window and enter each URL on a separate line.

If you want to allow uploading identical URLs, toggle the Allow duplicate URLs button on.

Uploads Status

The Uploads status window appears in the lower-right section of the page, enabling you to monitor the percentage progress of your upload. You can check the progress of any of your active uploads in the Active tab.

Once the upload process is complete, the Inactive tab will display a Complete status.

note

If you select the Refresh when jobs finish checkbox, the window will automatically refresh to display the status as soon as the upload process is finished. You'll also be notified if there is an issue with any of your inputs during uploading.

Upload via the API

Data Utils

The Clarifai's Data Utils library allows you to effortlessly extract, transform, and load unstructured data — such as images, videos, and text — into the Clarifai platform.

info

Before using the Python SDK, Node.js SDK, or any of our gRPC clients, ensure they are properly installed on your machine. Refer to their respective installation guides for instructions on how to install and initialize them.

Note that input uploads are processed asynchronously. Your files will be indexed in the background via your app's default base workflow, which may take some time depending on volume and file types.

To verify successful indexing, you can check the input status for code 30000 (INPUT_IMAGE_DOWNLOAD_SUCCESS). This confirms the input is fully processed and ready for use.

Upload Image Data

Below is an example of how to upload image data.

from clarifai.client.input import Inputs

img_url = "https://samples.clarifai.com/metro-north.jpg"
input_obj = Inputs(user_id="user_id", app_id="test_app", pat="YOUR_PAT")
# You can also upload data through Bytes and Filepath,

# Upload from file
# input_obj.upload_from_file(input_id='demo', image_file=’image_filepath')

# Upload from bytes
# input_obj.upload_from_bytes(input_id='demo', image_bytes=image)

input_obj.upload_from_url(input_id="demo", image_url=img_url)
Output

2024-01-15 16:38:49 INFO clarifai.client.input: input.py:669

Inputs Uploaded

code: SUCCESS

description: "Ok"

details: "All inputs successfully added"

req_id: "a14eda72951b06cd25561381d70ced74"

Upload Text Data

Below is an example of how to upload text data.

from clarifai.client.input import Inputs

input_text = b"Write a tweet on future of AI"
input_obj = Inputs(user_id="user_id", app_id="test_app", pat="YOUR_PAT")

# You can also upload data through URLand Filepath,

# Upload from file
# input_obj.upload_from_file(input_id='text_dat', text_file=’text_filepath')

# Upload from url
# input_obj.upload_from_url(input_id='text,text_url=”text_url”)

input_obj.upload_from_bytes(input_id="text_data", text_bytes=input_text)
Output
2024-01-16 14:14:41 INFO     clarifai.client.input:                                                    input.py:669

Inputs Uploaded

code: SUCCESS

description: "Ok"

details: "All inputs successfully added"

req_id: "80d2454a1dea0411e20fb03b2fe0c8b1"

Write Custom Functions for Data Processing

You can add your own custom functions for data processing with ease.

Below is an example of how to clean text data by removing Unicode characters before uploading it to the Clarifai platform.

from clarifai.client.input import Inputs

# Initialize the Inputs object with user and app IDs
input_object = Inputs(user_id="YOUR_USER_ID_HERE", app_id="YOUR_APP_ID_HERE", pat="YOUR_PAT_HERE")

# Remove unicode from text
def remove_unicode_and_upload(input_id, text):
string_encode = text.encode("ascii", "ignore")
string_decode = string_encode.decode()
input_object.upload_text(input_id=input_id,raw_text=string_decode)

remove_unicode_and_upload(input_id='test', text="This is a test \u200c example. ")

Upload Audio Data

Below is an example of how to upload audio data.

from clarifai.client.input import Inputs

audio_url = "https://s3.amazonaws.com/samples.clarifai.com/GoodMorning.wav"
input_obj = Inputs(user_id="user_id", app_id="test_app", pat="YOUR_PAT")

# You can also upload data through Bytes and Filepath,

# Upload from file
# input_obj.upload_from_file(input_id='audio_data', audio_file=’audio_filepath')

# Upload from bytes
# input_obj.upload_from_bytes(input_id='audio_data’, audio_bytes=audio)

input_obj.upload_from_url(
input_id="audio_data",
audio_url=audio_url,
)
Output

2024-01-16 14:18:58 INFO clarifai.client.input: input.py:669

Inputs Uploaded

code: SUCCESS

description: "Ok"

details: "All inputs successfully added"

req_id: "c16d3dd066d7ee48d038744daacef6e8"

Upload Video Data

Below is an example of how to upload video data.

from clarifai.client.input import Inputs
video_url = "https://samples.clarifai.com/beer.mp4"
input_obj = Inputs(user_id="user_id", app_id="test_app", pat="YOUR_PAT")

# You can also upload data through Bytes and Filepath,

# Upload from file
# input_obj.upload_from_file(input_id='video_data', video_file=’video_filepath')

# Upload from bytes
# input_obj.upload_from_bytes(input_id='video_data’, video_bytes=video)

input_obj.upload_from_url(
input_id="video_data", video_url= video_url
)
Output
2024-01-16 14:25:26 INFO     clarifai.client.input:                                                    input.py:669

Inputs Uploaded

code: SUCCESS

description: "Ok"

details: "All inputs successfully added"

req_id: "00576d040a6254019942ab4eceb306ad"

Upload Multimodal Data

Below is an example of how to upload a combination of different input types, such as images and text, to the Clarifai platform.

Currently, Clarifai supports specific multimodal input combinations, such as [Image, Text] -> Text. This allows you to process and analyze interconnected data types for advanced use cases.

from clarifai.client.input import Inputs

input_obj = Inputs(user_id="user_id", app_id="test_app", pat="YOUR_PAT")

# initialize inputs of different type
prompt = "What time of day is it?"
image_url = "https://samples.clarifai.com/metro-north.jpg"

# Here you can give the value for different types of inputs
input_obj.get_multimodal_input(
input_id="multimodal_data", image_url=image_url, raw_text=prompt
)
Output
id: "multimodal_data"
data {
image {
url: "https://samples.clarifai.com/metro-north.jpg"
}
text {
raw: "What time of day is it?"
}
}

Upload Custom Metadata

When using the Clarifai SDKs, you can enhance your inputs by attaching custom metadata alongside concepts. This feature enables you to include additional contextual information, such as categorization, filtering criteria, or reference data, making it easier to organize and retrieve your inputs later.

Below are examples of how to upload inputs with custom metadata. In these examples, the metadata includes details about the filename and the dataset split (e.g., train, validate, or test) to which the input belongs.

Image With Metadata

# Import necessary modules
from google.protobuf.struct_pb2 import Struct
from clarifai.client.input import Inputs

# Create an Inputs object with user_id and app_id
input_object = Inputs(user_id="user_id", app_id="app_id", pat="YOUR_PAT")

# Create a Struct object for metadata
metadata = Struct()

# Update metadata with filename and split information
metadata.update({"filename": "XiJinping.jpg", "split": "train"})

# URL of the image to upload
url = "https://samples.clarifai.com/XiJinping.jpg"

# Upload the image from the URL with associated metadata
input_object.upload_from_url(input_id="metadata", image_url=url, metadata=metadata)
Output
2024-04-05 13:03:24 INFO     clarifai.client.input:                                                    input.py:674
Inputs Uploaded
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "951a64b950cccf05c8d274c8acc1f0f6"

INFO:clarifai.client.input:
Inputs Uploaded
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "951a64b950cccf05c8d274c8acc1f0f6"

('8557e0f57f464c22b3483de76757fb4f',
status {
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "951a64b950cccf05c8d274c8acc1f0f6"
}
inputs {
id: "metadata"
data {
image {
url: "https://samples.clarifai.com/XiJinping.jpg"
image_info {
format: "UnknownImageFormat"
color_mode: "UnknownColorMode"
}
}
metadata {
fields {
key: "filename"
value {
string_value: "XiJinping.jpg"
}
}
fields {
key: "split"
value {
string_value: "train"
}
}
}
}
created_at {
seconds: 1712322204
nanos: 737881425
}
modified_at {
seconds: 1712322204
nanos: 737881425
}
status {
code: INPUT_DOWNLOAD_PENDING
description: "Download pending"
}
}
inputs_add_job {
id: "8557e0f57f464c22b3483de76757fb4f"
progress {
pending_count: 1
}
created_at {
seconds: 1712322204
nanos: 714751000
}
modified_at {
seconds: 1712322204
nanos: 714751000
}
status {
code: JOB_QUEUED
description: "Job is queued to be ran."
}
})

Video With Metadata

from google.protobuf.struct_pb2 import Struct
from clarifai.client.input import Inputs

# Initialize an Inputs object with specified user_id and app_id
input_object = Inputs(user_id="user_id", app_id="app_id", pat="YOUR_PAT")

# Define the URL of the video to upload
video_url = "https://samples.clarifai.com/beer.mp4"

# Create a Struct object to hold metadata
metadata = Struct()

# Update the metadata with filename and split information
metadata.update({"filename": "drinks.jpg", "split": "train"})

# Upload the video from the specified URL with the provided metadata
input_object.upload_from_url(
input_id="video_data_metadata", video_url=video_url, metadata=metadata
)
Output
2024-04-05 13:05:49 INFO     clarifai.client.input:                                                    input.py:674
Inputs Uploaded
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "72c9820d805efb9f3ee7f0508778c1f3"

INFO:clarifai.client.input:
Inputs Uploaded
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "72c9820d805efb9f3ee7f0508778c1f3"

('7fdc30b9c2a24f31b6a41b32bd9fea02',
status {
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "72c9820d805efb9f3ee7f0508778c1f3"
}
inputs {
id: "video_data_metadata"
data {
video {
url: "https://samples.clarifai.com/beer.mp4"
video_info {
video_format: "UnknownVideoFormat"
}
}
metadata {
fields {
key: "filename"
value {
string_value: "drinks.jpg"
}
}
fields {
key: "split"
value {
string_value: "train"
}
}
}
}
created_at {
seconds: 1712322349
nanos: 628288634
}
modified_at {
seconds: 1712322349
nanos: 628288634
}
status {
code: INPUT_DOWNLOAD_PENDING
description: "Download pending"
}
}
inputs_add_job {
id: "7fdc30b9c2a24f31b6a41b32bd9fea02"
progress {
pending_count: 1
}
created_at {
seconds: 1712322349
nanos: 602487000
}
modified_at {
seconds: 1712322349
nanos: 602487000
}
status {
code: JOB_QUEUED
description: "Job is queued to be ran."
}
})

Text With Metadata

# Import necessary modules
from google.protobuf.struct_pb2 import Struct
from clarifai.client.input import Inputs

# Define the input object with user_id and app_id
input_object = Inputs(user_id="user_id", app_id="app_id", pat="YOUR_PAT")

# Define the input text
input_text = b"Write a tweet on future of AI"

# Create a Struct object for metadata
metadata = Struct()

# Update metadata with filename and split information
metadata.update({"filename": "tweet.txt", "split": "train"})

# Upload the input from bytes with custom metadata
input_object.upload_from_bytes(input_id="text_data_metadata", text_bytes=input_text, metadata=metadata)
Output
2024-04-05 13:07:04 INFO     clarifai.client.input:                                                    input.py:674
Inputs Uploaded
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "835f6c736f032947d1f4067e39c10b72"

INFO:clarifai.client.input:
Inputs Uploaded
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "835f6c736f032947d1f4067e39c10b72"

('e3de274f644a4e98a488e7c85f94c0d1',
status {
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "835f6c736f032947d1f4067e39c10b72"
}
inputs {
id: "text_data_metadata"
data {
metadata {
fields {
key: "filename"
value {
string_value: "tweet.txt"
}
}
fields {
key: "split"
value {
string_value: "train"
}
}
}
text {
url: "https://data.clarifai.com/orig/users/8tzpjy1a841y/apps/visual_classifier_eval/inputs/text/c439598b04d8112867eec70097aa00c2"
text_info {
encoding: "UnknownTextEnc"
}
}
}
created_at {
seconds: 1712322424
nanos: 56818659
}
modified_at {
seconds: 1712322424
nanos: 56818659
}
status {
code: INPUT_DOWNLOAD_PENDING
description: "Download pending"
}
}
inputs_add_job {
id: "e3de274f644a4e98a488e7c85f94c0d1"
progress {
pending_count: 1
}
created_at {
seconds: 1712322423
nanos: 941401000
}
modified_at {
seconds: 1712322423
nanos: 941401000
}
status {
code: JOB_QUEUED
description: "Job is queued to be ran."
}
})

Audio With Metadata

# Import necessary modules
from clarifai.client.input import Inputs
from google.protobuf.struct_pb2 import Struct


# Define the input object with user_id and app_id
input_object = Inputs(user_id="user_id", app_id="app_id", pat="YOUR_PAT")

# Define the URL of the audio file
audio_url = "https://s3.amazonaws.com/samples.clarifai.com/GoodMorning.wav"

# Create a new Struct to hold metadata
metadata = Struct()

# Update the metadata with filename and split information
metadata.update({"filename": "goodmorning.wav", "split": "test"})

# Upload the input from the specified URL with metadata
input_object.upload_from_url(
input_id="audio_data_metadata", # Specify an ID for the input
audio_url=audio_url, # URL of the audio file
metadata=metadata # Custom metadata associated with the input
)
Output
2024-04-08 06:39:32 INFO     clarifai.client.input:                                                    input.py:674
Inputs Uploaded
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "4c96e4167170c174838c7987101f3478"

INFO:clarifai.client.input:
Inputs Uploaded
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "4c96e4167170c174838c7987101f3478"

('109349aa790a404db39f6324415a47a5',
status {
code: SUCCESS
description: "Ok"
details: "All inputs successfully added"
req_id: "4c96e4167170c174838c7987101f3478"
}
inputs {
id: "audio_data_metadata"
data {
metadata {
fields {
key: "filename"
value {
string_value: "goodmorning.wav"
}
}
fields {
key: "split"
value {
string_value: "test"
}
}
}
audio {
url: "https://s3.amazonaws.com/samples.clarifai.com/GoodMorning.wav"
audio_info {
audio_format: "UnknownAudioFormat"
}
}
}
created_at {
seconds: 1712558372
nanos: 764691920
}
modified_at {
seconds: 1712558372
nanos: 764691920
}
status {
code: INPUT_DOWNLOAD_PENDING
description: "Download pending"
}
}
inputs_add_job {
id: "109349aa790a404db39f6324415a47a5"
progress {
pending_count: 1
}
created_at {
seconds: 1712558372
nanos: 751997000
}
modified_at {
seconds: 1712558372
nanos: 751997000
}
status {
code: JOB_QUEUED
description: "Job is queued to be ran."
}
})

Upload Inputs with Geospatial Information

When uploading inputs to Clarifai, you can enrich them by including geospatial data, such as longitude and latitude coordinates from the GPS system.

This allows you to associate each input with a specific geographic location. Note that each input can have at most one geospatial point associated with it.

from clarifai.client.input import Inputs

# URL of the image to upload
image_url = "https://samples.clarifai.com/Ferrari.jpg"

# Provide the Geoinfo to be added to the input
# geo_info=[longitude, latitude]
geo_points = [102,73]

# Create an Inputs object with user_id and app_id
input_object = Inputs(user_id="YOUR_USER_ID_HERE", app_id="YOUR_APP_ID_HERE", pat="YOUR_PAT_HERE")

# Upload the image from the URL with associated GeoInfo
input_object.upload_from_url(input_id="geo_info", image_url=image_url, geo_info=geo_points)

Upload Inputs With Annotations

You can upload inputs along with their corresponding annotations, such as bounding boxes or polygons.

Bounding Box Annotations

Below is an example of how to label a new rectangular bounding box for a specific region within an image. The bounding box coordinates should be normalized to the image dimensions, with values scaled to the range of [0, 1.0].

This ensures that the coordinates are independent of the image resolution, making the annotations consistent across different image sizes.

# Start by uploading the image with a specific input ID as described earlier
# For example, you can upload this image: https://samples.clarifai.com/BarackObama.jpg
# Then, after successfully uploading it, apply the bounding box annotations

from clarifai.client.input import Inputs

# Initialize the Inputs object with user and app IDs
input_object = Inputs(user_id="YOUR_USER_ID_HERE", app_id="YOUR_APP_ID_HERE", pat="YOUR_PAT_HERE")

# Upload bounding box annotations
bbox_points = [.1, .1, .8, .9] # Coordinates of the bounding box
annotation = input_object.get_bbox_proto(input_id="bbox", label="face", bbox=bbox_points, label_id="id-face", annot_id="demo")
input_object.upload_annotations([annotation])

Polygon Annotations

Below is an example of how to annotate any polygon-shaped region within an image.

A polygon is defined by a list of points, each specified by:

  • row — The row position of the point, represented as a value between 0.0 and 1.0, where 0.0 corresponds to the top row and 1.0 corresponds to the bottom.
  • col — The column position of the point, represented as a value between 0.0 and 1.0, where 0.0 corresponds to the left column of the image and 1.0 corresponds to the right column.
# Start by uploading the image with a specific input ID as described earlier
# For example, you can upload this image: https://samples.clarifai.com/airplane.jpeg
# Then, after successfully uploading it, apply the polygon annotations

from clarifai.client.input import Inputs

# Initialize the Inputs object with user and app IDs
input_object = Inputs(user_id="YOUR_USER_ID_HERE", app_id="YOUR_APP_ID_HERE", pat="YOUR_PAT_HERE")

# Upload polygon annotations
#polygons=[[[x,y],...,[x,y]],...]
polygon_pts = [[.15,.24],[.4,.78],[.77,.62],[.65,.15]]
annotation = input_object.get_mask_proto(input_id="mask", label="airplane", polygons=polygon_pts)
input_object.upload_annotations([annotation])

Concepts Annotations

Below is an example of how to annotate different types of inputs with concepts.

from clarifai.client.input import Inputs

url = "https://samples.clarifai.com/featured-models/Llama2_Conversational-agent.txt"

# Change this depending on the type of input you want to annotate
concepts = ["mobile","camera"]

# Initialize the Inputs object with user and app IDs
input_object = Inputs(user_id="YOUR_USER_ID_HERE", app_id="YOUR_APP_ID_HERE", pat="YOUR_PAT_HERE")

# Upload text data with concepts
input_object.upload_from_url(input_id="text1", text_url=url, labels=concepts)

# Upload image data with concepts
#input_object.upload_from_url(input_id="image1", image_url="ADD_URL_HERE", labels=concepts)

# Upload video data with concepts
#input_object.upload_from_url(input_id="video1", video_url="ADD_URL_HERE", labels=concepts)

# Upload audio data with concepts
#input_object.upload_from_url(input_id="audio1", audio_url="ADD_URL_HERE", labels=concepts)

Upload Inputs Options (via URLs, bytes, concepts, metadata, etc)

You can add inputs one by one or in bulk. If you send them in bulk, you are limited to sending 128 inputs at a time, as mentioned above.

Upload Inputs via URL

Below is an example of how to add inputs via a publicly accessible URL.

##########################################################################
# In this section, we set the user authentication, app ID, and input URL.
# Change these strings to run your own example.
##########################################################################

USER_ID = 'YOUR_USER_ID_HERE'
# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = 'YOUR_PAT_HERE'
APP_ID = 'YOUR_APP_ID_HERE'
# Change this to whatever image input you want to add
IMAGE_URL = 'https://samples.clarifai.com/metro-north.jpg'

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (('authorization', 'Key ' + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

post_inputs_response = stub.PostInputs(
service_pb2.PostInputsRequest(
user_app_id=userDataObject,
inputs=[
resources_pb2.Input(
data=resources_pb2.Data(
image=resources_pb2.Image(
url=IMAGE_URL,
allow_duplicate_url=True
)
)
)
]
),
metadata=metadata
)

if post_inputs_response.status.code != status_code_pb2.SUCCESS:
print(post_inputs_response.status)
raise Exception("Post inputs failed, status: " + post_inputs_response.status.description)

Upload Inputs via Bytes

Below is an example of how to add inputs via bytes.

Note

The data must be base64 encoded. When you add a base64 image to our servers, a copy will be stored and hosted on our servers. If you already have an image hosting service, we recommend using it and adding images via the url parameter.

##################################################################################
# In this section, we set the user authentication, app ID, and the location
# of the image we want as an input. Change these strings to run your own example.
##################################################################################

USER_ID = 'YOUR_USER_ID_HERE'
# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = 'YOUR_PAT_HERE'
APP_ID = 'YOUR_APP_ID_HERE'
# Change this to whatever image input you want to add
IMAGE_FILE_LOCATION = 'YOUR_IMAGE_FILE_LOCATION'

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (('authorization', 'Key ' + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

with open(IMAGE_FILE_LOCATION, "rb") as f:
file_bytes = f.read()

post_inputs_response = stub.PostInputs(
service_pb2.PostInputsRequest(
user_app_id=userDataObject,
inputs=[
resources_pb2.Input(
data=resources_pb2.Data(
image=resources_pb2.Image(
base64=file_bytes
)
)
)
]
),
metadata=metadata
)

if post_inputs_response.status.code != status_code_pb2.SUCCESS:
print(post_inputs_response.status)
raise Exception("Post inputs failed, status: " + post_inputs_response.status.description)

Upload Multiple Inputs With IDs

In cases where you have your own id and you only have one item per image, you are encouraged to send inputs with your own id. This will help you later match the input to your own database.

If you do not send an id, one will be created for you. If you have more than one item per image, it is recommended that you put the product id in the metadata.

##################################################################################
# In this section, we set the user authentication, app ID, and the URLs and IDs
# of the images we want as inputs. Change these strings to run your own example.
##################################################################################

USER_ID = 'YOUR_USER_ID_HERE'
# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = 'YOUR_PAT_HERE'
APP_ID = 'YOUR_APP_ID_HERE'
# Change these to whatever inputs you want to add
IMAGE_URL_1 = 'https://samples.clarifai.com/metro-north.jpg'
IMAGE_URL_2 = 'https://samples.clarifai.com/puppy.jpeg'
INPUT_ID_1 = 'mytrain'
INPUT_ID_2 = 'mypuppy'

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (('authorization', 'Key ' + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

post_inputs_response = stub.PostInputs(
service_pb2.PostInputsRequest(
user_app_id=userDataObject,
inputs=[
resources_pb2.Input(
id=INPUT_ID_1,
data=resources_pb2.Data(
image=resources_pb2.Image(
url=IMAGE_URL_1,
allow_duplicate_url=True
)
)
),
resources_pb2.Input(
id=INPUT_ID_2,
data=resources_pb2.Data(
image=resources_pb2.Image(
url=IMAGE_URL_2,
allow_duplicate_url=True
)
)
),
]
),
metadata=metadata
)

if post_inputs_response.status.code != status_code_pb2.SUCCESS:
print("There was an error with your request!")
for input_object in post_inputs_response.inputs:
print("Input " + input_object.id + " status:")
print(input_object.status)
raise Exception("Post inputs failed, status: " + post_inputs_response.status.description)

Upload Inputs With Concepts

You can add inputs with concepts via URLs or bytes. Concepts play an important role in creating your own models. Concepts also help you search for inputs.

When you add a concept to an input, you need to indicate whether the concept is present in it or not.

##################################################################################
# In this section, we set the user authentication, app ID, and the input to add
# with concept. Change these strings to run your own example.
##################################################################################

USER_ID = 'YOUR_USER_ID_HERE'
# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = 'YOUR_PAT_HERE'
APP_ID = 'YOUR_APP_ID_HERE'
# Change these to whatever input and concept you want to add
IMAGE_URL = 'https://samples.clarifai.com/puppy.jpeg'
CONCEPT_ID = 'charlie'

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (('authorization', 'Key ' + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

post_inputs_response = stub.PostInputs(
service_pb2.PostInputsRequest(
user_app_id=userDataObject,
inputs=[
resources_pb2.Input(
data=resources_pb2.Data(
image=resources_pb2.Image(
url=IMAGE_URL,
allow_duplicate_url=True
),
concepts=[resources_pb2.Concept(id=CONCEPT_ID, value=1.)]
)
)
]
),
metadata=metadata
)

if post_inputs_response.status.code != status_code_pb2.SUCCESS:
print(post_inputs_response.status)
raise Exception("Post inputs failed, status: " + post_inputs_response.status.description)

Upload Inputs With Multiple Concepts

You can also add an input with multiple concepts in a single API call. You can provide the concepts in a list and iterate through it.

You can add the inputs via URLs or bytes.

##################################################################################
# In this section, we set the user authentication, app ID, and the input to add
# with concepts. Change these strings to run your own example.
##################################################################################

USER_ID = 'YOUR_USER_ID_HERE'
# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = 'YOUR_PAT_HERE'
APP_ID = 'YOUR_APP_ID_HERE'
# Change these to whatever input and concepts you want to add
IMAGE_URL = 'https://samples.clarifai.com/puppy.jpeg'
CONCEPT_IDS_LIST = ['one', 'two', 'three', 'four', 'five', 'six']

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (('authorization', 'Key ' + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

post_inputs_response = stub.PostInputs(
service_pb2.PostInputsRequest(
user_app_id=userDataObject,
inputs=[
resources_pb2.Input(
data=resources_pb2.Data(
image=resources_pb2.Image(
url=IMAGE_URL,
allow_duplicate_url=True
),
# We use Python list comprehension to iterate through the list of concepts
concepts=[resources_pb2.Concept(id=str(i), value=1.) for i in CONCEPT_IDS_LIST]
)
)
]
),
metadata=metadata
)

if post_inputs_response.status.code != status_code_pb2.SUCCESS:
print(post_inputs_response.status)
raise Exception("Post inputs failed, status: " + post_inputs_response.status.description)

Upload Inputs With Custom Metadata

In addition to adding an input with concepts, you can also add an input with custom metadata. This metadata will then be searchable. Metadata can be any arbitrary JSON.

If you have more than one item per image, it is recommended to put the id in the metadata like:

{
"product_id": "xyz"
}
####################################################################################
# In this section, we set the user authentication, app ID, and the custom metadata
# and input we want to add. Change these strings to run your own example.
####################################################################################

USER_ID = 'YOUR_USER_ID_HERE'
# Your PAT (Personal Access Token) can be found in the Account's Security section
PAT = 'YOUR_PAT_HERE'
APP_ID = 'YOUR_APP_ID_HERE'
# Change these to whatever input and custom metadata you want to add
CUSTOM_METADATA = {"id": "id001", "type": "animal", "size": 100}
IMAGE_URL = 'https://samples.clarifai.com/puppy.jpeg'

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2
from google.protobuf.struct_pb2 import Struct

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (('authorization', 'Key ' + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

input_metadata = Struct()

input_metadata.update(CUSTOM_METADATA)

post_inputs_response = stub.PostInputs(
service_pb2.PostInputsRequest(
user_app_id=userDataObject,
inputs=[
resources_pb2.Input(
data=resources_pb2.Data(
image=resources_pb2.Image(
url=IMAGE_URL,
allow_duplicate_url=True
),
metadata=input_metadata
)
)
]
),
metadata=metadata
)

if post_inputs_response.status.code != status_code_pb2.SUCCESS:
print(post_inputs_response.status)
raise Exception("Post inputs failed, status: " + post_inputs_response.status.description)

Upload Inputs From Cloud Storage

You can add inputs from various cloud storage platforms, such as S3 (Amazon Simple Storage Service) and GCP (Google Cloud Platform), by simply providing their corresponding URLs. In cases where access credentials are necessary, you can include them as part of the request.

This simplifies the process of adding inputs to our platform, offering a more efficient alternative to the conventional method of using the PostInputs endpoint for users who already have data stored in the cloud platforms.

note

This functionality has been introduced starting from the 10.1 release.

info
  • Image files stored in the cloud platforms will be treated as image inputs, video files as video inputs, etc. Archives will be extracted, and their contents will also be processed like this.

  • We do not support extraction of archives located inside other archives.

  • The cloud URL will serve as a filter prefix. For instance, in the case of an S3 URL like s3:/bucket/images_folder/abc, files within the images_folder will be processed starting with abc, or within a subfolder beginning with abc. For example, files such as bucket/images_folder/abcImage.png or bucket/images_folder/abc-1/Data.zip will be processed accordingly.

Upload Inputs via Cloud Storage URLs

Below is an example of pulling inputs from a subfolder of an S3 bucket.

######################################################################################################
# In this section, we set the user authentication, app ID, ID to collect statistics about inputs job
# to be created, and cloud storage URL. Change these strings to run your own example.
######################################################################################################

USER_ID = "YOUR_USER_ID_HERE"
# Your PAT (Personal Access Token) can be found in the Portal under Account > Security
PAT = "YOUR_PAT_HERE"
APP_ID = "YOUR_APP_ID_HERE"
# Change these to create your own extraction job
INPUTS_JOB_ID = "" # If empty, ID will be autogenerated; if non-empty, the given ID will be used
CLOUD_STORAGE_URL = "s3://samples.clarifai.com/storage/"

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (("authorization", "Key " + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

post_inputs_response = stub.PostInputsDataSources(
service_pb2.PostInputsDataSourcesRequest(
user_app_id=userDataObject,
app_pat=PAT,
data_sources=[
resources_pb2.InputsDataSource(
inputs_add_job_id=INPUTS_JOB_ID,
url=resources_pb2.DataSourceURL(
url=CLOUD_STORAGE_URL,
# Uncomment to add credentials
# credentials=resources_pb2.DataSourceCredentials(
# s3_creds=resources_pb2.AWSCreds(
# id="ADD_ACCESS_ID_HERE",
# secret="ADD_SECRET_HERE",
# region="ADD_AWS_REGION_HERE"
# )
# If using GCP
# gcpCreds="" # GCP uses service account key data (creds.json) as Byte array for authentication
# ),
),
)
],
),
metadata=metadata,
)

if post_inputs_response.status.code != status_code_pb2.SUCCESS:
print(post_inputs_response.status)
raise Exception(
"Post inputs failed, status: " + post_inputs_response.status.description
)

print(post_inputs_response)
Output Example
status {
code: SUCCESS
description: "Ok"
req_id: "8759d87e31403bbd838794fe6016f36d"
}
inputs_add_jobs {
id: "2581ebd8d7cd42e7ac0da2bec14d5426"
progress {
}
created_at {
seconds: 1708361354
nanos: 820114719
}
modified_at {
seconds: 1708361354
nanos: 847655746
}
extraction_jobs {
status {
code: JOB_QUEUED
description: "Job is queued to be ran."
}
id: "2a6f1f69cced42029986a72009e7d4da"
url: "s3://samples.clarifai.com/storage/"
progress {
}
created_at {
seconds: 1708361354
nanos: 835105396
}
modified_at {
seconds: 1708361354
nanos: 835105396
}
}
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
}

Track Upload Process

After starting to pull the inputs from a cloud storage service, you can track the progress of the exercise. Note that we’ll use the inputs_extraction_job_id returned after running the extraction job.

###################################################################################################
# In this section, we set the user authentication, app ID, and the inputs extraction job ID.
# Change these strings to run your own example.
###################################################################################################

USER_ID = "YOUR_USER_ID_HERE"
# Your PAT (Personal Access Token) can be found in the Portal under Account > Security
PAT = "YOUR_PAT_HERE"
APP_ID = "YOUR_APP_ID_HERE"
# Change this ID to whatever inputs you want to track their upload process
INPUTS_EXTRACTION_JOB_ID = "2a6f1f69cced42029986a72009e7d4da"

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (("authorization", "Key " + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

get_inputs_extraction_response = stub.GetInputsExtractionJob(
service_pb2.GetInputsExtractionJobRequest(
user_app_id=userDataObject,
inputs_extraction_job_id=INPUTS_EXTRACTION_JOB_ID
),
metadata=metadata,
)

if get_inputs_extraction_response.status.code != status_code_pb2.SUCCESS:
print(get_inputs_extraction_response.status)
raise Exception(
"Get input failed, status: " + get_inputs_extraction_response.status.description
)

print(get_inputs_extraction_response)
Output Example
status {
code: SUCCESS
description: "Ok"
req_id: "bae1f832c8931d47388f875653e7035d"
}
inputs_extraction_job {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "2a6f1f69cced42029986a72009e7d4da"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708361354
nanos: 835105000
}
modified_at {
seconds: 1708361355
nanos: 386004000
}
}

List Inputs Extraction Jobs

You can list all your inputs extraction jobs and get their details.

##################################################################
# In this section, we set the user authentication and app ID.
# Change these strings to run your own example.
###################################################################

USER_ID = "YOUR_USER_ID_HERE"
# Your PAT (Personal Access Token) can be found in the Portal under Account > Security
PAT = "YOUR_PAT_HERE"
APP_ID = "YOUR_APP_ID_HERE"

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (("authorization", "Key " + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

list_inputs_extraction_jobs = stub.ListInputsExtractionJobs(
service_pb2.ListInputsExtractionJobsRequest(
user_app_id=userDataObject, per_page=1000, page=1
),
metadata=metadata,
)

if list_inputs_extraction_jobs.status.code != status_code_pb2.SUCCESS:
print(list_inputs_extraction_jobs.status)
raise Exception(
"List input failed, status: " + list_inputs_extraction_jobs.status.description
)

print(list_inputs_extraction_jobs)
Output Example
----
inputs_extraction_jobs {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "487d863784804390a92e1108ee1ae1fb"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708406450
nanos: 685101000
}
modified_at {
seconds: 1708406451
nanos: 191007000
}
}
inputs_extraction_jobs {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "16d65cdff5d64ae8ba94ae59f5d7f43c"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708406156
nanos: 2926000
}
modified_at {
seconds: 1708406156
nanos: 560108000
}
}
inputs_extraction_jobs {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "423b4dfa36f64fffbe79cf845918d4c0"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708405684
nanos: 297689000
}
modified_at {
seconds: 1708405684
nanos: 778885000
}
}
inputs_extraction_jobs {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "a5af6a185ab148d4b7eb02e713d3340d"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708405639
nanos: 186106000
}
modified_at {
seconds: 1708405639
nanos: 696943000
}
}
inputs_extraction_jobs {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "1c10da09706d40448bf11fc5aaa8664b"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708405297
nanos: 953730000
}
modified_at {
seconds: 1708405298
nanos: 506209000
}
}
inputs_extraction_jobs {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "10ad7ba72e5e49899a042637178c9452"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708404787
nanos: 575667000
}
modified_at {
seconds: 1708404788
nanos: 141744000
}
}
inputs_extraction_jobs {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "8d7a240f39494ce18c3a5f4aeea687c1"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708403207
nanos: 89134000
}
modified_at {
seconds: 1708403207
nanos: 729276000
}
}
inputs_extraction_jobs {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "2a6f1f69cced42029986a72009e7d4da"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708361354
nanos: 835105000
}
modified_at {
seconds: 1708361355
nanos: 386004000
}
}
inputs_extraction_jobs {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "6db64516daf04abd97852407f9076e42"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708361312
nanos: 309789000
}
modified_at {
seconds: 1708361313
nanos: 435552000
}
}
inputs_extraction_jobs {
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
id: "7e4bd42e84294e8f9423e0a01783e3b1"
url: "s3://samples.clarifai.com/storage/"
progress {
image_inputs_count: 3
video_inputs_count: 1
}
created_at {
seconds: 1708354769
nanos: 17131000
}
modified_at {
seconds: 1708354769
nanos: 473323000
}
input_template {
data {
concepts {
id: "lamborghini23_A"
value: 1
}
concepts {
id: "spiderman_a"
value: 1
}
metadata {
fields {
key: "id"
value {
string_value: "id001"
}
}
}
}
dataset_ids: "dataset-1"
}
}
-----

Cancel Extraction Jobs

You can cancel the process of extraction of inputs from a cloud storage service. Note that we’ll use the inputs_extraction_job_id returned after starting the extraction process.

#####################################################################################################
# In this section, we set the user authentication, app ID, and the inputs extraction job ID.
# Change these strings to run your own example.
#####################################################################################################

USER_ID = "YOUR_USER_ID_HERE"
# Your PAT (Personal Access Token) can be found in the Portal under Account > Security
PAT = "YOUR_PAT_HERE"
APP_ID = "YOUR_APP_ID_HERE"
# Change this ID to whatever inputs you want to cancel their upload process
INPUTS_EXTRACTION_JOB_ID = "2a6f1f69cced42029986a72009e7d4da"

##########################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##########################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (("authorization", "Key " + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

cancel_inputs_extraction_response = stub.CancelInputsExtractionJobs(
service_pb2.CancelInputsExtractionJobsRequest(
user_app_id=userDataObject, ids=[INPUTS_EXTRACTION_JOB_ID]
),
metadata=metadata,
)

if cancel_inputs_extraction_response.status.code != status_code_pb2.SUCCESS:
print(cancel_inputs_extraction_response.status)
raise Exception(
"Cancel input failed, status: "
+ cancel_inputs_extraction_response.status.description
)

print(cancel_inputs_extraction_response)

Upload Inputs With Concepts and Datasets

You can also add inputs from cloud storage platforms while attaching relevant concepts, assigning them to an already existing dataset, or adding other metadata information to them.

The input_template parameter allows you to do that.

#####################################################################################################
# In this section, we set the user authentication, app ID, and the details of the extraction job.
# Change these strings to run your own example.
####################################################################################################

USER_ID = "YOUR_USER_ID_HERE"
# Your PAT (Personal Access Token) can be found in the Portal under Account > Security
PAT = "YOUR_PAT_HERE"
APP_ID = "YOUR_APP_ID_HERE"
# Change these to make your own extraction
INPUTS_JOB_ID = ""
CLOUD_STORAGE_URL = "s3://samples.clarifai.com/storage/"
CUSTOM_METADATA = {"id": "id001"}
DATASET_ID_1 = "dataset-1"
CONCEPT_ID_1 = "lamborghini23_A"
CONCEPT_ID_2 = "spiderman_a"

##############################################################################
# YOU DO NOT NEED TO CHANGE ANYTHING BELOW THIS LINE TO RUN THIS EXAMPLE
##############################################################################

from clarifai_grpc.channel.clarifai_channel import ClarifaiChannel
from clarifai_grpc.grpc.api import resources_pb2, service_pb2, service_pb2_grpc
from clarifai_grpc.grpc.api.status import status_code_pb2
from google.protobuf.struct_pb2 import Struct

channel = ClarifaiChannel.get_grpc_channel()
stub = service_pb2_grpc.V2Stub(channel)

metadata = (("authorization", "Key " + PAT),)

userDataObject = resources_pb2.UserAppIDSet(user_id=USER_ID, app_id=APP_ID)

input_metadata = Struct()

input_metadata.update(CUSTOM_METADATA)

post_inputs_response = stub.PostInputsDataSources(
service_pb2.PostInputsDataSourcesRequest(
user_app_id=userDataObject,
app_pat=PAT,
data_sources=[
resources_pb2.InputsDataSource(
inputs_add_job_id=INPUTS_JOB_ID,
url=resources_pb2.DataSourceURL(url=CLOUD_STORAGE_URL),
input_template=resources_pb2.Input(
dataset_ids=[DATASET_ID_1], # List of dataset IDs that this input is part of
data=resources_pb2.Data(
metadata=input_metadata,
concepts=[
resources_pb2.Concept(id=CONCEPT_ID_1, value=1),
resources_pb2.Concept(id=CONCEPT_ID_2, value=1),
],
),
),
)
],
),
metadata=metadata,
)


if post_inputs_response.status.code != status_code_pb2.SUCCESS:
print(post_inputs_response.status)
raise Exception(
"Post inputs failed, status: " + post_inputs_response.status.description
)

print(post_inputs_response)
Output Example
status {
code: SUCCESS
description: "Ok"
req_id: "32694c6a3ef8fe3f6704502c0b053734"
}
inputs_add_jobs {
id: "66b5ca001e754111a81c4839cdabed10"
progress {
}
created_at {
seconds: 1708500170
nanos: 508992497
}
modified_at {
seconds: 1708500170
nanos: 582792601
}
extraction_jobs {
status {
code: JOB_QUEUED
description: "Job is queued to be ran."
}
id: "7e9b139f65fb4426a3d273d609758d34"
url: "s3://samples.clarifai.com/storage/"
progress {
}
created_at {
seconds: 1708500170
nanos: 550291872
}
modified_at {
seconds: 1708500170
nanos: 550291872
}
input_template {
data {
concepts {
id: "lamborghini23_A"
value: 1
}
concepts {
id: "spiderman_a"
value: 1
}
metadata {
fields {
key: "id"
value {
string_value: "id001"
}
}
}
}
dataset_ids: "dataset-1"
}
}
status {
code: JOB_COMPLETED
description: "Job successfully ran."
}
}