> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tracebloc.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Prerequisites

> Supported data types, tasks, and requirements for creating a use case on tracebloc.

tracebloc is continually expanding supported data types and tasks to enable your use cases. In case your use case is not yet supported, please reach out to us at [support@tracebloc.io](mailto:support@tracebloc.io).

Before you can create a use case on the [tracebloc website](https://ai.tracebloc.io/my-use-cases), make sure the following requirements are met:

* You are registered as a user on the tracebloc platform
* You have a client deployed using kubernetes either [locally or in the cloud](/environment-setup/setup-guide)
* Your dataset is cleaned and preprocessed
* You are familiar with the supported data types and tasks

Once these requirements are met, proceed with:

* [Preparing and ingesting the dataset](/create-use-case/prepare-dataset)
* [Defining the use case](/create-use-case/define)
* [Setting evaluation metrics](/create-use-case/define#supported-metrics-per-data-type-and-task)
* [Evaluating models](/create-use-case/evaluate-models)

## Supported Data Types and Tasks

### Image Data

**Requirements for all image data tasks:** Uniform image sizes and uniform file types. For example all images as 256x256 rgb .jpg files. Convert files if necessary and in case your images do not fit the supported size, crop or resize accordingly.

<Note>
  **Filenames in the label CSV:** For all image and text tasks, the `filename` column in the label CSV **must not include the file extension** (e.g. use `cat01`, not `cat01.jpg`). The extension is configured once on the ingestor side via `file_options.extension` in the template and applied to every row at ingestion time.
</Note>

All images are validated before ingestion by the data ingestor. The ingestion process only starts when every file meets the requirements. Fix or remove any invalid images, then retry.

For object detection, images and annotations are automatically up- or downsized to 448x448 pixels.

| Task                  | Input file type | Color mode                                                   | Supported image size                                                                        | Label file type | Requirements                                                            | Links                                                                                                                                                        |
| --------------------- | --------------- | ------------------------------------------------------------ | ------------------------------------------------------------------------------------------- | --------------- | ----------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Classification        | PNG, JPG, JPEG  | rgb (3 channels) or grayscale (1 channel), 8-bit per channel | Square (height = width)                                                                     | CSV             | Uniform image size and file type per dataset                            | [Detailed structure](#image-classification) <br />[Example](https://github.com/tracebloc/data-ingestors/tree/develop/templates/image_classification)         |
| Keypoint Detection    | PNG, JPG, JPEG  | rgb (3 channels) or grayscale (1 channel), 8-bit per channel | Square (height = width)                                                                     | CSV             | Uniform image sizes <br /> Same number of keypoints per image and class | [Detailed structure](#image-keypoint-detection) <br />[Example](https://github.com/tracebloc/data-ingestors/tree/develop/templates/keypoint_detection)       |
| Object Detection      | PNG, JPG, JPEG  | rgb (3 channels) or grayscale (1 channel), 8-bit per channel | Square (height = width). Images and annotations will be resized to 448x448 px automatically | Pascal VOC      | Uniform image sizes, one xml per image                                  | [Detailed structure](#image-object-detection) <br />[Example](https://github.com/tracebloc/data-ingestors/tree/develop/templates/object_detection)           |
| Semantic Segmentation | PNG, JPG, JPEG  | rgb (3 channels) or grayscale (1 channel), 8-bit per channel | Square (height = width)                                                                     | PNG, JPG, JPEG  | Uniform image and mask sizes                                            | [Detailed structure](#image-semantic-segmentation) <br />[Example](https://github.com/tracebloc/data-ingestors/tree/develop/templates/semantic_segmentation) |

### Image Classification

```structure theme={null}
train/
    labels.csv
    images/
        cat01.jpg
        dog02.jpg
        ...
test/
    ...
```

```labels.csv theme={null}
filename,label
cat01,cat
dog02,dog
...
```

The `filename` column must not include the file extension. Set the expected extension once via `file_options.extension` in the ingestor template.

### Image Keypoint Detection

The number of keypoints per class and per image must be fixed. For example, in a person/car keypoint detection project, both classes must define the same keypoints, and every image must contain the full set for its class. You cannot mix classes with different keypoint counts (e.g., 16 for person and 32 for car) or annotate some images with fewer keypoints for the same class.

```structure theme={null}
train/
    annotations.csv
    images/
        image01.png
        image02.png
        ...
test/
    ...
```

```annotations.csv theme={null}
filename,label,x,y,visibility
image01,person,100,150,2
image01,car,120,140,1
image02,person,95,155,0
image02,car,115,145,2
```

* **X and Y** determine the X-/Y-coordinates of a keypoint
* **Visibility** indicates whether a keypoint is visible in the image or not: 0 = not visible (point outside the image or point is in the image but occluded), 1 = visible
* **Filename** should not include the file extension.

### Image Object Detection

The filename like "street01.png" specifies the link between images and annotations. XML-file annotations are in Pascal VOC format. The labels.csv contains a global list of all images and objects. Currently, the images and annotations are resized to 448x448 pixels.

```structure theme={null}
train/
    labels.csv
    images/
        street01.png
        street02.png
        ...
    annotations/
        street01.xml
        street02.xml
        ...
test/
    ...
```

```xml theme={null}
<annotation>
    <filename>street01.jpg</filename>
    <object>
        <name>car</name>
        <bndbox>
            <xmin>100</xmin>
            <ymin>200</ymin>
            <xmax>300</xmax>
            <ymax>400</ymax>
        </bndbox>
    </object>
    <object>
        ...
    </object>
</annotation>
```

Each row represents one detected object, not one image. An image with multiple objects will have multiple rows.

```labels.csv theme={null}
filename,image_label
street01,car
street01,car
street01,person
street02,car
...
```

The `filename` column links each row to its image and to the matching XML annotation file. The `image_label` column holds the class name for each object instance — one row per object. Filenames should not include the file extension.

### Image Semantic Segmentation

Each mask is an rgb image whose pixel values map to classes defined in labels.csv. The labels.csv contains a global list per image and class. All masks must exactly match their corresponding image sizes and file names.
For binary segmentation (two classes), provide a single-channel grayscale mask where background pixels are black (0) and foreground pixels are white (255).
For three or more classes, supply an RGB mask where each class is represented by a unique color (or pixel value). The filename should not include the file extension.

```structure theme={null}
train/
    labels.csv
    images/
        scene01.png
        scene02.png
        ...
    masks/
        scene01.png
        scene02.png
        ...
test/
    ...
```

```labels.csv theme={null}
filename,mask_filename,label,colour
image1,mask1,road,#FFFFFF
image1,mask1,background,#000000
image2,mask2,background,#000000
image2,mask2,road,#FFFFFF
...
```

### Tabular Data

**Requirements for all tabular data tasks:** Each dataset must be provided as a single CSV file with a header row. Every column must contain uniform data types, for example numeric values for features and a categorical or alphanumeric column for labels. Use UTF-8 encoding with comma separators and validate that your schema matches the expected types. Invalid rows are skipped by the ingestor.

| Task                     | Data file type                                          | Requirements                                                                                                                                                                      | Links                                                                                                                                                        |
| ------------------------ | ------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| Classification           | CSV (features and label in one single file)             | Uniform data formats per column. <br />Feature columns: Numeric <br />Label columns: Alphanumeric                                                                                 | [Detailed structure](#tabular-classification) <br />[Example](https://github.com/tracebloc/data-ingestors/tree/develop/templates/tabular_classification)     |
| Regression               | CSV (features and label in one single file)             | Uniform data formats per column. <br />Feature columns: Numeric <br />Label column: Numeric (continuous target)                                                                   | [Detailed structure](#tabular-regression) <br />[Example](https://github.com/tracebloc/data-ingestors/tree/develop/templates/tabular_regression)             |
| Time Series Forecasting  | CSV (timestamp, features and target in one single file) | A timestamp column in a parsable format (e.g. `YYYY-MM-DD` or ISO 8601). <br />Feature columns: Numeric <br />Target column: Numeric                                              | [Detailed structure](#time-series-forecasting) <br />[Example](https://github.com/tracebloc/data-ingestors/tree/develop/templates/time_series_forecasting)   |
| Time to Event Prediction | CSV (features, time and event in one single file)       | A `time` column (duration until event or censoring, integer or numeric). <br />An event column (binary 0/1 indicating whether the event occurred). <br />Feature columns: Numeric | [Detailed structure](#time-to-event-prediction) <br />[Example](https://github.com/tracebloc/data-ingestors/tree/develop/templates/time_to_event_prediction) |

### Tabular Classification

Include a header row with clear column names, using a dedicated column for the labels. An `id` column is recommended but not required.

```csv theme={null}
id,feature1,feature2,feature3,label
1,1.5,2.3,0.8,class_a
2,2.1,1.9,1.2,class_b
3,0.9,3.1,0.5,class_a
...
```

### Tabular Regression

Same structure as Tabular Classification, but the label column holds a continuous numeric target (not a class).

```csv theme={null}
id,square_feet,bedrooms,age,price
1,1668.08,3,15,285.50
2,1701.78,4,12,320.75
3,1697.01,2,8,245.30
...
```

### Time Series Forecasting

Provide a single CSV with a timestamp column, one or more numeric feature columns, and the numeric target column you want to forecast. Rows must be ordered by time and use a consistent timestamp format.

```csv theme={null}
timestamp,feature_1,feature_2,target
2023-10-01,7,1,125.50
2023-10-02,1,0,132.30
2023-10-03,2,0,128.75
...
```

### Time to Event Prediction

Provide a single CSV with feature columns, a `time` column (duration to event or censoring), and a binary event column (1 = event occurred, 0 = censored).

```csv theme={null}
age,feature_1,feature_2,time,event
75,0,1.9,4,1
55,0,1.1,6,1
65,0,1.3,7,0
...
```

### Text Data

| Task           | Input files | Label file type | Requirements               | Links                                                                                                                                              |
| -------------- | ----------- | --------------- | -------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
| Classification | TXT         | CSV             | Text file may not be empty | [Detailed structure](#text-classification) <br />[Example](https://github.com/tracebloc/data-ingestors/tree/develop/templates/text_classification) |

### Text Classification

The `filename` column must not include the file extension. The extension is set once via `file_options.extension` in the ingestor template (e.g. `FileExtension.TXT`).

```structure theme={null}
train/
    labels.csv
    texts/
        review01.txt
        review02.txt
        ...
test/
    ...
```

```labels.csv theme={null}
filename,label
review01,positive
review02,negative
...
```

```text file example theme={null}
# review01.txt
This product is amazing! I love it.
```

***

## Next Steps

* Prepare and ingest your dataset: [Prepare Data](/create-use-case/prepare-dataset)

***

## Need Help?

* Email us at [support@tracebloc.io](mailto:support@tracebloc.io)
