Dataset Parameters
The Dataset Parameters allow you to customize the dataset to your needs. All of the parameters below are supported for both TensorFlow and PyTorch.
1. datasetId
It is set by default when you link the dataset with a model. You don't need to re-enter this value.
2. totalDatasetSize
This parameter is set by default based on the selected dataset. It describes the total number of images in the dataset.
3. allClasses
This parameter is set by default based on the selected dataset. It describes the total number of images per class in the dataset.
4. trainingDatasetSize
This parameter can be set based on the selected dataset customisation using the trainingClasses
parameter. It describes the total number of images used for training and evaluating the model as per the customisation. By default this is equal to totalDatasetSize
.
5. trainingClasses
This parameter is used to customise the dataset. It takes a dictionary as input. The dictionary contains the class name as key and the number of images to be selected as value. The dictionary must contain all classes with the respective values (number of images for each class) being greater than one.
Example The dataset selected contains the two classes 'car' and 'person' with 65 and 42 images respectively: {'car': 65, 'person': 42} A sub dataset can be created like this:
trainingObject.trainingClasses({'car': 30, 'person': 30})
6. imageShape
This parameter specifies the image shape to be used for training. The value must be an integer between 48 and 224. The default value is 224. Set this parameter like this:
trainingObject.imageShape(124)
7. imageType
This parameter specifies the image type to be used for training. The supported formats are rgb
and grayscale
. The default value is rgb.
Set this parameter like this:
trainingObject.imageType('rgb')
8. seed
This parameter sets the global random seed. The default value is False.
trainingObject.seed(True)
Special Dataset Parameters
There are few methods that are specific to tabular/generic classification use case
1. get_features
This method returns list of all the features and method for interaction available in the dataset.
trainingObject.get_features()
2. feature_interaction
This method allows user to create more features via different methods as available in get_features method list. For each method user as to specify the features in the dictionary for which all the examples are given in the get_features method. This method can be called number of time to create new features, each time this method is called a new feature interaction is entered if it is unique. Default value is [].
trainingObject.feature_interaction({'feature1': 'feature1', 'feature2': 'feature2', 'method':'product'})
3. feature_selection (Coming Soon)
This method allows user to select features out of the feature list for training.
trainingObject.feature_selection(['feature1', 'feature2', 'feature3'])