Job Configuration API
After you start a job, you can call these optional APIs to view and update the configurations of the job before running:
Sheet Selection
The Sheet Selection API return the currently selected sheet (in an Excel workbook), and to configure which sheet to select.
getSelectedSheet
To get the selected sheet and a list of available sheets:
- JavaScript
- API
const sheetSelectionResponse = segna.getSheetSelection(jobId);
curl -X GET https://backend.segna.io/public/client-side/v1/selected-sheet/{jobId} \
-H 'x-api-key: YOUR_API_KEY'
Example Response:
{
"sheetSelection": {
"selected": "Sheet Name 1",
"confidence": 0.5,
"sheets": [
"Sheet Name 1",
"Sheet Name 2",
"Sheet Name 3"
]
}
}
updateSelectedSheet
To configure the selected sheet:
- JavaScript
- API
segna.updateSheetSelection(jobId, {
sheetSelection: {
selected: "Sheet Name 2"
}
});
warning
You can only select from the sheets listed under the sheets
key found on the response of getSheetSelection
curl -X POST https://backend.segna.io/public/client-side/v1/fields/{jobId} \
-H 'x-api-key: YOUR_API_KEY' \
-d '{
"sheetSelection": {
"selected": "Sheet Name 1",
}
}'
Fields
The Fields API returns the current mapping between the input data columns to the desired output columns (also known as fields).
getFields
- JavaScript
- API
const fieldsResponse = segna.getFields(jobId);
curl -X GET https://backend.segna.io/public/client-side/v1/fields/{jobId} \
-H 'x-api-key: YOUR_API_KEY'
Example Response:
{
"expectedFields": ["field 1", "field 2"],
"optionalFields": ["optionalField1"],
"inputColumns": ["column 1", "column 2", "column 3"],
"columnFieldMapping": {
"dataSource 1": {
"column 1": {
"field": "field 1",
"confidence": 0.7,
},
"column 2": {
"field": "field 2",
"confidence": 0.6,
}
}
}
}
updateFields
- JavaScript
- API
segna.updateFields(jobId, {
columnFieldMapping: {
"dataSource 1": {
"column 1": {
field: "field 2", // Correct the field mapping
},
"column 2": {
field: "field 1",
}
}
},
allowColumnCombining: false
});
curl -X POST https://backend.segna.io/public/client-side/v1/fields/{jobId} \
-H 'x-api-key: YOUR_API_KEY' \
-d '{
"columnFieldMapping": {
"dataSource 1": {
"column 1": {
"field": "field 2", // Correct the field mapping
},
"column 2": {
"field": "field 1",
}
}
},
allowColumnCombining: false
}'
Optional Parameters:
allowColumnCombining
is an optional flag that can be passed when updating the fields mapping. By enablingallowColumnCombining
, you are allowed to map multiple input columns to a field. These input columns will then get combined with a separator.
Column Combining Methods
info
This API is only useful if there are multiple columns mapped to a field. Please refer to the Fields API above.
The Column Combining Methods API defines the method, configuration, and order of columns to be combined into a field.
getColumnCombiningMethods
- JavaScript
- API
const columnCombiningMethodsResponse = segna.getColumnCombiningMethods(jobId);
curl -X GET https://backend.segna.io/public/client-side/v1/column-combining-methods/{jobId} \
-H 'x-api-key: YOUR_API_KEY'
Example Response:
{
"columnsBeingCombined": {
"field 1": {
"method": "stringAppend",
"methodConfig": {
"separator": "_"
},
"combiningOrder": [
"column 1", "column 3"
],
},
"field 2": {
"method": "json",
"methodConfig": {},
"combiningOrder": [
"column 2", "column 4"
],
},
}
}
updateColumnCombiningMethods
- JavaScript
- API
segna.updateColumnCombiningMethods(jobId, {
columnsBeingCombined: {
"field 1": {
method: "stringAppend",
methodConfig: {
separator: "_"
},
},
"field 2": {
method: "JSON",
combiningOrder: [
"column 2", "column 4"
],
},
}
});
curl -X POST https://backend.segna.io/public/client-side/v1/column-combining-methods/{jobId} \
-H 'x-api-key: YOUR_API_KEY' \
-d '{
"columnsBeingCombined": {
"field 1": {
"method": "stringAppend",
"methodConfig": {
"separator": "_"
},
},
"field 2": {
"method": "JSON",
"combiningOrder": [
"column 2", "column 4"
],
},
}
}'
The following column combining methods are allowed:
stringAppend
json
methodConfig
is an optional arugment which can be used to, for example, specify a separator when combining values.
combiningOrder
is an optional argument which specifies the order in which columns are combined together.
stringAppend
This is the default column combining method for when two columns are mapped to the same field.
stringAppend
will cast the contents of the specified columns as strings and concatenate them using the separator
defined in the methodConfig
. The default separator is an empty string. For example, consider the following dataframe:
col1 | col2 |
---|---|
Apples | 1.99 |
Bananas | 1.89 |
Specifying col1
and col2
to map to the field1
in the updateFields
command will result in the following dataframe:
field1 |
---|
Apples1.99 |
Bananas1.89 |
Specifying combiningOrder
with ["col2", "col1"]
and the methodConfig
separator
as "_"
would result in
field1 |
---|
1.99_Apples |
1.89_Bananas |
JSON
Columns can be combined into a JSON object. The same dataframe above with the json
combining method would be:
field1 |
---|
{ "col1: "Apples", "col2": 1.99 } |
{ "col1": "Bananas", "col2": 1.89 } |
info
All parameters (method
, methodConfig
, combiningOrder
) are optional and if unspecified the default values (retrieved from getColumnCombiningMethods()
) will be used.
Data Types
The Data Types API defines the current data type of each field.
getDataTypes
- JavaScript
- API
const dataTypesResponse = segna.getDataTypes(jobId);
curl -X GET https://backend.segna.io/public/client-side/v1/field-data-types/{jobId} \
-H 'x-api-key: YOUR_API_KEY'
Example Response:
{
"fieldDataTypes": {
"field 1": {
"dataType": "rich_text",
"confidence": 0.6
},
"field 2": {
"dataType": "number",
"confidence": 0.8
}
},
"expectedFieldDataTypes": {
"field 1": "number",
"field 2": "rich_text"
}
}
updateDataTypes
- JavaScript
- API
segna.updateDataTypes(jobId, {
fieldDataTypes: {
"field 1": {
dataType: DATATYPE.CATEGORY,
},
"field 2": {
dataType: DATATYPE.DATE_TIME,
}
}
});
curl -X POST https://backend.segna.io/public/client-side/v1/field-data-types/{jobId} \
-H 'x-api-key: YOUR_API_KEY' \
-d '{
"fieldDataTypes": {
"field 1": {
"dataType": "category",
},
"field 2": {
"dataType": "date_time",
}
}
}'
Valid data types are:
DATATYPE.NUMBER
or"number"
DATATYPE.RICH_TEXT
or"rich_text"
DATATYPE.CATEGORY
or"category"
DATATYPE.DATE_TIME
or"date_time"
Unit Conversion
The Unit Conversion API defines the desired units for each field. For each field, both from
and to
units need to be specified to perform the conversion.
Note: Unit Conversion is not set by default.
getUnits
- JavaScript
- API
const unitsResponse = segna.getUnits(jobId);
curl -X GET https://backend.segna.io/public/client-side/v1/units/{jobId} \
-H 'x-api-key: YOUR_API_KEY'
Example Response:
{
"fieldUnitConversions": {
"field 1": {
"columnName": "column 1",
"to": null,
"from": null,
"confidence": 1.0
},
"field 2": {
"columnName": "column 2",
"to": null,
"from": null,
"confidence": 1.0
},
},
"expectedFieldUnits": {
"field1 1": null,
"field 2": null,
}
}
updateUnits
- JavaScript
- API
segna.updateUnits(jobId, {
fieldUnitConversions: {
"field 1": {
to: "UTC_p6",
from: "UTC_m4"
},
"field 2": {
to: null,
from: null
}
}
});
curl -X POST https://backend.segna.io/public/client-side/v1/units/{jobId} \
-H 'x-api-key: YOUR_API_KEY' \
-d '{
"fieldUnitConversions": {
"field 1": {
"to": "UTC_p6",
"from": "UTC_m4"
},
"field 2": {
"to": null,
"from": null
}
}
}'
The units API supports converting between different timezones by specifying UTC deltas. For example, a unit string of "UTC_p6"
means +6 hours relative to UTC time. "UTC_m6"
is -6 hours relative to UTC time.
Impute Methods
The Impute Methods API defines the imputation method for each field. An imputation method can be specified to fill in any missing values.
Note: Impute Methods is not set by default.
getImputeMethods
- JavaScript
- API
const imputeMethodsResponse = segna.getImputeMethods(jobId);
curl -X GET https://backend.segna.io/public/client-side/v1/impute-methods/{jobId} \
-H 'x-api-key: YOUR_API_KEY'
Example Response:
{
"imputeMethods": {
"field 1": {
"imputeMethod": null,
"confidence": 1.0
},
"field 2": {
"imputeMethod": null,
"confidence": 1.0
},
},
"expectedImputeMethods": {
"field 1": null,
"field 2": null,
}
}
updateImputeMethods
- JavaScript
- API
segna.updateImputeMethods(jobId, {
imputeMethods: {
"field 1": {
imputeMethod: "fill",
imputeMethodConfig: {
fillValue: "No Answer"
}
},
"field 2": {
imputeMethod: null
}
}
});
curl -X POST https://backend.segna.io/public/client-side/v1/impute-methods/{jobId} \
-H 'x-api-key: YOUR_API_KEY' \
-d '{
"imputeMethods": {
"field 1": {
"imputeMethod": "fill",
"imputeMethodConfig": {
"fillValue": "No Answer"
}
},
"field 2": {
"imputeMethod": null
}
}
}'
Valid impute methods are:
fill
dropMissing
null
fill
The fill
impute method replaces missing values in the given field with the value specified in the fillValue
argument of the imputeMethodConfig
.
dropMissing
The dropMissing
impute method drops rows where missing values occur in the given field.
null
No changes are made to the data for this field.
Pivot Methods
The Pivot API defines the pivot methods, as well as the pivot, pivot value and index columns.
pivotColumns
defines the pivot column whose values will be the column headers once the data has been pivotedvalueColumns
defines columns to used to populate the values of the pivoted dataindexColumns
defines the columns that will make a unique row index once the data has been pivoted
aggregationMethod
handles how duplicate values in the indexColumns
are handled. Currently the only supported value of aggregationMethod
is "first"
which takes the first row where duplicate values are present.
Note: Pivot Methods is not set by default.
getPivot
- API
curl -X GET https://backend.segna.io/public/client-side/v1/pivot/{jobId} \
-H 'x-api-key: YOUR_API_KEY'
Example Response:
{
"pivot": {
"pivotColumns": {"columnNames": ["col to pivot"], "confidence": 0},
"valueColumns": {"columnNames": ["pivot value column"], "confidence": 0},
"indexColumns": {"columnNames": ["col2", "col3"], "aggregationMethod": "first", "confidence": 0}
}
}
updatePivot
- API
curl -X POST https://backend.segna.io/public/client-side/v1/pivot/{jobId} \
-H 'x-api-key: YOUR_API_KEY' \
-d '{
"pivot": {
"pivotColumns": {
"columnNames": ["col to pivot"]
},
"valueColumns": {
"columnNames": ["pivot value column"]
},
"indexColumns": {
"columnNames": ["col2", "col3"],
"aggregationMethod": "first"
},
}
}'
Row Dropping Methods
The Row Dropping Methods API can be used to specify constraints on the values of a field. By specifying fields you may drop rows for which there are:
- Duplicate values
- Missing values
getRowDroppingMethods
- API
curl -X GET https://backend.segna.io/public/client-side/v1/row-dropping-methods/{jobId} \
-H 'x-api-key: YOUR_API_KEY'
Example Response:
{
"presetRowDroppingMethods": [
{
"method": "duplicateValues",
"methodConfig": {
"fields": ["field1"]
}
},
{
"method": "missingValues",
"methodConfig": {
"fields": ["field5", "field2"]
}
}
],
"additionalRowDroppingMethods": [
{
"method": "missingValues",
"methodConfig": {
"fields": ["field4", "field3"]
},
"confidence": 1
},
]
}
updateRowDroppingMethods
- API
curl -X POST https://backend.segna.io/public/client-side/v1/row-dropping-methods/{jobId} \
-H 'x-api-key: YOUR_API_KEY' \
-d '{
"additionalRowDroppingMethods": [
{
"method": "missingValues",
"methodConfig": {
"fields": ["field7", "field8"]
}
},
{
"method": "duplicateValues",
"methodConfig": {
"fields": ["field9"]
}
},
]
}'
Valid row dropping methods are:
missingValues
duplicateValues
missingValues
This row dropping method will drop rows where missing values occur in the set of fields specified in the methodConfig
. A potential use of this could be to enforce a NOT NULL
database constraint.
duplicateValues
This row dropping method will drop rows where duplicate values occur in the set of fields specified in the methodConfig
. A potential use of this could be to enforce a PRIMARY KEY
or UNIQUE
database constraint.
Text Processing Methods
Text processing methods can be used to perform string operations on columns containing text. Two types of operations are possible: Enforcing the case of text and regular expression based text replacement.
getTextProcessingMethods
- API
curl -X GET https://backend.segna.io/public/client-side/v1/text-processing-methods/{jobId} \
-H 'x-api-key: YOUR_API_KEY'
Example Response:
{
"textProcessingMethods": {
"regexReplacement": {
"field1": {
"^[M|m]$": "Male",
"^[F|f]$": "Female",
},
"field2": {
"^[N.?A|n.?a|.issing]$": "",
}
},
"casing": {
"field3": "upper",
"field4": "lower",
"field5": "capitalize",
}
}
}
updateTextProcessingMethods
- API
curl -X POST https://backend.segna.io/public/client-side/v1/text-processing-methods/{jobId} \
-H 'x-api-key: YOUR_API_KEY' \
-d '{
"textProcessingMethods": {
"regexReplacement": {
"field1": {
"^[M|m]$": "male",
}
},
"casing": {
"field3": "title",
"field4": "swap",
}
}
}'
Two methods of text processing are available:
regexReplacement
casing
regexReplacement
For a given field, specify a regular expression to be matched and replaced with desired text. For example, the below example will replace values in field1 beginning and ending with "M" or "m" with "Male".
{
"textProcessingMethods": {
"regexReplacement": {"field1": {"^[M|m]$": "Male"}}}
}
casing
The casing method converts the case of a field. Supported values for case are:
"upper"
: UPPER CASE"lower"
: lower case"title"
: Title Case (first letter of every word is capitalised)"swap"
: swaps the case between upper and lower depending on what was in the data
Scripts
After adding scripts to a job on the start job configuration, they can be viewed at any time:
getScripts
- JavaScript
- API
const job_scripts = segna.getScripts(jobId);
curl -X GET https://backend.segna.io/public/client-side/v1/scripts/{jobId} \
-H 'x-api-key: YOUR_API_KEY'
Example Response:
{
"preColumnMapping": ["scriptId1", "scriptId2"],
"postClean": ["scriptid3"]
}
See the Scripts section for more information on using scripts in your job.