Skip to main content

Running a Job

Only two API calls are required to run a job and move your user's data to your database - startJob and runJob. Between these two calls, there are also several optional API that can be called to configure the job, and/or get metadata of a job. This provides complete flexibility to match whatever user experience you wish to provide to your users.

Start Job

A jobId is used to begin an upload session, and is used across all API calls. Each upload provides a unique jobId, and can be obtained using the start method.

startJob

const jobId = segna.startJob({
pipelineId: 'YOUR_PIPELINE_ID',
// If you want Segna to choose from a list of pipelines, provide pipelineId as
// ['PIPELINE_ID1', 'PIPELINE_ID2']
jobName? : 'A Job Name',
file?: File, // Data input parameter: one must be specified.
gsUrl?: 'https://docs.google.com/spreadsheets/...', // Data input parameter: one must be specified.
gsSheetName?: 'Sheet1', // Must be specified along with gsUrl
jsonData?: [ // Data input parameter: one must be specified.
{"name": "hannibal", "age": 20},
{"name": "scipio", "age": 21}
],
destinationFileName? : 'File name for output destination', // Optional
useFullData? : false, // Optional
scripts?: { // Optional
preColumnMapping: ['scriptId1', 'scriptId2']
}
webhookUrl? : 'https://your-server-to-pass-metadata-to/' // Optional
outputFileType?: 'excel' // Optional
});

Optional Parameters:

  • Our API can take input data from a local file, through the jsonData parameter, or directly from Google Sheets. If uploading a local file, file must be specified. If importing using Google Sheets, gsUrl and gsSheetName must be specified. If passing JSON data, jsonData must be specified. Note that only one method of data ingestion can be specified at a time.
  • destinationFileName is only relevant if the data is a named file in the output e.g. for an S3 bucket.
  • If useFullData is set to true, all meta data (before the run of the job) will be obtained from the full data, instead of a sample.
  • scripts: used to inject custom scripts created on the segna platform. Currently, the only script injection position is "preColumnMapping". This will cause the scripts to run after the data has been imported but before columns have been mapped to fields.
  • If a webhookURL is provided in the above method, metadata will be sent as a POST request to the URL once the file has been uploaded to our service and the job has started.
  • outputFileType currently supports excel value. Defaults to csv.
Example Webhook Metadata
{
"jobId": "YOUR_JOB_ID",
"dataLength": {
"file.csv": 120
},
"inputColumns": {
"file.csv": [
"Id_Number",
"Name",
"Role"
]
},
"expectedFields": [
"Id",
"Name",
"Role"
],
"columnFieldMapping": {
"file.csv": {
"Id_Number": {
"field": "Id",
"confidence": 0.9
},
"Name": {
"field": "Name",
"confidence": 1
},
"Role": {
"field": "Role",
"confidence": 1
}
}
},
"expectedFieldDataTypes": {
"Id": "number"
"Name": "rich_text",
"Role": "category"
},
"fieldDataTypes": {
"Id": {
"dataType": "number",
"confidence": 1
},
"Name": {
"dataType": "rich_text",
"confidence": 1
},
"Role": {
"dataType": "category",
"confidence": 1
}
},
"uniqueRatio": {
"file.csv": {
"Id_Number": 1,
"Name": 0.9,
"Role": 0.05
}
},
"missingRowIndices": {
"file.csv": {
"Id_Number": [],
"Name": [],
"Role": []
}
}
}


Job Configuration

Once a Job has started, and you have a jobId, you can make calls to multiple Optional Job Configuration Methods that provide both control and visibility of the data before it moves to your output destination.

Once the job is configured and ready to move to your output destination, simply call run.

runJob

segna.runJob(jobId, {
webhookUrl?: string,
returnWhenComplete?: boolean
});

Optional Parameters:

  • If a webhookURL is provided in the above method, a payload containing the processed data will be sent as a POST request to the URL once the job has run.
  • By default, the runJob promise returns as soon as we validate the configuration and begin processing the data. If returnWhenComplete is enabled, the runJob promise will return once the full job is complete (data has been pushed to the output database).
Example Run Job Webhook
{
"jobId": "YOUR_JOB_ID",
"dataTable": [
["Id_Number", "Name", "Role"],
[001, "Simba", "Engineer"],
[002, "Nala", "Engineer"],
[003, "Mufasa", "Manager"],
...
],
"dataTypes": {
"Id_Number": "number",
"Name": "rich_text",
"Role": "category",
}
}