Running a Job
Only two API calls are required to run a job and move your user's data to your database - start
and run
. Between these two calls, there are also several
optional API that can be called to configure the job, and or view metadata of a job. This provides complete flexibility to match whatever user experience you wish to provide to your users.
Start Job
A jobId
is used to begin an upload session, and is used across all API calls. Each upload provides a unique jobId
, and can be obtained using the start
method.
- JavaScript
- API
const jobId = segna.startJob({
pipelineId: 'YOUR_PIPELINE_ID',
// If you want Segna to choose from a list of pipelines, provide pipelineId as
// ['PIPELINE_ID1', 'PIPELINE_ID2']
jobName : 'A Job Name',
file?: File, // Data input parameter: one must be specified.
gsUrl?: 'https://docs.google.com/spreadsheets/...', // Data input parameter: one must be specified.
gsSheetName?: 'Sheet1', // Must be specified along with gsUrl
jsonData?: [ // Data input parameter: one must be specified.
{"name": "hannibal", "age": 20},
{"name": "scipio", "age": 21}
],
destinationFileName? : 'File name for output destination', // Optional
useFullData? : false, // Optional
scripts?: { // Optional
preColumnMapping: ['scriptId1', 'scriptId2']
}
webhookUrl? : 'https://your-server-to-pass-metadata-to/' // Optional
outputFileType?: 'excel' // Optional
});
curl -X POST https://backend.segna.io/public/client-side/v1/start \
-H 'x-api-key: YOUR_API_KEY' \
-F 'pipelineId=YOUR_PIPELINE_ID' \
-F 'jobName=A Job Name' \
-F "files=@path/to/file.csv" \
-F "gsUrl=https://docs.google..." \
-F "gsSheetName=Sheet1" \
-F 'destinationFileName=folder/filename' \
-F 'useVariableFields=false' \
-F 'useFullData=false' \
-F 'scripts={"preColumnMapping": ["scriptid1", "scriptid2"]}' \
-F 'webhookUrl=https://your-server-to-pass-metadata-to/' \
-F 'outputFileType=excel'
Optional Parameters:
- Our API can take input data from a local file, through the
jsonData
parameter, or directly from Google Sheets. If uploading a local file,file
must be specified. If importing using Google Sheets,gsUrl
andgsSheetName
must be specified. If passing JSON data,jsonData
must be specified. Note that only one method of data ingestion can be specified at a time. destinationFileName
is only relevant if the data is a named file in the output e.g. for an S3 bucket.- If
useFullData
is set to true, all meta data (before the run of the job) will be obtained from the full data, instead of a sample. scripts
: used to inject custom scripts created on the segna platform. Currently, the only script injection position is "preColumnMapping". This will cause the scripts to run after the data has been imported but before columns have been mapped to fields.- If a
webhookURL
is provided in the above method, metadata will be sent as a POST request to the URL once the file has been uploaded to our service and the job has started. outputFileType
currently supportsexcel
value. Defaults tocsv
.
Example Webhook Metadata
{
"jobId": "YOUR_JOB_ID",
"dataLength": {
"file.csv": 120
},
"inputColumns": {
"file.csv": [
"Id_Number",
"Name",
"Role"
]
},
"expectedFields": [
"Id",
"Name",
"Role"
],
"columnFieldMapping": {
"file.csv": {
"Id_Number": {
"field": "Id",
"confidence": 0.9
},
"Name": {
"field": "Name",
"confidence": 1
},
"Role": {
"field": "Role",
"confidence": 1
}
}
},
"expectedFieldDataTypes": {
"Id": "number"
"Name": "rich_text",
"Role": "category"
},
"fieldDataTypes": {
"Id": {
"dataType": "number",
"confidence": 1
},
"Name": {
"dataType": "rich_text",
"confidence": 1
},
"Role": {
"dataType": "category",
"confidence": 1
}
},
"uniqueRatio": {
"file.csv": {
"Id_Number": 1,
"Name": 0.9,
"Role": 0.05
}
},
"missingRowIndices": {
"file.csv": {
"Id_Number": [],
"Name": [],
"Role": []
}
}
}
Job Configuration
Once a Job has started, and you have a jobId
, you can make calls to multiple Optional Job Configuration Methods that provide both control and visibility
of the data before it moves to your output destination.
Run Job
Once the job is configured and ready to move to your output destination, simply call run
.
- JavaScript
- API
segna.runJob(jobId, {
webhookUrl?: string,
returnWhenComplete?: boolean
});
curl -X POST https://backend.segna.io/public/client-side/v1/run/{jobId} \
-H 'x-api-key: YOUR_API_KEY' \
-d {
"webhookUrl": "https://YOUR-WEBHOOK-SERVER",
"returnWhenComplete": true
}
Optional Parameters:
- If a
webhookURL
is provided in the above method, a payload containing the processed data will be sent as a POST request to the URL once the job has run. - By default, the
runJob
promise returns as soon as we validate the configuration and begin processing the data. IfreturnWhenComplete
is enabled, therunJob
promise will return once the full job is complete (data has been pushed to the output database).
Example Run Job Webhook
{
"jobId": "YOUR_JOB_ID",
"dataTable": [
["Id_Number", "Name", "Role"],
[001, "Simba", "Engineer"],
[002, "Nala", "Engineer"],
[003, "Mufasa", "Manager"],
...
],
"dataTypes": {
"Id_Number": "number",
"Name": "rich_text",
"Role": "category",
}
}