Pipeline Setup

πŸ›  Pipeline Setup

1. Create an account

Reach out to us at [email protected] or book a demo here to get started!

πŸ‘

14-day free trial - no obligations, no credit card required πŸ₯³


2. Create a Pipeline

A Pipeline defines what you want your output schema, or final cleaned data, to look like.

Create a Pipeline by clicking the "Build new Pipeline" button.

39483948

3. [OPTIONAL] Add an output destination for your data

You can configure to have csvs/files imported using this Pipeline to output to a database or data-lake of your choice.

  1. Select "Add Database"
43124312
  1. Select "Add New Connector"
17841784
  1. Select your desired database or data-lake
  2. Enter the relevant credentials then select "Add Connector"
  3. You may need to add additional details once the connector has been selected such as the table name or destination folder

4. Define your output schema

Add fields to your output schema - i.e. the columns you want once the csv file has been uploaded, cleaned and processed.

25342534

The expected data type of the field can be configured, as well as in some cases, the expected unit and format. Segna will automatically do this conversion for you when a csv is uploaded!

Additional pipeline configuration

  • Additional fields: If you want to allow additional fields to be added as the csv is uploaded, you can select "Allow additional fields on job run". This also enables the option of combining all unmapped columns into one of the fields defined on the pipeline as JSON.
  • Duplicate rows: A set of fields can be specified that will have duplicates (defined over the set of fields) dropped automatically. This can be useful for enforcing, for example, primary key constraints. For more information, see the row dropping feature page.
  • Email validation: A set of fields containing email addresses can be specified that will have the format and domain of the emails checked for validity, and any rows with invalid emails dropped automatically. This can be useful for removing rows of bad data. For more information, see the email validation feature page.
18421842

5. Publish and test your Pipeline

Click "Publish".

You can now test your Pipeline by clicking on "Run Test Job". Upload a csv file that you would be expected to be uploaded - remember your input columns don't need to match! Segna will also try cleaning the file.

Once you have completed the upload process, if you had supplied an output destination in Step 3, you will see the data in your specified destination.



You've now finished set up of the Pipeline, head on over to the Integration page to see how you can get a Data Importer in your app in minutes!


What’s Next