Loading Data Using SFTP
Custom connectors and industry data connectors leverage Nitro’s integrated SFTP server to get data into Nitro’s database.
Files are removed from the FTP directory 180 days after file creation. Related jobs and IDs are retained, but the underlying file is removed.
Connecting to the SFTP Server
SFTP user credentials are issued when Nitro Support provisions the connector. These credentials are unique for each connector and contain the following components:
- User Name – for each specific connector
- API Key – 128-character reusable key placed in the .CTL control file to initiate Nitro ETL jobs
Hostname and Location
The hostname for the SFTP connection depends on the location of the Nitro cluster:
All locations require the FTP protocol 'SFTP - SSH File Transfer Protocol.'
Customers with firewall rules defining where data from their corporate network can be sent must contact Nitro Support for the IP address to allowlist.
After configuring the connection to the connector-specific SFTP location, load data files into the root directory of the connector SFTP site:
/upload/<Cluster Name>/<Instance Name>/<Connector Name>/
Data should be loaded via a .csv data file – preferably fully quoted, fully escaped, CSV RFC 4180 Standard Compliant, UTF8 file.
Nitro supports the following compression for uploading individual data files via SFTP:
Starting the Data Load Process
There are multiple ways to initiate the data load in Nitro:
Select Run Job from Home page in the Nitro Admin Console to run a job or job orchestration
Set a File Trigger to run a specific job or job orchestration
Schedule the job or job orchestration to run at a specific day/time regularly
Changing file layouts for data being loaded into Nitro requires redshift tables to be adjusted to match the new file layout. Users can remove columns from a table and add columns to the end of existing redshift tables in FTP connectors without manually manipulating the redshift tables. The removed columns remain in the ODS, without new updates to these fields.
Data is processed by adding a control file to the ctl directory within the Connector root directory:
/upload/<Cluster Name>/<Instance Name>/<Connector Name>/ctlfile
Nitro performs preliminary validation on the data before loading it into the database. Validation includes:
- Ensuring the number of columns in the inbound raw data matches what is expected in the FTP yml definition
- Ensuring the data types as defined in the FTP yml definition are compatible with what is coming in from the raw data
- Ensuring required fields are present in the raw data
- Ensuring the field lengths in the raw data are supported by what is specified in the FTP yml definition
Nitro identifies errors and continues processing until the error threshold is reached, defined by the maxErrorCount control file attribute.
Nitro validates table YAMLs to detect invalid table, column name, and data type when the metadata is stored. If any of these are found during validation, an error occurs and the metadata is not stored. Since admins are notified in advance of loading table YAML files into Nitro if there is an issue with the file, they can correct any issues before the table is created.
The following validations apply to the table YAMLs.
<Table/ Column Name>
- Contain only ASCII letters, digits, underscore characters (_), or dollar signs ($)
- Begin with an alphabetic character or underscore character
- Be between 1 and 127 characters in length
- Contain no quotation marks and no spaces. Not be a reserved SQL keyword.
- Data type should be supported by RS