CSV Connector

CSV Connector




Additional Documentation

This Connector is based on the CSV ADO.NET Provider by CData. For additional help, including Connection String parameters, please see: https://cdn.cdata.com/help/RVG/ado/

CSV Via Upload (in Cloud Cartographer)

When you upload a new file, you need to clear the connector cache before the file will show in the Query Builder and before Origin Preview will work.

If you run into any issues with the data being read in unexpected ways, it's likely due to incorrect assumptions by the automated type detector that determines what type of data is in each column. You can turn this off and force Starfish to treat all columns as text data by adding the following to the Connection String setting in your connector settings:
  1. TypeDetectionScheme=None

Using On Prem CSV Connector

Origin

We've seen some display issues with the CSV connector. The Connection String sample is not quite correct. It should read:

         Data Source="C:\Starfish\Folder";Extended Properties="Text;HDR=Yes;"
         

Your SqL Selection Statement then selects from the filename:

         
            SELECT
         
         
            *
         
         
            FROM
         
         
            districts
         
         
            .
         
         
            csv
         
         

In addition, there should be a "Select File..." button just above the Connection String box and all the way to the right. Sometimes, this doesn't display.

In version 5 and higher, you need quotes around the filename in the selection statement:

         
            SELECT
         
         
            *
         
         
            FROM
         
         
            "districts.csv"
         
         

The file needs to be in a location that the Starfish app pool can read from. Either grant access to the directory for "IIS_IUSRS" or put the CSV file under C:\inetpub\wwwroot\StarfishEngine

Issues

Type mismatch issue where column contains some alpha numeric entries but Starfish believes the data is numeric.

If a column of data contains numeric data for the first few rows and then later contains alphanumeric data Starfish will expect numeric data from every row in the data and will then fail to parse the alphanumeric data. Because by default the Microsoft Jet driver that is used for this functionality only scans the first 16 rows, it tries to make a ‘guess’ based on that data for each column type. This is only a problem is the data changes from numeric to alphanumeric partway through the data as thinks the data is all numbers. In registry, navigate to: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Text Find the MaxScanRows key, and change the value to ‘0’. Restart Starfish. It may also be necessary to perform an IISRESET.

Value too long to parse

Depending on what values Starfish is expecting to see in a column, it may generate an error for values that do not match the expected data type. You can create a schema.ini file to tell Starfish what data types to expect, rather than relying on Starfish's guess based on scanning the first few rows of data.

First, use the ODBC Data Sources tool to generate a schema file. Open the tool, go to System DSN tab, and click Add. Select Microsoft Text Driver and click Finish. Uncheck 'Use Current Directory' setting and click the Select Directory button. Navigate to the directory where your file is saved, and click OK. Then click the Options button and click the Define Format button. Select your file from the list, check the Column Name Header box, and click Guess. (You may also have to adjust Rows to Scan setting.) Click OK. Cancel out of ODBC Data Sources tool.

You should now see a schema.ini file in the directory. You can edit this file using a text editor to modify the expected data type. For example, for varchar, set the column to Char Width 255 You will need to remove the line CharacterSet = OEM from the schema.ini file if present. Also remove the values for any other files in the same directory for which you do not wish to define the schema.

Use ODBC instead of CSV?

I had a 111MB CSV with 442,427 rows in it. The CSV Connector could not read it. I downloaded the 64-bit Microsoft ODBC CSV Engine and it read the file without issue.

The download is labelled "Microsoft Access Database Engine 2010 Redistributable" and I found it here:  https://www.microsoft.com/en-us/download/details.aspx?id=13255 .


Parameter
Description
Folder


CSV Connection Edit Screen


    • Related Articles

    • Setting up your CSV via Google connector

      Create CSV via Google Drive connection Authenticate your connection using OAuth Select the google account you want to use Select the Advanced option to continue Select the Starfish ETL link to continue Access has been granted Select Query Builder to ...
    • Hubspot Connector

      Additional Documentation This Connector is based on the Hubspot ADO.NET Provider by CData. For additional help, including Connection String parameters, please see: https://cdn.cdata.com/help/DHG/ado/ Hubspot Connector Note: As of September 8, 2020 ...
    • Act! Connector

      ACT! Premium Cloud: StarfishETL supports two-way integration. ACT! On-Prem: StarfishETL does not have a native Act! connector for versions less than Act v18 so we read directly from the database. Configuration Options Connect Act Premium (v18+) using ...
    • Salesforce Connector

      Additional Documentation This Connector is based on the Salesforce ADO.NET Provider by CData. For additional help, including Connection String parameters, please see: https://cdn.cdata.com/help/RFG/ado/ Salesforce Origin Case as Origin Table You ...
    • Excel Connector

      Additional Documentation This Connector is based on the Excel ADO.NET Provider by CData. For additional help, including Connection String parameters, please see: https://cdn.cdata.com/help/RXG/ado/ Establishing a Connection Connecting to Excel Data ...