Introducing our Snowflake Data Cloud Native Application: AI-Driven Data Quality built into SQL statements! Learn More

How to Calculate Basic Statistics of a Dataset or Database Table in Workflow

In addition to being able to calculate statistics of an entire text file or database table using this Cloud Connect application, you can now also automate the running of these analysis jobs so they can be scheduled, added to any business processes or workflow, or be part of a data pipeline in ETL/ELT processes. This is a powerful capability.

This is achieved via an HTTP request "query string", which can then be embedded directly into any process, batch file, scheduler, or series of commands.

For example, the following range test process can be tested against a demo file (no credits used):


    https://connect.interzoid.com/run?function=statistics&apikey=use-your-own-api-key-here&source=CSV&connection=https://dl.interzoid.com/csv/numbers.csv&table=CSV&column=1&html=true
                

Running with 'Curl'

You can also run this command from a Linux, Windows, or Macintosh command line using "Curl" (must use double quotes within Curl on Windows). Curl (also known as cURL) is a command line HTTP client tool that is generally available by default on most computers:

Linux & Mac

    curl 'https://connect.interzoid.com/run?function=statistics&apikey=use-your-own-api-key-here&source=CSV&connection=https://dl.interzoid.com/csv/numbers.csv&table=CSV&column=1'
                
Windows

    curl "https://connect.interzoid.com/run?function=statistics&apikey=use-your-own-api-key-here&source=CSV&connection=https://dl.interzoid.com/csv/numbers.csv&table=CSV&column=1"
                

Redirecting Output

Output from these curl commands can be redirected to output files for further processing using the greater-than symbol in both Linux & Windows.

Linux & Mac

    $ curl '[HTTP query string]' > output.csv
                
Windows

    > curl "[HTTP query string]" > output.csv
                

Connecting to Cloud SQL Data Tables

Here are some examples of using the same HTTP query string to calculate statistics on columns in an entire database table or view. See more about connection strings.


    (Snowflake example) https://connect.interzoid.com/run?function=statistics&apikey=use-your-own-api-key-here&source=Snowflake&connection=your-specific-connection-string&table=numbers&column=number
    (Azure SQL example) https://connect.interzoid.com/run?function=statistics&apikey=use-your-own-api-key-here&source=azure sql&connection=your-specific-connection-string&table=numbers&column=number
    (AWS RDS example) https://connect.interzoid.com/run?function=statistics&apikey=use-your-own-api-key-here&source=aws rds postgres&connection=your-specific-connection-string&table=numbers&column=number
    (Google Cloud SQL example) https://connect.interzoid.com/run?function=statistics&apikey=use-your-own-api-key-here&source=postgres&connection=your-specific-connection-string&table=numbers&column=number
    (Postgres example) https://connect.interzoid.com/run?function=statistics&apikey=use-your-own-api-key-here&source=postgres&connection=your-specific-connection-string&table=numbers&column=number
    (MySQL example) https://connect.interzoid.com/run?function=statistics&apikey=use-your-own-api-key-here&source=mysql&connection=your-specific-connection-string&table=numbers&column=number
                


Statistics Parameters


    Parameters specific to calculating statistics on a column to be set as part of the HTTP query string:

    function	    Required. Use 'statistics' for providing an element count analysis of a column.

                    


Additional Parameters


    Additional parameters that can set as part of the HTTP query string:

    apikey	    Required. Login to www.interzoid.com to obtain your API Key. It is how we track and manage usage.
                    If you do not yet have one, register at www.interzoid.com/register-api-account

    source	    Required. Source of data, such as 'CSV', 'Snowflake', 'Postgres', etc.
                    See source list on interactive page for entire list.

    connection	    Required. Connection string to access database, or in the case of a CSV or TSV file,
                    use the full URL of the location of the file.

    table	    Required. Table name to access the source data. Use "CSV" or "TSV" for delimited text files.

    column	    Required. Column name within the table to access the source data. This is a number for CSV or TSV files,
                    starting with number 1 from the left side of the file.

    reference	    An additional column from the source table to display in the output results, such as a primary key.

    newtable	    The name of the new table if the output results are written to a new table.

    json	    Set to true (&json=true) to display the output formatted as JSON.

    html	    Set to true (&html=true) to pad line breaks into the output results for better readability in
                    a browser when run from the address bar.
    

Questions? Contact support@interzoid.com - we are happy to help.

Return to interactive page