Instructions for Batch File Processing API

Overview

This explains how to leverage our high-performance, parallel-processing cloud architecture to run and/or automate the running of API data validation and data enrichment jobs for datasets and files, all with a single API call. Not only can you validate and enrich data quickly and easily, you can also schedule processing, add to business processes, enrich multiple datasets, and integrate into ETL/ELT processes.

Automation

Schedule and integrate validation and enrichment jobs via API into your ETL/ELT processes, workflows, devops, or other operations, constantly gauging levels of data quality and data value across all of your data assets.

Multiple Data Sources

Support for various dataset formats, including local files, cloud files, and other data sources.

Single Command

Execute or schedule powerful, highly-performant, API-driven data enrichment and validation capabilities with a single HTTP API request.

How It Works

The data processing is initiated via API using an HTTP request "query string", which can be embedded into any process, batch file, scheduler, or scripted series of commands, including from a browser address bar, a command line, using cURL, and any other method that enables an API call.

CSV Data Source Example: Validating and Enriching Email Addresses

Description:

The API provides validation information for email addresses to aid in deliverability and to prevent sending email to bad addresses. It also offers additional demographic and descriptive data useful for marketing, personalization and segmentation purposes. Examples include descriptions, revenue, number of employees, Twitter/X handles, location data, generic/disposable email indicators, and more. The source file is a URL as the sample file is stored on AWS S3.

Example API Call:

Try it out with Curl from the command line.

curl "https://connect.interzoid.com/run?function=email-info&apikey=your-api-key&source=csv&connection="https://your-file-location"&table=csv&column=1"

API Parameters

Use these parameters in your HTTP query string/API call.

Parameter	Description	Required
`function`	Which function (API) to call for the processing. Options: `company-name-matching`: Generate similarity keys to match company and organization names `individual-name-matching`: Generate similarity keys to match individual names `street-address-matching`: Generate similarity keys to match US street addresses `global-address-matching`: Generate similarity keys to match global addresses `product-name-matching`: Generate similarity keys to match product names `email-info`: Validate and enrich email addresses `city-standard`: Standardize city names, including globally `state-standard`: Standardize US state and Canadian province names `country-standard`: Standardize country names `country-standard-info`: Standardize country names with additional enrichment data `phone-info`: Append and enrich global phone numbers with location data `entity-type`: Determine/label entity type of data (company, person, location, email, etc.) `name-origin`: Determine likely country of origin based on name `gender`: Determine likely gender based on name `translate-to-english`: Translate column data from any language to English `translate-to-any`: Translate column data from any language to any target language `global-performance`: Measure round-trip Web performance of URLs (page loads, API calls...) from global server locations `company-parent`: If company is a subsidiary, returns the ultimate parent company name `top-executive`: Identify the top executive of a company, organization, school, or other entity. `currency-rate`: Retrieve live currency exchange rates `weather-zip`: Retrieve live weather information for US zip codes `global-weather`: Live global weather information for any location worldwide	Required
`apikey`	Your API Key. Login to www.interzoid.com to obtain one. New users can register at www.interzoid.com/register-api-account	Required
`source`	Source of delimited data file, either 'CSV' or 'TSV'.	Required
`connection`	Location of CSV or TSV file - the full URL of the raw file location (S3, Azure Storage, Google Storage, Github, etc.).	Required
`table`	Table name to access the source data. Use "CSV", "TSV", etc. for delimited text files.	Required
`column`	Column number for CSV or TSV files, starting with 1 for the leftmost column (or only column) of a file.	Required
`reference`	An additional column from the source file to display in the output results, such as a primary key. This is also a number.	Optional
`target`	The target text file delimited format for output, such as "CSV" (comma-delimited) or "TSV" (tab-delimited). Default is CSV.	Optional
`showall`	Set to true (&showall=true) to output all source columns with the new columns appended to the right.	Optional
`to`	The target language to translate the input parameter text to (only applicable to translate-to-any)	Optional
`origin`	The global server location to measure performance from (only applicable to global-performance)	Optional

Supported Data Sources: Connection Strings

Values to use for the API source and connection parameters

Source Value	Description	Connection String Value Example
`csv`	URL path of CSV file	`https://www.mywebaddress.com/files/myfile.csv`
`tsv`	URL path of TSV file	`https://www.mywebaddress.com/files/myfile.tsv`

Running with cURL Example

You can run the command from a Linux, Windows, or macOS command line using cURL:

Linux & Mac

curl 'https://connect.interzoid.com/run?function=email-info&apikey=your-api-key&source=csv&connection="https://your-file-location"&table=csv&column=1'

Windows

curl "https://connect.interzoid.com/run?function=email-info&apikey=your-api-key&source=csv&connection="https://your-file-location"&table=csv&column=1

Redirecting Output

Output from these curl commands can be redirected to output files for further processing using the greater-than symbol in both Linux & Windows.

Linux & Mac

$ curl '[HTTP query string]' > output.csv

Windows

curl "[HTTP query string]" > output.csv

Examples

Here is an additional example demonstrating a batch API call and mass data processing from file sources.

TSV Data Source Example: Appending Phone Geographic data

Description:

This API analyzes international phone numbers and provides corresponding geographic information. The API uses the fourth column in the tab-delimited file as the global phone number to use as the basis of the analysis and geographic information discovery. Note that the ShowAll flag is also true, meaning all the columns in the input source file will be included in the output results file with the new columns appended to it.

API Call:

curl "https://connect.interzoid.com/run?function=phone-info&apikey=your-api-key&source=tsv&connection=https://your-file-location.tsv&table=tsv&column=4&showall=true"