Documentation

Azure Blob Storage

IMPORTANT: This article covers setup of warehouse for load data from Improvado, not customer data warehouse from which data is being extracted. This article doesn't cover setup of customer data warehouse for Data Prep as well.

Required information

  • Title
  • Account URL
  • ~Account URL must satisfy the following regular expression: https://[a-z0-9]*\.blob\.core\.windows\.net$
  • ~The [a-z0-9] part of Account URL must be between 3 and 24 characters in length
  • SAS Token (check the instruction How to Generate an Azure SAS Token)
  • Container Name
  • ~Container Name length must be between 3 and 63 characters and must satisfy the following regular expression: r'^(?!.*--.*)[a-z0-9][-a-z0-9]*[a-z0-9]$’
  • Encryption type
  • Encryption key
  • Folder
  • ~The maximum length is 254 characters
  • File format
  • File name
  • Separator (optional)
  • ~The maximum length of the separator is 2 characters
  • Partition by
  • Use static IP

Encryption

Possible options:

  • No encryption (default cloud storage encryption is still enabled)
  • Customer-provided keys

Encryption Key

If you have selected the Default Cloud Storage encryption type, you will not be able to edit this field.

Otherwise, you should enter your AES-256 key, encoded in standard Base64 or resource name of Cloud KMS key used to encrypt the blob’s contents. For more info, see Azure Blob Storage encryption docs.

Folder

Possible parameters:

  • /{{ data_source }}/{{ data_table_title }}/{{report_type}}/{{ YYYY }}/{{ MM }}/{{ DD }}
  • ~{ data_source } is a data provider, integration, connector
  • ~{ data_table } is an object that contains all extraction orders with the same granularity (dimensional schema)
  • ~{report_type} is a set of such fields as metrics, properties, dimensions, etc.

If you use /{YYYY}/{MM}/{DD} settings, the data will be added to folders daily. Each new record will not delete the previous one, even for data that contains no date.

Data structure of S3 storage by Improvado

By request to the support team we are able to support different folder structure in a bucket.

File format

Possible formats:

  • csv
  • csv+gzip
  • json
  • json+gzip
  • parquet

File name

Possible parameters:

  • {{filename}}-{{YYYY}}-{{MM}}-{{DD}}
  • ~{ filename } is the same as destination table name

IMPORTANT: you cannot use {{ DD }} for partition by month

  • ~{{filename}}-{{YYYY}}-{{MM}}-{{DD}} – for partition by day
  • ~{{filename}}-{{YYYY}}-{{MM}} – for partition by month

Also, you can use “_” instead of “-” or do not use any symbols at all, for example:

  • {{filename}}_{{YYYY}}-{{MM}}-{{DD}}
  • {{filename}}{{YYYY}}{{MM}}{{DD}}

Partition by

Possible ways of splitting data:

  • Day
  • Month

Use static IP

Select Yes for Use static IP option if you allow Improvado to connect your database by the following static IPs only:

  • 34.226.37.150
  • 18.213.72.135
  • 54.146.15.122
  • 3.86.170.178
  • 23.21.191.65

Select No if you have permitted access to your database from any IP. In this case, Improvado will connect your database using dynamic IPs not listed above.

How to connect

We need the permission to write and update data in container.

Related articles
No items found.
No items found.