Updated on
Sep 5, 2024
Note: Google Cloud Storage is currently supported only as a Destination. This guide doesn’t cover the DataPrep setup for GCS.
Google Cloud Storage is a highly available and durable object storage service offered by Google Cloud Platform, designed to store and access large, unstructured data sets with high reliability, scalability, and performance.
Follow our setup guide to connect Google Cloud Storage to Improvado.
In order to use Service Account Key authentication, first, you need to generate a JSON file via Google Cloud Console using official documentation or an interactive step-by-step guide provided by Google.
Alternatively, you can follow the instructions below:
On the Google Cloud Storage connection page, fill in the following fields:
{%dropdown-body name="bucket-name"%}
Preferred bucket for GCS uploading.
Bucket Name can only contain letters, numbers, dots, and underscores and must start and end with a letter or number.
Bucket Name length must be between 3 and 222 characters.
{%dropdown-end%}
{%dropdown-body name="filename"%}
Possible parameters:
```{{filename}}-{{dataclass}}-{{YYYY}}-{{MM}}-{{DD}}```
{%dropdown-end%}
{%dropdown-body name="file-format"%}
Possible formats:
{%dropdown-end%}
{%dropdown-body name="separator"%}
Possible delimiters that can separate data in your file:
{%dropdown-end%}
{%dropdown-body name="partition-by"%}
Possible ways of splitting data:
{%dropdown-end%}
{%dropdown-body name="encryption"%}
Possible options:
{%dropdown-end%}
{%dropdown-body name="root-name"%}
Possible parameters:
```/{{data_source}}/{{data_table_title}}/{{report_type}}/{{YYYY}}/{{MM}}/{{DD}}/{{timestamp}}```
If you use ```/{{YYYY}}/{{MM}}/{{DD}}``` settings, the data will be added to folders daily. Each new record will not delete the previous one, even for data that contains no date. By request to the support team, we are able to support different root structures in a bucket.
{%dropdown-end%}
{%dropdown-body name="use-static-ip"%}
Select Yes for Use static IP option if you allow Improvado to connect your database by the static IPs mentioned on the Destination connection page.
Select No if you have permitted access to your database from any IP. In this case, Improvado will connect your database using dynamic IPs not listed on the Destination connection page.
{%dropdown-end%}
{%dropdown-body name="workload-pool-id"%}
Pool IDs are used as identifiers in IAM.
{%dropdown-end%}
{%dropdown-body name="aws-provider-id"%}
Providers manage and verify identities.
{%dropdown-end%}
{%dropdown-body name="aws-provider-id"%}
A service account is identified by its email address, which is unique to the account.
{%dropdown-end%}
Note: We recommend using the Service Account Key as an authentication method.
With identity federation, you can use Identity and Access Management (IAM) to grant external identities IAM roles, including the ability to impersonate service accounts. This approach eliminates the maintenance and security burden associated with service account keys.
Learn more about Identity Federation here: Workload identity federation | IAM Documentation | Google Cloud.
You need to share access for your Google Cloud Storage bucket to Improvado Google Service account: improvado-gcs-loader@green-post-223109.iam.gserviceaccount.com with a role at GCS bucket: Storage Object Admin.
Learn more here.
Improvado team is always happy to help with any other questions you might have! Send us an email.
Contact your Customer Success Manager or raise a request in Improvado Service Desk.