The Filter Data operator is a versatile tool in Self-Serve Data Transformation, designed to apply filters to any dataset or transformation result.
How Does It Work?
Select the Table or Dataset Start by choosing the data you want to filter. You can select any table in your database. Alternatively, you can filter results from previous transformation steps, enabling seamless multi-step workflows.
Apply Filters Add filters to define the subset of data you need: ou can create simple or compound conditions for filtering. You can also combine multiple conditions using AND/OR logic to refine the filtering criteria.
Add Table-Specific Filters You can apply filters at two levels:
~Dataset Level: Filter the entire dataset from the previous step using global conditions.
~Table Level: Add specific filters to individual tables within the dataset by clicking «Add Table Filter» and defining conditions for the selected table.
Run and Save Once your filters are configured, click «Run and Save» to apply the filters and update the model. The resulting dataset will reflect the specified filters.
Use Case
Filtering Cross-Channel Data with Google Ads Performance Max Campaigns
To illustrate, let’s consider a practical example:
Goal
Prepare Cross-Channel data that includes:
Google Ads data filtered by account.
Performance Max campaigns filtered by Campaign Type.
Steps
Select the Dataset: Choose the result from the previous step (e.g., Cross-Channel data).
Apply Dataset-Level Filters: Add a filter to the entire dataset to include only data for a specific account_name = Improvado.
Add Table-Level Filters: Click Add Table Filter to apply a filter specifically to the Google Ads table, and add a condition: campaign_advertiser_type = Performance Max.
Run and Save: Execute the filter operation and update the model.
As a result we have entire dataset is filtered by account name. The Google Ads table includes only Performance Max campaigns. This ensures a clean, non-duplicated dataset, ready for downstream analysis or transformations.
How to apply filters via AI Agent for Transformation
In Recipes, you can use the Transform Agent to automatically configure the Filter Data step by providing a raw SQL `WHERE``` clause. This allows for quick and precise filtering without manually using the UI constructor.
Below are examples of SQL filter snippets you can paste directly into the Transform Agent. Remember to specify the target table or step name in your prompt.
Example 1: Simple Equality Filter
SQL Snippet:
WHERE country = 'USA'
Transform Agent Prompt:
"On the Customers table, apply the following SQL filter: WHERE country = 'USA'"
Example 2: Multiple Conditions (AND)
SQL Snippet:
WHERE category = 'Electronics' AND price > 1000
Transform Agent Prompt:
"For the Sales_Data step, apply the following SQL filter: WHERE category = 'Electronics' AND price > 1000"
Example 3: Multiple Conditions (OR)
SQL Snippet:
WHERE status = 'shipped' OR status = 'delivered'
Transform Agent Prompt:
"On the Orders table, apply the following SQL filter: WHERE status = 'shipped' OR status = 'delivered'"
Example 4: Using the IN Operator
SQL Snippet:
WHERE country IN ('USA', 'Canada', 'Mexico')
Transform Agent Prompt:
"For the Enriched_Users step, apply the following SQL filter: WHERE country IN ('USA', 'Canada', 'Mexico')"
Example 5: Combining AND and OR
SQL Snippet:
WHERE (city = 'New York' OR city = 'London') AND age > 30
Transform Agent Prompt:
"On the User_Profiles table, apply the following SQL filter: WHERE (city = 'New York' OR city = 'London') AND age > 30"
Example 6: Using LIKE for Pattern Matching
SQL Snippet:
WHERE email LIKE '%@gmail.com'
Transform Agent Prompt:
"For the Marketing_Leads step, apply the following SQL filter: WHERE email LIKE '%@gmail.com'"
Example 7: Checking for NULL Values
SQL Snippet:
WHERE source NOT IN ('internal', 'test') AND campaign_id IS NOT NULL
Transform Agent Prompt:
"On the Contacts table, apply the following SQL filter: WHERE phone_number IS NOT NULL"
Example 8: Using NOT and NOT IN
SQL Snippet:
WHERE source NOT IN ('internal', 'test') AND campaign_id IS NOT NULL
Transform Agent Prompt:
"For the Attribution_Data step, apply the following SQL filter: WHERE source NOT IN ('internal', 'test') AND campaign_id IS NOT NULL"
Example 9: Complex Multi-Level Filtering
SQL Snippet:
WHERE (registration_date > '2023-01-01' AND (country = 'Germany' OR country = 'France')) OR (total_spent > 5000 AND last_seen_days < 30)
Transform Agent Prompt:
"On the Final_Customer_List table, apply the following SQL filter: WHERE (registration_date > '2023-01-01' AND (country = 'Germany' OR country = 'France')) OR (total_spent > 5000 AND last_seen_days < 30)"
By providing clear SQL snippets and prompts, you can significantly speed up the process of building complex filters in the Filter Data step.
Conclusion
The Filter Data operator brings precision and flexibility to your data transformation workflows. By allowing you to apply filters at both the dataset and table levels, it ensures your data is clean, targeted, and ready for deeper analysis.
Setup guide
Schema information
Settings
No items found.
Troubleshooting
Troubleshooting guides
Check out troubleshooting guides for
Filter Data
here:
Limits
Filter Data API changes
Frequently asked questions
No items found.
Thank you for your feedback!
Oops! Something went wrong while submitting the form.