Tableau — SA Migration Guide¶
Platform Architecture¶
graph TB
subgraph Authors["Authoring Layer"]
TD["Tableau Desktop<br>(.twb / .twbx)"]
TP["Tableau Prep Builder<br>(flow files)"]
TWC["Tableau Web Authoring<br>(browser-based)"]
end
subgraph Server["Tableau Server / Tableau Cloud"]
GW["Gateway / Load Balancer"]
APP["Application Server<br>(VizQL + portal UI)"]
VizQL["VizQL Server<br>(query engine + renderer)"]
BGW["Backgrounder<br>(scheduler + subscriptions)"]
DS["Data Server<br>(published data sources)"]
FM["File Store<br>(extract cache .hyper)"]
PG[("PostgreSQL<br>Repository DB")]
META["Metadata API<br>(Tableau Catalog)"]
end
subgraph Clients["Consumers"]
Browser["Browser / Portal"]
Email["Email Subscriptions"]
EMB["Embedded Application"]
API["REST API / JS API"]
end
subgraph Sources["Data Sources"]
SQL[("Relational DB")]
CLOUD[("Cloud DW / Databricks")]
FILE["Flat Files / Excel"]
CUBE["OLAP / Cubes"]
end
TD -->|"Publish .twbx / .twb"| GW
TWC --> GW
GW --> APP
APP --> VizQL
VizQL --> DS
VizQL --> FM
DS --> SQL
DS --> CLOUD
DS --> FILE
DS --> CUBE
BGW --> VizQL
BGW --> Email
APP --> PG
PG --> META
APP --> Browser
APP --> EMB
APP --> API
Data Flow¶
flowchart LR
SRC[("Source Systems<br>DB / Cloud DW / Files")]
DS["Data Server<br>Published Data Sources"]
EXT["Hyper Extract<br>.hyper file"]
VizQL["VizQL Engine<br>Query + Render"]
BGW["Backgrounder<br>Refresh / Schedule"]
CACHE["View Cache<br>Pre-rendered VIZ"]
SRC -->|"Live query or extract refresh"| DS
DS -->|Extract| EXT
EXT --> VizQL
DS -->|Live| VizQL
BGW -->|"Scheduled refresh"| DS
BGW -->|"Subscription render"| VizQL
VizQL -->|"Cached image"| CACHE
CACHE -->|Portal| Browser["Browser / Portal"]
VizQL -->|"On-demand render"| Browser
VizQL -->|"PDF / PNG / Excel"| Email["Email Subscription"]
VizQL -->|"Embedded view"| EMB["Embedded App<br>JS API / Embedding v3"]
VizQL -->|"JSON / Image"| RESTAPI["REST API Consumer"]
1. Ecosystem Overview¶
Tableau is a visual analytics platform owned by Salesforce. It sits at the self-service end of the BI spectrum — designed so business analysts can build and explore dashboards without writing SQL. It competes directly with Power BI and Looker, and is often found in organizations that have a strong culture of data exploration rather than centralized IT-published reports.
Product variants:
| Variant | Description | Common in customer estates |
|---|---|---|
| Tableau Desktop | Windows/Mac authoring tool — creates .twb/.twbx files | Always present where Tableau is used |
| Tableau Server | On-premises server deployment | Large enterprises, regulated industries |
| Tableau Cloud (formerly Online) | SaaS-hosted version of Tableau Server | Mid-market, modern data teams |
| Tableau Prep Builder | ETL/data prep tool, separate from core Tableau | Present in ~30% of estates |
| Tableau Public | Free public-facing publishing — not used in enterprise | Ignore for migration scoping |
| Tableau Embedded Analytics | Tableau views embedded in custom apps via JS API | High migration risk when present |
| Tableau CRM (formerly Einstein Analytics) | Salesforce-native analytics product — separate product | Different migration path entirely |
Why customers use it:
- Self-service analytics: Business users build their own dashboards without IT involvement
- Visual exploration: Drag-and-drop interface with instant visual feedback
- Embedded analytics: White-labeled Tableau views inside custom applications
- Executive dashboards: Live-connected or extract-based summary dashboards
- Ad hoc analysis: Analysts exploring data without a pre-defined report structure
Key discovery questions:
- How many workbooks are published to the server? How many have been accessed in the last 90 days?
- Are users on Tableau Server (on-prem) or Tableau Cloud (SaaS)?
- What version of Tableau Server is running? (Older versions may lack API coverage)
- What data sources are live connections vs. extracts (.hyper files)?
- Are any workbooks embedded in customer-facing or internal applications using the JS API or Embedding API v3?
- Is Tableau Prep being used for data preparation, and are those flows published to the server?
- How are subscriptions and alerts configured — who owns the recipient lists?
- Is Tableau Catalog (the governance/lineage add-on) licensed and in use?
- How is row-level security enforced — username() functions, entitlement tables, or data source filters?
2. Component Architecture¶
| Component | Role | Migration Equivalent | SA Note |
|---|---|---|---|
| Tableau Desktop | Windows/Mac authoring tool — creates .twb and .twbx workbook files | Databricks SQL notebook / Lakeview dashboard editor / Power BI Desktop | Authors must adopt a new tool; no automated conversion of .twbx to target format |
| Tableau Server / Cloud | Hosts, renders, and serves published workbooks and data sources | Databricks SQL workspace / Power BI Service / Looker instance | The "server" is the migration target, not just a deployment mechanism |
| VizQL Server | Query engine that translates Tableau's visual grammar into SQL/MDX, executes against the data source, and renders the visualization | Databricks SQL compute / partner BI rendering engine | VizQL is proprietary — no open equivalent. Migrating a viz means rebuilding it, not porting VizQL logic |
| Backgrounder | Executes scheduled extract refreshes, subscriptions, alerts, and Prep flows | Databricks Workflows / partner BI native scheduling | If extracts are business-critical, the refresh cadence must be replicated in Databricks Workflows |
| Data Server | Manages published data sources — centralized, reusable connections with embedded credentials | Unity Catalog + Databricks SQL warehouses | Published data sources are the closest analog to Unity Catalog assets; this is the most valuable migration parallel |
| File Store | Stores .hyper extract files and cached rendered views | Delta tables on Databricks / partner BI import cache | .hyper files are proprietary binary format — data must be re-ingested to Delta, not migrated as-is |
| PostgreSQL Repository | Internal content store — stores metadata about workbooks, users, groups, permissions, subscriptions, and usage history | No direct equivalent — mine this DB for inventory data | This is the most valuable single artifact for migration scoping. Query it directly for usage, ownership, and lineage data |
| Metadata API (Catalog) | GraphQL API over Tableau Catalog — provides lineage, column-level metadata, and data quality warnings | Unity Catalog lineage | Only available if Tableau Catalog (Data Management add-on) is licensed |
| Gateway / Load Balancer | Routes requests across a multi-node Tableau Server cluster | Handled by target platform | Only relevant for large on-prem deployments with multiple nodes |
| Tableau Prep | Browser/desktop ETL tool for data preparation flows | Databricks notebooks / dbt / Delta Live Tables | Prep flows (.tflx) are separate artifacts with a separate migration path |
3. Artifact Lifecycle¶
| Stage | What happens | Where (server/client) | Artifact involved | Migration risk |
|---|---|---|---|---|
| Author | Analyst uses Tableau Desktop (or web authoring) to drag dimensions/measures onto a canvas. The workbook stores both the viz definition and optionally an embedded data extract | Client (Desktop) or Server (web authoring) | .twb (XML, no data) or .twbx (zip with embedded .hyper extract) | High — vizzes are defined in Tableau's proprietary VizQL grammar, not SQL. No automated export to standard format |
| Publish | Author publishes the workbook to Tableau Server/Cloud via Desktop or REST API. The workbook file is stored in the PostgreSQL repository as a blob | Server (published via client connection) | .twb or .twbx blob stored in repository DB | Medium — publishing can be scripted via Tableau Server Client (TSC) library or REST API |
| Compile | No pre-compilation step. Tableau translates visual specs to queries at render time using VizQL | Server (at query time) | VizQL query plan (internal, not inspectable) | High — VizQL query generation is opaque; understanding what SQL is being issued requires query logs or Performance Recorder |
| Execute | VizQL generates SQL (or MDX for OLAP sources) and sends it to the data source. For extracts, VizQL queries the local .hyper file instead | Server | SQL sent to data source / query against .hyper file | Medium — live queries will naturally migrate to Databricks SQL; extract-backed workbooks require a Delta table ingestion strategy |
| Render | VizQL renders the result as SVG/HTML in the browser or as a rasterized image (PNG/PDF) for subscriptions and snapshots | Server (image generation) + Client (interactive SVG) | Rendered view (cached in File Store), subscription output (PDF/PNG/CSV) | Low — rendering is handled by the target platform after migration |
| Deliver | Consumer accesses via browser portal, REST API, embedded JS API, or receives scheduled subscription via email | Server (portal) / Client (browser renders interactive SVG) | Rendered HTML/SVG (interactive), PDF/PNG (subscription), JSON (REST API) | High for embedded — Tableau JS Embedding API v3 has no standard replacement; embedded workloads need a redesign decision |
4. Data Sources and Dataset Model¶
Tableau separates data connectivity into two layers:
- Published Data Sources (.tds/.tdsx): Centralized, server-hosted connections that multiple workbooks can share. The closest analog to a governed dataset in Unity Catalog.
- Embedded Data Sources: Connection info baked into a single workbook — common in analyst-built content, harder to govern.
Live vs. Extract:
- Live connection: Every render triggers a query against the upstream source. Latency depends on source system performance.
- Extract (.hyper file): A snapshot of data stored in Tableau's columnar format. Fast query performance, stale by design. Requires scheduled refresh via Backgrounder.
SA Tip: Customers often don't know which workbooks use live vs. extract. This matters because extract-backed workbooks require a data ingestion strategy into Delta, not just a query redirection.
Parameters and filters:
Tableau supports user-facing parameters (typed inputs) and filters (dimension/measure selections). Parameters can be static or dynamic (Tableau 2020.1+). Dynamic parameters pull their allowable values from a query — these require rebuild in the target platform.
Cascading filters (where selecting one filter narrows another) are common in self-service content and require careful review — the dependency logic is embedded in the workbook and must be rebuilt.
| Data Source Type | Frequency in customer estates | Migration Path | Risk |
|---|---|---|---|
| Tableau Hyper Extract (.hyper) | Very High | Re-ingest source data into Delta; do not attempt to migrate .hyper files | High — .hyper is proprietary binary; source system must be re-tapped |
| Direct SQL (Redshift, Snowflake, SQL Server, Postgres) | High | Redirect connection to Databricks SQL warehouse | Low — Tableau supports Databricks native connector |
| Google BigQuery | Medium | Redirect to Databricks SQL warehouse | Low |
| Databricks (already connected) | Medium (growing) | Retain or migrate viz layer only | Low |
| Salesforce / CRM data | Medium | Establish Databricks ingestion pipeline first | Medium |
| Excel / CSV flat files | Medium | Migrate to Delta tables; remove file dependency | Medium — files often live on individual laptops |
| SAP BW / OLAP cubes | Low | No native equivalent in Databricks SQL — must flatten or model in dbt | High — MDX queries must be rewritten as SQL |
| Google Sheets | Low | Migrate to Delta tables | Low |
| JSON / REST API (Web Data Connector) | Low | Build ingestion pipeline in Databricks | High — Web Data Connectors are deprecated in Tableau Cloud |
5. Rendering and Delivery Model¶
Rendering modes:
| Mode | How it works | Common usage |
|---|---|---|
| On-demand (Live) | VizQL queries source and renders on each page load | Executive dashboards, frequently updated data |
| Extract-backed | VizQL queries .hyper file; refreshed on schedule | Self-service analytics where query speed matters |
| Cached view | Previously rendered view served from cache if data hasn't changed | Portal browsing, high-concurrency dashboards |
| Subscription snapshot | Backgrounder renders at scheduled time; output is PDF/PNG/CSV sent by email | Operational reporting, scheduled email delivery |
Delivery modes:
| Delivery Mode | How it works | Business use case | Migration equivalent | Risk |
|---|---|---|---|---|
| Browser portal | User navigates Tableau Server/Cloud portal | Self-service analytics | Databricks SQL portal / Power BI Service / Tableau Cloud retained | Low |
| Direct URL | Shareable URL to a specific view with optional filter parameters in URL | Ad hoc sharing | Partner BI native share links | Low |
| Email subscription | Backgrounder renders a snapshot (PDF/PNG/CSV) and emails it on schedule | Operational delivery, executive digest | Databricks Workflows + email / partner BI subscription | Medium |
| Subscription with custom views | User saves a filtered state; subscription delivers their personal view | Personalized distribution | Partner BI data-driven subscriptions or separate delivery logic | High |
| Embedded via JS API / Embedding v3 | Tableau viz rendered inside a custom web app using Tableau's JavaScript SDK | Customer-facing portals, internal apps | Power BI Embedded / Looker Embedded / Databricks SQL REST API | High — requires application redesign |
| REST API image/PDF export | Programmatic image export via REST API (/views/{id}/image) | Automated report generation | Partner BI REST export API | Medium |
| Data alerts | User sets a threshold; Tableau sends an email when data crosses it | Operational monitoring | Databricks SQL alerts / partner BI native alerts | Low |
| Ask Data / Tableau Pulse (NL query) | Natural language querying of published data sources | Self-service discovery | Databricks Genie / partner NL analytics | Medium — feature parity varies by target |
SA Tip: Embedded deployments are the highest-risk delivery mode. Ask early: "Are any Tableau views embedded in internal or customer-facing applications?" If yes, that workload needs an application-level redesign decision before migration starts.
6. Project Structure and Version Control¶
Server-side organization:
Tableau Server organizes content into Projects (top-level containers) with nested sub-projects. Workbooks, data sources, and flows live inside projects. Projects map loosely to business domains or teams.
Tableau does not have native Git integration. Version history is minimal — Tableau Server keeps a limited revision history (configurable, default 25 revisions), accessible via the REST API.
Environment promotion:
There is no built-in Dev/QA/Prod lifecycle in Tableau Server. Common approaches customers use:
- Separate Tableau Server instances per environment
- Separate Projects on the same server (Dev / Staging / Prod projects)
- Manual promotion via Tableau Desktop publish or REST API scripts
SA Tip: Ask the customer: "Is the server the source of truth, or do analysts have local .twbx files on their laptops?" In many Tableau estates, the authoritative version of a workbook is on someone's laptop, not the server. This directly affects how you scope the inventory.
Command-line / scripting tools:
| Tool | Purpose | SA relevance |
|---|---|---|
tabcmd |
CLI for common server admin tasks — publish, export, create sites | Useful for bulk export during migration scoping |
| Tableau Server Client (TSC) | Python library wrapping the REST API — workbook/datasource CRUD, permissions, schedules | Primary scripting tool for migration automation |
| REST API | Full CRUD over workbooks, data sources, projects, users, schedules, subscriptions | Best source for programmatic inventory |
| Tableau Content Migration Tool (CMT) | GUI/CLI tool for moving content between sites or servers | Useful for Tableau-to-Tableau migration, not cross-platform |
7. Orchestration and Scheduling¶
Built-in scheduler:
Tableau Server's Backgrounder service handles all scheduled work: extract refreshes, subscriptions, and Prep flow runs. Schedules are defined in the Tableau Server admin UI or via REST API.
Schedule types:
- Extract refresh schedules: Refresh a published data source or embedded extract on a time interval
- Subscription schedules: Render and email a workbook view on a schedule
- Flow schedules (Prep): Run a Prep flow to output data to a published data source or file
External trigger via API:
Schedules can be triggered on-demand via the REST API (POST /api/{version}/tasks/extractRefreshes/{task-id}/runNow). This is commonly used when an ETL pipeline completes and needs to trigger a downstream Tableau refresh.
SA Tip: Customers often have a hidden dependency here — their ETL job triggers a Tableau extract refresh via API after loading data. If that ETL migrates to Databricks, the trigger mechanism must move too. Look for Tableau REST API calls in their pipeline code.
Migration target:
| Tableau scheduling function | Migration target |
|---|---|
| Extract refresh schedule | Databricks Workflows (trigger SQL warehouse query or dbt job) |
| Subscription (email delivery) | Partner BI native subscriptions / Databricks Workflows + notification |
| API-triggered refresh | Databricks Workflows triggered by pipeline completion |
| Prep flow schedule | Databricks notebook / Delta Live Tables pipeline |
8. Metadata, Lineage, and Impact Analysis¶
Where inventory data lives:
The PostgreSQL Repository (internal Tableau DB) is the most valuable single artifact for migration scoping. It contains usage events, workbook metadata, data source connections, schedules, and permissions.
SA Tip: On Tableau Server, the repository is directly queryable if the customer enables the
readonlyuser. On Tableau Cloud, the REST API and Metadata API are the only access paths.
Key tables in the PostgreSQL Repository:
| Table | What it contains |
|---|---|
workbooks |
All published workbooks — name, project, owner, site |
views |
Individual sheets/dashboards within workbooks |
hist_views (via _views_stats) |
View-level access history — use for 90-day usage analysis |
datasources |
Published data sources — type, connection string |
subscriptions |
Subscription definitions — owner, schedule, workbook |
users, groups |
User/group assignments |
permissions |
Project and workbook-level permissions |
REST API endpoints for inventory:
GET /api/{version}/sites/{site-id}/workbooks # list all workbooks
GET /api/{version}/sites/{site-id}/datasources # list all published data sources
GET /api/{version}/sites/{site-id}/tasks/extractRefreshes # list refresh schedules
GET /api/{version}/sites/{site-id}/views # list all views with usage stats
Lineage:
Tableau Catalog (Data Management add-on) provides column-level lineage via the Metadata GraphQL API. Without Catalog, lineage must be inferred by parsing workbook XML (field references in .twb files point back to data source columns).
Sample inventory queries (PostgreSQL Repository):
-- Count workbooks by project
SELECT p.name AS project, COUNT(*) AS workbook_count
FROM workbooks w
JOIN projects p ON w.project_id = p.id
GROUP BY p.name ORDER BY workbook_count DESC;
-- Identify unused workbooks (no views in 90 days)
SELECT w.name, w.project_id, MAX(hv.time) AS last_viewed
FROM workbooks w
LEFT JOIN hist_views hv ON hv.workbook_id = w.id
AND hv.time > NOW() - INTERVAL '90 days'
GROUP BY w.id, w.name, w.project_id
HAVING MAX(hv.time) IS NULL;
-- List data sources and their types
SELECT name, datasource_type, db_name, table_name
FROM datasources
ORDER BY datasource_type;
-- List subscriptions with workbook and owner
SELECT s.id, s.subject, u.friendly_name AS owner, w.name AS workbook, sch.name AS schedule
FROM subscriptions s
JOIN users u ON s.user_id = u.id
JOIN views v ON s.view_id = v.id
JOIN workbooks w ON v.workbook_id = w.id
JOIN schedules sch ON s.schedule_id = sch.id;
9. Data Quality and Governance¶
Row-Level Security (RLS):
Tableau implements RLS primarily through User Filters — calculated fields that use Tableau's USERNAME() or USERDOMAIN() functions to filter data to the current viewer. There are several common patterns:
| RLS Pattern | How it works | Migration complexity |
|---|---|---|
USERNAME() filter |
Calculated field checks if a column equals USERNAME() |
Medium — logic must move to Unity Catalog row filters |
| Entitlement table join | Workbook joins to a separate permission table keyed on username | High — entitlement table and join logic must be migrated |
| Data source filter (published DS) | Filter applied at the published data source level — all workbooks using the DS inherit it | Medium — maps to Unity Catalog row filter on the underlying table |
| Row-level security via groups | Tableau groups mapped to data segments | High — custom logic in workbook |
Column-level security:
Tableau does not have native column-level security. Columns are hidden by excluding them from the published data source or by using calculated fields that return null for unauthorized users. This is fragile and should be replaced with Unity Catalog column masks.
Caching and freshness:
- Extract refresh frequency is the primary freshness control for extract-backed workbooks
- Live-connected workbooks are always current but impose load on the source
- Tableau Server's view cache can serve stale renders if data hasn't changed — customers may not realize their "live" dashboard is cached
Custom code / scripting:
| Type | Description | Migration risk |
|---|---|---|
| Calculated fields | Tableau formula language (similar to Excel) — used for derived metrics | Medium — must be rebuilt in SQL or dbt models |
| Table calculations | Post-aggregation calculations (running totals, rank, window functions) — run in VizQL, not in SQL | High — no direct SQL equivalent for some patterns; must be modeled in Databricks SQL |
| Level of Detail (LOD) expressions | FIXED, INCLUDE, EXCLUDE — compute aggregates at a different granularity than the view |
High — powerful but proprietary; must be rewritten as SQL window functions or subqueries |
| Parameters | User-controlled inputs used in calculations and filters | Medium — must be rebuilt in target BI tool |
| Extensions (Analytics Extensions) | Tableau can call Python/R models via TabPy or Rserve | High — extensions are custom integrations requiring redesign |
| Web Data Connectors (WDC) | Custom JavaScript connectors to REST APIs | High — deprecated in Tableau Cloud; must be replaced with ingestion pipelines |
SA Tip: LOD expressions are the single most common migration blocker in Tableau estates. Ask the customer: "Do your analysts use FIXED LOD expressions?" If yes, budget significant rebuild time — these must become SQL subqueries or dbt models, and the logic is often tribal knowledge in the workbook.
10. File Formats and Artifact Reference¶
.twb (Tableau Workbook)¶
What it is: XML file describing the workbook — data connections, calculated fields, parameters, vizzes, dashboards, and filters. Contains no data.
| Property | Value |
|---|---|
| Created by | Tableau Desktop, Tableau Web Authoring |
| Stored in | PostgreSQL repository (as blob) + optional local copy |
| Contains | Viz definitions, connection metadata, calculations, parameters, layouts |
| Human-readable | Yes — XML, but verbose and complex |
| Migration target | Rebuild in target BI tool (Databricks SQL / Power BI / Looker) |
SA Tip: .twb files can be parsed with any XML parser to extract data source connection strings, calculated field definitions, and parameter lists — valuable for automated inventory.
.twbx (Tableau Packaged Workbook)¶
What it is: ZIP archive containing a .twb file plus any embedded extract data (.hyper) and image resources.
| Property | Value |
|---|---|
| Created by | Tableau Desktop (File → Export Packaged Workbook) |
| Stored in | Local filesystem or uploaded to server |
| Contains | .twb + .hyper extract + background images + custom fonts |
| Human-readable | No (ZIP) — but unzip to access .twb |
| Migration target | Unpack, migrate .twb XML to target BI; re-ingest .hyper data from source system |
.tds (Tableau Data Source)¶
What it is: XML file defining a connection to a data source — connection parameters, field aliases, calculated fields, filters applied at the connection level.
| Property | Value |
|---|---|
| Created by | Tableau Desktop, published from Tableau Server |
| Stored in | PostgreSQL repository (as blob) or local file |
| Contains | Connection string, credentials reference, field metadata, RLS filters |
| Human-readable | Yes — XML |
| Migration target | Unity Catalog data source / Databricks SQL warehouse connection |
.tdsx (Tableau Packaged Data Source)¶
What it is: ZIP archive containing a .tds plus an embedded .hyper extract.
| Property | Value |
|---|---|
| Created by | Tableau Desktop |
| Stored in | Local filesystem or uploaded to server |
| Contains | .tds + .hyper extract data |
| Human-readable | No (ZIP) |
| Migration target | Re-ingest source data to Delta; rebuild .tds in target |
.hyper (Tableau Extract)¶
What it is: Tableau's proprietary columnar data store. A snapshot of source data optimized for VizQL queries.
| Property | Value |
|---|---|
| Created by | Tableau Desktop / Backgrounder (on refresh) |
| Stored in | File Store on Tableau Server / local filesystem |
| Contains | Column-oriented data snapshot — proprietary binary format |
| Human-readable | No |
| Migration target | Delta table on Databricks (data must be re-ingested from source) |
SA Tip: There is a Hyper API (Python/C++) from Tableau that can read .hyper files. In a pinch, this can extract data for migration, but the preferred path is always to re-tap the original source system and write to Delta.
.tflx (Tableau Prep Flow)¶
What it is: Tableau Prep Builder flow definition — a JSON-based pipeline describing data preparation steps.
| Property | Value |
|---|---|
| Created by | Tableau Prep Builder |
| Stored in | PostgreSQL repository (if published) or local file |
| Contains | Flow steps, transformations, output destinations |
| Human-readable | Yes — JSON |
| Migration target | Databricks notebook / Delta Live Tables / dbt |
PostgreSQL Repository Tables (Content Store)¶
What it is: Tableau Server's internal database — the definitive record of all server content, usage, permissions, and schedules.
| Property | Value |
|---|---|
| Created by | Tableau Server (auto-managed) |
| Stored in | PostgreSQL on the Tableau Server node |
| Contains | Workbooks, views, users, groups, permissions, subscriptions, usage events |
| Human-readable | Yes — relational tables |
| Migration target | Mine for inventory; no migration target needed |
Quick-Reference Summary¶
| Artifact | Extension or location | Human-readable | Where stored | Migration target | Risk level |
|---|---|---|---|---|---|
| Tableau Workbook | .twb | Yes (XML) | PostgreSQL repo blob / local | Rebuild in target BI | High |
| Packaged Workbook | .twbx | No (ZIP) | Local / uploaded | Unpack + rebuild | High |
| Published Data Source | .tds | Yes (XML) | PostgreSQL repo blob | Unity Catalog + Databricks SQL | Medium |
| Packaged Data Source | .tdsx | No (ZIP) | Local / uploaded | Re-ingest + rebuild | High |
| Hyper Extract | .hyper | No (binary) | File Store on server | Re-ingest to Delta | High |
| Prep Flow | .tflx | Yes (JSON) | PostgreSQL repo / local | Databricks / dbt | Medium |
| PostgreSQL Repository | Internal DB tables | Yes (SQL) | Tableau Server node | Mine for inventory | N/A |
11. Migration Assessment and Artifact Inventory¶
Preferred inventory method:
Query the PostgreSQL Repository directly (requires readonly user access on Tableau Server). This is faster and more complete than the REST API for large estates, and avoids UI pagination limits.
For Tableau Cloud estates, use the REST API (no direct DB access) or the Metadata GraphQL API if Catalog is licensed.
SA Tip: Never scope a Tableau migration by counting workbooks in the portal UI. The UI hides system workbooks, shows content from all projects including archived ones, and gives no usage context. Always query the repository or REST API.
Complexity scoring:
| Dimension | Low | Medium | High | Critical |
|---|---|---|---|---|
| Data source type | Direct SQL to cloud DW | Relational DB requiring credential migration | Flat files / Excel on user machines | OLAP/SAP BW — MDX must be rewritten |
| Query/dataset complexity | Simple SQL select | Stored procedures, multi-table joins | LOD expressions, table calculations | TabPy/R extensions calling ML models |
| Parameter complexity | Static parameters | Dynamic parameters (2020.1+) | Cascading filters with data-driven values | Parameters driving RLS logic |
| Custom code | None | Calculated fields only | LOD expressions, table calcs | Web Data Connectors, Analytics Extensions |
| Dashboard nesting | Single-level | Multiple sheets with actions | Embedded dashboards, URL actions | Embedded in external apps via JS API |
| Delivery complexity | Portal browsing only | Email subscriptions | Subscription with custom views / bursting | Embedded JS API in custom applications |
| Security model | No RLS | USERNAME() filter on single table | Entitlement table join | Multi-table dynamic RLS + group-based filters |
Top migration risks for Tableau:
- LOD expressions —
FIXED,INCLUDE,EXCLUDElogic must be rewritten as SQL. No tooling automates this. - Embedded deployments — Tableau JS Embedding API v3 has no plug-in replacement. Applications must be refactored.
- Web Data Connectors — Deprecated in Tableau Cloud. Any WDC-based data source needs a new ingestion pipeline before the BI layer can migrate.
- Hyper extracts as the "source of truth" — Some customers have lost access to the original source system. The .hyper file is the only copy of the data. These cases require Hyper API extraction before migration.
- Tribal knowledge in workbooks — Tableau's drag-and-drop model means business logic (LOD expressions, complex filters, calculated fields) lives in workbooks rather than in a governed semantic layer. It must be documented and rebuilt.
- TabPy / Analytics Extensions — Workbooks that call Python or R for in-visualization ML scoring require both the model and the integration to be migrated to Databricks ML endpoints.
Sample inventory query:
-- Comprehensive workbook inventory for migration scoping
SELECT
p.name AS project,
w.name AS workbook_name,
w.id AS workbook_id,
u.friendly_name AS owner,
w.created_at,
w.updated_at,
MAX(hv.time) AS last_viewed,
COUNT(DISTINCT hv.user_id) AS unique_viewers_90d,
COUNT(DISTINCT hv.id) AS view_events_90d,
COUNT(DISTINCT ds.id) AS datasource_count,
STRING_AGG(DISTINCT ds.datasource_type, ', ') AS datasource_types,
(SELECT COUNT(*) FROM subscriptions s
JOIN views v2 ON s.view_id = v2.id
WHERE v2.workbook_id = w.id) AS subscription_count
FROM workbooks w
JOIN projects p ON w.project_id = p.id
JOIN users u ON w.owner_id = u.id
LEFT JOIN hist_views hv
ON hv.workbook_id = w.id
AND hv.time > NOW() - INTERVAL '90 days'
LEFT JOIN workbook_datasource_connections wdc ON wdc.workbook_id = w.id
LEFT JOIN datasources ds ON ds.id = wdc.datasource_id
GROUP BY p.name, w.id, w.name, u.friendly_name, w.created_at, w.updated_at
ORDER BY view_events_90d DESC NULLS LAST;
12. Migration Mapping to Databricks¶
Core Mapping Tables¶
Report types:
| Tableau artifact | Databricks / partner BI equivalent |
|---|---|
| Dashboard (multiple sheets) | Databricks SQL dashboard / Lakeview dashboard / Power BI report / Tableau Cloud (retain) |
| Single sheet / viz | Databricks SQL visualization / Lakeview chart |
| Story (guided analytics) | Databricks SQL dashboard with narrative text / Power BI report |
| Tableau Pulse (NL query) | Databricks Genie |
| Prep flow output | Delta Live Tables pipeline / dbt model |
Data sources:
| Tableau data source | Databricks equivalent |
|---|---|
| Published data source (.tds) | Unity Catalog table / view with Databricks SQL warehouse connection |
| Embedded extract (.hyper) | Delta table (data re-ingested from original source) |
| Live connection to Databricks | Retain — redirect to Databricks SQL warehouse |
| Live connection to other SQL | Redirect to Databricks SQL warehouse after data migration |
| Web Data Connector | Databricks ingestion pipeline (Autoloader / partner ETL) |
Dataset queries:
| Tableau pattern | Databricks equivalent |
|---|---|
| Simple SELECT from table | Databricks SQL query / Unity Catalog view |
| Custom SQL in data source | Databricks SQL view or dbt model |
| Calculated field | SQL column expression / dbt metric |
| LOD expression (FIXED) | SQL subquery / window function in dbt |
| Table calculation | Databricks SQL window function |
| Stored procedure call | Databricks SQL stored procedure / dbt macro |
Security:
| Tableau security | Databricks equivalent |
|---|---|
| USERNAME() row filter | Unity Catalog row filter (current_user() predicate) |
| Entitlement table join | Unity Catalog row filter with entitlement table in catalog |
| Published data source filter | Unity Catalog row filter on underlying table |
| Hidden columns (no native CLS) | Unity Catalog column masks |
| Project-level permissions | Databricks SQL workspace permissions + UC catalog/schema grants |
| Group-based content access | Databricks groups + Unity Catalog grants |
Scheduling:
| Tableau scheduling | Databricks / partner BI equivalent |
|---|---|
| Extract refresh schedule | Databricks Workflows (trigger SQL or dbt job) |
| Subscription (email delivery) | Partner BI native subscription / Databricks notification |
| API-triggered refresh | Databricks Workflows REST API trigger |
| Prep flow schedule | Databricks Workflows + DLT pipeline |
Portal and browsing:
| Tableau | Databricks / partner BI equivalent |
|---|---|
| Tableau Server / Cloud portal | Databricks SQL workspace / Power BI workspace / Tableau Cloud |
| Tableau Projects (content folders) | Databricks SQL folders / Power BI workspaces |
| Tableau favorites | Partner BI native bookmark/favorite |
Embedded integration:
| Tableau | Databricks / partner BI equivalent |
|---|---|
| Tableau JS Embedding API v3 | Power BI Embedded / Looker Embed / Databricks SQL REST API + custom UI |
| Tableau Connected Apps (auth) | Power BI service principal / Looker API token |
| Tableau REST API image export | Partner BI REST API / Databricks SQL REST API |
Governance:
| Tableau | Databricks equivalent |
|---|---|
| Tableau Catalog lineage | Unity Catalog lineage |
| Tableau data quality warnings | Unity Catalog data quality tags |
| Tableau site admin permissions | Databricks workspace admin |
| Tableau row-level security | Unity Catalog row filters |
What Doesn't Map Cleanly¶
| Tableau capability | Why it doesn't map | SA recommendation |
|---|---|---|
| LOD Expressions (FIXED / INCLUDE / EXCLUDE) | These compute aggregations at a granularity independent of the viz. There is no equivalent construct in any BI tool — the logic must become SQL window functions or subqueries. | Treat LOD-heavy workbooks as full rebuilds, not migrations. Budget analyst time to reconstruct the intent in dbt or Databricks SQL. |
| Tableau JS Embedding API v3 | Tableau's embedding API is proprietary and tightly coupled to Tableau Server/Cloud authentication. There is no drop-in SDK replacement. | Make an explicit embed strategy decision (Power BI Embedded, Looker Embed, or custom UI on Databricks SQL REST API) before starting the migration. |
| Analytics Extensions (TabPy / Rserve) | These execute Python or R inside a Tableau viz at render time — a tight coupling between viz rendering and model scoring that no other BI tool replicates. | Decouple the ML scoring: run models in Databricks ML Feature Serving, write scores to a Delta table, and consume results as a regular column in the target BI tool. |
| Web Data Connectors (WDC) | Custom JavaScript connectors that pull from REST APIs at viz load time. Deprecated in Tableau Cloud and has no equivalent in Databricks SQL or most partner BI tools. | Build proper ingestion pipelines in Databricks (Autoloader, partner ETL) before migrating the BI layer. WDC workbooks cannot be migrated until the data source is replaced. |
| Tableau Stories | A guided, slide-based narrative format that sequences vizzes with annotations. No equivalent format exists in Databricks SQL, Power BI, or Looker. | Rebuild as a dashboard with text objects and navigation actions, or accept that the narrative format is lost and deliver a standard dashboard. |
| Data-driven subscriptions with custom views | Users can save personal filtered views and receive subscriptions scoped to their saved view. This combines personalization + delivery in a way that partner BI tools handle inconsistently. | Map to Power BI data-driven subscriptions (if target is Power BI) or accept a simplified subscription model and communicate the gap to stakeholders early. |
13. Quick-Reference Cheat Sheet¶
| Topic | Key fact | SA question to ask |
|---|---|---|
| Report count heuristic | Tableau estates typically have 3–5x more workbooks than actively used ones — 90-day usage filter is essential | "Can we get readonly access to the PostgreSQL repository, or will you run an inventory query for us?" |
| Where usage data lives | PostgreSQL Repository (hist_views table) on Tableau Server; REST API /views endpoint for Tableau Cloud |
"Is the Tableau Server repository accessible to your DBA for a query?" |
| Biggest migration risk | LOD expressions (FIXED/INCLUDE/EXCLUDE) — must be rewritten as SQL, no automation | "Do your analysts use LOD expressions in their workbooks?" |
| Second biggest risk | Embedded deployments via Tableau JS Embedding API — requires application redesign | "Are any Tableau views embedded in internal or customer-facing apps?" |
| Hyper extract trap | Some customers have no access to original source system — .hyper file is the only data copy | "For extract-backed workbooks, can you confirm the source system is still accessible?" |
| Source control gap | Tableau has no native Git integration — the server may not have the latest version of a workbook | "Do analysts publish directly from Desktop, or is there a source-controlled project structure?" |
| TabPy / Analytics Extensions | Any workbook calling TabPy/Rserve couples ML scoring to viz rendering — must be decoupled | "Do any workbooks use Analytics Extensions (TabPy, Rserve)?" |
| Web Data Connectors | WDC data sources are deprecated in Tableau Cloud — blockers if customer is mid-cloud-migration | "Are any data sources built on Web Data Connectors?" |
| RLS approach | Most commonly USERNAME() filter or entitlement table join — both require rebuild as Unity Catalog row filters | "How is row-level security enforced — username functions in workbooks, or at the data source level?" |
| Tableau Catalog / lineage | Only available with Data Management add-on license — many customers don't have it | "Is Tableau Catalog licensed? If not, lineage will need to be inferred from workbook XML parsing." |