Skip to content

ETL Tools

A reference library for Solution Architects working on ETL migration and modernization engagements. Each tool page is designed to help you quickly understand the building blocks, artifacts, and orchestration model of an ETL platform — so you can assess scope, map concepts, and guide customers toward a modern data stack on Databricks.


Why This Exists

Enterprise ETL migrations are complex. Customers have years of logic embedded in tools like Ab Initio, Informatica, or Talend — and they need help understanding what they have, how it fits together, and how to move it. This section helps SAs answer:

  • What are the core artifacts in this tool and how are they organized?
  • How does orchestration and scheduling work?
  • How do I inventory and scope a migration engagement?
  • How do Ab Initio concepts map to Databricks equivalents?

Available Tools

Tool Description
Ab Initio Enterprise parallel ETL — building blocks, orchestration, and Databricks migration mapping
Talend Open-core Java ETL — Jobs, tMap logic, orchestration, and Databricks migration mapping
IBM DataStage IBM enterprise ETL — APT parallel engine, Sequences, xMeta inventory, and Databricks migration mapping
Informatica Market-leading enterprise ETL — mappings, sessions, workflows, IDQ, and Databricks migration mapping
Informatica BDM Informatica Big Data Management — Spark/Hive execution, mappings, mapplets, IDQ, and Databricks migration mapping
Matillion Cloud-native ELT — Transformation/Orchestration pipelines, push-down SQL, and Databricks migration mapping
SSIS Microsoft SQL Server Integration Services — packages, Control Flow, Data Flow, SQL Agent orchestration, and Databricks migration mapping
Pentaho Open-core Kettle ETL — Transformations, Jobs, Carte clustering, repository inventory, and Databricks migration mapping
Oracle Data Integrator Oracle ELT platform — Mappings, Knowledge Modules, Load Plans, repository inventory, and Databricks migration mapping

More tools will be added over time. Each follows the same structure: ecosystem overview → building blocks → orchestration → migration assessment → Databricks mapping.