An introduction to Talend Open Studio for Big Data


One of my favorite things about being a developer is that fact that we have so many free resources available to do our jobs. One of these is Talend Open Studio (TOS), and in this first blog post I wanted to introduce this amazing little tool and talk a bit about how we use it at Kenosis Coders.

If you are not familiar, Talend Open Studio for Big Data is an Extract-Transform-Load (ETL) tool. The purpose of this type of software is that it can extract data from a source such as a flat file, database, or application like Salesforce, transform that data, and then load it into another source.


Talend Open Studio can be installed on Windows, Mac, or Linux, and the latest version we have run is 6.4. While it can be a little tricky to install, once you get it going it is a powerhouse of a tool. The process typically consists of defining your input and output sources, validating your connection to them, and then creating a job to pull all the components together.


Perhaps the greatest feature of the software is that it is so customizable. TOS is a java code generator, and as such you can ‘dig under the hood’ anytime to troubleshoot a script, or to add custom functionality that might not otherwise be easily achieved through the interface.

TOS is also compatible with git for version control, and we have been able to compile a job into a JAR file and run it via a scheduled process on Heroku. We have created quick jobs that could have been performed with Salesforce data loader, and we have performed weekend migrations where we have loaded several million records from flat files into a Salesforce org.

Next Steps

TOS can be downloaded from the Talend Website. They also have a paid version that seems to have some great features, and may be worth checking out depending on your organization’s needs.

If you would like to read more about Talend, this Guru99 article has some good information.

Well, that’s it for this time around. Next week’s post is going to be on one of the other tools we use here at the shop, the IntelliJ Java IDE and the Illuminated Cloud plugin for interacting with Salesforce. We’ll compare it to some other tools we’ve used in the past. Also, coming up soon is a post on Salesforce DX.