As explained in the presentation article, Stambia can work with Hadoop distributions, to perform operations on its different technologies.
To accomplish this, you'll first have to install the Stambia's Hadoop connector and prepare your Designer and environment.
Therefore, you can find below all the explanations and instructions to install and set up everything to get ready to work with Hadoop.
Prerequisites:
- Stambia DI Designer S18.3.6 or higher
- Stambia DI Runtime S17.4.6 or higher
Installation
Introduction
We strongly advise to read and follow carefully the installation instructions when starting using Hadoop with Stambia.
Make sure everything is set up correctly and you don't miss any of the steps presented below.
When the installation is finished, we suggest having a look at the dedicated articles for each technology to learn how to use and configure them.
The installation is composed of the following steps:
- Download of the connector, templates, and libraries
- Installation of the Stambia connector
- Installation of the Hadoop libraries
- Installation of the generic and Hadoop templates
- Configuration of the JDBC Drivers
Download
The first step is to download all the necessary materials that you'll need to perform the installation.
Lead to the presentation article and retrieve the following items from the download section:
- Stambia's Hadoop Connector plugin
- Generic templates
- Hadoop templates
- Additional libraries
Connector Installation
Once you've downloaded the Hadoop Connector, proceed to the installation of the plugin as usual.
You can follow this link to find how to install plugins in the Designer.
The Hadoop's one is no different than any other additional plugin.
When the installation is finished, you can start the Designer and check in the Installation Details if it is correctly installed.
Click on the Help Menu > About Stambia Designer > Installations Details > Plug-ins tab
You should find the Hadoop plugin in the list.
Hadoop additional libraries
The connector being installed, comes the additional library installation part.
The required Stambia libraries are all deployed by the plugin previously installed, but you also need to install the Hadoop libraries which contain the necessary APIs to communicate with Hadoop technologies.
The exact list of libraries may depend on the Hadoop distribution and version you are using, such as Cloudera, Hortonworks, Mapr, ...
We therefore cannot provide for the moment a complete list for each of them, but we provide a standard package of libraries that should work most of the time.
You can find in the download section.
All the additional libraries must be copied under the following path in your Designer:
<Designer Installation Directory>/stambiaRuntime/lib/addons/hadoop
Templates Installation
The next step consists of importing the generic and Hadoop templates into your workspace.
Start your Designer and import them as usual in your workspace.
JDBC Drivers Configuration
The final part is about the JDBC Drivers configuration.
It is necessary to configure it if you want to connect, reverse or consult data through the Designer on technologies such as Hive, Impala, or HBase.
The entries in the Designer are already pre-configured and all you have to do is to modify them to specify in all the Hadoop libraries installed previously.
Open the Window Menu > Preferences > SQL Explorer > JDBC Drivers > select "Hive" for instance in the list and click on Edit
Then, lead to the "Extra Class Path" tab and click on "Add JARs".
Use the Popup to select all the Hadoop libraries you placed into <Designer Installation Directory>/stambiaRuntime/lib/addons/hadoop.
Then, click on OK to save your modifications.
The screenshot only shows 4 of the added libraries as an example, you should add here all the Hadoop libraries
You must do this operation for each of the following Hadoop technologies that we use through this system (if you plan to use them): Hive, Impala, HBase
Troubleshooting
If you have any issue during the installation, do not hesitate to contact us.
We can help and guide you to perform this installation.
Next steps
Demonstration Project and articles
A demonstration project presenting common and advanced usage of Hadoop technologies in Stambia can be found on the download page.
You can download and import it in your workspace and then have a look at the Mappings and Processes examples.
It is a good start to familiarize with its usage and to see how the Metadata are configured, for instance.
We also advise to refer to the articles dedicated to each technology, which will explain the basics about it.
Technology | Getting started |
HDFS | Getting started article |
Hive | Getting started article |
HBase | Getting started article |
Impala | Getting started article |
Sqoop | Getting started article |