Docker - Hop Server Container Setup


Here I describe my setup of the Docker Apache Hop Server container. Apache Hop is an open source data integration platform and a fork of Pentaho Data Integration. In recent years I have enjoyed using Pentaho both at home and professionally and I’m very excited about Hop.

For those who know Pentaho, much will be familiar. You can read about the differences (and similarities) here.

Setup

I will use a long-lived Apache Hop server container to experiment with. With this container setup it is possible to run pipelines and workflows remotely, but also to run web services using a project and environments. Thanks to the Hop team by the way.

IMPORTANT If you want to use projects and environments, make sure to use the Hop Server container with version 1.2.0 or higher. At the time of writing, 1.2.0 is still under development and can be tested by using “apache/hop:Development”.

I run this Docker container on a Ubuntu VM that runs via [[ Proxmox VE ]]. I perform the following on the CLI.

Docker pull command:

docker pull apache/hop

Within my home folder I have created a Docker folder where I create a subfolder for each container:

cd ~
mkdir -p docker/hop
cd docker/hop

Also create the folders where the Hop files can be stored:

mkdir -p data/env
mkdir -p data/metadata
mkdir -p data/jdbc
mkdir -p data/log
mkdir -p data/projects

I use the Nano text editor to create a shell script:

sudo nano hop_run.sh

With this shell script we are going to create the container. Copy the following into hop_run.sh:

docker run -d \
 --name=hop \
 --hostname=hop \
 -p 8182:8182 \
 -v $PWD/data:/files \
 -e TZ=Europe/Amsterdam \
 -e HOP_SERVER_USER="admin" \
 -e HOP_SERVER_PASS="admin" \
 -e HOP_SERVER_PORT=8182 \
 -e HOP_SERVER_HOSTNAME=0.0.0.0 \
 -e HOP_PROJECT_NAME="PROJECT_NAME" \
 -e HOP_PROJECT_FOLDER="/files/projects/PROJECT_FOLDER_NAME" \
 -e HOP_ENVIRONMENT_NAME="ENV_NAME" \
 -e HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS="/files/env/ENV_PATH.json" \
 -e HOP_SERVER_METADATA_FOLDER="METADATA_LOCATION" \
 -e HOP_SHARED_JDBC_FOLDER="/files/jdbc" \
 -e HOP_LOG_PATH="/files/log/hop.err.log" \
 --restart unless-stopped \
 apache/hop

If necessary, adjust the following:

-p 8182:8182
Choose a port that is still available. Check this with netstat

-v $PWD/data:/files
Choose the location for the Hop files. In this example it is the data map we created. This can also be a shared folder

-e TZ=Europe/Amsterdam
Pick the right timezone

-e HOP_SERVER_USER=”admin”
Replace with your username

-e HOP_SERVER_PASS=”admin”
Replace with your password

-e HOP_SERVER_PORT=8182
Replace with the port you previously set as mapping

-e HOP_PROJECT_NAME=”PROJECT_NAME”
Replace PROJECT_NAME with the name of your project. If you only use Hop Server to run pipelines or workflows remotely, then don’t use this docker environment variable

-e HOP_PROJECT_FOLDER=”/files/projects/PROJECT_FOLDER_NAME”
Replace PROJECT_FOLDER_NAME with the name of your project folder. If you only use Hop Server to run pipelines or workflows remotely, then don’t use this docker environment variable

-e HOP_ENVIRONMENT_NAME=”ENV_NAME”
Replace ENV_NAME with the name of your Hop environment. For example development-config. If you only use Hop Server to run pipelines or workflows remotely, then don’t use this docker environment variable

-e HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS=”/files/env/ENV_PATH.json”
Replace ENV_PATH with your Hop environment path. For example /files/env/hop-server-test-development-config.json. If you only use Hop Server to run pipelines or workflows remotely, then don’t use this docker environment variable

-e HOP_SERVER_METADATA_FOLDER=”METADATA_LOCATION”
Choose the location for the metadata. If you only use Hop Server to run pipelines or workflows remotelythis can be /files/metadata, but when using a project this could be /files/projects/PROJECT_FOLDER_NAME/metadata

-e HOP_SHARED_JDBC_FOLDER=/files/jdbc
Choose the location of the JDBC drivers that are not included by default

-e HOP_LOG_PATH=/files/log/hop.err.log
Replace with the location of the error log. This environment variable is not necessary, but I like to be able to easily consult the log files via a shared folder

Exit Nano (CTRL-X) and save the changes.

Now create the container:

sudo sh hop_run.sh

Check if the container is running properly.

See also my notes about updating containers with Portainer or via the CLI. With Synology’s Docker Application, updating a container is also very easy.

Hop Server Status

The Hop server status can now be accessed via the following URL:

http://<IP DOCKER HOST>:8182/hop/status

Login with your username and password. This will give you an overview of the pipelines and workflows after these are executed through the server.

Hop GUI

Here you can read about the Hop GUI and the Remote Pipeline Engine which you can use in combination with the Hop Server.

Via the Hop GUI you can create a project, environments, pipelines and workflows that you can use to run Hop Server as a web service.


Read other notes

Comments

    No comments found for this note.

    Join the discussion for this note on this ticket. Comments appear on this page instantly.

    Tags


    Notes mentioning this note


    Notes Graph