Camel – Unmarshal Csv to Java POJO

This document shows how to do following:

  1. Get a csv from a local folder,
  2. Unmarshal it to a Java pojo object array
  3. Split the array to each object
  4. Handle each pojo object by a customized code

Following is the pojo definition with camel csv format annotations


package com.zzyan.domain;

import org.apache.camel.dataformat.bindy.annotation.CsvRecord;
import org.apache.camel.dataformat.bindy.annotation.DataField;

@CsvRecord(separator = ",", skipFirstLine = true)
public class Item {

    @DataField(pos=1)
    private String transactionType;

    @DataField(pos=2)
    private String sku;

    @DataField(pos=3)
    private String itemDescrition;

    @DataField(pos=4)
    private String price;

/*
    here ignores all the getter and setters
*/

    @Override
    public String toString() {
        return "Item{" +
                "transactionType='" + transactionType + '\'' +
                ", sku='" + sku + '\'' +
                ", itemDescrition='" + itemDescrition + '\'' +
                ", price='" + price + '\'' +
                '}';
    }
}

We also define a customized processor, the processor will take in 1 Item object, and do some handling. Here we do nothing but just print it out.


package com.zzyan.processor;

import com.zzyan.domain.Item;
import lombok.extern.slf4j.Slf4j;
import org.apache.camel.Exchange;
import org.springframework.stereotype.Component;

@Component
@Slf4j
public class ItemProcessor implements org.apache.camel.Processor {
    @Override
    public void process(Exchange exchange) throws Exception {
        Item item = (Item)exchange.getIn().getBody();
        System.out.println(item.getItemDescrition());
    }
}

Finally we define the Camel Route to link these together.


package com.zzyan.route;

import com.zzyan.domain.Item;
import com.zzyan.processor.ItemProcessor;
import org.apache.camel.builder.RouteBuilder;
import org.apache.camel.dataformat.bindy.csv.BindyCsvDataFormat;
import org.apache.camel.spi.DataFormat;
import org.springframework.stereotype.Component;

//@Component
public class UnmarshalCsvRoute extends RouteBuilder {
    @Override
    public void configure() throws Exception {

        DataFormat bindy = new BindyCsvDataFormat(Item.class);

        from("timer:hello?period=10s")
                .log("Timer Invoked and the body is ${body}")
                .pollEnrich("file:c:/data/input?noop=true&readLock=none")
                .log("body: ${body}")
                .unmarshal(bindy)
                .log("The unmarshaled object is ${body}")
                .split(body())
                  .log("new Item ${body}")
                  .process(new ItemProcessor())
                .end();
    }
}

Put the csv file with the same header to c:/data/input folder, and run the spring boot project, the Route will pickup the file and do the transformation.

 

 

Camel – Fetch file from sftp and unzip

In real time integration, especially file based integration, we normally starts from fetching file from a sftp server.

It’s common that the files put on the sftp server is a zipped file with multiple files compressed together.

Following shows how to use a Camel route to fetch a zip file from the sftp server, and after that unzip the file and put to local folders.

Please refer to Spring boot + Camel hello world to see how to setup the project before creating the Route.

The code contains both getting file from a sftp, and fetch from a local folder(commented). It also includes a Choice to do the file filtering and switch to different sub-Routes if required.


package com.zzyan.route;

import org.apache.camel.Predicate;
import org.apache.camel.builder.PredicateBuilder;
import org.apache.camel.builder.RouteBuilder;
import org.apache.camel.model.dataformat.ZipFileDataFormat;
import org.springframework.stereotype.Component;

import java.util.Iterator;

//@Component
public class UnzipRoute extends RouteBuilder {
    @Override
    public void configure() throws Exception {

        // Define a Zip File format, since our Zip file have multiple zip files
        ZipFileDataFormat zipFile = new ZipFileDataFormat();
        zipFile.setUsingIterator(true);

        from("timer:hello?period=10s") // Triggered by Timer every 10 seconds
                .log("Timer Invoked and the body is ${body}")
                .pollEnrich("sftp://USERNAME@HOST/FILEPATH?password=PASSWORD&noop=true")
				// following shows pull the file from local c:/data/zipInput
                //.pollEnrich("file:c:/data/zipInput?noop=true&readLock=none")
                .log("${file:name}")
                .unmarshal(zipFile)
                .split(bodyAs(Iterator.class)).streaming()
                  .log("${file:name}")
                  .choice()
                    .when(header("CamelFileName").startsWith("Headers"))
                      .log("This is the Header")
                      .to("file:c:/data/output")
                    .otherwise()
                      .to("file:c:/data/output")
                  .endChoice()
                .end();
    }
}

 

How to use Python to connect to sftp and download file

We can use pysftp to connect to python and get the files from an sftp server.

Before that, we need to get the sftp server’s key using following command.

$ ssh-keyscan example.com
# example.com SSH-2.0-OpenSSH_5.3
example.com ssh-rsa AAAAB3NzaC1yc2EAAAADAQAB...

When fetched the key, put it to file  C:/Users/username/.ssh/known_hosts
– You can also save the key in a folder and pass in using the cnopts variable shown below in the code

import pysftp
import io
import csv

myHostname = "HOST"
myUsername = "USERNAME"
myPassword = "PASSWORD"

cnopts = pysftp.CnOpts() # C:/Users/USERNAME/.ssh/known_hosts
cnopts.hostkeys.load('keys/testSftp.key')

with pysftp.Connection(host=myHostname, username= myUsername, password=myPassword, cnopts = cnopts) as sftp:
    print("Connection successfully stablished...")
    flo = io.BytesIO()
    sftp.getfo('/PATH_TO_DATA_FILE',flo)
    fileStr=flo.getvalue()
    textStr = fileStr.decode('UTF-8')
    print(textStr)

If everything works fine, you should see your file print in the command line

How can Java/Groovy connect to Oracle Always Free Autonomous Database

Thanks to Oracle’s Cloud Always Free services, now we can have a free VM and a free Oracle Autonomous Database.

Isn’t it excited? And then you want to do some thing with it.

1 constraint of the Always Free Oracle Autonomous Database is that, you need to use it.

If you don’t use the database within 7 days, Oracle will shut the database down, you need to start it manually later. And if you don’t use it for 3 months after that, Oracle deletes the whole instance.

So, how to avoid that? Or, how to auto do something in the database to keep it alive?

An easy way to do this is to insert some data into the database periodically.

We got a VM, and we got the database, so why don’t use the VM to insert some data into the database?

So here I come up with the “ping” groovy script, which get triggered by cron job, and inserts a dummy record every hour to the Database.

Here’s how it works.

Firstly, we install Java and Groovy into the database

How to install Java and Groovy in Oracle Always free Ubuntu

Secondly, we download the Oracle Wallet, and then, referring to Oracle document on how to use JDBC thin to connect to Database, we can download all the required jar files from Oracle Database 18c (18.3) JDBC Driver and UCP Downloads. (We will need ojdbc8.jar, ucp.jar, oraclepki.jar, osdt_core.jar and osdt_cert.jar)

Put the downloaded Jar file to your Groovy /lib directory
And put the wallet to an folder, unzipped it, we need the unzipped path later in the Groovy script.

Create a table using following script

create table ping(creation_date date)

Thirdly, we create the Groovy Script as following:

//Please note the wallet path is the unzipped path
//Replace DBNAME, USERNAME, PASSWORD to your own ones
url = "jdbc:oracle:thin:@DBNAME_high?TNS_ADMIN=PATH_TO_THE_Wallet/Wallet_DBNAME"
username = "USERNAME"
password = "PASSWORD"

println url

driver = "oracle.jdbc.driver.OracleDriver"

import groovy.sql.*
sql = Sql.newInstance(url,username,password,driver)
try{
  sql.execute("insert into ping(creation_date) values (sysdate)")
} finally{
  sql.close()
}

Finally, we create a Cron job in the crontab, and let Linux schedule and trigger the Groovy script every hour

 0 */1 * * * /home/ubuntu/.sdkman/candidates/groovy/current/bin/groovy /home/ubuntu/groovy/PingDB.groovy 

If everything works fine, you should see the Ping table been populated every hour.

Now you can enjoy a none turned off Free database and VM.

How to install Java and Groovy in Oracle Always free Ubuntu

Refer to following URL to install Java and Groovy in Ubuntu

https://computingforgeeks.com/how-to-install-apache-groovy-on-ubuntu-18-04-ubuntu-16-04/

How to Install Java on Ubuntu

http://groovy-lang.org/install.html
Install Java to Ubuntu
You need to upgrate your Ubuntu to the latest version
sudo apt update
sudo apt -y upgrade
sudo reboot
sudo apt-get install default-jdk
The jdk will be installed at /usr/lib/jvm

After this, we need to setup the JAVA_HOME environment parameter
go to folder /etc

change the “environment” file and add a new line of the JAVA_HOME
you need to use sudo vi environment to open the file and edit, since this folder needs root access

JAVA_HOME=/usr/lib/jvm/default-java

after saving the file, use to load the new profile and double check the setting
source environment
echo $JAVA_HOME

Install Groovy
before installing Groovy, we need to install zip and unzip tool
sudo apt-get install -y zip
sudo apt-get install -y unzip

and after this, we refer to following Groovy official document to install Groovy
http://groovy-lang.org/install.html

We need to use sudo to install the sdkman
sudo curl -s get.sdkman.io | sudo bash

we need to change user to root to execute the sdkman-init.sh
sudo -s
source “$HOME/.sdkman/bin/sdkman-init.sh”

now install groovy using root account
sudo -s
sdk install groovy
groovy -version

 

The groovy was installed under following directory
/home/ubuntu/.sdkman/candidates/groovy/current/bin

 

need to add the same to environment so it will take in the setting in PATH
/etc/environment

now we can try to deploy the groovy script in the ubuntu server and try to run them