Spring Batch

Content:

  1. Introduction
  2. What is it?
  3. Examples
  4. Basic - flat files and database
    1. XML flat files
    2. JSON flat files
    3. CSV flat files
    4. Fixed width flat files
  5. Flow control
    1. Multiple input files combined to single output file
    2. Single input file copied to multiple output files each output file getting all records
    3. Single input file split to multiple output files each record only going to one output file
  6. Error handling
    1. Skip in case of error
    2. Retry in case of error
  7. Final remarks

Introduction:

This is about Spring Batch. I am usually not a big fan of Spring products, but I decided to make an excption for SPring Batch.

Readers are expected to know Java, Spring DI and a little about data processing.

For a refresher on DI go here.

Some code will be repeated, but I find it better for the reader to have all the code related to one example in one place instead of having to find some of the code in another example.

What is it?

There are fundamentally two types of server applications:

interactive
application started by and interacting with user during run
batch
application started automatically and running without user interaction

There are two aspects of batch jobs:

management of jobs
schedule start of jobs, monitor running jobs etc.
the jobs themselves
the data processing that take place in the jobs

Despites its name then Spring Batch is a framework for the second.

The Spring suite has other products for scheduling jobs. And there is another wellknown product Quartz dedicated to that.

Spring Batch is a framework that allows for declarative orchestration of many standard operations and comes with a number of standard components for common operations.

The key aspects of Spring Batch are:

Examples:

Examples will cover:

The examples are pretty trivial, but a lot of batch data processing is trivial.

Spring Batch today are often done using annotations and configuration in fluent code. But I have chosen to show configuration via XML. To me XML configuration is the right way to configure batch jobs despite not being so elegant.

A job is defined like:

<batch:job id="job id">
    <batch:step id="step id">
        ...
    </batch:step>
    <batch:step id="step id">
        ...
    </batch:step>
    ...
</batch:job>

A step is defined like:

<batch:step id="step id">
    <batch:tasklet ...>
        <batch:chunk reader="reader bean" writer="writer bean" processor="processor bean" .../>
    </batch:tasklet>
    ...
</batch:step>

A reader/writer/processor bean is defined like any other Spring bean:

<bean id="bean id" class="bean class name">
    <property name="property name" value="property value literal"/>
    <property name="property name" ref="property value id of other bean"/>
    ...
</bean>

Most of the examples will use the following two support classes.

AutoMap.java:

package batch.xxxxxx;

import java.beans.IntrospectionException;
import java.beans.Introspector;
import java.beans.PropertyDescriptor;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.util.ArrayList;
import java.util.List;

public class AutoMap<TFROM,TTO> {
    public static class MethodPair {
        private Method getter;
        private Method setter;
        private boolean trim;
        public MethodPair(Method getter, Method setter, boolean trim) {
            this.getter = getter;
            this.setter = setter;
            this.trim = trim;
        }
        public Method getGetter() {
            return getter;
        }
        public Method getSetter() {
            return setter;
        }
        public boolean getTrim() {
            return trim;
        }
    }
    private List<MethodPair> conv;
    public AutoMap(Class<TFROM> from, Class<TTO> to) throws IntrospectionException {
        this(from, to, false);
    }
    public AutoMap(Class<TFROM> from, Class<TTO> to, boolean trim) throws IntrospectionException {
        conv = new ArrayList<MethodPair>();
        for(PropertyDescriptor pdfrom : Introspector.getBeanInfo(from).getPropertyDescriptors()) {
            for(PropertyDescriptor pdto : Introspector.getBeanInfo(to).getPropertyDescriptors()) {
                if(pdfrom.getName().equals(pdto.getName()) && pdfrom.getPropertyType().equals(pdto.getPropertyType())) {
                    Method getter = pdfrom.getReadMethod();
                    Method setter = pdto.getWriteMethod();
                    if(getter != null && setter != null) {
                        conv.add(new MethodPair(getter, setter, trim && pdfrom.getPropertyType().equals(String.class)));
                    }
                }
            }
        }
    }
    public void convert(TFROM from, TTO to) throws IllegalAccessException, IllegalArgumentException, InvocationTargetException {
        for(MethodPair mp : conv) {
            if(mp.getTrim()) {
                mp.getSetter().invoke(to, ((String)mp.getGetter().invoke(from)).trim());
            } else {
                mp.getSetter().invoke(to, mp.getGetter().invoke(from));
            }
        }
    }
}

CopyProcessor.java:

package batch.xxxxxx;

import org.springframework.batch.item.ItemProcessor;

public class CopyProcessor<TFROM,TTO> implements ItemProcessor<TFROM,TTO> {
    private Class<TFROM> fromClass;
    private Class<TTO> toClass;
    public Class<TFROM> getFromClass() {
        return fromClass;
    }
    public void setFromClass(Class<TFROM> fromClass) {
        this.fromClass = fromClass;
    }
    public Class<TTO> getToClass() {
        return toClass;
    }
    public void setToClass(Class<TTO> toClass) {
        this.toClass = toClass;
    }
    private AutoMap<TFROM,TTO> mapper = null;
    @Override
    public TTO process(TFROM source) throws Exception {
        if(mapper == null) {
            mapper = new AutoMap<TFROM,TTO>(fromClass, toClass);
        }
        TTO target = toClass.newInstance();
        mapper.convert(source, target);
        return target;
    }
}

Basic - flat files and database:

Here we will look at examples doing 2 step jobs:

flat file -> database -> flat file

Tests data:

input.xml:

<employees>
    <employee no='1'>
        <name>Alan A</name>
        <role>Manager</role>
    </employee>
    <employee no='2'>
        <name>Brian B</name>
        <role>Engineer</role>
    </employee>
    <employee no='3'>
        <name>Chris C</name>
        <role>Sales rep</role>
    </employee>
</employees>

input.json:

[ { "no" : 1, "name" : "Alan A", "role" : "Manager" }, { "no" : 2, "name" : "Brian B", "role" : "Engineer" }, { "no" : 3, "name" : "Chris C", "role" : "Sales rep" } ]

input.csv:

1,"Alan A","Manager"
2,"Brian B","Engineer"
3,"Chris C","Sales rep"

input.fix:

       1Alan A                          Manager                         
       2Brian B                         Engineer                        
       3Chris C                         Sales rep                       

XML flat files:

Two steps:

  1. XML file using JAXB to database using JPA
  2. database using JPA to XML file using JAXP

First step has:

Second step has:

Reading XML file via JAXB use config file fragment like:

<bean id="jaxbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
    <property name="classesToBeBound">
        <list>
            <value>class to be mapped to XML</value>
            ...
        </list>
    </property>
</bean>
<bean id="xmlReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
    <property name="fragmentRootElementName" value="root element name"/>
    <property name="resource" value="name of XML file"/>
    <property name="unmarshaller" ref="jaxbMarshaller"/>
</bean>

Writing to database via JPA will use config file fragment like:

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseWriter" class="org.springframework.batch.item.database.JpaItemWriter">
    <property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>

Reading from database via JPA will use config file fragment like:

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
    <property name="entityManagerFactory" ref="entityManagerFactory" />
    <property name="queryString" value="JPQL query"/>
</bean>

Note: JPQL query - not SQL query!

Writing XML file via JAXB use config file fragment like:

<bean id="jaxbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
    <property name="classesToBeBound">
        <list>
            <value>class to be mapped to XML</value>
            ...
        </list>
    </property>
</bean>
<bean id="xmlWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
    <property name="resource" value="name of XML file"/>
    <property name="marshaller" ref="jaxbMarshaller"/>                                                         
    <property name="rootTagName" value="root element name"/>
    <property name="overwriteOutput" value="true"/>
</bean>

xmldbxml.xml (complete Spring Batch configuration):

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:batch="http://www.springframework.org/schema/batch" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
    <!-- support beans -->
    <bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
    </bean>
    <bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
        <property name="persistenceUnitName" value="mysql"/>
    </bean>
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> 
        <property name="transactionManager" ref="repoTransactionManager"/>
    </bean>     
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>
    <bean id="jaxbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
        <property name="classesToBeBound">
            <list>
                <value>batch.basic.EmployeeJAXB</value>
            </list>
        </property>
    </bean>
    <!-- step1 beans -->
    <bean id="employeeXmlReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
        <property name="fragmentRootElementName" value="employee"/>
        <property name="resource" value="C:/work/batch/input.xml"/>
        <property name="unmarshaller" ref="jaxbMarshaller"/>
    </bean>
    <bean id="copyJAXB2JPAProcessor" class="batch.basic.CopyProcessor">
        <property name="fromClass" value="batch.basic.EmployeeJAXB"/>
        <property name="toClass" value="batch.basic.EmployeeJPA"/>
    </bean>
    <bean id="employeeJPAWriter" class="org.springframework.batch.item.database.JpaItemWriter">
        <property name="entityManagerFactory" ref="entityManagerFactory"/>
    </bean>
    <!-- step2 beans -->
    <bean id="employeeJPAReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
        <property name="queryString" value="SELECT e FROM EmployeeJPA AS e"/>
    </bean>
    <bean id="copyJPA2JAXBProcessor" class="batch.basic.CopyProcessor">
        <property name="fromClass" value="batch.basic.EmployeeJPA"/>
        <property name="toClass" value="batch.basic.EmployeeJAXB"/>
    </bean>
    <bean id="employeeXmlWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
        <property name="resource" value="C:/work/batch/output.xml"/>
        <property name="marshaller" ref="jaxbMarshaller"/>                                                         
        <property name="rootTagName" value="employees"/>
        <property name="overwriteOutput" value="true"/>
    </bean>
    <!-- job of step1 and step2 -->
    <batch:job id="testJob">
        <batch:step id="step1">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeXmlReader" writer="employeeJPAWriter" processor="copyJAXB2JPAProcessor" commit-interval="1"/>
            </batch:tasklet>
            <batch:next on="COMPLETED" to="step2"/>
            <batch:end on="FAILED"/>
        </batch:step>
        <batch:step id="step2">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeJPAReader" writer="employeeXmlWriter" processor="copyJPA2JAXBProcessor" commit-interval="1"/>
            </batch:tasklet>
        </batch:step>
    </batch:job>
</beans>

EmployeeJAXB.java (data class with JAXB annotations):

package batch.basic;

import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement(name="employee")
public class EmployeeJAXB {
    private int no;
    private String name;
    private String role;
    public EmployeeJAXB() {
        this(0, null, null);
    }
    public EmployeeJAXB(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    @XmlAttribute
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    @XmlElement
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    @XmlElement
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

EmployeeJPA.java (data class with JPA annotations):

package batch.basic;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;

@Entity
@Table(name="employee")
public class EmployeeJPA {
    private int no;
    private String name;
    private String role;
    public EmployeeJPA() {
        this(0, null, null);
    }
    public EmployeeJPA(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    @Id
    @Column(name="no")
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    @Column(name="name")
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    @Column(name="role")
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

Main.java (standard main program):

package batch.basic;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;

public class Main {
    public static void main(String[] args) throws Exception {
        @SuppressWarnings("resource")
        ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/xmldbxml.xml");
        JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
        Job job = (Job)ctx.getBean("testJob");
        JobExecution exe = jobLauncher.run(job, new JobParameters());
        while(exe.isRunning()) {
            System.out.print("*");
            Thread.sleep(10);
        }
        System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
    }
}

JSON flat files:

Two steps:

  1. JSON file using Jackson to database using JPA
  2. database using JPA to JSON file using Jackson

First step has:

Second step has:

Reading JSON file via Jackson use config file fragment like:

<bean id="jsonReader" class="org.springframework.batch.item.json.JsonItemReader">
    <property name="resource" value="name of JSON file"/>
    <property name="jsonObjectReader">
        <bean class="org.springframework.batch.item.json.JacksonJsonObjectReader">
            <constructor-arg value="class to be read"/>
        </bean>
    </property>
</bean>

Writing to database via JPA will use config file fragment like:

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseWriter" class="org.springframework.batch.item.database.JpaItemWriter">
    <property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>

Reading from database via JPA will use config file fragment like:

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
    <property name="entityManagerFactory" ref="entityManagerFactory" />
    <property name="queryString" value="JPQL query"/>
</bean>

Note: JPQL query - not SQL query!

Writing JSON file via Jackson use config file fragment like:

<bean id="jsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
    <constructor-arg value="name of JSON file"/>
    <constructor-arg>
        <bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
    </constructor-arg>
</bean>

jsondbjson.xml (complete Spring Batch configuration):

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:batch="http://www.springframework.org/schema/batch" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
    <!-- support beans -->
    <bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
    </bean>
    <bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
        <property name="persistenceUnitName" value="mysql"/>
    </bean>
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> 
        <property name="transactionManager" ref="repoTransactionManager"/>
    </bean>     
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>
    <!-- step1 beans -->
    <bean id="employeeJsonReader" class="org.springframework.batch.item.json.JsonItemReader">
        <property name="resource" value="C:/work/batch/input.json"/>
        <property name="jsonObjectReader">
            <bean class="org.springframework.batch.item.json.JacksonJsonObjectReader">
                <constructor-arg value="batch.basic.EmployeePlain"/>
            </bean>
        </property>
    </bean>
    <bean id="copyPlain2JPAProcessor" class="batch.basic.CopyProcessor">
        <property name="fromClass" value="batch.basic.EmployeePlain"/>
        <property name="toClass" value="batch.basic.EmployeeJPA"/>
    </bean>
    <bean id="employeeJPAWriter" class="org.springframework.batch.item.database.JpaItemWriter">
        <property name="entityManagerFactory" ref="entityManagerFactory"/>
    </bean>
    <!-- step2 beans -->
    <bean id="employeeJPAReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
        <property name="queryString" value="SELECT e FROM EmployeeJPA AS e"/>
    </bean>
    <bean id="copyJPA2PlainProcessor" class="batch.basic.CopyProcessor">
        <property name="fromClass" value="batch.basic.EmployeeJPA"/>
        <property name="toClass" value="batch.basic.EmployeePlain"/>
    </bean>
    <bean id="employeeJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
        <constructor-arg value="C:/work/batch/output.json"/>
        <constructor-arg>
            <bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
        </constructor-arg>
    </bean>
    <!-- job of step1 and step2 -->
    <batch:job id="testJob">
        <batch:step id="step1">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeJsonReader" writer="employeeJPAWriter" processor="copyPlain2JPAProcessor" commit-interval="1"/>
            </batch:tasklet>
            <batch:next on="COMPLETED" to="step2"/>
            <batch:end on="FAILED"/>
      </batch:step>
      <batch:step id="step2">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeJPAReader" writer="employeeJsonWriter" processor="copyJPA2PlainProcessor" commit-interval="1"/>
            </batch:tasklet>
      </batch:step>
    </batch:job>
</beans>

EmployeePlain.java (data class):

package batch.basic;

public class EmployeePlain {
    private int no;
    private String name;
    private String role;
    public EmployeePlain() {
        this(0, null, null);
    }
    public EmployeePlain(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

EmployeeJPA.java (data class with JPA annotations):

package batch.basic;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;

@Entity
@Table(name="employee")
public class EmployeeJPA {
    private int no;
    private String name;
    private String role;
    public EmployeeJPA() {
        this(0, null, null);
    }
    public EmployeeJPA(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    @Id
    @Column(name="no")
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    @Column(name="name")
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    @Column(name="role")
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

Main.java (standard main program):

package batch.basic;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;

public class Main {
    public static void main(String[] args) throws Exception {
        @SuppressWarnings("resource")
        ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/jsondbjson.xml");
        JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
        Job job = (Job)ctx.getBean("testJob");
        JobExecution exe = jobLauncher.run(job, new JobParameters());
        while(exe.isRunning()) {
            System.out.print("*");
            Thread.sleep(10);
        }
        System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
    }
}

CSV flat files:

Two steps:

  1. CSV file using configuration to database using JPA
  2. database using JPA to CSV file using configuration

First step has:

Second step has:

Reading CSV file use config file fragment like:

<bean id="data bean name" class="class to be read"/>
<bean id="csvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
    <property name="resource" value="name of CSV file"/>
    <property name="lineMapper">
        <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
            <property name="lineTokenizer">
                <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                    <property name="delimiter" value=","/>
                    <property name="names" value="list of fields"/>
                </bean>
            </property>
            <property name="fieldSetMapper">
                <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                    <property name="prototypeBeanName" value="data bean name"/>
                </bean>
            </property>
        </bean>
    </property>
</bean>

Writing to database via JPA will use config file fragment like:

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseWriter" class="org.springframework.batch.item.database.JpaItemWriter">
    <property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>

Reading from database via JPA will use config file fragment like:

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
    <property name="entityManagerFactory" ref="entityManagerFactory" />
    <property name="queryString" value="JPQL query"/>
</bean>

Note: JPQL query - not SQL query!

Writing CSV file use config file fragment like:

<bean id="csvWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
    <property name="resource" value="name of CSV file"/>
    <property name="lineAggregator">
        <bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
            <property name="delimiter" value=","/>
            <property name="quoteCharacter" value="""/>
            <property name="fieldExtractor">
                <bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
                    <property name="names" value="list of fields"/>
                </bean>
            </property>
        </bean>
    </property>
</bean>

csvdbcsv.xml (complete Spring Batch configuration):

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:batch="http://www.springframework.org/schema/batch" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
    <!-- support beans -->
    <bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
    </bean>
    <bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
        <property name="persistenceUnitName" value="mysql"/>
    </bean>
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> 
        <property name="transactionManager" ref="repoTransactionManager"/>
    </bean>     
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>
    <bean id="employeeBean" class="batch.basic.EmployeePlain"/>
    <!-- step1 beans -->
    <bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="resource" value="C:/work/batch/input.csv"/>
        <property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <property name="lineTokenizer">
                    <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                        <property name="delimiter" value=","/>
                        <property name="names" value="no,name,role"/>
                    </bean>
                </property>
                <property name="fieldSetMapper">
                    <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                        <property name="prototypeBeanName" value="employeeBean"/>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
    <bean id="copyPlain2JPAProcessor" class="batch.basic.CopyProcessor">
        <property name="fromClass" value="batch.basic.EmployeePlain"/>
        <property name="toClass" value="batch.basic.EmployeeJPA"/>
    </bean>
    <bean id="employeeJPAWriter" class="org.springframework.batch.item.database.JpaItemWriter">
        <property name="entityManagerFactory" ref="entityManagerFactory"/>
    </bean>
    <!-- step2 beans -->
    <bean id="employeeJPAReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
        <property name="queryString" value="SELECT e FROM EmployeeJPA AS e"/>
    </bean>
    <bean id="copyJPA2PlainProcessor" class="batch.basic.CopyProcessor">
        <property name="fromClass" value="batch.basic.EmployeeJPA"/>
        <property name="toClass" value="batch.basic.EmployeePlain"/>
    </bean>
    <bean id="employeeCsvWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
        <property name="resource" value="C:/work/batch/output.csv"/>
        <property name="lineAggregator">
            <bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
                <property name="delimiter" value=","/>
                <!--
                <property name="quoteCharacter" value="&quot;"/>
                -->
                <property name="fieldExtractor">
                    <bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
                        <property name="names" value="no,name,role"/>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
    <!-- job of step1 and step2 -->
    <batch:job id="testJob">
        <batch:step id="step1">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeCsvReader" writer="employeeJPAWriter" processor="copyPlain2JPAProcessor" commit-interval="1"/>
            </batch:tasklet>
            <batch:next on="COMPLETED" to="step2"/>
            <batch:end on="FAILED"/>
      </batch:step>
      <batch:step id="step2">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeJPAReader" writer="employeeCsvWriter" processor="copyJPA2PlainProcessor" commit-interval="1"/>
            </batch:tasklet>
      </batch:step>
    </batch:job>
</beans>

Note: Spring Batch 4.x does not support quoteCharacter - Spring Batch 5.x does.

EmployeePlain.java (data class):

package batch.basic;

public class EmployeePlain {
    private int no;
    private String name;
    private String role;
    public EmployeePlain() {
        this(0, null, null);
    }
    public EmployeePlain(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

EmployeeJPA.java (data class with JPA annotations):

package batch.basic;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;

@Entity
@Table(name="employee")
public class EmployeeJPA {
    private int no;
    private String name;
    private String role;
    public EmployeeJPA() {
        this(0, null, null);
    }
    public EmployeeJPA(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    @Id
    @Column(name="no")
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    @Column(name="name")
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    @Column(name="role")
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

Main.java (standard main program):

package batch.basic;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;

public class Main {
    public static void main(String[] args) throws Exception {
        @SuppressWarnings("resource")
        ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/csvdbcsv.xml");
        JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
        Job job = (Job)ctx.getBean("testJob");
        JobExecution exe = jobLauncher.run(job, new JobParameters());
        while(exe.isRunning()) {
            System.out.print("*");
            Thread.sleep(10);
        }
        System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
    }
}

To use quotes in output with Spring Batch 4.x or to use a custom format, then that can be configured by using a line mapper and a line aggregator:

<bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
    <property name="resource" value="C:/work/batch/input.csv"/>
    <property name="lineMapper">
        <bean class="batch.basic.EmployeeCSVSupport"/>
    </property>
</bean>
<bean id="employeeCsvWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
    <property name="resource" value="C:/work/batch/output.csv"/>
    <property name="lineAggregator">
        <bean class="batch.basic.EmployeeCSVSupport"/>
    </property>
</bean>
package batch.basic;

import org.springframework.batch.item.file.LineMapper;
import org.springframework.batch.item.file.transform.LineAggregator;

public class EmployeeCSVSupport implements LineMapper<EmployeePlan>, LineAggregator<EmployeePlan> {
    @Override
    public String aggregate(EmployeePlain emp) {
        return String.format("%d,\"%s\",\"%s\"", emp.getNo(), emp.getName(), emp.getRole());
    }
    @Override
    public EmployeePlain mapLine(String line, int lineNo) throws Exception {
        String[] parts = line.split(",");
        EmployeePlain res = new EmployeePlain();
        res.setNo(Integer.parseInt(parts[0]));
        res.setName(parts[1].replace("\"", ""));
        res.setRole(parts[2].replace("\"", ""));
        return res;
    }
}

Real code may need to be a little more complex to handle quotes within the string value etc., but the point is that the conversion between String line and object can be done 100% custom.

Fixed width flat files:

Two steps:

  1. Fixed width file using builtin configuration to database using JPA
  2. database using JPA to fixed width file using builtin configuration

First step has:

Second step has:

Reading fixed width file use config file fragment like:

<bean id="data bean name" class="class to be read"/>
<bean id="employeeFixReader" class="org.springframework.batch.item.file.FlatFileItemReader">
    <property name="resource" value="name of fixed width file"/>
    <property name="lineMapper">
        <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
            <property name="lineTokenizer">
                <bean class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
                    <property name="columns" value="list of columns"/>
                    <property name="names" value="list of fields"/>
                </bean>
            </property>
            <property name="fieldSetMapper">
                <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                    <property name="prototypeBeanName" value="data bean name"/>
                </bean>
            </property>
        </bean>
    </property>
</bean>

Writing to database via JPA will use config file fragment like:

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseWriter" class="org.springframework.batch.item.database.JpaItemWriter">
    <property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>

Reading from database via JPA will use config file fragment like:

<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
    <property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
    <property name="entityManagerFactory" ref="entityManagerFactory" />
    <property name="queryString" value="JPQL query"/>
</bean>

Note: JPQL query - not SQL query!

Writing fixed width file use config file fragment like:

<bean id="employeeFixWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
    <property name="resource" value="name of fixed width file"/>
    <property name="lineAggregator">
        <bean class="org.springframework.batch.item.file.transform.FormatterLineAggregator">
            <property name="format" value="printf style format"/>
            <property name="fieldExtractor">
                <bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
                    <property name="names" value="list of fields"/>
                </bean>
            </property>
        </bean>
    </property>
</bean>

fixdbfix.xml (complete Spring Batch configuration):

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:batch="http://www.springframework.org/schema/batch" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
    <!-- support beans -->
    <bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
    </bean>
    <bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
        <property name="persistenceUnitName" value="mysql"/>
    </bean>
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> 
        <property name="transactionManager" ref="repoTransactionManager"/>
    </bean>     
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>
    <bean id="employeeBean" class="batch.basic.EmployeePlain"/>
    <!-- step1 beans -->
    <bean id="employeeFixReader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="resource" value="C:/work/batch/input.Fix"/>
        <property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <property name="lineTokenizer">
                    <bean class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
                        <property name="columns" value="1-8,9-40,41-72"/>
                        <property name="names" value="no,name,role"/>
                    </bean>
                </property>
                <property name="fieldSetMapper">
                    <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                        <property name="prototypeBeanName" value="employeeBean"/>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
    <bean id="copyPlain2JPAProcessor" class="batch.basic.CopyProcessor">
        <property name="fromClass" value="batch.basic.EmployeePlain"/>
        <property name="toClass" value="batch.basic.EmployeeJPA"/>
    </bean>
    <bean id="employeeJPAWriter" class="org.springframework.batch.item.database.JpaItemWriter">
        <property name="entityManagerFactory" ref="entityManagerFactory"/>
    </bean>
    <!-- step2 beans -->
    <bean id="employeeJPAReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
        <property name="queryString" value="SELECT e FROM EmployeeJPA AS e"/>
    </bean>
    <bean id="copyJPA2PlainProcessor" class="batch.basic.CopyProcessor">
        <property name="fromClass" value="batch.basic.EmployeeJPA"/>
        <property name="toClass" value="batch.basic.EmployeePlain"/>
    </bean>
    <bean id="employeeFixWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
        <property name="resource" value="C:/work/batch/output.Fix"/>
        <property name="lineAggregator">
            <bean class="org.springframework.batch.item.file.transform.FormatterLineAggregator">
                <property name="format" value="%8d%-32s%-32s"/>
                <property name="fieldExtractor">
                    <bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
                        <property name="names" value="no,name,role"/>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
    <!-- job of step1 and step2 -->
    <batch:job id="testJob">
        <batch:step id="step1">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeFixReader" writer="employeeJPAWriter" processor="copyPlain2JPAProcessor" commit-interval="1"/>
            </batch:tasklet>
            <batch:next on="COMPLETED" to="step2"/>
            <batch:end on="FAILED"/>
      </batch:step>
      <batch:step id="step2">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeJPAReader" writer="employeeFixWriter" processor="copyJPA2PlainProcessor" commit-interval="1"/>
            </batch:tasklet>
      </batch:step>
    </batch:job>
</beans>

EmployeePlain.java (data class):

package batch.basic;

public class EmployeePlain {
    private int no;
    private String name;
    private String role;
    public EmployeePlain() {
        this(0, null, null);
    }
    public EmployeePlain(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

EmployeeJPA.java (data class with JPA annotations):

package batch.basic;

import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;

@Entity
@Table(name="employee")
public class EmployeeJPA {
    private int no;
    private String name;
    private String role;
    public EmployeeJPA() {
        this(0, null, null);
    }
    public EmployeeJPA(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    @Id
    @Column(name="no")
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    @Column(name="name")
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    @Column(name="role")
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

Main.java (standard main program):

package batch.basic;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;

public class Main {
    public static void main(String[] args) throws Exception {
        @SuppressWarnings("resource")
        ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/fixdbfix.xml");
        JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
        Job job = (Job)ctx.getBean("testJob");
        JobExecution exe = jobLauncher.run(job, new JobParameters());
        while(exe.isRunning()) {
            System.out.print("*");
            Thread.sleep(10);
        }
        System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
    }
}

Flow control:

In the real world it is not always reading from one file and writing to one file - sometimes it is reading from multiple files or writinhg to multiple files.

Spring Batch supports that.

Multiple input files combined to single output file:

Scenario:

Sprint Batch fan in

The way to achieve this is to configure a reader that delegate each input file to another reader:

<bean id="someReader" class="...">
    <!-- no resource property -->
    ...
</bean>
<bean id="multiReader" class="org.springframework.batch.item.file.MultiResourceItemReader">
    <property name="resources" value="pattern for input files"/>
    <property name="delegate" ref="someReader"/>
</bean>

We will do an example with multiple CSV files being combined to one XML file.

Input files:

frag_1.csv:

1,"Alan A","Manager"

frag_2.csv:

2,"Brian B","Engineer"

frag_3.csv:

2,"Brian B","Engineer"

Desired output file:

comb.xml:

<?xml version="1.0" encoding="UTF-8"?><employees><employee no="1"><name>Alan A</name><role>Manager</role></employee><employee no="2"><name>Brian B</name><role>Engineer</role></employee><employee no="2"><name>Brian B</name><role>Engineer</role></employee></employees>

fanin.xml (complete Spring Batch configuration):

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:batch="http://www.springframework.org/schema/batch" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
    <!-- support beans -->
    <bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
    </bean>
    <bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
        <property name="persistenceUnitName" value="mysql"/>
    </bean>
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> 
        <property name="transactionManager" ref="repoTransactionManager"/>
    </bean>     
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>
    <bean id="jaxbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
        <property name="classesToBeBound">
            <list>
                <value>batch.flow.EmployeeJAXB</value>
            </list>
        </property>
    </bean>
    <bean id="employeeBean" class="batch.flow.EmployeePlain"/>
    <!-- step1 beans -->
    <bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <property name="lineTokenizer">
                    <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                        <property name="delimiter" value=","/>
                        <property name="names" value="no,name,role"/>
                    </bean>
                </property>
                <property name="fieldSetMapper">
                    <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                        <property name="prototypeBeanName" value="employeeBean"/>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
    <bean id="employeeMultiReader" class="org.springframework.batch.item.file.MultiResourceItemReader">
        <property name="resources" value="C:/work/batch/frag_*.csv"/>
        <property name="delegate" ref="employeeCsvReader"/>
    </bean>
    <bean id="copyCSV2XMLProcessor" class="batch.flow.CopyProcessor">
        <property name="fromClass" value="batch.flow.EmployeePlain"/>
        <property name="toClass" value="batch.flow.EmployeeJAXB"/>
    </bean>
    <bean id="employeeXmlWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
        <property name="resource" value="C:/work/batch/comb.xml"/>
        <property name="marshaller" ref="jaxbMarshaller"/>                                                         
        <property name="rootTagName" value="employees"/>
        <property name="overwriteOutput" value="true"/>
    </bean>
    <!-- job of step1 -->
    <batch:job id="testJob">
        <batch:step id="step1">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeMultiReader" writer="employeeXmlWriter" processor="copyCSV2XMLProcessor" commit-interval="1"/>
            </batch:tasklet>
      </batch:step>
    </batch:job>
</beans>

EmployeePlain.java (data class):

package batch.flow;

public class EmployeePlain {
    private int no;
    private String name;
    private String role;
    public EmployeePlain() {
        this(0, null, null);
    }
    public EmployeePlain(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

EmployeeJAXB.java (data class with JAXB annotations):

package batch.flow;

import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement(name="employee")
public class EmployeeJAXB {
    private int no;
    private String name;
    private String role;
    public EmployeeJAXB() {
        this(0, null, null);
    }
    public EmployeeJAXB(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    @XmlAttribute
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    @XmlElement
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    @XmlElement
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

Main.java (standard main program):

package batch.flow;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;

public class Main {
    public static void main(String[] args) throws Exception {
        @SuppressWarnings("resource")
        ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/fanin.xml");
        JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
        Job job = (Job)ctx.getBean("testJob");
        JobExecution exe = jobLauncher.run(job, new JobParameters());
        while(exe.isRunning()) {
            System.out.print("*");
            Thread.sleep(10);
        }
        System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
    }
}

Single input file copied to multiple output files each output file getting all records:

Scenario:

Sprint Batch fan out

The way to achieve this is to configure a writer that delegate to multiple writers:

<bean id="firstWriter" class="...">
    ...
</bean>
<bean id="secondWriter" class="...">
    ...
</bean>
...
<bean id="multiWriter" class="org.springframework.batch.item.support.CompositeItemWriter">
    <property name="delegates">
        <list>
            <ref bean="firstWriter"/>
            <ref bean="secondWriter"/>
            ...
        </list>
    </property>
</bean>

We will do an example with one CSV file being converted to both a XML file and a JSON file.

Input file:

input.csv:

1,"Alan A","Manager"
2,"Brian B","Engineer"
3,"Chris C","Sales rep"

Output files:

output.xml:

<?xml version="1.0" encoding="UTF-8"?><employees><employee no="1"><name>Alan A</name><role>Manager</role></employee><employee no="2"><name>Brian B</name><role>Engineer</role></employee><employee no="3"><name>Chris C</name><role>Sales rep</role></employee></employees>

output.json:

[
 {"no":1,"name":"Alan A","role":"Manager"},
 {"no":2,"name":"Brian B","role":"Engineer"},
 {"no":3,"name":"Chris C","role":"Sales rep"}
]

fanout.xml (complete Spring Batch configuration):

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:batch="http://www.springframework.org/schema/batch" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
    <!-- support beans -->
    <bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
    </bean>
    <bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
        <property name="persistenceUnitName" value="mysql"/>
    </bean>
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> 
        <property name="transactionManager" ref="repoTransactionManager"/>
    </bean>     
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>
    <bean id="jaxbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
        <property name="classesToBeBound">
            <list>
                <value>batch.flow.EmployeeJAXB</value>
            </list>
        </property>
    </bean>
    <bean id="employeeBean" class="batch.flow.EmployeePlain"/>
    <!-- step1 beans -->
    <bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="resource" value="C:/work/batch/input.csv"/>
        <property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <property name="lineTokenizer">
                    <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                        <property name="delimiter" value=","/>
                        <property name="names" value="no,name,role"/>
                    </bean>
                </property>
                <property name="fieldSetMapper">
                    <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                        <property name="prototypeBeanName" value="employeeBean"/>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
    <bean id="copyCSV2XMLProcessor" class="batch.flow.CopyProcessor">
        <property name="fromClass" value="batch.flow.EmployeePlain"/>
        <property name="toClass" value="batch.flow.EmployeeJAXB"/>
    </bean>
    <bean id="employeeXmlWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
        <property name="resource" value="C:/work/batch/output.xml"/>
        <property name="marshaller" ref="jaxbMarshaller"/>                                                         
        <property name="rootTagName" value="employees"/>
        <property name="overwriteOutput" value="true"/>
    </bean>
    <bean id="employeeJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
        <constructor-arg value="C:/work/batch/output.json"/>
        <constructor-arg>
            <bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
        </constructor-arg>
    </bean>
    <bean id="employeeMultiWriter" class="org.springframework.batch.item.support.CompositeItemWriter">
        <property name="delegates">
            <list>
                <ref bean="employeeXmlWriter"/>
                <ref bean="employeeJsonWriter"/>
            </list>
        </property>
    </bean>
    <!-- job of step1 -->
    <batch:job id="testJob">
        <batch:step id="step1">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeCsvReader" writer="employeeMultiWriter" processor="copyCSV2XMLProcessor" commit-interval="1"/>
            </batch:tasklet>
      </batch:step>
    </batch:job>
</beans>

EmployeePlain.java (data class):

package batch.flow;

public class EmployeePlain {
    private int no;
    private String name;
    private String role;
    public EmployeePlain() {
        this(0, null, null);
    }
    public EmployeePlain(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

EmployeeJAXB.java (data class with JAXB annotations):

package batch.flow;

import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement(name="employee")
public class EmployeeJAXB {
    private int no;
    private String name;
    private String role;
    public EmployeeJAXB() {
        this(0, null, null);
    }
    public EmployeeJAXB(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    @XmlAttribute
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    @XmlElement
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    @XmlElement
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

Main.java (standard main program):

package batch.flow;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;

public class Main {
    public static void main(String[] args) throws Exception {
        @SuppressWarnings("resource")
        ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/fanout.xml");
        JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
        Job job = (Job)ctx.getBean("testJob");
        JobExecution exe = jobLauncher.run(job, new JobParameters());
        while(exe.isRunning()) {
            System.out.print("*");
            Thread.sleep(10);
        }
        System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
    }
}

Single input file split to multiple output files each record only going to one output file:

Scenario:

Sprint Batch selective fan out

The way to achieve this is to configure a classifying writer with classifying class and multiple writers:

<bean id="firstWriter" class="...">
    ...
</bean>
<bean id="secondWriter" class="...">
    ...
</bean>
...
<bean id="selector" class="classifier class">
    <property name="first" ref="firstWriter"/>
    <property name="second" ref="secondWriter"/>
    ...
</bean>
<bean id="multiWriter" class="org.springframework.batch.item.support.ClassifierCompositeItemWriter">
    <property name="classifier" ref="selector"/>
</bean>

We will do an example with one CSV file being split out to two JSON files.

Input file:

input.csv:

1,"Alan A","Manager"
2,"Brian B","Engineer"
3,"Chris C","Sales rep"

Output files:

supervisor.json:

[
 {"no":1,"name":"Alan A","role":"Manager"}
]

individual.json:

[
 {"no":2,"name":"Brian B","role":"Engineer"},
 {"no":3,"name":"Chris C","role":"Sales rep"}
]

selfanout.xml (complete Spring Batch configuration):

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:batch="http://www.springframework.org/schema/batch" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
    <!-- support beans -->
    <bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
    </bean>
    <bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
        <property name="persistenceUnitName" value="mysql"/>
    </bean>
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> 
        <property name="transactionManager" ref="repoTransactionManager"/>
    </bean>     
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>
    <bean id="employeeBean" class="batch.flow.EmployeePlain"/>
    <!-- step1 beans -->
    <bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="resource" value="C:/work/batch/input.csv"/>
        <property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <property name="lineTokenizer">
                    <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                        <property name="delimiter" value=","/>
                        <property name="names" value="no,name,role"/>
                    </bean>
                </property>
                <property name="fieldSetMapper">
                    <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                        <property name="prototypeBeanName" value="employeeBean"/>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
    <bean id="copyProcessor" class=" org.springframework.batch.item.support.PassThroughItemProcessor"/>
    <bean id="supervisorJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
        <constructor-arg value="C:/work/batch/supervisor.json"/>
        <constructor-arg>
            <bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
        </constructor-arg>
    </bean>
    <bean id="individualJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
        <constructor-arg value="C:/work/batch/individual.json"/>
        <constructor-arg>
            <bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
        </constructor-arg>
    </bean>
    <bean id="selector" class="batch.flow.Selector">
        <property name="supervisor" ref="supervisorJsonWriter"/>
        <property name="individual" ref="individualJsonWriter"/>
    </bean>
    <bean id="employeeMultiWriter" class="org.springframework.batch.item.support.ClassifierCompositeItemWriter">
        <property name="classifier" ref="selector"/>
    </bean>
    <!-- job of step1 -->
    <batch:job id="testJob">
        <batch:step id="step1">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeCsvReader" writer="employeeMultiWriter" processor="copyProcessor" commit-interval="1">
                    <batch:streams>
                        <batch:stream ref="supervisorJsonWriter"/>
                        <batch:stream ref="individualJsonWriter"/>
                    </batch:streams>
                </batch:chunk>
            </batch:tasklet>
      </batch:step>
    </batch:job>
</beans>

EmployeePlain.java (data class):

package batch.flow;

public class EmployeePlain {
    private int no;
    private String name;
    private String role;
    public EmployeePlain() {
        this(0, null, null);
    }
    public EmployeePlain(int no, String name, String role) {
        this.no = no;
        this.name = name;
        this.role = role;
    }
    public int getNo() {
        return no;
    }
    public void setNo(int no) {
        this.no = no;
    }
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }
    public String getRole() {
        return role;
    }
    public void setRole(String role) {
        this.role = role;
    }
    @Override
    public String toString() {
        return String.format("[%d,%s,%s]", no, name, role);
    }
}

Main.java (standard main program):

package batch.flow;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;

public class Main {
    public static void main(String[] args) throws Exception {
        @SuppressWarnings("resource")
        ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/selfanout.xml");
        JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
        Job job = (Job)ctx.getBean("testJob");
        JobExecution exe = jobLauncher.run(job, new JobParameters());
        while(exe.isRunning()) {
            System.out.print("*");
            Thread.sleep(10);
        }
        System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
    }
}

Error handling:

A common problem is what to do in case of an error during processing of a record.

The 3 common options are:

All of these can be configure in Spring Batch. Skip and retry can be configured for certain exceptions. Exit is default.

We will use the following data for test.

big.csv:

1,"Line #1"
2,"Line #2"
3,"Line #3"
4,"Line #4"
5,"Line #5"
6,"Line #6"
7,"Line #7"
8,"Line #8"
9,"Line #9"
10,"Line #10"

Skip in case of error:

Skip exceptions are configured as:

<batch:chunk reader="..." writer="..." processor="..." ... skip-limit="maximum number of records to skip">
    <batch:skippable-exception-classes>
        <batch:include class="exception class"/>
        <batch:include class="exception class"/>
        ...
    </batch:skippable-exception-classes>
</batch:chunk>

We will do an example where even numbers generate an exception and trigger skip.

So expected output is as below.

bigdiscard.json:

[
 {"ival":1,"sval":"Line #1"},
 {"ival":3,"sval":"Line #3"},
 {"ival":5,"sval":"Line #5"},
 {"ival":7,"sval":"Line #7"},
 {"ival":9,"sval":"Line #9"}
]

bigdiscard.xml (complete Spring Batch configuration):

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:batch="http://www.springframework.org/schema/batch" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
    <!-- support beans -->
    <bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
    </bean>
    <bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
        <property name="persistenceUnitName" value="mysql"/>
    </bean>
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> 
        <property name="transactionManager" ref="repoTransactionManager"/>
    </bean>     
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>
    <bean id="dataBean" class="batch.handling.Data"/>
    <!-- step1 beans -->
    <bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="resource" value="C:/work/batch/big.csv"/>
        <property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <property name="lineTokenizer">
                    <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                        <property name="delimiter" value=","/>
                        <property name="names" value="ival,svale"/>
                    </bean>
                </property>
                <property name="fieldSetMapper">
                    <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                        <property name="prototypeBeanName" value="dataBean"/>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
    <bean id="discardProcessor" class="batch.handling.DiscardProcessor"/>
    <bean id="employeeJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
        <constructor-arg value="C:/work/batch/bigdiscard.json"/>
        <constructor-arg>
            <bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
        </constructor-arg>
    </bean>
    <!-- job of step1 -->
    <batch:job id="testJob">
        <batch:step id="step1">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeCsvReader" writer="employeeJsonWriter" processor="discardProcessor" commit-interval="1" skip-limit="1000">
                    <batch:skippable-exception-classes>
                        <batch:include class="batch.handling.DiscardProcessor.DiscardException"/>
                    </batch:skippable-exception-classes>
                </batch:chunk>
            </batch:tasklet>
      </batch:step>
    </batch:job>
</beans>

Data.java (data class):

package batch.handling;

public class Data {
    private int ival;
    private String sval;
    public int getIval() {
        return ival;
    }
    public void setIval(int ival) {
        this.ival = ival;
    }
    public String getSval() {
        return sval;
    }
    public void setSval(String sval) {
        this.sval = sval;
    }
}

DiscardProcessor.java (processor simulation errors):

package batch.handling;

import org.springframework.batch.item.ItemProcessor;

public class DiscardProcessor implements ItemProcessor<Data,Data> {
    public static class DiscardException extends Exception {
        private static final long serialVersionUID = 1L;
    }
    @Override
    public Data process(Data source) throws Exception {
        if(source.getIval() % 2 == 0) {
            throw new DiscardException();
        }
        return source;
    }
}

Main.java (standard main program):

package batch.handling;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;

public class Main {
    public static void main(String[] args) throws Exception {
        @SuppressWarnings("resource")
        ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/bigdiscard.xml");
        JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
        Job job = (Job)ctx.getBean("testJob");
        JobExecution exe = jobLauncher.run(job, new JobParameters());
        while(exe.isRunning()) {
            System.out.print("*");
            Thread.sleep(10);
        }
        System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
    }
}

Retry in case of error:

Retry exceptions are configured as:

<batch:chunk reader="..." writer="..." processor="..." ... retry-limit="maximum number of retries for a single record">
    <batch:retryable-exception-classes>
        <batch:include class="exception class"/>
        <batch:include class="exception class"/>
        ...
    </batch:retryable-exception-classes>
</batch:chunk>

We will do an example where processor at random generate an exception and trigger retry.

So expected output is as below.

bigflaky.json:

[
 {"ival":1,"sval":"Line #1"},
 {"ival":2,"sval":"Line #2"},
 {"ival":3,"sval":"Line #3"},
 {"ival":4,"sval":"Line #4"},
 {"ival":5,"sval":"Line #5"},
 {"ival":6,"sval":"Line #6"},
 {"ival":7,"sval":"Line #7"},
 {"ival":8,"sval":"Line #8"},
 {"ival":9,"sval":"Line #9"},
 {"ival":10,"sval":"Line #10"}
]

bigflaky.xml (complete Spring Batch configuration):

<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:batch="http://www.springframework.org/schema/batch" 
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
    <!-- support beans -->
    <bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
        <property name="entityManagerFactory" ref="entityManagerFactory" />
    </bean>
    <bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
    <bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
        <property name="persistenceUnitName" value="mysql"/>
    </bean>
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"> 
        <property name="transactionManager" ref="repoTransactionManager"/>
    </bean>     
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository"/>
    </bean>
    <bean id="dataBean" class="batch.handling.Data"/>
    <!-- step1 beans -->
    <bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
        <property name="resource" value="C:/work/batch/big.csv"/>
        <property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
                <property name="lineTokenizer">
                    <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
                        <property name="delimiter" value=","/>
                        <property name="names" value="ival,svale"/>
                    </bean>
                </property>
                <property name="fieldSetMapper">
                    <bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
                        <property name="prototypeBeanName" value="dataBean"/>
                    </bean>
                </property>
            </bean>
        </property>
    </bean>
    <bean id="FlakyProcessor" class="batch.handling.FlakyProcessor"/>
    <bean id="employeeJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
        <constructor-arg value="C:/work/batch/bigflaky.json"/>
        <constructor-arg>
            <bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
        </constructor-arg>
    </bean>
    <!-- job of step1 -->
    <batch:job id="testJob">
        <batch:step id="step1">
            <batch:tasklet transaction-manager="dbTransactionManager">
                <batch:chunk reader="employeeCsvReader" writer="employeeJsonWriter" processor="FlakyProcessor" commit-interval="1" retry-limit="10">
                    <batch:retryable-exception-classes>
                        <batch:include class="batch.handling.FlakyProcessor.FlakyException"/>
                    </batch:retryable-exception-classes>
                </batch:chunk>
            </batch:tasklet>
      </batch:step>
    </batch:job>
</beans>

FlakyProcessor.java (processor simulating errors):

package batch.handling;

import java.util.Random;

import org.springframework.batch.item.ItemProcessor;

public class FlakyProcessor implements ItemProcessor<Data,Data> {
    public static class FlakyException extends Exception {
        private static final long serialVersionUID = 1L;
    }
    private static Random rng = new Random();
    @Override
    public Data process(Data source) throws Exception {
        if(rng.nextDouble() < 0.5) {
            throw new FlakyException();
        }
        return source;
    }
}

Main.java (standard main program):

package batch.handling;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;

public class Main {
    public static void main(String[] args) throws Exception {
        @SuppressWarnings("resource")
        ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/bigflaky.xml");
        JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
        Job job = (Job)ctx.getBean("testJob");
        JobExecution exe = jobLauncher.run(job, new JobParameters());
        while(exe.isRunning()) {
            System.out.print("*");
            Thread.sleep(10);
        }
        System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
    }
}

Final remarks:

Spring Batch is a big framework and above has only shown a tiny part of what it is capable off. But hopefully enough to provide a basic understanding.

I actually like Spring Batch. It is targeting a rather narrow type of applications, but it is a type of applications that are widely used in lots of companies.

Article history:

Version Date Description
1.0 March 15th 2025 Initial version

Other articles:

See list of all articles here

Comments:

Please send comments to Arne Vajhøj