This is about Spring Batch. I am usually not a big fan of Spring products, but I decided to make an excption for SPring Batch.
Readers are expected to know Java, Spring DI and a little about data processing.
For a refresher on DI go here.
Some code will be repeated, but I find it better for the reader to have all the code related to one example in one place instead of having to find some of the code in another example.
There are fundamentally two types of server applications:
There are two aspects of batch jobs:
Despites its name then Spring Batch is a framework for the second.
The Spring suite has other products for scheduling jobs. And there is another wellknown product Quartz dedicated to that.
Spring Batch is a framework that allows for declarative orchestration of many standard operations and comes with a number of standard components for common operations.
The key aspects of Spring Batch are:
Examples will cover:
The examples are pretty trivial, but a lot of batch data processing is trivial.
Spring Batch today are often done using annotations and configuration in fluent code. But I have chosen to show configuration via XML. To me XML configuration is the right way to configure batch jobs despite not being so elegant.
A job is defined like:
<batch:job id="job id">
<batch:step id="step id">
...
</batch:step>
<batch:step id="step id">
...
</batch:step>
...
</batch:job>
A step is defined like:
<batch:step id="step id">
<batch:tasklet ...>
<batch:chunk reader="reader bean" writer="writer bean" processor="processor bean" .../>
</batch:tasklet>
...
</batch:step>
A reader/writer/processor bean is defined like any other Spring bean:
<bean id="bean id" class="bean class name">
<property name="property name" value="property value literal"/>
<property name="property name" ref="property value id of other bean"/>
...
</bean>
Most of the examples will use the following two support classes.
AutoMap.java:
package batch.xxxxxx;
import java.beans.IntrospectionException;
import java.beans.Introspector;
import java.beans.PropertyDescriptor;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.util.ArrayList;
import java.util.List;
public class AutoMap<TFROM,TTO> {
public static class MethodPair {
private Method getter;
private Method setter;
private boolean trim;
public MethodPair(Method getter, Method setter, boolean trim) {
this.getter = getter;
this.setter = setter;
this.trim = trim;
}
public Method getGetter() {
return getter;
}
public Method getSetter() {
return setter;
}
public boolean getTrim() {
return trim;
}
}
private List<MethodPair> conv;
public AutoMap(Class<TFROM> from, Class<TTO> to) throws IntrospectionException {
this(from, to, false);
}
public AutoMap(Class<TFROM> from, Class<TTO> to, boolean trim) throws IntrospectionException {
conv = new ArrayList<MethodPair>();
for(PropertyDescriptor pdfrom : Introspector.getBeanInfo(from).getPropertyDescriptors()) {
for(PropertyDescriptor pdto : Introspector.getBeanInfo(to).getPropertyDescriptors()) {
if(pdfrom.getName().equals(pdto.getName()) && pdfrom.getPropertyType().equals(pdto.getPropertyType())) {
Method getter = pdfrom.getReadMethod();
Method setter = pdto.getWriteMethod();
if(getter != null && setter != null) {
conv.add(new MethodPair(getter, setter, trim && pdfrom.getPropertyType().equals(String.class)));
}
}
}
}
}
public void convert(TFROM from, TTO to) throws IllegalAccessException, IllegalArgumentException, InvocationTargetException {
for(MethodPair mp : conv) {
if(mp.getTrim()) {
mp.getSetter().invoke(to, ((String)mp.getGetter().invoke(from)).trim());
} else {
mp.getSetter().invoke(to, mp.getGetter().invoke(from));
}
}
}
}
CopyProcessor.java:
package batch.xxxxxx;
import org.springframework.batch.item.ItemProcessor;
public class CopyProcessor<TFROM,TTO> implements ItemProcessor<TFROM,TTO> {
private Class<TFROM> fromClass;
private Class<TTO> toClass;
public Class<TFROM> getFromClass() {
return fromClass;
}
public void setFromClass(Class<TFROM> fromClass) {
this.fromClass = fromClass;
}
public Class<TTO> getToClass() {
return toClass;
}
public void setToClass(Class<TTO> toClass) {
this.toClass = toClass;
}
private AutoMap<TFROM,TTO> mapper = null;
@Override
public TTO process(TFROM source) throws Exception {
if(mapper == null) {
mapper = new AutoMap<TFROM,TTO>(fromClass, toClass);
}
TTO target = toClass.newInstance();
mapper.convert(source, target);
return target;
}
}
Here we will look at examples doing 2 step jobs:
flat file -> database -> flat file
Tests data:
input.xml:
<employees>
<employee no='1'>
<name>Alan A</name>
<role>Manager</role>
</employee>
<employee no='2'>
<name>Brian B</name>
<role>Engineer</role>
</employee>
<employee no='3'>
<name>Chris C</name>
<role>Sales rep</role>
</employee>
</employees>
input.json:
[ { "no" : 1, "name" : "Alan A", "role" : "Manager" }, { "no" : 2, "name" : "Brian B", "role" : "Engineer" }, { "no" : 3, "name" : "Chris C", "role" : "Sales rep" } ]
input.csv:
1,"Alan A","Manager"
2,"Brian B","Engineer"
3,"Chris C","Sales rep"
input.fix:
1Alan A Manager
2Brian B Engineer
3Chris C Sales rep
Two steps:
First step has:
Second step has:
Reading XML file via JAXB use config file fragment like:
<bean id="jaxbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>class to be mapped to XML</value>
...
</list>
</property>
</bean>
<bean id="xmlReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
<property name="fragmentRootElementName" value="root element name"/>
<property name="resource" value="name of XML file"/>
<property name="unmarshaller" ref="jaxbMarshaller"/>
</bean>
Writing to database via JPA will use config file fragment like:
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseWriter" class="org.springframework.batch.item.database.JpaItemWriter">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>
Reading from database via JPA will use config file fragment like:
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
<property name="entityManagerFactory" ref="entityManagerFactory" />
<property name="queryString" value="JPQL query"/>
</bean>
Note: JPQL query - not SQL query!
Writing XML file via JAXB use config file fragment like:
<bean id="jaxbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>class to be mapped to XML</value>
...
</list>
</property>
</bean>
<bean id="xmlWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" value="name of XML file"/>
<property name="marshaller" ref="jaxbMarshaller"/>
<property name="rootTagName" value="root element name"/>
<property name="overwriteOutput" value="true"/>
</bean>
xmldbxml.xml (complete Spring Batch configuration):
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
<!-- support beans -->
<bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="mysql"/>
</bean>
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="repoTransactionManager"/>
</bean>
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository"/>
</bean>
<bean id="jaxbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>batch.basic.EmployeeJAXB</value>
</list>
</property>
</bean>
<!-- step1 beans -->
<bean id="employeeXmlReader" class="org.springframework.batch.item.xml.StaxEventItemReader">
<property name="fragmentRootElementName" value="employee"/>
<property name="resource" value="C:/work/batch/input.xml"/>
<property name="unmarshaller" ref="jaxbMarshaller"/>
</bean>
<bean id="copyJAXB2JPAProcessor" class="batch.basic.CopyProcessor">
<property name="fromClass" value="batch.basic.EmployeeJAXB"/>
<property name="toClass" value="batch.basic.EmployeeJPA"/>
</bean>
<bean id="employeeJPAWriter" class="org.springframework.batch.item.database.JpaItemWriter">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>
<!-- step2 beans -->
<bean id="employeeJPAReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
<property name="entityManagerFactory" ref="entityManagerFactory" />
<property name="queryString" value="SELECT e FROM EmployeeJPA AS e"/>
</bean>
<bean id="copyJPA2JAXBProcessor" class="batch.basic.CopyProcessor">
<property name="fromClass" value="batch.basic.EmployeeJPA"/>
<property name="toClass" value="batch.basic.EmployeeJAXB"/>
</bean>
<bean id="employeeXmlWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" value="C:/work/batch/output.xml"/>
<property name="marshaller" ref="jaxbMarshaller"/>
<property name="rootTagName" value="employees"/>
<property name="overwriteOutput" value="true"/>
</bean>
<!-- job of step1 and step2 -->
<batch:job id="testJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeXmlReader" writer="employeeJPAWriter" processor="copyJAXB2JPAProcessor" commit-interval="1"/>
</batch:tasklet>
<batch:next on="COMPLETED" to="step2"/>
<batch:end on="FAILED"/>
</batch:step>
<batch:step id="step2">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeJPAReader" writer="employeeXmlWriter" processor="copyJPA2JAXBProcessor" commit-interval="1"/>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>
EmployeeJAXB.java (data class with JAXB annotations):
package batch.basic;
import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
@XmlRootElement(name="employee")
public class EmployeeJAXB {
private int no;
private String name;
private String role;
public EmployeeJAXB() {
this(0, null, null);
}
public EmployeeJAXB(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
@XmlAttribute
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
@XmlElement
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@XmlElement
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
EmployeeJPA.java (data class with JPA annotations):
package batch.basic;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;
@Entity
@Table(name="employee")
public class EmployeeJPA {
private int no;
private String name;
private String role;
public EmployeeJPA() {
this(0, null, null);
}
public EmployeeJPA(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
@Id
@Column(name="no")
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
@Column(name="name")
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@Column(name="role")
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
Main.java (standard main program):
package batch.basic;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;
public class Main {
public static void main(String[] args) throws Exception {
@SuppressWarnings("resource")
ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/xmldbxml.xml");
JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
Job job = (Job)ctx.getBean("testJob");
JobExecution exe = jobLauncher.run(job, new JobParameters());
while(exe.isRunning()) {
System.out.print("*");
Thread.sleep(10);
}
System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
}
}
Two steps:
First step has:
Second step has:
Reading JSON file via Jackson use config file fragment like:
<bean id="jsonReader" class="org.springframework.batch.item.json.JsonItemReader">
<property name="resource" value="name of JSON file"/>
<property name="jsonObjectReader">
<bean class="org.springframework.batch.item.json.JacksonJsonObjectReader">
<constructor-arg value="class to be read"/>
</bean>
</property>
</bean>
Writing to database via JPA will use config file fragment like:
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseWriter" class="org.springframework.batch.item.database.JpaItemWriter">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>
Reading from database via JPA will use config file fragment like:
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
<property name="entityManagerFactory" ref="entityManagerFactory" />
<property name="queryString" value="JPQL query"/>
</bean>
Note: JPQL query - not SQL query!
Writing JSON file via Jackson use config file fragment like:
<bean id="jsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
<constructor-arg value="name of JSON file"/>
<constructor-arg>
<bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
</constructor-arg>
</bean>
jsondbjson.xml (complete Spring Batch configuration):
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
<!-- support beans -->
<bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="mysql"/>
</bean>
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="repoTransactionManager"/>
</bean>
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository"/>
</bean>
<!-- step1 beans -->
<bean id="employeeJsonReader" class="org.springframework.batch.item.json.JsonItemReader">
<property name="resource" value="C:/work/batch/input.json"/>
<property name="jsonObjectReader">
<bean class="org.springframework.batch.item.json.JacksonJsonObjectReader">
<constructor-arg value="batch.basic.EmployeePlain"/>
</bean>
</property>
</bean>
<bean id="copyPlain2JPAProcessor" class="batch.basic.CopyProcessor">
<property name="fromClass" value="batch.basic.EmployeePlain"/>
<property name="toClass" value="batch.basic.EmployeeJPA"/>
</bean>
<bean id="employeeJPAWriter" class="org.springframework.batch.item.database.JpaItemWriter">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>
<!-- step2 beans -->
<bean id="employeeJPAReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
<property name="entityManagerFactory" ref="entityManagerFactory" />
<property name="queryString" value="SELECT e FROM EmployeeJPA AS e"/>
</bean>
<bean id="copyJPA2PlainProcessor" class="batch.basic.CopyProcessor">
<property name="fromClass" value="batch.basic.EmployeeJPA"/>
<property name="toClass" value="batch.basic.EmployeePlain"/>
</bean>
<bean id="employeeJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
<constructor-arg value="C:/work/batch/output.json"/>
<constructor-arg>
<bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
</constructor-arg>
</bean>
<!-- job of step1 and step2 -->
<batch:job id="testJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeJsonReader" writer="employeeJPAWriter" processor="copyPlain2JPAProcessor" commit-interval="1"/>
</batch:tasklet>
<batch:next on="COMPLETED" to="step2"/>
<batch:end on="FAILED"/>
</batch:step>
<batch:step id="step2">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeJPAReader" writer="employeeJsonWriter" processor="copyJPA2PlainProcessor" commit-interval="1"/>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>
EmployeePlain.java (data class):
package batch.basic;
public class EmployeePlain {
private int no;
private String name;
private String role;
public EmployeePlain() {
this(0, null, null);
}
public EmployeePlain(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
EmployeeJPA.java (data class with JPA annotations):
package batch.basic;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;
@Entity
@Table(name="employee")
public class EmployeeJPA {
private int no;
private String name;
private String role;
public EmployeeJPA() {
this(0, null, null);
}
public EmployeeJPA(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
@Id
@Column(name="no")
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
@Column(name="name")
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@Column(name="role")
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
Main.java (standard main program):
package batch.basic;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;
public class Main {
public static void main(String[] args) throws Exception {
@SuppressWarnings("resource")
ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/jsondbjson.xml");
JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
Job job = (Job)ctx.getBean("testJob");
JobExecution exe = jobLauncher.run(job, new JobParameters());
while(exe.isRunning()) {
System.out.print("*");
Thread.sleep(10);
}
System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
}
}
Two steps:
First step has:
Second step has:
Reading CSV file use config file fragment like:
<bean id="data bean name" class="class to be read"/>
<bean id="csvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="name of CSV file"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names" value="list of fields"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="data bean name"/>
</bean>
</property>
</bean>
</property>
</bean>
Writing to database via JPA will use config file fragment like:
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseWriter" class="org.springframework.batch.item.database.JpaItemWriter">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>
Reading from database via JPA will use config file fragment like:
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
<property name="entityManagerFactory" ref="entityManagerFactory" />
<property name="queryString" value="JPQL query"/>
</bean>
Note: JPQL query - not SQL query!
Writing CSV file use config file fragment like:
<bean id="csvWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" value="name of CSV file"/>
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
<property name="delimiter" value=","/>
<property name="quoteCharacter" value="""/>
<property name="fieldExtractor">
<bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="list of fields"/>
</bean>
</property>
</bean>
</property>
</bean>
csvdbcsv.xml (complete Spring Batch configuration):
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
<!-- support beans -->
<bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="mysql"/>
</bean>
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="repoTransactionManager"/>
</bean>
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository"/>
</bean>
<bean id="employeeBean" class="batch.basic.EmployeePlain"/>
<!-- step1 beans -->
<bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="C:/work/batch/input.csv"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names" value="no,name,role"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="employeeBean"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean id="copyPlain2JPAProcessor" class="batch.basic.CopyProcessor">
<property name="fromClass" value="batch.basic.EmployeePlain"/>
<property name="toClass" value="batch.basic.EmployeeJPA"/>
</bean>
<bean id="employeeJPAWriter" class="org.springframework.batch.item.database.JpaItemWriter">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>
<!-- step2 beans -->
<bean id="employeeJPAReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
<property name="entityManagerFactory" ref="entityManagerFactory" />
<property name="queryString" value="SELECT e FROM EmployeeJPA AS e"/>
</bean>
<bean id="copyJPA2PlainProcessor" class="batch.basic.CopyProcessor">
<property name="fromClass" value="batch.basic.EmployeeJPA"/>
<property name="toClass" value="batch.basic.EmployeePlain"/>
</bean>
<bean id="employeeCsvWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" value="C:/work/batch/output.csv"/>
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineAggregator">
<property name="delimiter" value=","/>
<!--
<property name="quoteCharacter" value="""/>
-->
<property name="fieldExtractor">
<bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="no,name,role"/>
</bean>
</property>
</bean>
</property>
</bean>
<!-- job of step1 and step2 -->
<batch:job id="testJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeCsvReader" writer="employeeJPAWriter" processor="copyPlain2JPAProcessor" commit-interval="1"/>
</batch:tasklet>
<batch:next on="COMPLETED" to="step2"/>
<batch:end on="FAILED"/>
</batch:step>
<batch:step id="step2">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeJPAReader" writer="employeeCsvWriter" processor="copyJPA2PlainProcessor" commit-interval="1"/>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>
Note: Spring Batch 4.x does not support quoteCharacter - Spring Batch 5.x does.
EmployeePlain.java (data class):
package batch.basic;
public class EmployeePlain {
private int no;
private String name;
private String role;
public EmployeePlain() {
this(0, null, null);
}
public EmployeePlain(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
EmployeeJPA.java (data class with JPA annotations):
package batch.basic;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;
@Entity
@Table(name="employee")
public class EmployeeJPA {
private int no;
private String name;
private String role;
public EmployeeJPA() {
this(0, null, null);
}
public EmployeeJPA(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
@Id
@Column(name="no")
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
@Column(name="name")
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@Column(name="role")
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
Main.java (standard main program):
package batch.basic;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;
public class Main {
public static void main(String[] args) throws Exception {
@SuppressWarnings("resource")
ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/csvdbcsv.xml");
JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
Job job = (Job)ctx.getBean("testJob");
JobExecution exe = jobLauncher.run(job, new JobParameters());
while(exe.isRunning()) {
System.out.print("*");
Thread.sleep(10);
}
System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
}
}
To use quotes in output with Spring Batch 4.x or to use a custom format, then that can be configured by using a line mapper and a line aggregator:
<bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="C:/work/batch/input.csv"/>
<property name="lineMapper">
<bean class="batch.basic.EmployeeCSVSupport"/>
</property>
</bean>
<bean id="employeeCsvWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" value="C:/work/batch/output.csv"/>
<property name="lineAggregator">
<bean class="batch.basic.EmployeeCSVSupport"/>
</property>
</bean>
package batch.basic;
import org.springframework.batch.item.file.LineMapper;
import org.springframework.batch.item.file.transform.LineAggregator;
public class EmployeeCSVSupport implements LineMapper<EmployeePlan>, LineAggregator<EmployeePlan> {
@Override
public String aggregate(EmployeePlain emp) {
return String.format("%d,\"%s\",\"%s\"", emp.getNo(), emp.getName(), emp.getRole());
}
@Override
public EmployeePlain mapLine(String line, int lineNo) throws Exception {
String[] parts = line.split(",");
EmployeePlain res = new EmployeePlain();
res.setNo(Integer.parseInt(parts[0]));
res.setName(parts[1].replace("\"", ""));
res.setRole(parts[2].replace("\"", ""));
return res;
}
}
Real code may need to be a little more complex to handle quotes within the string value etc., but the point is that the conversion between String line and object can be done 100% custom.
Two steps:
First step has:
Second step has:
Reading fixed width file use config file fragment like:
<bean id="data bean name" class="class to be read"/>
<bean id="employeeFixReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="name of fixed width file"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
<property name="columns" value="list of columns"/>
<property name="names" value="list of fields"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="data bean name"/>
</bean>
</property>
</bean>
</property>
</bean>
Writing to database via JPA will use config file fragment like:
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseWriter" class="org.springframework.batch.item.database.JpaItemWriter">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>
Reading from database via JPA will use config file fragment like:
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="name of persistence unit in persistence.xml"/>
</bean>
<bean id="databaseReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
<property name="entityManagerFactory" ref="entityManagerFactory" />
<property name="queryString" value="JPQL query"/>
</bean>
Note: JPQL query - not SQL query!
Writing fixed width file use config file fragment like:
<bean id="employeeFixWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" value="name of fixed width file"/>
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.FormatterLineAggregator">
<property name="format" value="printf style format"/>
<property name="fieldExtractor">
<bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="list of fields"/>
</bean>
</property>
</bean>
</property>
</bean>
fixdbfix.xml (complete Spring Batch configuration):
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
<!-- support beans -->
<bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="mysql"/>
</bean>
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="repoTransactionManager"/>
</bean>
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository"/>
</bean>
<bean id="employeeBean" class="batch.basic.EmployeePlain"/>
<!-- step1 beans -->
<bean id="employeeFixReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="C:/work/batch/input.Fix"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.FixedLengthTokenizer">
<property name="columns" value="1-8,9-40,41-72"/>
<property name="names" value="no,name,role"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="employeeBean"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean id="copyPlain2JPAProcessor" class="batch.basic.CopyProcessor">
<property name="fromClass" value="batch.basic.EmployeePlain"/>
<property name="toClass" value="batch.basic.EmployeeJPA"/>
</bean>
<bean id="employeeJPAWriter" class="org.springframework.batch.item.database.JpaItemWriter">
<property name="entityManagerFactory" ref="entityManagerFactory"/>
</bean>
<!-- step2 beans -->
<bean id="employeeJPAReader" class="org.springframework.batch.item.database.JpaPagingItemReader">
<property name="entityManagerFactory" ref="entityManagerFactory" />
<property name="queryString" value="SELECT e FROM EmployeeJPA AS e"/>
</bean>
<bean id="copyJPA2PlainProcessor" class="batch.basic.CopyProcessor">
<property name="fromClass" value="batch.basic.EmployeeJPA"/>
<property name="toClass" value="batch.basic.EmployeePlain"/>
</bean>
<bean id="employeeFixWriter" class="org.springframework.batch.item.file.FlatFileItemWriter">
<property name="resource" value="C:/work/batch/output.Fix"/>
<property name="lineAggregator">
<bean class="org.springframework.batch.item.file.transform.FormatterLineAggregator">
<property name="format" value="%8d%-32s%-32s"/>
<property name="fieldExtractor">
<bean class="org.springframework.batch.item.file.transform.BeanWrapperFieldExtractor">
<property name="names" value="no,name,role"/>
</bean>
</property>
</bean>
</property>
</bean>
<!-- job of step1 and step2 -->
<batch:job id="testJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeFixReader" writer="employeeJPAWriter" processor="copyPlain2JPAProcessor" commit-interval="1"/>
</batch:tasklet>
<batch:next on="COMPLETED" to="step2"/>
<batch:end on="FAILED"/>
</batch:step>
<batch:step id="step2">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeJPAReader" writer="employeeFixWriter" processor="copyJPA2PlainProcessor" commit-interval="1"/>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>
EmployeePlain.java (data class):
package batch.basic;
public class EmployeePlain {
private int no;
private String name;
private String role;
public EmployeePlain() {
this(0, null, null);
}
public EmployeePlain(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
EmployeeJPA.java (data class with JPA annotations):
package batch.basic;
import javax.persistence.Column;
import javax.persistence.Entity;
import javax.persistence.Id;
import javax.persistence.Table;
@Entity
@Table(name="employee")
public class EmployeeJPA {
private int no;
private String name;
private String role;
public EmployeeJPA() {
this(0, null, null);
}
public EmployeeJPA(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
@Id
@Column(name="no")
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
@Column(name="name")
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@Column(name="role")
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
Main.java (standard main program):
package batch.basic;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;
public class Main {
public static void main(String[] args) throws Exception {
@SuppressWarnings("resource")
ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/fixdbfix.xml");
JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
Job job = (Job)ctx.getBean("testJob");
JobExecution exe = jobLauncher.run(job, new JobParameters());
while(exe.isRunning()) {
System.out.print("*");
Thread.sleep(10);
}
System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
}
}
In the real world it is not always reading from one file and writing to one file - sometimes it is reading from multiple files or writinhg to multiple files.
Spring Batch supports that.
Scenario:
The way to achieve this is to configure a reader that delegate each input file to another reader:
<bean id="someReader" class="...">
<!-- no resource property -->
...
</bean>
<bean id="multiReader" class="org.springframework.batch.item.file.MultiResourceItemReader">
<property name="resources" value="pattern for input files"/>
<property name="delegate" ref="someReader"/>
</bean>
We will do an example with multiple CSV files being combined to one XML file.
Input files:
frag_1.csv:
1,"Alan A","Manager"
frag_2.csv:
2,"Brian B","Engineer"
frag_3.csv:
2,"Brian B","Engineer"
Desired output file:
comb.xml:
<?xml version="1.0" encoding="UTF-8"?><employees><employee no="1"><name>Alan A</name><role>Manager</role></employee><employee no="2"><name>Brian B</name><role>Engineer</role></employee><employee no="2"><name>Brian B</name><role>Engineer</role></employee></employees>
fanin.xml (complete Spring Batch configuration):
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
<!-- support beans -->
<bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="mysql"/>
</bean>
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="repoTransactionManager"/>
</bean>
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository"/>
</bean>
<bean id="jaxbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>batch.flow.EmployeeJAXB</value>
</list>
</property>
</bean>
<bean id="employeeBean" class="batch.flow.EmployeePlain"/>
<!-- step1 beans -->
<bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names" value="no,name,role"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="employeeBean"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean id="employeeMultiReader" class="org.springframework.batch.item.file.MultiResourceItemReader">
<property name="resources" value="C:/work/batch/frag_*.csv"/>
<property name="delegate" ref="employeeCsvReader"/>
</bean>
<bean id="copyCSV2XMLProcessor" class="batch.flow.CopyProcessor">
<property name="fromClass" value="batch.flow.EmployeePlain"/>
<property name="toClass" value="batch.flow.EmployeeJAXB"/>
</bean>
<bean id="employeeXmlWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" value="C:/work/batch/comb.xml"/>
<property name="marshaller" ref="jaxbMarshaller"/>
<property name="rootTagName" value="employees"/>
<property name="overwriteOutput" value="true"/>
</bean>
<!-- job of step1 -->
<batch:job id="testJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeMultiReader" writer="employeeXmlWriter" processor="copyCSV2XMLProcessor" commit-interval="1"/>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>
EmployeePlain.java (data class):
package batch.flow;
public class EmployeePlain {
private int no;
private String name;
private String role;
public EmployeePlain() {
this(0, null, null);
}
public EmployeePlain(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
EmployeeJAXB.java (data class with JAXB annotations):
package batch.flow;
import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
@XmlRootElement(name="employee")
public class EmployeeJAXB {
private int no;
private String name;
private String role;
public EmployeeJAXB() {
this(0, null, null);
}
public EmployeeJAXB(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
@XmlAttribute
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
@XmlElement
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@XmlElement
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
Main.java (standard main program):
package batch.flow;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;
public class Main {
public static void main(String[] args) throws Exception {
@SuppressWarnings("resource")
ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/fanin.xml");
JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
Job job = (Job)ctx.getBean("testJob");
JobExecution exe = jobLauncher.run(job, new JobParameters());
while(exe.isRunning()) {
System.out.print("*");
Thread.sleep(10);
}
System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
}
}
Scenario:
The way to achieve this is to configure a writer that delegate to multiple writers:
<bean id="firstWriter" class="...">
...
</bean>
<bean id="secondWriter" class="...">
...
</bean>
...
<bean id="multiWriter" class="org.springframework.batch.item.support.CompositeItemWriter">
<property name="delegates">
<list>
<ref bean="firstWriter"/>
<ref bean="secondWriter"/>
...
</list>
</property>
</bean>
We will do an example with one CSV file being converted to both a XML file and a JSON file.
Input file:
input.csv:
1,"Alan A","Manager"
2,"Brian B","Engineer"
3,"Chris C","Sales rep"
Output files:
output.xml:
<?xml version="1.0" encoding="UTF-8"?><employees><employee no="1"><name>Alan A</name><role>Manager</role></employee><employee no="2"><name>Brian B</name><role>Engineer</role></employee><employee no="3"><name>Chris C</name><role>Sales rep</role></employee></employees>
output.json:
[
{"no":1,"name":"Alan A","role":"Manager"},
{"no":2,"name":"Brian B","role":"Engineer"},
{"no":3,"name":"Chris C","role":"Sales rep"}
]
fanout.xml (complete Spring Batch configuration):
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
<!-- support beans -->
<bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="mysql"/>
</bean>
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="repoTransactionManager"/>
</bean>
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository"/>
</bean>
<bean id="jaxbMarshaller" class="org.springframework.oxm.jaxb.Jaxb2Marshaller">
<property name="classesToBeBound">
<list>
<value>batch.flow.EmployeeJAXB</value>
</list>
</property>
</bean>
<bean id="employeeBean" class="batch.flow.EmployeePlain"/>
<!-- step1 beans -->
<bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="C:/work/batch/input.csv"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names" value="no,name,role"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="employeeBean"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean id="copyCSV2XMLProcessor" class="batch.flow.CopyProcessor">
<property name="fromClass" value="batch.flow.EmployeePlain"/>
<property name="toClass" value="batch.flow.EmployeeJAXB"/>
</bean>
<bean id="employeeXmlWriter" class="org.springframework.batch.item.xml.StaxEventItemWriter">
<property name="resource" value="C:/work/batch/output.xml"/>
<property name="marshaller" ref="jaxbMarshaller"/>
<property name="rootTagName" value="employees"/>
<property name="overwriteOutput" value="true"/>
</bean>
<bean id="employeeJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
<constructor-arg value="C:/work/batch/output.json"/>
<constructor-arg>
<bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
</constructor-arg>
</bean>
<bean id="employeeMultiWriter" class="org.springframework.batch.item.support.CompositeItemWriter">
<property name="delegates">
<list>
<ref bean="employeeXmlWriter"/>
<ref bean="employeeJsonWriter"/>
</list>
</property>
</bean>
<!-- job of step1 -->
<batch:job id="testJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeCsvReader" writer="employeeMultiWriter" processor="copyCSV2XMLProcessor" commit-interval="1"/>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>
EmployeePlain.java (data class):
package batch.flow;
public class EmployeePlain {
private int no;
private String name;
private String role;
public EmployeePlain() {
this(0, null, null);
}
public EmployeePlain(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
EmployeeJAXB.java (data class with JAXB annotations):
package batch.flow;
import javax.xml.bind.annotation.XmlAttribute;
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
@XmlRootElement(name="employee")
public class EmployeeJAXB {
private int no;
private String name;
private String role;
public EmployeeJAXB() {
this(0, null, null);
}
public EmployeeJAXB(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
@XmlAttribute
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
@XmlElement
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@XmlElement
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
Main.java (standard main program):
package batch.flow;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;
public class Main {
public static void main(String[] args) throws Exception {
@SuppressWarnings("resource")
ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/fanout.xml");
JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
Job job = (Job)ctx.getBean("testJob");
JobExecution exe = jobLauncher.run(job, new JobParameters());
while(exe.isRunning()) {
System.out.print("*");
Thread.sleep(10);
}
System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
}
}
Scenario:
The way to achieve this is to configure a classifying writer with classifying class and multiple writers:
<bean id="firstWriter" class="...">
...
</bean>
<bean id="secondWriter" class="...">
...
</bean>
...
<bean id="selector" class="classifier class">
<property name="first" ref="firstWriter"/>
<property name="second" ref="secondWriter"/>
...
</bean>
<bean id="multiWriter" class="org.springframework.batch.item.support.ClassifierCompositeItemWriter">
<property name="classifier" ref="selector"/>
</bean>
We will do an example with one CSV file being split out to two JSON files.
Input file:
input.csv:
1,"Alan A","Manager"
2,"Brian B","Engineer"
3,"Chris C","Sales rep"
Output files:
supervisor.json:
[
{"no":1,"name":"Alan A","role":"Manager"}
]
individual.json:
[
{"no":2,"name":"Brian B","role":"Engineer"},
{"no":3,"name":"Chris C","role":"Sales rep"}
]
selfanout.xml (complete Spring Batch configuration):
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
<!-- support beans -->
<bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="mysql"/>
</bean>
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="repoTransactionManager"/>
</bean>
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository"/>
</bean>
<bean id="employeeBean" class="batch.flow.EmployeePlain"/>
<!-- step1 beans -->
<bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="C:/work/batch/input.csv"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names" value="no,name,role"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="employeeBean"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean id="copyProcessor" class=" org.springframework.batch.item.support.PassThroughItemProcessor"/>
<bean id="supervisorJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
<constructor-arg value="C:/work/batch/supervisor.json"/>
<constructor-arg>
<bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
</constructor-arg>
</bean>
<bean id="individualJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
<constructor-arg value="C:/work/batch/individual.json"/>
<constructor-arg>
<bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
</constructor-arg>
</bean>
<bean id="selector" class="batch.flow.Selector">
<property name="supervisor" ref="supervisorJsonWriter"/>
<property name="individual" ref="individualJsonWriter"/>
</bean>
<bean id="employeeMultiWriter" class="org.springframework.batch.item.support.ClassifierCompositeItemWriter">
<property name="classifier" ref="selector"/>
</bean>
<!-- job of step1 -->
<batch:job id="testJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeCsvReader" writer="employeeMultiWriter" processor="copyProcessor" commit-interval="1">
<batch:streams>
<batch:stream ref="supervisorJsonWriter"/>
<batch:stream ref="individualJsonWriter"/>
</batch:streams>
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>
EmployeePlain.java (data class):
package batch.flow;
public class EmployeePlain {
private int no;
private String name;
private String role;
public EmployeePlain() {
this(0, null, null);
}
public EmployeePlain(int no, String name, String role) {
this.no = no;
this.name = name;
this.role = role;
}
public int getNo() {
return no;
}
public void setNo(int no) {
this.no = no;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getRole() {
return role;
}
public void setRole(String role) {
this.role = role;
}
@Override
public String toString() {
return String.format("[%d,%s,%s]", no, name, role);
}
}
Main.java (standard main program):
package batch.flow;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;
public class Main {
public static void main(String[] args) throws Exception {
@SuppressWarnings("resource")
ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/selfanout.xml");
JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
Job job = (Job)ctx.getBean("testJob");
JobExecution exe = jobLauncher.run(job, new JobParameters());
while(exe.isRunning()) {
System.out.print("*");
Thread.sleep(10);
}
System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
}
}
A common problem is what to do in case of an error during processing of a record.
The 3 common options are:
All of these can be configure in Spring Batch. Skip and retry can be configured for certain exceptions. Exit is default.
We will use the following data for test.
big.csv:
1,"Line #1"
2,"Line #2"
3,"Line #3"
4,"Line #4"
5,"Line #5"
6,"Line #6"
7,"Line #7"
8,"Line #8"
9,"Line #9"
10,"Line #10"
Skip exceptions are configured as:
<batch:chunk reader="..." writer="..." processor="..." ... skip-limit="maximum number of records to skip">
<batch:skippable-exception-classes>
<batch:include class="exception class"/>
<batch:include class="exception class"/>
...
</batch:skippable-exception-classes>
</batch:chunk>
We will do an example where even numbers generate an exception and trigger skip.
So expected output is as below.
bigdiscard.json:
[
{"ival":1,"sval":"Line #1"},
{"ival":3,"sval":"Line #3"},
{"ival":5,"sval":"Line #5"},
{"ival":7,"sval":"Line #7"},
{"ival":9,"sval":"Line #9"}
]
bigdiscard.xml (complete Spring Batch configuration):
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
<!-- support beans -->
<bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="mysql"/>
</bean>
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="repoTransactionManager"/>
</bean>
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository"/>
</bean>
<bean id="dataBean" class="batch.handling.Data"/>
<!-- step1 beans -->
<bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="C:/work/batch/big.csv"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names" value="ival,svale"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="dataBean"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean id="discardProcessor" class="batch.handling.DiscardProcessor"/>
<bean id="employeeJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
<constructor-arg value="C:/work/batch/bigdiscard.json"/>
<constructor-arg>
<bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
</constructor-arg>
</bean>
<!-- job of step1 -->
<batch:job id="testJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeCsvReader" writer="employeeJsonWriter" processor="discardProcessor" commit-interval="1" skip-limit="1000">
<batch:skippable-exception-classes>
<batch:include class="batch.handling.DiscardProcessor.DiscardException"/>
</batch:skippable-exception-classes>
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>
Data.java (data class):
package batch.handling;
public class Data {
private int ival;
private String sval;
public int getIval() {
return ival;
}
public void setIval(int ival) {
this.ival = ival;
}
public String getSval() {
return sval;
}
public void setSval(String sval) {
this.sval = sval;
}
}
DiscardProcessor.java (processor simulation errors):
package batch.handling;
import org.springframework.batch.item.ItemProcessor;
public class DiscardProcessor implements ItemProcessor<Data,Data> {
public static class DiscardException extends Exception {
private static final long serialVersionUID = 1L;
}
@Override
public Data process(Data source) throws Exception {
if(source.getIval() % 2 == 0) {
throw new DiscardException();
}
return source;
}
}
Main.java (standard main program):
package batch.handling;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;
public class Main {
public static void main(String[] args) throws Exception {
@SuppressWarnings("resource")
ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/bigdiscard.xml");
JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
Job job = (Job)ctx.getBean("testJob");
JobExecution exe = jobLauncher.run(job, new JobParameters());
while(exe.isRunning()) {
System.out.print("*");
Thread.sleep(10);
}
System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
}
}
Retry exceptions are configured as:
<batch:chunk reader="..." writer="..." processor="..." ... retry-limit="maximum number of retries for a single record">
<batch:retryable-exception-classes>
<batch:include class="exception class"/>
<batch:include class="exception class"/>
...
</batch:retryable-exception-classes>
</batch:chunk>
We will do an example where processor at random generate an exception and trigger retry.
So expected output is as below.
bigflaky.json:
[
{"ival":1,"sval":"Line #1"},
{"ival":2,"sval":"Line #2"},
{"ival":3,"sval":"Line #3"},
{"ival":4,"sval":"Line #4"},
{"ival":5,"sval":"Line #5"},
{"ival":6,"sval":"Line #6"},
{"ival":7,"sval":"Line #7"},
{"ival":8,"sval":"Line #8"},
{"ival":9,"sval":"Line #9"},
{"ival":10,"sval":"Line #10"}
]
bigflaky.xml (complete Spring Batch configuration):
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:batch="http://www.springframework.org/schema/batch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans https://www.springframework.org/schema/beans/spring-beans.xsd http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">
<!-- support beans -->
<bean id="dbTransactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">
<property name="entityManagerFactory" ref="entityManagerFactory" />
</bean>
<bean id="repoTransactionManager" class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
<bean id="entityManagerFactory" class="org.springframework.orm.jpa.LocalContainerEntityManagerFactoryBean">
<property name="persistenceUnitName" value="mysql"/>
</bean>
<bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
<property name="transactionManager" ref="repoTransactionManager"/>
</bean>
<bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository"/>
</bean>
<bean id="dataBean" class="batch.handling.Data"/>
<!-- step1 beans -->
<bean id="employeeCsvReader" class="org.springframework.batch.item.file.FlatFileItemReader">
<property name="resource" value="C:/work/batch/big.csv"/>
<property name="lineMapper">
<bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">
<property name="lineTokenizer">
<bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">
<property name="delimiter" value=","/>
<property name="names" value="ival,svale"/>
</bean>
</property>
<property name="fieldSetMapper">
<bean class="org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper">
<property name="prototypeBeanName" value="dataBean"/>
</bean>
</property>
</bean>
</property>
</bean>
<bean id="FlakyProcessor" class="batch.handling.FlakyProcessor"/>
<bean id="employeeJsonWriter" class="org.springframework.batch.item.json.JsonFileItemWriter">
<constructor-arg value="C:/work/batch/bigflaky.json"/>
<constructor-arg>
<bean class="org.springframework.batch.item.json.JacksonJsonObjectMarshaller"/>
</constructor-arg>
</bean>
<!-- job of step1 -->
<batch:job id="testJob">
<batch:step id="step1">
<batch:tasklet transaction-manager="dbTransactionManager">
<batch:chunk reader="employeeCsvReader" writer="employeeJsonWriter" processor="FlakyProcessor" commit-interval="1" retry-limit="10">
<batch:retryable-exception-classes>
<batch:include class="batch.handling.FlakyProcessor.FlakyException"/>
</batch:retryable-exception-classes>
</batch:chunk>
</batch:tasklet>
</batch:step>
</batch:job>
</beans>
FlakyProcessor.java (processor simulating errors):
package batch.handling;
import java.util.Random;
import org.springframework.batch.item.ItemProcessor;
public class FlakyProcessor implements ItemProcessor<Data,Data> {
public static class FlakyException extends Exception {
private static final long serialVersionUID = 1L;
}
private static Random rng = new Random();
@Override
public Data process(Data source) throws Exception {
if(rng.nextDouble() < 0.5) {
throw new FlakyException();
}
return source;
}
}
Main.java (standard main program):
package batch.handling;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.FileSystemXmlApplicationContext;
public class Main {
public static void main(String[] args) throws Exception {
@SuppressWarnings("resource")
ApplicationContext ctx = new FileSystemXmlApplicationContext("C:/work/batch/bigflaky.xml");
JobLauncher jobLauncher = (JobLauncher)ctx.getBean("jobLauncher");
Job job = (Job)ctx.getBean("testJob");
JobExecution exe = jobLauncher.run(job, new JobParameters());
while(exe.isRunning()) {
System.out.print("*");
Thread.sleep(10);
}
System.out.printf("Done, status = %s %s\n", exe.getExitStatus().getExitCode(), exe.getExitStatus().getExitDescription());
}
}
Spring Batch is a big framework and above has only shown a tiny part of what it is capable off. But hopefully enough to provide a basic understanding.
I actually like Spring Batch. It is targeting a rather narrow type of applications, but it is a type of applications that are widely used in lots of companies.
Version | Date | Description |
---|---|---|
1.0 | March 15th 2025 | Initial version |
See list of all articles here
Please send comments to Arne Vajhøj