Get ahead
VMware offers training and certification to turbo-charge your progress.
Learn moreIn this article I will show you how to run a Spring Batch job in the SpringSource Application Platform. I ran an early version of this up as a little demo for JavaOne, and then again at the London Spring User Group, and thought it might be a good thing to share. The sample code is here.
The bundle configuration is in META-INF/spring/module-context.xml (this is conventional for Platform bundles) - Spring DM picks up all XML files from META-INF/spring. This one just uses Spring to configure and launch an instance of the HSQL Server.
There is an integration test that can be used to check the configuration.
The Eclipse project also contains a launch configuration for an HSQL Swing client, so you can see the database contents in a GUI. Launch it and connect to the server instance with the properties provided in META-INF/batch-hsql.properties in the same project (url=jdbc:hsqldb:hsql://localhost:9005/samples).
The bundle configuration is in META-INF/spring/module-context.xml as per usual. It is a stripped-down version of the simple-job-launcher-context.xml from the Spring Batch samples. It only has to define the beans that will be exported, i.e.
<bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<property name="jobRepository" ref="jobRepository" />
</bean>
<bean id="jobRepository"
class="org.springframework.batch.core.repository.support.JobRepositoryFactoryBean">
<property name="dataSource" ref="dataSource" />
<property name="databaseType" value="hsql" />
</bean>
<bean id="transactionManager"
class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
<property name="dataSource" ref="dataSource"/>
</bean>
The only other configuration there is the transaction advice for the JobRepository (not needed in Spring Batch 1.1). The dataSource reference comes from an OSGi service exposed by the data-source bundle above. To see how that reference is imported and how the local services are exposed to the OSGi Service Registry, we can look at META-INF/spring/osgi-context.xml:
<reference id="dataSource" interface="javax.sql.DataSource" />
<service ref="jobLauncher"
interface="org.springframework.batch.core.launch.JobLauncher" />
<service ref="jobRepository"
interface="org.springframework.batch.core.repository.JobRepository" />
<service ref="transactionManager"
interface="org.springframework.transaction.PlatformTransactionManager" />
This is pretty straightforward use of Spring DM. The important thing is that the module context is kept separate from the OSGi-specific context. This allows us to write an integration test for the module context, without having to deploy to the Platform. Thus we have:
@ContextConfiguration
@RunWith(SpringJUnit4ClassRunner.class)
public class JobLauncherIntegrationTests {
@Autowired
private JobLauncher jobLauncher;
@Test
public void testLaunchJob() throws Exception {
assertNotNull(jobLauncher);
}
}
The test loads the context up, adding a local data source definition to replace the OSGi one (see JobLauncherIntegrationTests-context.xml), ad then asserts that a job launcher is available. You can run the test directly from Eclipse in the normal way.
...
Export-Package: com.springsource.consulting.batch.support
...
If you look in this package you will find a convenience class that can be used by other bundles to launch a job (SimpleJobLauncherBean). The SimpleJobLauncherBean is an ApplicationListener which means that any Spring ApplicationContext that contains one of these will try to launch the job on startup (when the context is loaded). The way it does this is to listen for a ContextRefreshedEvent and then try to launch the job:
try {
jobLauncher.run(job, converter.getJobParameters(parameters));
} catch (JobExecutionAlreadyRunningException e) {
logger.error("This job is already running", e);
} catch (JobInstanceAlreadyCompleteException e) {
logger.info("This job is already complete. "
+ "Maybe you need to change the input parameters?", e);
} catch (JobRestartException e) {
logger.error("Unspecified restart exception", e);
}
The plan for launching jobs is to simply create a bundle for each job, and have it define one of these SimpleJobLauncherBean instances.
Drop the bundle into the running Server instance. It starts pretty quickly, and since the job is so small in scope you will immediately see the effect in the batch meta-data. In the HSQL Swing GUI you can execute some SQL, e.g.
SELECT * FROM BATCH_STEP_EXECUTION
And see the result, something like this:
STEP_EXECUTION_ID | VERSION | STEP_NAME | ... | STATUS | ... |
---|---|---|---|---|---|
0 | 4 | helloWorldStep | ... | COMPLETED | ... |
This shows that the job was executed (and completed successfully). The configuration for the step is in META-INF/spring/module-context.xml:
<bean
class="com.springsource.consulting.batch.support.SimpleJobLauncherBean">
<constructor-arg ref="jobLauncher" />
<constructor-arg ref="helloWorld" />
<property name="parameters" value="launch.timestamp=${launch.timestamp}"/>
</bean>
<bean id="helloWorld" parent="simpleJob">
<property name="steps">
<bean parent="simpleStep" id="helloWorldStep">
<property name="commitInterval" value="100" />
<property name="itemReader">
...
</property>
<property name="itemWriter">
...
</property>
</bean>
</property>
</bean>
From the above you can see that we have a regular Spring Batch job configuration (called "helloWorld") with a single step. The step id ("helloWorldStep") was seen already in the database query above, showing that the step had been executed (once). All the step does is read data from a flat file, transforming the lines to domain objects, and writing them to a stdout. You can see the result by inspecting the trace logs in the Platform home directory, e.g. if you tail -f serviceability/trace/trace.log | grep -i hello you should see:
[2008-05-30 15:57:04.140] platform-dm-11
com.springsource.consulting.batch.hello.MessageWriter.unknown
I Message: [Hello World]
[2008-05-30 15:57:04.140] platform-dm-11
com.springsource.consulting.batch.hello.MessageWriter.unknown
I Message: [Hello Small World]
If you like you can run the job again, just by editing one of the files in the bundle (e.g. the MANIFEST or one of the Spring config files) and saving it. The tools pick up the change and redeploy the bundle. The way this job is set up it starts every execution with a new set of parameters (using a timestamp) so it should always run successfully.
To signal the end of the job, the SimpleJobLauncherBean simply grabs the enclosing OSGi Bundle instance, and stops it. This is a pretty simple model, but has the benefit that the API is well defined and universally supported by the OSGi platform. It can in principle be extended very flexibly as long as the container (SpringSource Application Platform) can trap those bundle events. These are features we might see in the Batch Personality for the Platform version 2.0. If you have any ideas about what the behaviour should be and what features are required by operators, please help us by commenting on this article.
We can verify the state of the job bundle by logging into the Equinox console. If you go to a command line and type telnet localhost 2401 you should see a command prompt from the platform:
osgi>
Type "ss" and hit return, and you will see a list of installed bundles:
osgi> ss
Framework is launched.
id State Bundle
...
86 RESOLVED org.springframework.batch.infrastructure_1.0.0
87 RESOLVED org.springframework.batch.core_1.0.0
88 RESOLVED com.springsource.org.apache.commons.lang_2.4.0
97 ACTIVE job.launcher_1.0.0
99 RESOLVED hello.world_1.0.0
osgi>
So the bundle with id=97 is the job launcher, and it is active. The bundle with id=99 is the hello world job (the ids might be different in your case), and it is resolved, but not active because it was stopped when the job finished executing.
You can restart the job again from the OSGi command line:
osgi> start 99
osgi> ss
Framework is launched.
id State Bundle
...
86 RESOLVED org.springframework.batch.infrastructure_1.0.0
87 RESOLVED org.springframework.batch.core_1.0.0
88 RESOLVED com.springsource.org.apache.commons.lang_2.4.0
97 ACTIVE job.launcher_1.0.0
99 RESOLVED hello.world_1.0.0
osgi>
The job bundle is back to the resolved state, but it has executed the job again, which you can verify from the HSQL GUI or from the trace logs as before.
STEP_EXECUTION_ID | VERSION | STEP_NAME | ... | STATUS | ... |
---|---|---|---|---|---|
0 | 4 | helloWorldStep | ... | COMPLETED | ... |
1 | 4 | helloWorldStep | ... | COMPLETED | ... |
2 | 4 | helloWorldStep | ... | COMPLETED | ... |
If you just tried that, you probably found that on the second and subsequent launches, nothing changes in the database. That's expected because you have restarted a successfully completed job instance, so it won't process the data again. In fact an exception was thrown by the JobLauncher, caught and logged by the SimpleJobLauncherBean (so it shows up in the trace logs).
After installing the SpringSource Eclipse tools, you need to create a server instance. Go to File->New->Other... and find Server->Server. Select SpringSource and the server type below that and use the browse dialogue to find the Platform installation.
$ find ~/.m2/repository -name \*.jar -exec cp {} bundles/usr \;
No need to restart Eclipse or anything. The "Bundle Dependencies" classpath container should then contain the runtime dependencies you just downloaded. When all Eclipse errors in the Problems view (angry red margin markers) have disappeared, we are ready to go.
I would be delighted to hear from people who have a better way of doing this. Other people have evolved other methods already, but none seemed that convenient to me. Actually, a command-line Maven target would be quite easy to write, but I haven't seen that one yet.
You also don't need the Maven local repo for runtime dependencies at all, in principle. You can open the Platform runtime (right click in the Servers view and Open), and browse for and download dependencies directly to bundles/usr. The only weakness currrently (the tools team is working on improving this) is that it doesn't provide any view of the transitive dependencies - you have to know explicitly which bundles are needed. In the case of the samples for this blog, that is easy because the MANIFESTs all have the dependencies completely specificied already. It's harder when you aren't sure what they are and have to create the MANIFEST from scracth. For that I'm still using Q4E for now.
A lot of the focus with the Application Platform 1.0 release is on the web tier, and while this is clearly essential (and very tricky to deliver), there are other fish to fry. The 2.0 release will have specific batch-related features (a Batch Personality), so anything we do now will be helpful to flesh out feature requirements for that release. So if you get a chance to try this out and have some constructive comments, especially about operational aspects, they will come in handy when we start to build the Batch Personality.