Migrating a Benerator 0.5 project to version 0.6
Migrate or not?
Four major drivers triggered design decisions for Benerator 0.6:
- moving towards a final syntax for Benerator 1.0
- modifying the service provider interfaces for allowing efficient multithreaded execution
- resolving standard problems of many users
- dropping concepts which were unintuitive, difficult for users or simply were not fit for their purpose
Because of these I was unfortunately forced to make a lot of changes which are not backwards compatible. If you did not implement custom service providers, it will be relatively easy to migrate a project. The more Java classes you implemented to extend Benerator, the more difficult a migration is supposed to be. So, if your data generation is working fine now, you might decide to avoid the complications of migrating.
Default Script Language
One of the most important changes for user convenience was the definition of an own expression language, called Benerator Script. It replaces FreeMarker as Benerator's default script language. However, when migrating, you can reduce the effort by explicitly using FreeMarker as default script language. So, for keeping the variable substitution syntaxt as {${variable}}, please use:
| <setup ... defaultScript="ftl"> |
Import
The import feature changed: Package import was dropped and the domain import was changed to a plural form, now supporting a comm-separated list of domain names. A platform import was defined as well. So you can use a more compact import syntax, like:
| <import defaults="true" domains="person,address" platforms="xml,db" /> |
create-entities
The name 'create-entities' was limited to entities and confusing for the user, since it had to be applied even when iterating existing data, not generating. So the meaning was split:
Use 'iterate' for iterating over existing data from a 'source'. So a statement <create-entities source="db" name="db_order" /> now is correctly:
| <iterate source="db" type="db_order" /> |
For real data generation from scratch, use 'generate':
| <generate type="db_order"> |
As you may have noticed from the examples above, something more has changed: Benerator now differs between the 'type' of an entity and its 'name'. When migrating, the existing semantics is kept if you replace all create-entities' 'name' attributes with a 'type' attribute: <create-entities name="db_order" /> becomes
| <generate type="db_order" /> |
BTW: you can specify a name as well, it's like java classes (type) and instances (name). So you now can differ between generated objects of the same type in nested generation elements by different names.
update-entities
Since the update-entities only made sense for databases, but benerator is a universal data generation tool, this element was dropped. Instead, you should now iterate over a database's data explicitly and use a special consumer, which is provided to you by the database calling the updater() method.
Example: Change <update-entities name="db_order" source="db"> to
| <iterate source="db" type="db_order" consumer="db.updater()"> |
pagesize
The capitalization has been unified, so you need to rename the pagesize attribute to pageSize.
ids and references
One more confusing thing in Benerator 0.5 was that you were forced to use 'attribute' elements if you wanted to use special features for 'id' or 'reference' elements, e.g. setting them by a script expression. You now have (almost) the full feature set for 'id' and 'reference' elements, e.g.
| <reference name=""type="int" values="1,2,3" /> |
Constructor invocation
You can instantiate generators, converters, validators and consumers by specifying a constructor invocation. Its syntax has changed slightly to a C/Java-like form: generator="com.my.SpecialClass(1234)" becomes
| generator="new com.my.SpecialClass(1234)" |
Id Providers
The whole IdProvider concept was dropped, since it was too specific and cumbersome and caused code duplication with generators of similar semantic. Thus, existing functionality has been moved to generator classes which now need to be used explicitly. A configuration <id name="id" strategy="increment"/> needs to be replaced with
| <id generator="IncrementalIdGenerator"/> |
You can configure generators using an explicit constructor invocation, e.g. <id generator="new IncrementalIdGenerator(100, 2)"/>
Some hints for former common IdProviders:
strategy 'increment' (provider IncrementIdProvider) -> IncrementalIdGenerator
strategy 'uuid' (provider UUIDProvider) -> UUIDGenerator
Database-related IdProviders (require an <import platforms="db" />):
strategy 'seqhilo' (provider SeqHiLoIdProvider) -> DBSeqHiLoGenerator
QueryIdProvider -> QueryGenerator
LongQueryIdGenerator -> QueryLongGenerator
The following is a summary of all id-related generator classes:
- Platform independent: IncrementalIdGenerator, UUIDGenerator, HibUUIDGenerator, LocalSequenceGenerator
- Database-related: DBSequenceGenerator, DBSeqHiLoGenerator, QueryGenerator, QueryLongGenerator, QueryHiLoGenerator
Define a scope explicitly by instantiating a generator as <bean> and refering to it by name where necessary
'value' lists
With the introduction of Benerator Script, syntax was unified. So you need to specify value lists like literals, e.g. a string value list <attribute name="rank" values="A,B,C">, now needs to be formatted
| <attribute name="rank" values="'A','B','C'"> |
Number lists remain unchanged, e.g. <attribute name="rank" values="1,2,3">
'source' attributes
Due to the limitations of FreeMarker as default script language, the 'script' attribute was a failover to refer to some data types that caused problems for FreeMarker. Unfortunately it made the resolution of some bugs very cumbersome. So, with the introduction Benerator Script, this failover is not needed any more and dropped. A 'source' attribute that refers a variable or a member of the generated entity needs to be redefined as script: <attribute name="name" source="user.id"/> becomes
| <attribute name="name" script="user.id" /> |
BTW: When using script="..." Benerator now knows that the content can only be a script; so the curly braces {} are not necessary in this case.
proxy, proxy-param1, proxy-param2
This concept was dropped completely and migrated to the new Distribution concept.
Migration for the standard proxies:
proxy="skip" proxy-param1="10" proxy-param2="20" --> distribution="new RandomWalkSequence(10, 20)"
proxy="repeat" proxy-param1="10" proxy-param2="20" --> distribution="new RepeatSequence(10, 20)"
Custom Generators
The Generator interface has changed in three ways:
- Merged avaliable() and generate()
- Declaring thread safety with the ThreadAware interface methods: isThreadSafe() and isParallelizable()
- Simplified life cycle management for the component developer: initialize() and wasInitialized()
available() and generate()
The available() method was dropped. Instead, the generate() method semantic was changed to return null for indicating that a generator is depleted. The design decision was necessary to allow for more efficient multithreaded execution.
isThreadSafe() and isParallelizable()
A generator class can declare its threading support by these methods:
- If a generator can be executed with several concurrent threads, isThreadSafe should return 'true'.
- If a generator can be cloned and each clone can run with a single dedicated thread, isParallelizable should return 'true'
However there is no satisfactory support for multithreaded execution implemented yet, so you can make your life easy by retruning 'false' in both methods or, even easier make your Generator implementation extend the class SimpleGenerator.
The class LightweightGenerator was dropped in favor of the classes
- AbstractGenerator: Provides lifecycle managment
- SimpleGenerator: Provides lifecycle management and is neither thread-safe nor parallelizable
- ThreadSafeGenerator: Provides lifecycle management and is thread-safe and parallelizable
Custom Distributions, variation1, variation2
The Distribution (Sequence, WeightFunction) concept has completely changed. If you implemented custom sequences, you need to completely relearn what to do. If your sequence might be of general interest, you can contribute the old implementation and have me translate it to the new standard. Please refer to the manual for the new component contract of the Distribution interface and the Sqequence implementations.
variation1 and variation2 were dropped in favour of explicit parameterized Distribution construction.
Custom Converters
The Converter interface was changed in two ways:
- extending the ThreadAware interface: isThreadSafe() and isParallelizable()
- dropped the canConvert() method and re-introduced the getSourceType() method
isThreadSafe() and isParallelizable()
A converter class can declare its threading support by these methods:
- If a converter can be executed with several concurrent threads, isThreadSafe should return 'true'.
- If a converter can be cloned and each clone can run with a single dedicated thread, isParallelizable should return 'true'
However there is no satisfactory support for multithreaded execution implemented yet, so you can make your life easy by retruning 'false' in both methods or, even easier make your Converter implementation extend the class SimpleConverter.
These classes are useful parent classes for your new Converter implementation:
- SimpleConverter: Declares to be neither thread-safe nor parallelizable
- ThreadSafeConverter: Dclares being thread-safe and parallelizable
canConvert() and getSourceType()
Experience has shown that the canConvert() methods introduced in one of the last releases was one step too far and complicated some things without adding real value. So it was dropped and replace by the former method getSourceType() which simply returns the Java class of which instances can be converted.
Custom Tasks
The Task interface has changed significantly. Please consult the manual for the new component contract.
Other changes
This list may not be complete. If you encounter further problems, please post a question or information in the forum. I will then extend this page.


