SAXParseException: Element '...' cannot have character [children]
That's not an error in Benerator, but in your XML code: Most Benerator descriptor element (e.g. <generate>) may contain only sub elements, but no characters except white space. A simple typo like <generate>> can cause this exception.
Simple characters are easy to recognize, but there is a more tricky case: Some editors (like the standard Eclipse XML editor) insert non-visible characters in your XML file (e.g. when typing Alt-Space, which are not displayed by your editor, but no whitespace to the XML parser. The standard cure for this case is, to remove all 'white space'between start <element> and end </element> and then reinsert it manually.
Shell invoaction error on Windows
When invoking Windows shell commands from Benerator like this:
<execute type="shell">echo 42</execute>
you may get an error message like this:
CreateProcess error=2, file not found
It means that the 'echo' command is not found because it is part of the Windows command interpreter and not a separate executable. To run the Windows command interpreter, execute cmd.exe /C:
<execute type="shell">cmd.exe /C echo 42</execute>
Memory
If you get an OutOfMemoryError, first increase the Java heap size by an -Xmx environment setting (e.g. by configuring the Java heap size, e.g. -Xmx1024m using the BENERATOR_OPTS).
Another potential cause for OutOfMemoryErrors is application of distributions to very large data sets. Most sequences and all other types of distribution require the source data to fit into RAM. So either use an 'unlimited' sequence like 'expand', 'repeat' or 'randomWalk' or simply repeat data set iteration by adding cyclic="true" to the configuration.
temp Directory
On some environments, the temp directory has a very restrictive disk quota. If you need more space for data generation, you can specify another directory by the -Djava.io.tmpdir environment setting (e.g. by adding -Djava.io.tmpdir=/User/me/mytemp to the BENERATOR_OPTS in the script files)
File Encoding
If no file encoding was specified, benerator uses the default file encoding of the system it runs on - except if the file itself contains encoding info.
If all used files have the same encoding and it is different to your system's encoding, you can change set benerator's default encoding by the -Dfile.encoding environment setting (e.g. by adding -Dfile.encoding=iso-8859-1 to the BENERATOR_OPTS in the script files)
When generating data in heterogeneous environments, it is good practice to set the defaultEncoding property of the benerator descriptor file's root element. If only single files have a different encoding, you can specify an encoding properts for all built-in file importers and file-based consumers.
A typical error that may arise from wrong file encoding configuration is that file import (e.g. for a CSV file) stops before the end of file is reached.
Logging
benerator logs its event using apache commons-logging. That service forwards output to Apache log4j or to the native JDK 1.4 logging. For avoiding version conflicts with your environment, benerator uses JDK 1.4 logging by default, but for troubleshooting it is useful to switch to Log4j as the underlying logging implementation and fine-tune log messages for tracking down your problem. In order to use log4j, download the binary of a new version (e.g. log4j 1.2.15) from the Apache log4j 1.2 website, uncompress it and put the jar file log4j-1.2.15.jar into benerator's lib directory. Edit the log4j.xml file in your BENERATOR_HOME/bin directory to adapt the log levels for interesting categories:
Set a category to debug for getting detailed information about its execution. The most important log categories are:
| name | description |
| org.databene.benerator.main | Events of benerator's main classes, e.g. detailed information about which entities are currently generated |
| org.databene.benerator.STATE | generator state handling for information which component generator caused termination of the composite generator |
| org.databene.benerator.factory | Creating generators from descriptor information |
| org.databene.benerator | Top-level directory for all generators and main classes |
| org.databene.SQL | SQL commands, e.g. DDL, queries, inserts, updates |
| org.databene.JDBC | JDBC operations, e.g. connection / transaction handling |
| org.databene.platform.db | All database related information that does not fit into the SQL or JDBC category |
| org.databene.platform.xml | XML-related activities |
| org.databene.domain | benerator domain packages |
| org.databene.model descriptor | related information |
| org.databene.commons | low-level operations like data conversion |
Locating Errors
When configuring data generation you are likely to encounter error messages.
Depending on the settings it may be difficult to find out what caused the problem. For tracking database-related errors, set batch="false" in your <database> setup and use pagesize="1" in the <generate>. These are default settings, so you do not need to specify them explicitly if you did not change the default.
If that alone does not help, set the log category org.databene.benerator.main to debug level to find out which element caused the error. If there is a stack trace, check it to get a hint which part of the element's generation went wrong. If that does not help, remove one attribute/reference/id after the other for finding the actual troublemaker. If you still cannot solve the problem, post a message in the benerator forum. You can check out the benerator sources from the SVN source repository, open it in Eclipse and debug through the code.
Database Privilege Problems
When importing database metadata, you might encounter exceptions when Benerator tries to get metadata of catalogs or schemas it has no access privileges to.
Usually can fix this by choosing the right schema for your database, e.g.
<database id="db" ... schema="PUBLIC" />
If you are not sure which schema is applicable in your case, edit the logging configuration in log4j.xml (as described above) and set the category org.databene.platform.db to debug.
You will then get a list of schemas as Benerator scans the database metadata, e.g. for an Oracle system:
11:20:09,279 DEBUG [DBSystem] parsing metadata...
11:20:09,302 INFO [JDBCDBImporter] Importing database metadata. Be patient, this may take some time...
11:20:09,303 DEBUG [JDBCDBImporter] Product name: Oracle
11:20:09,303 INFO [JDBCDBImporter] Importing catalogs
11:20:09,313 INFO [JDBCDBImporter] Importing schemas
11:20:09,320 DEBUG [JDBCDBImporter] found schema ANONYMOUS
11:20:09,320 DEBUG [JDBCDBImporter] found schema CTXSYS
11:20:09,320 DEBUG [JDBCDBImporter] found schema DBSNMP
11:20:09,320 DEBUG [JDBCDBImporter] found schema DIP
11:20:09,320 DEBUG [JDBCDBImporter] found schema FLOWS_FILES
11:20:09,321 DEBUG [JDBCDBImporter] found schema FLOWS_020100
11:20:09,321 DEBUG [JDBCDBImporter] found schema HR
11:20:09,321 DEBUG [JDBCDBImporter] found schema MDSYS
11:20:09,321 DEBUG [JDBCDBImporter] found schema OUTLN
11:20:09,322 DEBUG [JDBCDBImporter] found schema SHOP
11:20:09,322 DEBUG [JDBCDBImporter] found schema SYS
11:20:09,322 DEBUG [JDBCDBImporter] found schema SYSTEM
11:20:09,323 DEBUG [JDBCDBImporter] found schema TSMSYS
11:20:09,323 DEBUG [JDBCDBImporter] found schema XDB
Cross checking this with your access information should make it easy to figure out which one is appropriate in your case.
Typical default schema names are:
| Database | Default Schema name |
| HSQL | public |
| Postgres | public |
| SQL Server | dbo |
| DB2 | <user name> |
| Derby | <user name> |
| MySQL | <user name> |
| Oracle | <user name> |
Constraint Violations
Some constraint violations may arise when using database batch with nested create-entities. Switch batch off. If the problem does not occur any more, stick with non-batch generation. Otherwise you need further investigation. When using Oracle, a constraint violation typically looks like this:
java.sql.SQLException: ORA-00001: Unique Constraint (MYSCHEMA.SYS_C0011664) violated
It contains a constraint name you can look up on the database like this:
select * from user_constraints where constraint_name like '%SYS_C0011541%'
The query result will tell you the table name and the constraint type. The constraint types are encoded as follows:
- P: Primary key constraint
- U: Unique constraint
- R: Foreign key constraint
'value too large for column' in Oracle
Depending on the character set, oracle may report a multiple of the real column with, e.g. 80 instead of 20. So, automatic generation of varchar2 columns may fail. This typically results in Exceptions like this:java.sql.SQLException: ORA-12899: value too large for column "SCHEM"."TBL"."COL" (actual: 40, maximum: 10)
This is Oracle bug #4485954, see http://www.oracle.com/technology/software/tech/java/sqlj_jdbc/htdocs/readme_jdbc_10204.html and
http://kr.forums.oracle.com/forums/thread.jspa?threadID=554236.
The solution is using the newest JDBC driver, at least 10.2.0.4 or 11.0. BTW: It is backwards compatible with the Oracle 9 databases.
Composite Keys
Benerator expects single-valued ids. It does not automatically support composite keys and composie references
Since composite keys typically have a business meaning, most composite keys cannot be automatically generated. So there is no need to support this.
If you encounter a composite key, manually configure how to create each key component.
Importing Excel Sheets
For a beginner it is sometimes confusing, how Benerator handles imported Excel sheets. For this task iit completely relies on the cell type configured in the original sheet. So if you have a Date cell in the Excel sheet and format it as number or text, benerator will interpret it as double or string.
Another popular error comes from columns that contain long code numbers and have the the default format: They are imported as numbers and e.g. leading zeros are lost. In such case explicitly format the column as text in Excel.
Apache POI represents all numbers as variablesof type 'double'. So there are numbes which are simple in decimal format but not so in binary: when importing the number 1.95 from an Excel sheet, the user gets a value of 1.95000000002. For now you need to round the values yourself, e.g. by a converter.


