I am reading 5gb of XML file using the below code and processing that data into Database using spring dataJpa, this below is just sample logic we are closing the inpustream and aswell xsr object.
XMLInputFactory xf=XMLInputFactory.newInstance();
XMLStreamReader xsr=xf.createXMLStreamReader(new InputStreamReader(new FileInputStream("test.xml"))
i have configured the max 8GB(ie -xms7000m and -xmx8000m) of heap memory But its getting the below hibernate heap issue when saving the data. it inserted around 700000 data total of 2100000
[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed; nested exception is java.lang.OutOfMemoryError: Java heap space] with root cause
java.lang.OutOfMemoryError: Java heap space
at java.base/java.util.IdentityHashMap.resize(IdentityHashMap.java:472) ~[na:na]
at java.base/java.util.IdentityHashMap.put(IdentityHashMap.java:441) ~[na:na]
at org.hibernate.event.internal.DefaultPersistEventListener.entityIsPersistent(DefaultPersistEventListener.java:159) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.event.internal.DefaultPersistEventListener.onPersist(DefaultPersistEventListener.java:124) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl$$Lambda$1620/0x00000008010a3040.applyEventToListener(Unknown Source) ~[na:na]
at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:113) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl.persistOnFlush(SessionImpl.java:765) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.spi.CascadingActions$8.cascade(CascadingActions.java:341) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeToOne(Cascade.java:492) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeAssociation(Cascade.java:416) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeProperty(Cascade.java:218) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeCollectionElements(Cascade.java:525) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeCollection(Cascade.java:456) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeAssociation(Cascade.java:419) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascadeProperty(Cascade.java:218) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.internal.Cascade.cascade(Cascade.java:151) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.event.internal.AbstractFlushingEventListener.cascadeOnFlush(AbstractFlushingEventListener.java:158) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.event.internal.AbstractFlushingEventListener.prepareEntityFlushes(AbstractFlushingEventListener.java:148) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.event.internal.AbstractFlushingEventListener.flushEverythingToExecutions(AbstractFlushingEventListener.java:81) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:39) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl$$Lambda$1597/0x0000000801076040.accept(Unknown Source) ~[na:na]
at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:102) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl.doFlush(SessionImpl.java:1362) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl.managedFlush(SessionImpl.java:453) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl.flushBeforeTransactionCompletion(SessionImpl.java:3212) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.internal.SessionImpl.beforeTransactionCompletion(SessionImpl.java:2380) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.beforeTransactionCompletion(JdbcCoordinatorImpl.java:447) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.beforeCompletionCallback(JdbcResourceLocalTransactionCoordinatorImpl.java:183) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.access$300(JdbcResourceLocalTransactionCoordinatorImpl.java:40) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.commit(JdbcResourceLocalTransactionCoordinatorImpl.java:281) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.hibernate.engine.transaction.internal.TransactionImpl.commit(TransactionImpl.java:101) ~[hibernate-core-5.4.22.Final.jar:5.4.22.Final]
at org.springframework.orm.jpa.JpaTransactionManager.doCommit(JpaTransactionManager.java:534) ~[spring-orm-5.2.10.RELEASE.jar:5.2.10.RELEASE]
As per the above trace logs it seems some issue with hibernate save casacade , but not able to figured out , below are the entity classes which uses to save the data in databse.
@Data
@EqualsAndHashCode
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
public class UMEnityPK implements Serializable {
private static final long serialVersionUID=1L;
private String batchId;
private Long batchVersion;
private BigInteger umId;
}
@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"batchId", "batchVersion", "umId"})
@Entity
@Table(name ="um_base")
@IdClass(UMEnityPK.class)
public class UMBase {
@Id private String batchId;
@Id private Long batchVersion;
@Id private BigInteger umId;
private String firstName;
private String lastName;
private String umType;
private String umLevel;
@OneToMany(mappedBy = "umBase", cascade = CascadeType.ALL)
private List<UMAddress> umAddresses;
@OneToMany(mappedBy = "umBase", cascade = CascadeType.ALL)
private List<UMIdentifier> umIdentifiers;
@OneToOne(mappedBy = "umBase", cascade = CascadeType.ALL)
private UMHierarchy umHierarchy;
}
@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"id"})
@Entity
@Table(name = "um_identifier")
public class UMIdentifier {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "um_address")
@SequenceGenerator(name = "um_address", sequenceName = "SEQ_UM_ADDRESS", allocationSize = 1)
private Long id;
private String idValue;
private String idType;
private String groupType;
@ManyToOne
@JoinColumns({
@JoinColumn(name = "BATCH_ID", referencedColumnName = "batchId"),
@JoinColumn(name = "BATCH_VERSION", referencedColumnName = "batchVersion"),
@JoinColumn(name = "UM_ID", referencedColumnName = "umId")
})
private UMBase umBase;
}
@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"batchId", "batchVersion", "umId"})
@Entity
@Table(name ="um_hierarchy")
public class UMHierarchy {
@Id
private String batchId;
@Id private Long batchVersion;
@Id private BigInteger umId;
private String hierarchyTpe;
private String umStatusCode;
private String immediateParentId;
private Date hierarchyDate;
@OneToOne(cascade = CascadeType.ALL)
@JoinColumns({
@JoinColumn(name = "BATCH_ID", referencedColumnName = "batchId"),
@JoinColumn(name = "BATCH_VERSION", referencedColumnName = "batchVersion"),
@JoinColumn(name = "UM_ID", referencedColumnName = "umId")
})
private UMBase umBase;
}
@Data
@Builder(toBuilder = true)
@NoArgsConstructor(access = AccessLevel.PRIVATE)
@AllArgsConstructor(access = AccessLevel.PRIVATE)
@EqualsAndHashCode(of = {"id"})
@Entity
@Table(name = "um_address")
public class UMAddress {
@Id
@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "um_address")
@SequenceGenerator(name = "um_address", sequenceName = "SEQ_UM_ADDRESS", allocationSize = 1)
private Long id;
private String addressType;
private String addressLine1;
private String getAddressLine2;
private String city;
private String state;
private String postalCode;
private String country;
@ManyToOne
@JoinColumns({
@JoinColumn(name = "BATCH_ID", referencedColumnName = "batchId"),
@JoinColumn(name = "BATCH_VERSION", referencedColumnName = "batchVersion"),
@JoinColumn(name = "UM_ID", referencedColumnName = "umId")
})
private UMBase umBase;
}
Is there any issues with hibernate entity mapping which eating of the memory
When dealing with converting datasets that big you need to do that in batches. Read 100 records from the xml, convert them to entities, save each of them with
em.persist(record), then callem.flush()andem.clear()to remove them from Hibernate, then clear them from your local collection, then manually invoke the garbage collector usingSystem.gc(). You may even want to use Hibernate's batch processing as described in this tutorial.In pseudocode this would be:
Also you may want to manually begin and commit a transaction around each insert loop to ensure the database isn't holding everything for your entire import, or it might run out of memory instead of your java application.