Flushing the Hibernate Session is taking a very long time

3.7k views Asked by At

I am using Hibernate 5.2.8.Final version and we have a requirement where we read millions of data from database and update the data by some business logic, as my database is Huge I want to commit data after my batchsize is reached so I have written below code

Session session = HibernateUtil.getSessionFactory().getCurrentSession();
session.getTransaction().begin();
Query<Object> query = session.createQuery(SQL, Object.class);
ScrollableResults scrollableResults = query.setCacheMode(CacheMode.IGNORE)
        .scroll(ScrollMode.FORWARD_ONLY);
int count = 0;
while (scrollableResults.next())
{
    Object object = (Object) scrollableResults.get(0);
    process(object)
    session.update(object);
    if (++count % batchSizeDeclare== 0)
    {
        session.flush();
        session.clear();
        LOGGER.info("Updated batch records");
    }
}
session.getTransaction().commit();
LOGGER.info("commited in Hibernate ");
}

Below is my hibernate.cfg.xml file

<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE hibernate-configuration PUBLIC
        "-//Hibernate/Hibernate Configuration DTD//EN"
        "http://www.hibernate.org/dtd/hibernate-configuration-3.0.dtd">
<hibernate-configuration>
    <session-factory>
        <!-- Database connection settings -->
        <property name="connection.driver_class">com.microsoft.sqlserver.jdbc.SQLServerDriver</property>
        <property name="connection.url">jdbc:sqlserver://com;database=DEV</property>
        <property name="connection.username">user</property>
        <property name="connection.password">pass</property>

        <property name="hibernate.default_schema">schema</property>

        <!-- JDBC connection pool (use the built-in) -->
        <property name="connection.pool_size">5</property>

        <!-- SQL dialect -->
        <property name="dialect">org.hibernate.dialect.SQLServer2012Dialect</property>

        <!-- Disable the second-level cache -->
        <property name="cache.provider_class">org.hibernate.cache.internal.NoCacheProvider</property>

        <!-- Echo all executed SQL to stdout -->
        <!-- <property name="show_sql">true</property> -->
        <!-- <property name="format_sql">true</property> -->

        <!-- Drop and re-create the database schema on startup -->
        <property name="hbm2ddl.auto">update</property>

        <!-- org.hibernate.HibernateException: No CurrentSessionContext configured! -->
        <property name="hibernate.current_session_context_class">thread</property>

        <property name="hibernate.jdbc.batch_size">100</property>

        <property name="hibernate.c3p0.min_size">5</property>
        <property name="hibernate.c3p0.max_size">20</property>
        <property name="hibernate.c3p0.timeout">300</property>
        <property name="hibernate.c3p0.max_statements">50</property>
        <property name="hibernate.c3p0.idle_test_period">3000</property>

        <mapping class="com.beans.Object" />

    </session-factory>
</hibernate-configuration>

Below is my Object.java

 public class Object implements Serializable
    {
        private static final long serialVersionUID = 1L;

        @Id
        @GeneratedValue(strategy = GenerationType.IDENTITY)
        @Column(name = "ID", nullable = false, unique = true, updatable = false)
        private int id;
        private String filePath;

    private String fileName;
    private String mimeType;
    private double fileSize;
    // removed getters and setters

    }

Once my code reached session.flush() it is not doing any thing even after waiting for 30 min. Is this the correct way to batch commit? How to batch Update?

1

There are 1 answers

5
Vlad Mihalcea On BEST ANSWER

Once my code reached session.flush it is not doing anything even after waiting for 30 min.

On the contrary, the database seems to be doing way too much. It's just that you don't see any progress because the database is struggling to cope with the huge amount of work that you submitted.

Is this the correct way to batch commit?

The short answer is No.

You don't have to fetch millions of rows from the DB. You have better options:

  1. You can do the processing in the database, so that you don't pay the price of extracting data and sending it over the network, only to process it in Java.
  2. If you can't process it in the DB, then you need to use a batch processor that only fetches small chunks of data at a time. This way, you can even parallelize the batch processing, which should reduce the overall processing time.