mysql - Hibernate, JDBC and Java performance on medium and big result set

Question

Welcome To Ask or Share your Answers For Others

mysql - Hibernate, JDBC and Java performance on medium and big result set

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

mysql - Hibernate, JDBC and Java performance on medium and big result set

Issue

We are trying to optimize our dataserver application. It stores stocks and quotes over a mysql database. And we are not satisfied with the fetching performances.

Context

- database
    - table stock : around 500 lines
    - table quote : 3 000 000 to 10 000 000 lines
    - one-to-many association : one stock owns n quotes
    - fetching around 1000 quotes per request
    - there is an index on (stockId,date) in the quote table
    - no cache, because in production, querys are always different
- Hibernate 3
- mysql 5.5
- Java 6
- JDBC mysql Connector 5.1.13
- c3p0 pooling

Tests and results

Protocol

Execution times on mysql server are obtained with running the generated sql queries in mysql command line bin.
The server is in a test context : no other DB readings, no DB writings
We fetch 857 quotes for the AAPL stock

Case 1 : Hibernate with association

This fills up our stock object with 857 quotes object (everything correctly mapped in hibernate.xml)

session.enableFilter("after").setParameter("after", 1322910573000L);
Stock stock = (Stock) session.createCriteria(Stock.class).
add(Restrictions.eq("stockId", stockId)).
setFetchMode("quotes", FetchMode.JOIN).uniqueResult();

SQL generated :

SELECT this_.stockId AS stockId1_1_,
       this_.symbol AS symbol1_1_,
       this_.name AS name1_1_,
       quotes2_.stockId AS stockId1_3_,
       quotes2_.quoteId AS quoteId3_,
       quotes2_.quoteId AS quoteId0_0_,
       quotes2_.value AS value0_0_,
       quotes2_.stockId AS stockId0_0_,
       quotes2_.volume AS volume0_0_,
       quotes2_.quality AS quality0_0_,
       quotes2_.date AS date0_0_,
       quotes2_.createdDate AS createdD7_0_0_,
       quotes2_.fetcher AS fetcher0_0_
FROM stock this_
LEFT OUTER JOIN quote quotes2_ ON this_.stockId=quotes2_.stockId
AND quotes2_.date > 1322910573000
WHERE this_.stockId='AAPL'
ORDER BY quotes2_.date ASC

Results :

Execution time on mysql server : ~10 ms
Execution time in Java : ~400ms

Case 2 : Hibernate without association without HQL

Thinking to increase performance, we've used that code that fetch only the quotes objects and we manually add them to a stock (so we don't fetch repeated infos about the stock for every line). We used createSQLQuery to minimize effects of aliases and HQL mess.

String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
stock.addQuotes((ArrayList<Quote>) session.createSQLQuery("select * from quote q where stockId='" + stockId + "' " + filter).addEntity(Quote.class).list());

SQL generated :

SELECT *
FROM quote q
WHERE stockId='AAPL'
  AND q.date>1322910573000
ORDER BY q.date ASC

Results :

Execution time on mysql server : ~10 ms
Execution time in Java : ~370ms

Case 3 : JDBC without Hibernate

String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
Connection conn = SimpleJDBC.getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from quote q where stockId='" + stockId + "' " + filter);
while(rs.next())
{
    stock.addQuote(new Quote(rs.getInt("volume"), rs.getLong("date"), rs.getFloat("value"), rs.getByte("fetcher")));
}
stmt.close();
conn.close();

Results :

Execution time on mysql server : ~10 ms
Execution time in Java : ~100ms

Our understandings

The JDBC driver is common to all the cases
There is a fundamental time cost in JDBC driving
With similar sql queries, Hibernate spends more time than pure JDBC code in converting result sets in objects
Hibernate createCriteria, createSQLQuery or createQuery are similar in time cost
In production, where we have lots of writing concurrently, pure JDBC solution seemed to be slower than the hibernate one (maybe because our JDBC solutions was not pooled)
Mysql wise, the server seems to behave very well, and the time cost is very acceptable

Our questions

Is there a way to optimize the performance of JDBC driver ?
And will Hibernate benefit this optimization ?
Is there a way to optimize Hibernate performance when converting result sets ?
Are we facing something not tunable because of Java fundamental object and memory management ?
Are we missing a point, are we stupid and all of this is vain ?
Are we french ? Yes.

Your help is very welcome.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T18:34:51+0000

Can you do a smoke test with the simples query possible like:

SELECT current_timestamp()

or

SELECT 1 + 1

This will tell you what is the actual JDBC driver overhead. Also it is not clear whether both tests are performed from the same machine.

Is there a way to optimize the performance of JDBC driver ?

Run the same query several thousand times in Java. JVM needs some time to warm-up (class-loading, JIT). Also I assume SimpleJDBC.getConnection() uses C3P0 connection pooling - the cost of establishing a connection is pretty high so first few execution could be slow.

Also prefer named queries to ad-hoc querying or criteria query.

And will Hibernate benefit this optimization ?

Hibernate is a very complex framework. As you can see it consumes 75% of the overall execution time compared to raw JDBC. If you need raw ORM (no lazy-loading, dirty checking, advanced caching), consider mybatis. Or maybe even JdbcTemplate with RowMapper abstraction.

Is there a way to optimize Hibernate performance when converting result sets ?

Not really. Check out the Chapter 19. Improving performance in Hibernate documentation. There is a lot of reflection happening out there + class generation. Once again, Hibernate might not be a best solution when you want to squeeze every millisecond from your database.

However it is a good choice when you want to increase the overall user experience due to extensive caching support. Check out the performance doc again. It mostly talks about caching. There is a first level cache, second level cache, query cache... This is the place where Hibernate might actually outperform simple JDBC - it can cache a lot in a ways you could not even imagine. On the other hand - poor cache configuration would lead to even slower setup.

Check out: Caching with Hibernate + Spring - some Questions!

Are we facing something not tunable because of Java fundamental object and memory management ?

JVM (especially in server configuration) is quite fast. Object creation on the heap is as fast as on the stack in e.g. C, garbage collection has been greatly optimized. I don't think the Java version running plain JDBC would be much slower compared to more native connection. That's why I suggested few improvements in your benchmark.

Are we missing a point, are we stupid and all of this is vain ?

I believe that JDBC is a good choice if performance is your biggest issue. Java has been used successfully in a lot of database-heavy applications.

Categories

mysql - Hibernate, JDBC and Java performance on medium and big result set

mysql - Hibernate, JDBC and Java performance on medium and big result set

Issue

Context

Tests and results

Protocol

Case 1 : Hibernate with association

Case 2 : Hibernate without association without HQL

Case 3 : JDBC without Hibernate

Our understandings

Our questions

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags