Archive for category performance

Code review checklist

A small checklist to conduct code reviews, is the change

  • Readable and following standards ?
  • Minimal and working solution ?
  • Better than before ?
  • Production ready ?
  • Checkout the details here : https://github.com/mestachs/experiment/blob/master/codereview/checklist.md

    Advertisements

    Leave a comment

    Gatling : load testing like a king

    Gatling for rails app @ http://www.brug.be

    , ,

    Leave a comment

    Don’t Get Caught Hibernating (again)

    Last week I had to review an hibernate powered application due to his poor performance and various instability issues.
    In my post Don’t Get Caught Hibernating, I’ve assumed that jdbc connectivity and caching were properly configured.
    Well in fact this is not always the case.

    Don’t use Hibernate built-in connection pool

    from hibernate documentation

    Hibernate’s own connection pooling algorithm is, however, quite rudimentary. It is intended to help you get started and is not intended for use in a production system, or even for performance testing. You should use a third party pool for best performance and stability. Just replace the hibernate.connection.pool_size property with connection pool specific settings. This will turn off Hibernate’s internal pool. For example, you might like to use c3p0.

    Strangely hibernate log this at info level (not warning) when instantiating the session factory

    log.info("Using Hibernate built-in connection pool (not for production use!)");
    

    Instead of using this ‘naive’ implementation, you should configure the session factory to use one of the better javax.sql.DataSource implementation :

    • C3P0: add the extra hibernate.c3p0.* properties and that’s it !
    • commons-dbcp : perhaps the older implementation, need to adjust the dependencies depending the java version your are using.
    • tomcat7-jdbc : simpler implementation than commons-dbcp and contains more features 😉

    Note also that other implementations are available.

    Don’t forget to

    • tune the datasource pool size to fit your production needs
    • choose and configure the validation mechanism for opened connections and various strategies testOnBorrow, testWhileIdle,…
    • specify the isolation level : it will take the default which is sometimes too high like “repeatable read”
    • specify also timeouts, max connection age
    • enable preparedStatement cache

    As you review your database connection pooling, it may be easy to also instrument it with jamon datasource.
    You will gain visibility in your various jdbc accesses. This can also help to identify the ideal pool size (take a look at MaxActive and AvgActive)

    Don’t use Hibernate default cache implementation

    From hibernate documentation:

    Hashtable (not intended for production use) org.hibernate.cache.HashtableCacheProvider memory

    The HashtableCacheProvider isn’t a production ready implementation :

    • no max size
    • no invalidation (lru,…)
    • no time to live, time-idle

    and can be :

    • considered as a memory leak,
    • source of OutOfMemoryError,
    • out of date data : caches not aware of changes made to the persistent store by another application

    As the documentation states, various implementation exists.
    From my experience Ehcache is quite easy to setup and ready to scale (cluster).
    Don’t forget to disable its phone home mechanism

    ,

    4 Comments

    Surviving in a legacy AS/400 world with a taste of Groovy.

    IBM System i, iSeries, AS/400,…

    You may have heard of IBM System i, iSeries, AS/400,… he was rebranded multiple times but for most of you it’s a green screen 5250. This system is fairly widespread in our european industry. For java developpement you have access to iseries via jt400 (driver + api for most concept (jobs, program call,…))

    Groovy + jt400 + system tables = automation for lazy dba.

    Last month, we did a new release of our application and this one required a new set of indexes.

    The good news is that the iSeries, when preparing sql statements, is doing an explain plan and logs it’s advised indexes in a system tables. But all advised indexes aren’t good to create, may be you can reuse an existing one by re-phrasing your sql statement. So we had to list advised indexes and existing one for each table, take a look at the number of times the index was advised,…

    Doing this manually in the UI tool was in fact too error-prone, too boring. As a java/groovy developper, I should automate this with a groovy script.

    Existing tables

    So first let’s list all existing tables (physical file) in a given schema (library) using the system view SYSTABLES. Our dba prefer systemName (short name vs long name)

    import groovy.sql.Sql
    import java.util.*
    
    def getTableSystemNames = {library,sql ->
        sql.rows(""" select * from QSYS2/SYSTABLES where table_schema = '${library}'
                     fetch first 500 rows only with ur""".toString()).collect { it.SYSTEM_TABLE_NAME}
    }

    Existing indexes

    first step, let’s get the existing indexes from sysindexes.
    One line is a column of one index… so let’s use groovy goodness groupBy, collect and join to get them one line of format : “column1, column2, column3”

        def getExistingIndexes = { library,tableSystemName,sql ->
    
        def existingIndexSQL = """with INDX as (
            select INDEX_NAME,SYSTEM_INDEX_NAME,COLUMN_NAME,SYSTEM_COLUMN_NAME from qsys2/SYSKEYS
            where
              index_name in (SELECT INDEX_NAME FROM qsys2/sysindexes 
                             where SYSTEM_INDEX_SCHEMA = '$library' and system_table_name = '$tableSystemName' )
               and index_SCHEMA='$library'
            )
            select * from INDX
            fetch first 500 rows only with ur
        """;
        rows=sql.rows(existingIndexSQL.toString())
        existingIndexes = [:];
        def existingIndexesColumns = rows.groupBy { it.SYSTEM_INDEX_NAME}
        existingIndexesColumns.each {row -> existingIndexes.put row.key, row.value.collect {it.COLUMN_NAME} .join(',') }
        return existingIndexes
    }

    Advised indexes

    second step, get the advised indexes
    KEY_COLUMNS_ADVISED is already “column1, column2, column3” format

        def getAdvisedIndexes= {library,tableSystemName,sql ->
        def advisedIndexesSQL = """
            select * from qsys2/SYSIXADV where
            TABLE_SCHEMA = '${library}' and
            SYSTEM_TABLE_NAME like '${tableSystemName}%'
            and TIMES_ADVISED > 1
            and index_type = 'RADIX'
            order by TIMES_ADVISED desc, MTI_CREATED desc
            fetch first 500 rows only with ur
            """
    
        rows = sql.rows(advisedIndexesSQL.toString())
        rows.collect { it.KEY_COLUMNS_ADVISED+" "+it.TIMES_ADVISED+" "+it.INDEX_TYPE}
    }

    It works !

    last step, put everything together with an sql connection 😉

        def dumpAdvisedAndExistingIndexes = { library,sql ->
        tables=getTableSystemNames(library,sql)
        tables.each() { tableSystemName->
            advised=getAdvisedIndexes(library,tableSystemName,sql)
            if (advised.isEmpty())
              return
            println "###### ${library}.${tableSystemName}"
            println "****************** existing indexes ****************"
            getExistingIndexes(library,tableSystemName,sql).each {println it}
            println "****************** advised indexes ****************"
            advised.each {println it}
        }
    
    }
    def as400 = "myas400"
    def as400User = "myuser"
    def as400Pwd = "mypwd"
    
    def sql = Sql.newInstance("jdbc:as400://${as400};naming=system;libraries=*LIBL;date format=iso;prompt=false", as400User,as400Pwd, "com.ibm.as400.access.AS400JDBCDriver");
    
    dumpAdvisedAndExistingIndexes('LIB1',sql )
    dumpAdvisedAndExistingIndexes('LIB2',sql )
    dumpAdvisedAndExistingIndexes('LIB3',sql )
    

    A little further

    Ok now I have beautifull script… what can I do with it. You can for example

    • reuse these closures to compare two library, two different iseries,…
    • put this kind of groovy script in a jenkins job. I did similar script to detect reserved keyword and each developper can test his own library/schema via a Parameterized groovyjenkins job.
    • document your database with similar scripts or tool like schemaspy.
    • reuse the same approach for the dbms like DB2 luw, oracle, mysql,…
    • mix these system informatin with your naming conventions check

    , , , , , ,

    Leave a comment

    hibernate reviewer : graphviz, hibernate cfg to review jbpm 5.2

    Hibernate Reviewer

    OK… you read my article “dont-get-caught-hibernating”
    and now you are curious to test your mappings/annotations. Now, you have access to the tool in github!

    Features

    for the moment 2 main features, from your hibernate configuration :

    • generate a report of “violations” : it’s embbedable as a junit. I hope to get it soon contribute to sonar violations 😉
    • generate a graphviz graph of your model

    current rules :

     
     * AvoidIgnoreNotFoundMappingReviewRule
     * AvoidJoinMappingReviewRule
     * AvoidPropertyRefMappingReviewRule
     * BatchSizeMappingReviewRule
     * CachingMappingReviewRule
     * LazyLoadMappingReviewRule

    Samples from hibernate reviewer

  • some basic entity mappings with violations (HibernateTest)
  • some coming from jbpm 5.2 (Jbpm5GraphTest)

    The Jbpm case

    Sample graph based on jbpm 5.2

    All red arrows shows the following violations :

    org.jbpm.task.Task	BatchSizeMappingReviewRule	BLOCKER	no batch size at class level
    org.jbpm.task.Task	BatchSizeMappingReviewRule	BLOCKER	no batch size at collection level :org.jbpm.task.Task.subjects
    

    By default, jbpm don’t specify batch-size at class or collection level. This leads select n+1 issues. Adjusting the batch-size to a value that fits your need can optimize it to a 2 select for n task returned.

    You can fix this by adjusting programmatically the batch-size on persistentClass or Collection :

    	private void fixBatchSize(Configuration cfg) {
    		Iterator persist = cfg.getClassMappings();
    		while (persist.hasNext()) {
    			PersistentClass persistentClass = persist.next();
    			persistentClass.setBatchSize(50);
    		}
    		
    		Iterator colls = cfg.getCollectionMappings();
    		while (colls.hasNext()) {
    			org.hibernate.mapping.Collection  object = (org.hibernate.mapping.Collection) colls.next();
    			object.setBatchSize(50);
    		}
    	}
    

  • 3 Comments

    Performance : when average is not enough ?

    Application performance can’t be summarized to an average and a standard deviation. Most performance issues aren’t so clear… jamonapi can help identifying your bottlenecks

    Same average, same standard deviation, not same reality

    Most application performance solutions are collecting performance data and only keep average and standard deviation (stddev). But application performance rarely follows normal distribution. Two samples with the same average and stddev doesn’t imply happy users.

    Let’s suppose you have a first release of your app and see an histogram like this one

    Most users are happy with a average of 1.9 seconds and standard deviation of 0.6 seconds

    Let’s introduce our version 2.0 of the application. Our monitoring still shows an average of 1.9 seconds and standard deviation of 0.6 seconds.

    But you receive a lot of feedback : 50%  of your end-users are complaining about bad performance… what’s going on ?

    on the left the happy users… and on the right your unhappy end-users !
    Hopefully you can easily instrument your application with jamon and discover this distribution.

    Jamon is to System.currentTimeMillis() what log4j is to System.out.println()

    jamon

    Jamon collects “stop/start” events and aggregates the logarithmic distribution of these events/monitor.

    • 0-10ms. 11-20ms. 21-40ms. 41-80ms. 81-160ms. 161-320ms. 321-640ms.
    • 641-1280ms. 1281-2560ms. 2561-5120ms.
    • 5121-10240ms. 10241-20480ms. >20480ms.

    It also keeps for each monitor additional informations like :

    • Hits
    • Avg ms.
    • Total ms.
    • Std Dev ms.
    • Min ms.
    • Max ms.
    • Active
    • Avg Active
    • Max Active
    • First access
    • Last access

    The active, avg active and max active shows the degree of concurrency of your monitor.

    Jamon feature and advantages :

    • easy installation : drop 3 jars, a bunch of jsp that’s it
    • production ready with low overhead
    • a servlet filter to monitor url time response by just modifying the web.xml
    • datasource wrapper to gather sql statistics (just an extra bean in your application)
    • spring integration via aop JamonPerformanceMonitorInterceptor
    • for non web application like batch processing or junit, you can write the jamon stats to a csv via a jmx console or a the end of the process.

    Real life usage

    0. mesure don’t guess
    1. enable it in production
    2. sort by total time
    3. detect what can be improved in these use case : db indexes, hibernate batch-size,…
    4. fix and rollout in production
    5. goto 1 😉

    Alternative to jamon

    • codahale metrics looks really promising with implementation for Gauges, Ratio Gauges, Percent Counters, Histogram, Meters,… integration with Guice, Jetty, Log4j, Apache HttpClient, Ehcache, Logback, Spring, Ganglia, Graphite
    • javasimon : quantiles, hierarchichal, nanoseconds,… but jdk 1.6 and I’m stuck with websphere 😦
    • datasource-proxysql no distribution but can summarize sql interactions per http request.But can be linked with other librairies

    , ,

    4 Comments