Archive for March, 2012
Application performance can’t be summarized to an average and a standard deviation. Most performance issues aren’t so clear… jamonapi can help identifying your bottlenecks
Same average, same standard deviation, not same reality
Most application performance solutions are collecting performance data and only keep average and standard deviation (stddev). But application performance rarely follows normal distribution. Two samples with the same average and stddev doesn’t imply happy users.
Let’s suppose you have a first release of your app and see an histogram like this one
Most users are happy with a average of 1.9 seconds and standard deviation of 0.6 seconds
Let’s introduce our version 2.0 of the application. Our monitoring still shows an average of 1.9 seconds and standard deviation of 0.6 seconds.
But you receive a lot of feedback : 50% of your end-users are complaining about bad performance… what’s going on ?
on the left the happy users… and on the right your unhappy end-users !
Hopefully you can easily instrument your application with jamon and discover this distribution.
Jamon is to System.currentTimeMillis() what log4j is to System.out.println()
Jamon collects “stop/start” events and aggregates the logarithmic distribution of these events/monitor.
- 0-10ms. 11-20ms. 21-40ms. 41-80ms. 81-160ms. 161-320ms. 321-640ms.
- 641-1280ms. 1281-2560ms. 2561-5120ms.
- 5121-10240ms. 10241-20480ms. >20480ms.
It also keeps for each monitor additional informations like :
- Avg ms.
- Total ms.
- Std Dev ms.
- Min ms.
- Max ms.
- Avg Active
- Max Active
- First access
- Last access
The active, avg active and max active shows the degree of concurrency of your monitor.
Jamon feature and advantages :
- easy installation : drop 3 jars, a bunch of jsp that’s it
- production ready with low overhead
- a servlet filter to monitor url time response by just modifying the web.xml
- datasource wrapper to gather sql statistics (just an extra bean in your application)
- spring integration via aop JamonPerformanceMonitorInterceptor
- for non web application like batch processing or junit, you can write the jamon stats to a csv via a jmx console or a the end of the process.
Real life usage
0. mesure don’t guess
1. enable it in production
2. sort by total time
3. detect what can be improved in these use case : db indexes, hibernate batch-size,…
4. fix and rollout in production
5. goto 1 😉
Alternative to jamon
- codahale metrics looks really promising with implementation for Gauges, Ratio Gauges, Percent Counters, Histogram, Meters,… integration with Guice, Jetty, Log4j, Apache HttpClient, Ehcache, Logback, Spring, Ganglia, Graphite
- javasimon : quantiles, hierarchichal, nanoseconds,… but jdk 1.6 and I’m stuck with websphere 😦
- datasource-proxysql no distribution but can summarize sql interactions per http request.But can be linked with other librairies
Hibernate is a great ORM and influenced most of the jpa standardisation. But a lot of his magic power can turn into malediction if you don’t follow some best practices.Some features in the tool are just there to answer ‘yes we can’ but are in my eyes not production ready. After more than 6 years of use on large application development… I could figure it out what works well or not.
Model best practices
- use and abuse of components
- avoid smart getter/setter
- they may break lazy loading or batch-size mechanism (for eg parent-child relationship, firepropertyChange) or trigger undesired update (trim or rounding)
- if you really need them switch to access=”field”
- Use custom usertypes for example to trim/uppercase/unaccent automatically, convert db date types to jodatime equivalent
- implements wisely hashCode and equals(and follow the contract)
- the db id is can be null
- use getter/setter (proxy issues)
- don’t use reflection
- prefer hashCode/EqualsBuilder
- avoid depending on other entities (breaks lazy loading)
- avoid inheritence, prefer composition
- polymorphism and the different hierachy strategies are awesome… but they will bite you one day or another (performance, or “impossibility” to change “type”)
- prefer a cluster safe id generation like identity column or similar (sequence)
Fetch mode, hibernate mappings
- the default should lazy=true in mapping and initialize by navigation or using query fetchmode in your DAO
- avoid lazy=false like the pest, once they are in… it’s hard to get them out !
- Avoid Select N plus 1 :
- one of the plague in most application I’ve seen not only for hibernate powered application (rails,raw jdbc)
- with 2 attributes batch-size at class and collection level you can gain in performance and stability
- join isn’t always the better option (cartesian product, db engine aren’t always good at it).
- Too much data loaded : sometimes it’s better to use hibernate projections or named queries
- Avoid not-found=”ignore”
- ignore not-found on manyToOne relationship force hibernate to check if the records exists… and breaks any lazy load mechanism
- “many-to-zero or one” can be re-implemented by a set for one-to-many
Hibernate query api
- prefer load/get/DetachedCriteria over other hibernate query apis !
- avoid direct sql, prefer externalized named-query for advanced sql query, stored proc calls or legacy table mapping
- avoid “dynamic” hql : concatenation issues (should I close the parenthesis, append “end”, risk of non prepared statement)
some features are available…. but use them wisely, or don’t use them 😉
- Avoid Property-ref : they force hibernate to lookup the database… even if you know that it’s an alternate key
- Avoid Caching ever changing query/entities:
- cache your metamodel (postal code,…)
- Don’t use Hibernate default cache implementation
- be carefull with temporal queries : give me the TVA% at this date: it’s perhaps smarter to do give me all tva indexation events and will find it through java iteration.
- Avoid Composite key
- use them only for legacy table mappings and be aware that the performance won’t be good for example when doing batch loading
In the end it’s a db under the hood
- display the sql logs in development mode(it will show you the effect of one click in your ui)
- sessionfactory settings
- log4j : org.hibernate.SQL
- use jamon (http://jamonapi.sourceforge.net/) to measure first (in production also !)
- don’t return 100000 records to your user : use paging !
- Avoid casting in SQL:
- preserve sql types for exampl 0 vs ‘0’ may prevent your sql engine to use the correct index
- Avoid sql with like “ends with” clause
- Avoid duplicating indexes
- Don’t use Hibernate built-in connection pool
Benefits from metadata
Through the session factory metadata you can :
- create a tool to enforce some rules : use identity column for id, don’t use property-ref,…
- impact analysis : who is using this table/stored procedure,…
- you can generate your model documentation with a tool like linguin maps
Don’t use java built-in classes, use jodatime and enforce this rule with sonar !
Don’t use java built-in classes
How many bugs in 5 lines of code ?
Date date = new Date(2007, 12, 13, 16, 40); TimeZone zone = TimeZone.getInstance("Europe/Bruxelles"); Calendar cal = new GregorianCalendar(date, zone); DateFormat fm = new SimpleDateFormat("HH:mm Z"); String str = fm.format(cal);
Just 6 bugs !
int year = 2007 - 1900; int month = 12 - 1; Date date = new Date(year, month, 13, 16, 40); TimeZone zone = TimeZone.getInstance("Europe/Brussels"); Calendar cal = new GregorianCalendar(zone); cal.setTime(date); DateFormat fm = new SimpleDateFormat("HH:mm Z"); fm.setTimeZone(zone); Date calDate = cal.getTime(); String str = fm.format(calDate);
If you want deeper explanations see this [presentation]
- From JDK1.0
- Uses two digit years (from 1900)
- January is 0, December is 11
- Should have been immutable
- Most methods deprecated in JDK1.1
- Uses milliseconds from 1970 representation
- From JDK1.1
- Uses subclasses for different calendar systems
- January is 0, December is 11
- Should have been immutable
- Uses dual representation internally
- value for each field
- milliseconds from 1970 representation
- Odd performance and bugs
- Pattern based date formatting
- “dd MMM yyyy”
- Requires Date object
- Not thread-safe : see rule findbugs : Multithreaded correctness – Call to static DateFormat
- Not especially fast
- Sun RFE to make thread-safe ignored
SQL – java.util.sql.Date, Time, Timestamp issues
- Date extends Date (!)
- Time extends Date (!)
- Override superclass to block methods (throws Exception)
- Timestamp adds nanoseconds
- equals() broken
- All the problems of java.util.Date and more
- timezone problem new Time(long)
Avoid millis manipulation and let’s use Jodatime !
when playing with java.util.Date you end up doing calculation in millis
Date now =
nowMillis = now.getTime();
Timestamp nowTimestamp =
* days *
Timestamp expiryTimestamp =
Timestamp(nowMillis + future);
this last code sample contains a bug… int vs long for days !
see this explaination the expiryTimestamp is before the nowTimestamp for “large days count”
nowTimestamp 2011-02-04 12:45:40.381 expiryTimestamp 2011-01-25 19:42:53.085
now let’s write the same code with joda time
DateTime nowTimestamp2 =
it’s more readable… and most important it return the correct value 😉
nowTimestamp 2011-02-04T12:45:40.443+01:00 expiryTimestamp 2011-03-16T12:45:40.443+01:00
Sonar to the rescue
sonar can detect these issues :
- Multithreaded correctness – Call to static Calendar
- Multithreaded correctness – Call to static DateFormat
to fix them with jodatime let’s use the DateTimeFormat.
DateTime datetime = fmt.parseDateTime(duedate);
this one is threadsafe and can be static and final field. or use the toString
but the step further is to banish jdk date from your code base !
To do so, let’s define sonar architectural constraint like
- Arch : avoid java.util.GregorianCalendar
- Arch : avoid java.util.Date
- Arch : avoid java.text.SimpleDateFormat
- Arch : avoid java.sql.Timestamp
- Arch : avoid java.sql.Date
to banish jdk dates from your model, you may implement hibernate usertype, jaxb adapter,…
Secure your jenkins
- shell scripts output
Manage diskspace requirement
- Install nagios checks for diskspace monitoring
- Install diskusage plugin
- Discard Old Builds
- Disable maven artefact archiving
- maven -e option
- install job config history plugin
- use template project plugin
- groovy system scripts
Secure your jenkins
configure jenkins to use your ldap or active directory. by default jenkins is really open… even “manage jenkins” is available to anonymous user.
you can disable logging of shell commands via
set +o interactive-comments set +o xtrace
there’s a plugin where you can centralize your user/password and that will mask them in the console logs
Manage diskspace requirement
Installl nagios checks for diskspace monitoring
Install diskusage plugin to gain visibility over the big consumer
Discard Old Buils
enable one of the 2 options :
Days to keep builds Max # of builds to keep
Disable maven artefact archiving
This option will tell jenkins to collect pom,jars,wars,ears as they are produced by maven. This is rarely usefull when you use an enterprise repository. This option is enabled by default… so you aren’t using it… disable it !
Build > Advanced > Disable automatic artifact archiving
specify maven -e option
get detailed error from maven
install job config history plugin
knowing that something has changed in the project configuration is always good when something goes bad
use template project plugin
you can reuse builder, publisher from other projects.
get to know system groovy scripts. With a simple script you detect/fix the various highlighted issues in this post.