Posts Tagged jamon
Jenkins as monitoring platform of the poor
The goal
The goal was to monitor some html page and wsdl availability. I don’t really have access to all the monitoring infrastructure and wanted to check my development servers. I was looking for a lightweight way of monitoring them. I’ve mixed jenkins and groovy and ended up to pretty and low-cost monitoring solution
Install the necessary plugin and tools
Manage Jenkins > Manage Plugins :
– Groovy plugin : This plugin adds the ability to directly execute Groovy code.
– Green Balls : Changes Hudson to use green balls instead of blue for successful builds
– Groovy Postbuild Plugin : This plugin executes a groovy script in the Jenkins JVM. Typically, the script checks some conditions and changes accordingly the build result, puts badges next to the build in the build history and/or displays information on the build summary page.
Manage Jenkins > Configure System : Groovy > Groovy installations or Install automatically
For the groovy plugin you can use the built-in tool installer or just point it to a unzipped binary of groovy
GROOVY_HOME : /opt/groovy/
So let’s create a free-style jenkins job with the following settings
Discard Old Builds : Max # of builds to keep : 100
Build periodically : */10 * * * *
Execute Groovy Script : groovy command :
servers = ['ex1.server.com','ex2.server.com','ex3.server.com']
wdsls=[]
simpleurls=[]
servers.each() {host ->
wdsls.add("http://${host}/ws/MyWebService?wsdl")
simpleurls.add("http://${host}/ui/MyConsole.html")
}
def koCount=0;
def slowCount=0;
def checkUrl = { url, check ->
def status ='KO'
def host =''
start= System.currentTimeMillis()
try {
myurl = new URL(url)
host =myurl.getHost()
def text = myurl.getText(connectTimeout: 10000, readTimeout: 10000)
def ok = check(url,text)
status = ok?'OK':'KO';
if (!ok) {koCount++}
} catch (Throwable t) {
koCount++
}
end= System.currentTimeMillis()
if ((end-start)>100)
slowCount++
println "$host\t"+status+'\t'+(end-start)+'\t'+' '+url
}
def checkAllUrl = {urls, check -> urls.each() {url ->checkUrl(url,check)}}
def wsdlCheck = {url,content -> content.contains("wsdl:definitions")}
def pingCheck = {url,content -> content.contains("status=NORMAL")}
def contentCheck = {url,content -> content.contains("login")}
checkAllUrl (wdsls,wsdlCheck )
checkAllUrl (simpleurls,contentCheck)
println "ko.count="+koCount
println "slow.count="+slowCount
if (koCount>0 || slowCount >0) {
System.exit(-1)
}
Build two lists of urls : wsdls, simpleurls based on a list of servers.
A first closure checkUrl get the content of an url and update counters ok, ko
, it’s also receiving another closure that will check the expected content of the url content.
Now depending on the kind of content call the checkAllUrl with matching check closure wsdlCheck ,contentCheck,….
Add a Groovy Postbuild : Groovy script:
def addShortTextSlow = { comp,shortcomp->
matcher = manager.getMatcher(manager.build.logFile, comp+".count=(.*)\$")
if(matcher?.matches()) {
manager.addShortText(shortcomp+' '+matcher.group(1), "grey", "white", "0px", "white")
}
}
addShortTextSlow('slow','slow')
addShortTextSlow('ko','ko')
That’s it !
Subscribe to the jenkins “RSS for failures” feed or your preferred jenkins notification tool and benefit from the jenkins built-in ui !
You have an history of the checks :
You can easily embed this graph or the green/red ball in jira our your wiki :
<a href="http://myjenkins.com/job/monitoring/lastBuild/consoleText">
<img src="http://myjenkins.com/job/monitoring/buildTimeGraph/png" alt="200" title="200" border="0"/>
</a>
<a href="http://myjenkins.com/job/monitoring/lastBuild/consoleText">
<img src="http://myjenkins.com/job/monitoring/lastBuild/buildStatus" border="0">
</a>
The sky is the limit !
Ok now you got the idea… let’s add some checks to gather
– check some open ports :
try {
s = new Socket(host, port);
s.withStreams { input, output -> }
println "management port ok $host $port"
} catch (Exception e){
koCount++
println "management port KO for $host $port : "+e.getMessage()
}
– access jmx beans
import javax.management.remote.*
def serverUrl = new JMXServiceURL('service:jmx:rmi:///jndi/rmi://ex1.server.com:9999/jmxrmi')
def server = JMXConnectorFactory.connect(serverUrl).MBeanServerConnection;
def memory = new GroovyMBean(server, 'java.lang:type=Memory')
println memory.listAttributeNames()
println memory.listOperationNames()
– some jamon statistics :
jamonurls=[]
jamonurlsuffix='/jamonadmin.jsp?sortCol=2&sortOrder=desc&displayTypeValue=RangeColumns&RangeName=ms.&outputTypeValue=xml&formatterValue=%23%2C%23%23%23&TextSize=0&highlight=&ArraySQL=^WS-|^Fault&'
servers.each() {host -> jamonurls.add("http://${host}"+jamonurlsuffix)}
def fixJamonXml= {
xml ->
if (xml.indexOf("No data was returned") != -1) {
return '<JAMonXML></JAMonXML>';
}
String content = xml.substring(xml.indexOf('<JAMonXML>'));
rangeLabels = [ "0_10ms", "10_20ms","20_40ms","40_80ms","80_160ms","160_320ms","320_640ms","640_1280ms","1280_2560ms","2560_5120ms","5120_10240ms","10240_20480ms"];
content = content.replaceAll( '<Label>','<Label><![CDATA[');
content = content.replaceAll( '</Label>',']]></Label>');
rangeLabels.each() {
rangeLabel -> content = content.replaceAll(rangeLabel, "range_" + rangeLabel);
}
content = content.replaceAll( "LessThan_0ms", "range_LessThan_0ms");
content = content.replaceAll( "GreaterThan", "range_GreaterThan");
return content
}
def jamonCheck= {
url,content ->
monitors = []
def JAMonXML = new XmlSlurper().parseText(fixJamonXml(content))
def parseLong = { t -> if (t.text().equals("")) return null; Long.valueOf(t.text().replaceAll(',', ''))}
def parseLongString = { t -> if (t.equals("")) return null; Long.valueOf(t.replaceAll(',', ''))}
def parseRange = {
rangeText -> // 15/10.2 (0/0/0)
// http://docs.codehaus.org/display/GROOVY/Tutorial+5+-+Capturing+regex+groups
rangeFormat = /(.*)\/(.*) \((.*)\/(.*)\/(.*)\)/
matched = ( rangeText.text() =~ rangeFormat )
if (matched.matches()) {
return [ 'label':rangeText.name() , hits : parseLongString(matched[0][1]),average:matched[0][2]]
}
return [ 'label':rangeText.name() , hits : 0,average:0.0]
}
println "************************"+ url
JAMonXML.children().each() { row ->
monitors.add( [
'label' : row.Label,
'units' : row.Units,
'hits' : parseLong(row.Hits),
'avg' : parseLong(row.Avg),
'total' : parseLong(row.Total),
'stddev' : parseLong(row.StdDev),
'lastvalue': parseLong(row.LastValue),
'min' : parseLong(row.Min),
'max' : parseLong(row.Max),
'active' : parseLong(row.Active),
'avgActice':parseLong(row.AvgActive),
'maxActice':parseLong(row.MaxActive),
'firstAccess':row.FirstAccess,
'lastAccess' : row.LastAccess,
'ranges' : [
'range_LessThan_0ms' :parseRange(row.range_LessThan_0ms),
'range_0_10ms' : parseRange(row.range_0_10ms),
'range_10_20ms' : parseRange(row.range_10_20ms) ,
'range_20_40ms' : parseRange(row.range_20_40ms),
'range_40_80ms':parseRange(row.range_40_80ms),
'range_80_160ms' : parseRange(row.range_80_160ms) ,
'range_160_320ms' : parseRange(row.range_160_320ms),
'range_320_640ms' : parseRange(row.range_320_640ms),
'range_640_1280ms' : parseRange(row.range_640_1280ms),
'range_1280_2560ms' : parseRange(row.range_1280_2560ms) ,
'range_2560_5120ms' : parseRange(row.range_2560_5120ms),
'range_5120_10240ms' : parseRange(row.range_5120_10240ms),
'range_10240_20480ms' : parseRange(row.range_10240_20480ms),
'range_GreaterThan_20480ms': parseRange(row.range_GreaterThan_20480ms)]
] )
}
/**
* 1 0 10ms
2 10 20ms
3 20 40ms
4 40 80ms
5 80 160ms
6 160 320ms
7 320 640ms
8 640 1280ms
9 1280 2560ms
10 2560 5120ms
10 5120 10240ms
12 10240 20480ms
13 >> 20480ms
*/
def getPercentiles = {monitor ->
def ps = [0.5,0.8,0.9,0.95,0.98,0.99]
def ranges = [];
monitor.ranges.eachWithIndex() {it, i -> ranges.add(it.value.hits) }
def rangesCumulative = [];
(0..13).each() {i -> rangesCumulative.add (monitor.hits>0?ranges[i]/monitor.hits:0)}
def percentages= (0..13).collect() {i -> rangesCumulative[1..i].sum()}
def percentiles = ps.collect{ percentile->percentages.findIndexOf{it>=percentile}}
return percentiles
}
percentileserrors = [];
monitors.each {
percentiles = getPercentiles(it)
println percentiles.join('\t') + "\t"+it.label
if (percentiles[2]>8) {
percentileserrors.add(it.label)
}
}
return percentileserrors>0;
}
checkAllUrl (jamonurls,jamonCheck)
Performance : when average is not enough ?
Posted by mestachs in java, performance on March 31, 2012
Application performance can’t be summarized to an average and a standard deviation. Most performance issues aren’t so clear… jamonapi can help identifying your bottlenecks
Same average, same standard deviation, not same reality
Most application performance solutions are collecting performance data and only keep average and standard deviation (stddev). But application performance rarely follows normal distribution. Two samples with the same average and stddev doesn’t imply happy users.
Let’s suppose you have a first release of your app and see an histogram like this one
Most users are happy with a average of 1.9 seconds and standard deviation of 0.6 seconds
Let’s introduce our version 2.0 of the application. Our monitoring still shows an average of 1.9 seconds and standard deviation of 0.6 seconds.
But you receive a lot of feedback : 50% of your end-users are complaining about bad performance… what’s going on ?
on the left the happy users… and on the right your unhappy end-users !
Hopefully you can easily instrument your application with jamon and discover this distribution.
Jamon is to System.currentTimeMillis() what log4j is to System.out.println()
jamon
Jamon collects “stop/start” events and aggregates the logarithmic distribution of these events/monitor.
- 0-10ms. 11-20ms. 21-40ms. 41-80ms. 81-160ms. 161-320ms. 321-640ms.
- 641-1280ms. 1281-2560ms. 2561-5120ms.
- 5121-10240ms. 10241-20480ms. >20480ms.
It also keeps for each monitor additional informations like :
- Hits
- Avg ms.
- Total ms.
- Std Dev ms.
- Min ms.
- Max ms.
- Active
- Avg Active
- Max Active
- First access
- Last access
The active, avg active and max active shows the degree of concurrency of your monitor.
Jamon feature and advantages :
- easy installation : drop 3 jars, a bunch of jsp that’s it
- production ready with low overhead
- a servlet filter to monitor url time response by just modifying the web.xml
- datasource wrapper to gather sql statistics (just an extra bean in your application)
- spring integration via aop JamonPerformanceMonitorInterceptor
- for non web application like batch processing or junit, you can write the jamon stats to a csv via a jmx console or a the end of the process.
Real life usage
0. mesure don’t guess
1. enable it in production
2. sort by total time
3. detect what can be improved in these use case : db indexes, hibernate batch-size,…
4. fix and rollout in production
5. goto 1
Alternative to jamon
- codahale metrics looks really promising with implementation for Gauges, Ratio Gauges, Percent Counters, Histogram, Meters,… integration with Guice, Jetty, Log4j, Apache HttpClient, Ehcache, Logback, Spring, Ganglia, Graphite
- javasimon : quantiles, hierarchichal, nanoseconds,… but jdk 1.6 and I’m stuck with websphere
- datasource-proxysql no distribution but can summarize sql interactions per http request.But can be linked with other librairies







