Jenkins as monitoring platform of the poor

The goal

The goal was to monitor some html page and wsdl availability. I don’t really have access to all the monitoring infrastructure and wanted to check my development servers. I was looking for a lightweight way of monitoring them. I’ve mixed jenkins and groovy and ended up to pretty and low-cost monitoring solution ;)

Install the necessary plugin and tools

Manage Jenkins > Manage Plugins :
Groovy plugin : This plugin adds the ability to directly execute Groovy code.
Green Balls : Changes Hudson to use green balls instead of blue for successful builds
Groovy Postbuild Plugin : This plugin executes a groovy script in the Jenkins JVM. Typically, the script checks some conditions and changes accordingly the build result, puts badges next to the build in the build history and/or displays information on the build summary page.

Manage Jenkins > Configure System : Groovy > Groovy installations or Install automatically
For the groovy plugin you can use the built-in tool installer or just point it to a unzipped binary of groovy
GROOVY_HOME : /opt/groovy/

So let’s create a free-style jenkins job with the following settings

Discard Old Builds : Max # of builds to keep : 100
Build periodically : */10 * * * *
Execute Groovy Script : groovy command :

servers = ['ex1.server.com','ex2.server.com','ex3.server.com']

wdsls=[]
simpleurls=[]
servers.each() {host ->
   wdsls.add("http://${host}/ws/MyWebService?wsdl")
   simpleurls.add("http://${host}/ui/MyConsole.html")
}

def koCount=0;
def slowCount=0;
def checkUrl = { url, check ->
 def status ='KO'
 def host =''
   start= System.currentTimeMillis()
    try {
     myurl = new URL(url)
       host =myurl.getHost()
       def text = myurl.getText(connectTimeout: 10000, readTimeout: 10000)
       def ok = check(url,text)
      status = ok?'OK':'KO';
      if (!ok) {koCount++}
    } catch (Throwable t) {
       koCount++
       }
    end= System.currentTimeMillis()
    if ((end-start)>100)
       slowCount++
    println "$host\t"+status+'\t'+(end-start)+'\t'+' '+url
}
def checkAllUrl =  {urls, check -> urls.each() {url ->checkUrl(url,check)}}
def wsdlCheck = {url,content -> content.contains("wsdl:definitions")}
def pingCheck = {url,content -> content.contains("status=NORMAL")}
def contentCheck = {url,content -> content.contains("login")}

checkAllUrl (wdsls,wsdlCheck )
checkAllUrl (simpleurls,contentCheck)

println "ko.count="+koCount
println "slow.count="+slowCount

if (koCount>0 || slowCount >0) {
    System.exit(-1)
}

Build two lists of urls : wsdls, simpleurls based on a list of servers.
A first closure checkUrl get the content of an url and update counters ok, ko
, it’s also receiving another closure that will check the expected content of the url content.
Now depending on the kind of content call the checkAllUrl with matching check closure wsdlCheck ,contentCheck,….

Add a Groovy Postbuild : Groovy script:

def addShortTextSlow = { comp,shortcomp->
matcher = manager.getMatcher(manager.build.logFile, comp+".count=(.*)\$")
if(matcher?.matches()) {
    manager.addShortText(shortcomp+' '+matcher.group(1), "grey", "white", "0px", "white")
}
}
addShortTextSlow('slow','slow')
addShortTextSlow('ko','ko')

That’s it !

Subscribe to the jenkins “RSS for failures” feed or your preferred jenkins notification tool and benefit from the jenkins built-in ui !

You have an history of the checks :

jenkins-monitoring-history

And trending
jenkins-monitoring-trend

You can easily embed this graph or the green/red ball in jira our your wiki :


<a href="http://myjenkins.com/job/monitoring/lastBuild/consoleText">
    <img src="http://myjenkins.com/job/monitoring/buildTimeGraph/png" alt="200" title="200" border="0"/>
</a>

<a href="http://myjenkins.com/job/monitoring/lastBuild/consoleText">
    <img src="http://myjenkins.com/job/monitoring/lastBuild/buildStatus" border="0">
</a>

The sky is the limit !

Ok now you got the idea… let’s add some checks to gather

– check some open ports :

try {
    s = new Socket(host, port);
    s.withStreams { input, output ->	}
    println "management port ok $host $port"
} catch (Exception e){
    koCount++
    println "management port KO for  $host $port : "+e.getMessage()
}	

– access jmx beans

import javax.management.remote.*
def serverUrl = new JMXServiceURL('service:jmx:rmi:///jndi/rmi://ex1.server.com:9999/jmxrmi')
def server = JMXConnectorFactory.connect(serverUrl).MBeanServerConnection;
def memory = new GroovyMBean(server, 'java.lang:type=Memory')
println memory.listAttributeNames() 
println memory.listOperationNames() 

– some jamon statistics :

jamonurls=[]
jamonurlsuffix='/jamonadmin.jsp?sortCol=2&sortOrder=desc&displayTypeValue=RangeColumns&RangeName=ms.&outputTypeValue=xml&formatterValue=%23%2C%23%23%23&TextSize=0&highlight=&ArraySQL=^WS-|^Fault&'

servers.each() {host ->	jamonurls.add("http://${host}"+jamonurlsuffix)}

def fixJamonXml= {
	xml ->
	if (xml.indexOf("No data was returned") != -1) {
		return '<JAMonXML></JAMonXML>';
	}
	String content = xml.substring(xml.indexOf('<JAMonXML>'));
	rangeLabels = [ "0_10ms", "10_20ms","20_40ms","40_80ms","80_160ms","160_320ms","320_640ms","640_1280ms","1280_2560ms","2560_5120ms","5120_10240ms","10240_20480ms"];
	content = content.replaceAll( '<Label>','<Label><![CDATA[');
	content = content.replaceAll( '</Label>',']]></Label>');
	rangeLabels.each() {
		rangeLabel -> content = content.replaceAll(rangeLabel, "range_" + rangeLabel);
	}
	content = content.replaceAll(  "LessThan_0ms", "range_LessThan_0ms");
	content = content.replaceAll( "GreaterThan", "range_GreaterThan");
	return content
}

def jamonCheck= {
	url,content ->
	monitors = []
	def JAMonXML = new XmlSlurper().parseText(fixJamonXml(content))
    def parseLong =  { t ->  if (t.text().equals("")) return null; Long.valueOf(t.text().replaceAll(',', ''))}
    def parseLongString =  { t ->  if (t.equals("")) return null; Long.valueOf(t.replaceAll(',', ''))}
    def parseRange = {
		rangeText ->	// 15/10.2 (0/0/0)
		// http://docs.codehaus.org/display/GROOVY/Tutorial+5+-+Capturing+regex+groups
		rangeFormat = /(.*)\/(.*) \((.*)\/(.*)\/(.*)\)/
		matched = ( rangeText.text() =~ rangeFormat )
		if (matched.matches()) {
			return [	'label':rangeText.name() , hits : parseLongString(matched[0][1]),average:matched[0][2]]
		}
		return [	'label':rangeText.name() , hits : 0,average:0.0]
	}
	println "************************"+ url
	JAMonXML.children().each() { row ->
		monitors.add( [
			'label' : row.Label,
			'units' : row.Units,
			'hits' : parseLong(row.Hits),
			'avg'  : parseLong(row.Avg),
			'total' : parseLong(row.Total),
			'stddev' : parseLong(row.StdDev),
			'lastvalue': parseLong(row.LastValue),
			'min' : parseLong(row.Min),
			'max' : parseLong(row.Max),
			'active' : parseLong(row.Active),
			'avgActice':parseLong(row.AvgActive),
			'maxActice':parseLong(row.MaxActive),
			'firstAccess':row.FirstAccess,
			'lastAccess' : row.LastAccess,
			'ranges' : [
				'range_LessThan_0ms' :parseRange(row.range_LessThan_0ms),
				'range_0_10ms' : parseRange(row.range_0_10ms),
				'range_10_20ms' : parseRange(row.range_10_20ms) ,
				'range_20_40ms' : parseRange(row.range_20_40ms),
				'range_40_80ms':parseRange(row.range_40_80ms),
				'range_80_160ms' : parseRange(row.range_80_160ms) ,
				'range_160_320ms' : parseRange(row.range_160_320ms),
				'range_320_640ms' : parseRange(row.range_320_640ms),
				'range_640_1280ms' : parseRange(row.range_640_1280ms),
				'range_1280_2560ms' : parseRange(row.range_1280_2560ms) ,
				'range_2560_5120ms' : parseRange(row.range_2560_5120ms),
				'range_5120_10240ms' : parseRange(row.range_5120_10240ms),
				'range_10240_20480ms' : parseRange(row.range_10240_20480ms),
				'range_GreaterThan_20480ms': parseRange(row.range_GreaterThan_20480ms)]
		] )
	}
	/**
	 *  1      0      10ms
		2     10      20ms
		3     20      40ms
		4     40      80ms 
		5     80     160ms
		6    160     320ms
		7    320     640ms
		8    640    1280ms
		9   1280    2560ms
		10  2560    5120ms
		10  5120   10240ms
		12 10240   20480ms
		13 >>      20480ms
	 */
	def getPercentiles = {monitor ->
	    def ps = [0.5,0.8,0.9,0.95,0.98,0.99]
		def ranges = [];
		monitor.ranges.eachWithIndex() {it, i -> ranges.add(it.value.hits) }	
		def rangesCumulative  = [];	 
		(0..13).each() {i -> rangesCumulative.add (monitor.hits>0?ranges[i]/monitor.hits:0)}
		def percentages= (0..13).collect() {i -> rangesCumulative[1..i].sum()}
		def percentiles = ps.collect{ percentile->percentages.findIndexOf{it>=percentile}}
	   return percentiles
    }
	percentileserrors = [];
	monitors.each {
		percentiles = getPercentiles(it)
		println percentiles.join('\t') + "\t"+it.label
		if (percentiles[2]>8) {
			percentileserrors.add(it.label)
		}		
	}	

	return percentileserrors>0;
}
checkAllUrl (jamonurls,jamonCheck)
About these ads

, , , ,

  1. #1 by chaitanya on January 20, 2014 - 10:54 am

    I have my app distributed on 2 diff servers I am using nginx for load balancing how can I monitor that using jenkins?
    Thanks !!
    Chaitanya

  2. #3 by Florent on August 20, 2014 - 12:38 pm

    Very good idea ! (but you should provide your script as a plugin. The Jenkins plugin sitemonitor is not as efficient as your script but quite more simple to install.)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: