Update 6/29/2015: This site is now running Lucee Server, not Railo.  

When I installed CFML on a Raspberry Pi, top performance was really the least of my worries.  I'm using ContentBox Modular CMS which allowed me to get this site up and running (via the CommandBox CLI) literally in about 30 minutes after booting a Raspberry Pi for the first time in my life.  

This was incredibly easy and the site performs great for a single page page load with its in-memory H2 database.  Hitting the home page locally loads in less than 200ms.  Of course, ContentBox is a full-fledged CMS running on Hibernate ORM with theming support, dynamic content rendering, and modular plugins.  It's not necessarily optimized for an embedded device.  However, I'm still able to get about 15 requests a second to a ContentBox page which is a decent amount of traffic and the average load time climbing to about 500ms.  (30 concurrent users)  I'd say serving over 50,000 pages an hour with half-second load times is pretty respectable for a full CMS on such a small device.  This pegs CPU, so at that point, it's all CPU-bound.

Here's my live memory and CPU stats during that test.  Hitting the site in my browser during the test was still responsive.

Them's Fightin' Words

After my post to Reddit, which was filled mostly with comments about performance, I got to wondering about what raw CFML performance I could get on a Pi.  There were a number of naysayers on Reddit who insisted that any use of the JVM on a Pi was foolish and incapable of practical use.  They seem to forget that some of the world's biggest sites (like Twitter) run on the JVM.  There was one Reddit user who has actually written a pure Java server called Rupy that looks pretty cool, and pretty fast.  He runs all his web projects on Pis hosted in data centers.  Sweet.

The Setup

To review, I'm using the embedded Railo server built into CommandBox because it's just so freaking easy to use.  I can start it up from the web root with this simple command at the interactive shell:

start host=192.168.1.xxx --rewritesEnable --!openbrowser 

I'm running Railo, but as soon as I get the development channel of CommandBox switched over to Lucee, I'll update my Pi to run on Lucee as well.  Update 6/29/2015: This site is now running Lucee. Railo is running on Undertow and bound directly to port 80 so there's no web server in the mix like Apache.  I may add one later just to keep Undertow from having to process my static files like images and css.  I also added a small hack to the ServerService.cfc inside CommandBox and put in a JVM arg of:

-Xmx768m

I've made a note to add an official feature so anyone can add ad-hoc JVM args of their choice to the embedded server.

I created a subfolder called /bench off the main ContentBox site and put two files in it.  An empty Application.cfm and an index.cfm containing the following:

<h1>CFML</h1>
<cfoutput>#now()#</cfoutput>

I basically just want to see the overhead of spinning up the CF engine and processing requests.

Laying Siege

There are a handful of testing tools.  I looked at JMeter, Apache Bench (ab), and finally a tool called Siege.  Siege has a very simple command line usage.  Here is me hitting a URL 500 times with 20 concurrent users.

$> siege -r 25 -c 20 "http://192.168.1.xxx/bench/"
** SIEGE 3.0.5
** Preparing 20 concurrent users for battle.
The server is now under siege..      done.

Transactions:                    500 hits
Availability:                 100.00 %
Elapsed time:                   0.61 secs
Data transferred:               0.02 MB
Response time:                  0.02 secs
Transaction rate:             819.67 trans/sec
Throughput:                     0.03 MB/sec
Concurrency:                   18.49
Successful transactions:         500
Failed transactions:               0
Longest transaction:            0.22
Shortest transaction:           0.00

Perfect.  The main bit of data  I'm interested in is the transactions (requests) per second so I concocted a bit of shell magic to strip it out.

siege -r 25 -c 20 "http://192.168.1.xxx/bench/" 2>&1 | grep "Transaction rate:" | awk '{print $3}'

What's happening here is I redirect the standard error output into the standard output.  For some reason Siege prints to the standard error out.  Grep strips out just the line I want, and awk prints the 3rd column of data (delimited by whitespace).

Building The Testbed

So, with that in hand, I created the following shell script that puts my server "under siege" 20 times with an ever-increasing number of concurrent users, starting with a single user and ending with 100 users, but always 5000 hits per cycle.

#!/bin/bash

url="http://192.168.1.xxx/bench/"
reqs=5000

NUMS="1 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100"

for NUM in $NUMS
do
	RPS=`siege -r $((reqs/NUM)) -c $NUM "$url" 2>&1 | grep "Transaction rate:" | awk '{print $3}'`
	echo -e "$NUM\t$RPS"
done

I can run the test easily and the output looks like so:

$> ./load
1	185
5	644
10	859
15	964
20	1,018
etc...

The first column is the concurrency and the second column is the request per second.  I didn't capture the average response time, but if you're curious, it was around 20ms.  I ran top in an SSH terminal on the Pi and jotted down the CPU usage at each stage.

The Results

"Inspect Templates" is set to "never" in the Railo admin.  I also ran the test a couple of times so the server could warm up and cache whatever it needed.  Here are the results:

You may recall I tested up to 100 concurrent users, but things pretty much level off after 20 so I didn't bother showing the rest of the graph.  If you really must see it, click here. The throughput started to drop off slightly around 80 concurrent users.  

  • The left Y axis is requests per second
  • The right Y axis is CPU usage
  • You can see both steadily climbed and peaked at 1,042 request per second and 86% CPU load

I think that is a very respectable number.  No one is going to have a CFML site this simple, but if you did, it could serve over 3.5 million requests an hour on a single $35 Pi!

We Can Do Better

You may have noticed CPU didn't completely peak, and you're very observant.  There was actually a bottleneck in the bowels of Railo I found.  I pulled some stack traces via JStack like so during a test:

$> jstack -l java_pid > jstack.out

Every single running thread was executing the same line in Railo that processes cookies at the beginning of the request.  I pinged Micha, and we both determined this was actually a bug in Railo/Lucee.  When there are no cookies defined, there is a null pointer exception that gets thrown due to an object being null.  When Java has to process 1000 exceptions per second, that can start to add up.  Unfortunately, a try/catch with an empty catch{} block was completely covering up the error so no one had ever noticed (for years, probably).  I'm not a fan of empty catch blocks and this is a good reason why.

Micha patched the issue in Lucee, but Railo will never get this patch.  This is another good reason to get switched over to Lucee once this Cookie patch comes out on the dev branch so I can rerun my tests and see how much more I squeeze out of this Raspberry Pi!