Some Numbers at Last

So, a few weeks ago I made an offhanded post here about my new-found love for Rails. I'd been skipping off the surface of Ruby for a while, trying to decide if it, or Python, or Groovy, or something else, ought to fill out the empty slot in my tool belt. (I'll save the "why LISP isn't on this list" post for another time.) Rails seemed like an excellent way to put Ruby through a workout, and I had the right sized project to try it out with.

The project itself is not open-source; the client is now and shall remain anonymous. But they are paying me my going rate to do this work, which makes it a commercial Rails project, and it will in the future be installed into some rather large organizations. I can't really say much about what the application's domain is, but I can lay out the general principles of the app.

The application must support multiple users, in the order of 10-100 concurrently. The load isn't particularly high, but the application will be treated like a desktop app (more specifically, a spreadsheet replacement) so it has to be responsive. The application is for managing pieces of a large machine process. There are lots of types of components to be managed, and the relationships between them can be quite complicated. There are no one-to-one relationships in the model, but plenty of one-to-many and many-to-many. In addition to managing the components, the application has to allow for the addition of entirely new categories of components as well as a variety of customizable fields. Finally, the authorization rules call for a breadth of user roles with a complex mapping of permissions to those roles.

I've finally gotten around to running the profiling numbers and doing some comparison between the two systems. I won't spoil the suspense by listing my conclusions up front -- you'll have to scroll through the rest of the post to see them. But, first, let me set the stage: the original application is based on the Java/JSTL/Spring/Hibernate/MySQL stack. I used ACEGI for the security, SpringMVC for the web controller, and was using Hibernate 2.x for the O/RM. To increase performance, I was using the default output caching from SpringMVC and data caching in Hibernate. The re-implementation is in Rails on top of MySQL. The authorization is done through a custom observer and a custom authorization class, which uses a declarative rules file for permission mapping. For performance, I'm using a combination of page caching, action caching, and cache sweepers (observers whose job it is to expire cached pages).

Now, for the comparisons:

Time to Implement I made a comment about this in the previous posts on the topic, and that comment has been quoted widely out in the wide blogosphere as a classic example of Rails hype. So, let me make it plain: any time you re-write an application, it will go almost infinitely faster because you already have a firm grasp on the problem domain. Such was the case for me in my re-write; we'd spent so much time on the domain model and the database schema that the second time through the application, everything already made perfect sense to me. Any comparison of how long it took to implement one or the other is bogus, since the only fair comparison would be to implement two roughly functionally equivalent projects in the two different stacks and measure the competitive results. Since I have not done that, making statements about how it only took 5 days to re-implement the site are almost meaningless. What I can say is that I had more fun implementing it the second time, but that's just personal preference.

Lines of Code This one is a lot more interesting. Folks will tell you all the time that there is a running debate about the meaningfulness of LOC comparisons. They're right; there is a running debate. I just think it's moot. For my money, the fewer lines of code I have to write to get a piece of functionality, *as long as those lines are clear in meaning*, the better off I am. Obfuscated Perl programs don't make the grade; I can write some really concise Perl code and not have any idea, three months later, what the heck I was doing. But if the code is legible, its intent obvious, then more concise is better, pure and simple.

Bear in mind that this comparison is somewhat unfair. The Rails version of the app has been in development for an extra month (meaning a month's worth of new features added) while the Java version has been stagnant since the switch. Since the Rails version started out as an experiment, I don't have a clear history of the codebase to be able to produce a version that is identical in features to the Java version. Therefore, I've made the comparison based on the current state of the app vs. the abandoned Java version.

The LOC counts used here do not include comments or unit tests, nor do they include simple punctuation lines (lines with just a "}" or "end").

Without further ado:

Lines of Code
Rails: 1164
Java: 3293

Number of Classes:

Rails: 55
Java: 62

Rails: 126
Java: 549

Rails: 113
Java: 1161

Those numbers are beyond striking. The configuration count alone is enough to make me think long and hard about what I've been doing lately. Let me forestall any criticism about my "agenda", by the way. Somebody recently said that I must have an "anti-Java" or an "anti-Spring" agenda. That is far from the truth, since my Spring: A Developer's Notebook hits shelves any day now. I don't want people to stop using Spring, or Java. In fact, I want lots and lots of Java developers to use Spring and lots and lots of non-Java developers to start using Java so they can start using Spring. But one of the things I like most about Spring (its concise configuration) is still, well, huge, compared to the same app in Rails.

Performance This is where everybody really wants to see the numbers. So, for the sake of total specificity, the following numbers were generated on a 1.5GhZ Mac OSX (10.3.7) PowerBook with a 4200rpm hard drive and 1GB of RAM. The Java app is running on Jakarta Tomcat v 5.0.28, while the Rails app is running in Lighttp with FastCGI. The setups are standard for each application stack.

Since both stacks have different caching mechanisms, with different degrees of difficulty to manage them, I'll start the performance comparison with a walk through the application without using the caching systems. To generate the first number, I walked through every screen in the application (using the screens available in the Java version to form a subset of the screens available in the newer Rails version) hitting pages that have not been cached yet. This starts from logging in, then hitting every available piece of functionality once. These numbers do not include the time it took me (the user) to navigate to the next request, only the time to process the requests. Both applications had roughly equivalent logging turned on (obviously, it can't be exact, but without the logging, I couldn't provide measurements). Here's what I found out.

To walk the entire application feature set, once, in Rails, without caching, took 41.801s. To walk the exact same feature set in the Java app took 58.369s. These numbers are averages over five attempts with each app, with full restart between to give the cleanest runs possible. Those summary numbers are deceiving, though. What I found was that, the less complex the feature, the faster the Java app served it relative to the Rails app. The more complex the feature, the slower the Java app served it relative to the Rails app. Some of that difference might be changes to the model during the re-implementation based on a better understanding of the domain. Regardless, that difference comes out to show the Rails app at ~30% faster than the Java version when caching isn't taken into account. If I am generous and say that half of the difference is due to optimizations in the model, that still leaves us a 15% better performance in Rails.

The caching story is interesting. The default caching mechanism for Rails is either page caching (caching the output of the rendering to a physical html file and letting the server serve it directly instead of invoking Rails the next time it is requested) or action caching (similar, except the output is cached in memory or another kind of store and Rails is invoked to process the request). On the Java side, there is JSP pre-compilation, Spring output caching and of course Hibernate data caching. I used a simple load tester to test a couple of pieces of functionality in the application. For each URL, I ran the test without a pre-existing cache of the response, then again with it, 100 times each, and then determined the number of requests per second.

Functionality 1: Rails, 100 runs, no pre-existing cache: 75.59 request/second. Rails, 100 runs, pre-existing cache: 1754.39 requests/second. Java, 100 runs, no pre-existing cache: 71.89 requests/second. Java, 100 runs, pre-existing cache: 80.06 requests/second.

Functionality 2: Rails, 100 runs, no pre-existing cache: 62.50 r/s. Rails, 100 runs, pre-existing cache: 1785.15 r/s. Java, 100 runs, no pre-existing cache: 80.06 r/s. Java, 100 runs, pre-existing cache: 88.97 r/s.

These numbers are not a great comparison, because there is tons more I can do to increase the cache performance of the Java application. No doubt whatsoever (I just hadn't gotten around to it by the time I abandoned it). What *is* interesting, though, is that the Rails page caching maxes out . The server can't serve the pages any faster than that. And with the Rails cache_sweeper observer pattern, it is dead-simple to use this hyper-fast caching whenever possible (obviously, highly dynamic pages can't be cached, but that's the case with that kind of page in any application stack). If you switch to the action caching (which allows all the filters to be executed) the Rails app still ends up in the 800-1000 requests/second range.

Conclusions So what do I think? I think that the application I'm working on is perfectly suited for Rails and Rails is perfectly suited for it. I think that I have had more fun working on the Rails app than the Java version. However, I think that the Java version is just as capable, and could be just as performant, as the Rails app. To me, the eye-opening revelation isn't "Rails is faster than Java/Spring/Hibernate". It's "Rails can be very fast". I think that there is a lot to be said for Rails, and it deserves much of the press it is getting. However, I don't think its a Java-killer. I think there are plenty of applications, and development teams, that are better suited to Java and its immense universe of available support libraries. I certainly am not going to stop developing in and learning about Java just because I've discovered Rails. On the other hand, I am going to spend more of my time trying to find projects that I can use Rails on.

I'm extremely interested in how these numbers strike folks, and whether there are other comparisons that you would find useful. So interested, in fact, that I'm risking blog-spamming by turning comments back on. So, if you have some other comparisons you would find useful, or some insight onto these numbers, or even some things you think I ought to try to make the comparisons better, let me know. I can't promise I'll tackle them all, but I think it will be interesting.

Alex Miller

Carin Meier

David Chelimsky

David Nolen

Ghadi Shayban

Jeb Beich

Justin Gehtland

Lynn Grogan

Marc Phillips

Michael Nygard

Naoko Higashide

Paul de Grandis

Rich Hickey

Russ Olsen

Stuart Halloway

Stuart Sierra

Tim Baldridge

Aaron Bedra

Chris Redinger

Craig Andera

Don Mullen

Glenn Vanderburg

Jared Pace

Jason Rudolph

Jon Distad

Larry Karnowski

Michael Parenteau

Muness Alrubaie

Rob Sanheim

Sam Umbach

Kim Foster

Jaret Binford

Joe Smith

Clinton Dreisbach

Tim Ewald

Robert Randolph

Alex Redington

Joe Lane

Christian Romney