Rsync + Samba: Don't Archive!

Posted by dave
on Wednesday, February 17

Just a quick note to self (and whoever else it might help) – when doing an rsync backup to a mounted SMB file share, this works:

rsync -rvz --progress /path/to/source /path/to/destination/

This doesn’t:

rsync -avz --progress /path/to/source /path/to/destination/

The -a option (“archive”) doesn’t work properly with SMB, so the files are never copied. Thanks to the Linux Blog for that one!

Hubbub: A Quick Tour

Posted by dave
on Tuesday, February 02

We recently did some e-commerce work for an online startup called Hubbub. We’re pretty proud of it here, so it seems worth it to talk a little about the challenges involved in building it.

Most e-commerce jobs are relatively simple from a business process point of view – one website has one shop inside it, and in most cases, when you buy something it instantly gets sent for fulfillment (either physically or digitally). For example, if you buy a pair of shoes from Humanic (which we worked on last year), the shoes move into the delivery system immediately, and are usually sent out the next day. Fulfillment, i.e. the delivery of your shoes, is handled by a dedicated service provider external to the website itself.

Hubbub is completely different. Instead of having one shop and a relatively simple delivery system, the site is composed of multiple shops (a baker, an Italian deli, a fish shop, etc) which are in a delivery area – so the Hubbub site needs to keep track of which products have been ordered from which shops. New shops and new delivery areas can be added at any time. If Hubbub decides to expand into Kensington, Chelsea, and Clapham tomorrow, it can add those three neighbourhoods as new delivery areas, and add the shops from each neighbourhood into the right area. Specialist shops (like Christchurch Fish, which can serve all of London) can be shared across multiple delivery areas. The site then keeps track of which shops have had which items ordered from them.

Items can be scheduled for delivery on a specific hour anytime in the 30 days from the date of purchase. The site has its own fulfilment system built into it, automating the scheduling of store pick-ups and item deliveries for each delivery area on an hourly basis. Delivery drivers have their pickup and delivery schedules emailed to them every day in the early afternoon, allowing them to plan out their routes for the day. All in all, the site code is much more than a simple webshop, it’s a fully integrated customized e-commerce software platform which enables Hubbub to function. It’s arguable that the Hubbub business would be impossible without the code to back it up.

Most sites take payment when a customer orders an item. On Hubbub, this isn’t possible because the price of the item may not be known when it’s ordered. In the case of a steak, a chicken, or a bag of fruit, for instance, the site may give an indicative price (steak is £3.99/lb), but the actual price of the steak which will be delivered isn’t known until the butcher actually cuts it up and weighs it on the day of delivery. This is a huge issue from a business process point of view. In order to make this relatively complex payment flow work, we wrote our own payment gateway integration code so that we can do 3-D Secure authentication with the SecureTrading Xpay gateway.

The end result is a site which looks deceptively simple. It unifies multiple webshops (which correspond to multiple real-world businesses) into a unified front-end, offering each shop its own login area for product management and order tracking. The Hubbub site administrators also have their own secure login area which allows them to track the progress of the business.

The shop code is based on a heavily-modified version of Spree retrofitted with our own internal role-based access control systems and an extensive test suite based on shoulda, machinist and cucumber.

If we made one mistake during development, it was to write too much new code before we’d adequately covered the Spree codebase with tests we liked. The default Spree comes with an rspec test suite which was incomplete (when we forked the code from Spree six months ago) – we like Shoulda and Test::Unit better, so we rewrote the suite. We should have gone all the way with the code coverage before writing any new code, though – it was amazing how much our development effort sped up once we got code coverage above ~70%. For new projects, we’ve instituted a 95% code coverage minimum on our CruiseControl server (with an actual internal target of 100%).

It’s a pretty good feeling – over the past few years, we’ve gotten way more agile. Codewise at least.

Rails Duckboards

Posted by dave
on Monday, February 01

These days, Rails is no longer the hot new thing in the way that it once was – as programming frameworks go, it’s settling happily into a vigorous middle age. The good thing about this is that it means a lot of those awkward teen moments are a thing of the past, and that the experimental choices of early adulthood have given way to a stable, productive lifestyle.

However, for anyone who is new to Rails, the learning curve can be very steep. For those of us who’ve been using Rails for years (I’ve been at it since -v=0.11), it’s easy to take for granted the amount of knowledge that we’ve accumulated without thinking about it. Given the choice of several million blog posts, screencasts, and mailing list postings about Rails, I suspect it wouldn’t be an easy thing these days to start fresh.

So, it seems worth it to try and provide some duckboards for new users. There’s a swamp of old blog posts, incorrect information, and simple irrelevance out there on the internetz, and it seems worth it to try and help out newcomers. Outlining what libraries we’re using here also seems like a good idea when they are different to the framework defaults (and worth it for a newbie).

So, if I had to give some advice to people learning Rails today, I’d tell them to focus on:

  • a good idea of where application logic should go
  • a firm understanding of ActiveRecord, its relationships, and when not to use it
  • test-first experience with Rspec, Shoulda, and/or Test::Unit (or all three)
  • good understanding of what RESTful routing is and how to use it
  • DRY coding style

Skinny Controllers – Fat Models

It’s easy to succumb to the temptation to pack all application logic into the controllers, but it’s a mistake, as detailed in Jamis Buck’s classic blog post Skinny Controller, Fat Model. Rails controllers should respond to form input in the minimum amount of code possible – the more application logic gets offloaded into the models, the better.

RESTful routing

Routing is connects URLs to code and also guides the basic controller structures of the application. Sticking with the recommended controller actions and nesting resources properly really cleans up a given application, making it more consistent and logical, and allows the easy construction of machine-accessible APIs. A good intro to RESTful routing is available on Daryn Holmes’ blog and there’s another good intro in the Rails guides.

Test-first development

We’ve decided to settle on the Shoulda gem (in conjunction with Test::Unit) for unit and functional tests. From all the different testing frameworks we’ve tried out, it is the simplest and most readable way of ending up with a comprehensive test suite. There’s a great screencast on how to use Shoulda here We’ve also settled on using the Machinist gem instead of fixtures: http://github.com/notahat/machinist. For small projects, fixtures are fine. For anything with a complex domain model, fixtures start to suck pretty quickly.

In order to generate our models, controllers, and scaffolds using the Test::Unit/Shoulda/Machinist stack, and get 100% test coverage on the scaffolds out of the box, we built our own gem which does the boring work for us: the shoulda_machinist_generator. Check it out :).

Activerecord (especially associations)

Reading and understanding pretty much the entirety of the ActiveRecord documentation is extremely important. Busting out of ActiveRecord and using raw SQL when necessary is still smart – but the point at which that happens should be pretty far up the complexity mountain, it shouldn’t be a first resort when doing rapid development. For optimization, yes; for doing extremely complex queries, yes; for doing basic has_many, has_many :through, has_one, or polymorphic queries, no. New Rails coders often seem to have trouble breaking out of a database-table-centric mentality, especially when it comes to associations – so read the basics about the ActiveRecord class methods. Every day when you wake up, repeat the phrase in Ruby, everything is an object, so I should stop doing things like “validates_presence_of :article_id”.

DRY

The temptation to use the “old standby” CTRL-C, CTRL-V when building new functionality is always hard to avoid. Using test-first development is one way to fight this evil urge, another way is the rule of threes: writing code once is good, copying and pasting it into another part of the application is alright but not desirable, and if you find yourself copying and pasting it into a third place then you probably need to do some refactoring. Since all the code should be tested, refactoring should be easy because of the big testing safety-net. Keep it DRY: don’t repeat yourself

These are the basics of Rails development – once these are in place and working, it’s time to start configuring continuous integration servers (we use CruiseControl.rb), rcov for test coverage metrics, and figuring out how to deploy using Capistrano.

It’s interesting to note just how much the whole Rails ecosystem has grown. For someone moving across from a different environment (PHP, .NET or Java), Rails is no longer the lightweight framework that we used to know and love when we started with it. The core workflow has stayed pretty much the same – code generation does what it always did, ActiveRecord still makes us happy, etc. But the amount of problems that people are solving in the wider Ruby and Rails ecosystem means that it’s possible to do amazing things quickly, if only you know where to look. Our own Branston gem is a good example of this – it solves a set of problems that advanced Rails users do have (automatic generation of Cucumber acceptance-test code) but which new users probably don’t need to know about. Wading through the blog swamp is hard work – anybody else have any Rails duckboards to help newcomers?