For the last little while, I’ve been messing around to see whether I can whip off a video-encoding server which is easily installable. It’s put together using Sinatra, Thin, and Eventmachine (three of my favourite Ruby libraries lately).
At the point where it all started to work, I ran into a nasty order-of-operations problem. To fix it, I had to do a little research into how Ruby sets up blocks and procs.
The basic system is fairly simple. There’s a web interface written in Sinatra, which runs inside Thin. Off to the side of this running webapp is an Eventmachine process sitting around waiting to encode videos. The webapp interface allows users to easily define encoder profiles, which are sets of tasks for ffmpeg. As an example, we might define an Encoder with three EncodingTask objects attached to it:
- output to 320×240 flash video
- output to 720×576 ogg theora
- make a 320×240 thumbnail jpg.
When this Encoder is applied to a new video, the tasks are run in order on the video. Once there are no more tasks, the video state is changed to “complete”.
The code that accomplishes this is pretty simple, except for the fact that it’s multithreaded:
def ffmpeg(task, video)
command_string = "ffmpeg -i #{video.file} #{task.command} #{video.file + task.output_file_suffix}"
encoding_operation = proc {
video.state = "encoding"
video.save
Log.info("Executing: #{task.name}")
`nice -n 19 #{command_string}`
}
completion_callback = proc {|result|
if task == encoding_tasks.last
video.state = "complete"
video.save
else
current_task_index = encoding_tasks.index(task)
next_task_index = current_task_index + 1
next_task = encoding_tasks[next_task_index]
ffmpeg(next_task, video)
end
}
EventMachine.defer(encoding_operation, completion_callback)
end
The first time through, the code is called with the first EncodingTask and whatever video is being encoded. The method then calls itself again for every EncodingTask until there are no more EncodingTasks left in the queue, at which point the video is marked as complete and everything chills out. The main motivations for this code are:
- We need to execute a long-running, blocking process (the ffmpeg encoding)
- We want the Sinatra/Thin application to keep running while we’re encoding, because we want to be able to queue up other videos to encode
The EventMachine.defer method does what we need here. EventMachine is mainly designed to be blindingly fast for network operations by using a single-threaded evented reactor loop, but the single-threaded approach only works in cases where non-blocking processing is being done. For blocking I/O, like the call to ffmpeg here, EventMachine.defer comes to the rescue for us: by default, EventMachine sets up a pool of 20 ruby threads which are available for use by EventMachine.defer in cases where it’s really necessary to use multiple ruby threads.
Don’t confuse EventMachine.defer with EventMachine Deferrables, which are related to the single-threaded part of EventMachine. EM.defer is a method which takes two proc objects as its arguments – the first should contain your long-running operation, the second is a callback which should be executed once the long-running operation is completed. The main problem with my code above is that it didn’t work.
I naively expected that I could set the local variable current_task_index inside my completion_callback proc and use it normally. This was a really dumb thing to do, as I realized once I did a little refresher research into Proc objects and how they work in Ruby.
Procs get their variables from their surrounding scope when they are defined, not when they are called. This led to all kinds of exciting order-of-operations bugs in my code, which was all speedily fixed once I moved the current_task_index = encoding_tasks.index(task) line so that it was the first thing defined in the method.
