ParsaLabs | Blog

A publication about the web and more.

Concurrency in Ruby, Simplified by Celluloid Gem

| Comments

Here is a basic example of concurrent programming in Ruby. I’ll be making use of the awesome Celluloid gem to significantly simplify the process. In this snippet, we are going to process 20 (relatively lengthy) jobs in parallel. Note that MRI (or also called cRuby implementation) does not support true concurrency due to Global Interpreter Lock (GIL). For that reason, you will need to switch to another implementation of ruby. I ran and tested the following code sample on jRuby, so that’s what I suggest you use as well. It’s trivial to install and switch to jRuby with rbenv.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
require "benchmark"
require "celluloid/autostart"

class MyWorker
  include Celluloid

  def do_lengthy_operation(id)
    sleep 2
    puts "worked on id: #{id}"
  end
end

pool = MyWorker.pool(size: 20)

time = Benchmark.measure do
  futures = []
  (1..100).each {|i| futures << pool.future.do_lengthy_operation(i)}
  futures.map(&:value)
end

puts time

We start by defining a worker class that performs an operation that takes at least 2 seconds to complete, and include the Celluloid goodness into it. Now, if we were to do this 100 times, sequentially, it would take 200 seconds to complete all of the jobs. But thanks to Celluloid we can create a pool of threads, and execute 20 jobs at the same time. That would reduce the processing time down to around ~ 10 seconds. To prove that, we make use of benchmark to measure time taken to run this code. Sure enough, running the script produces the following results/output for me in the console:

1
1.960000   0.130000   2.090000 ( 10.236000)

Feel free, to change the pool size, play with other API such as async or even randomize the sleep time (for example, using rand*10) and see for yourself how each change affects the program execution. Also, don’t forget to checkout Celluloid’s Github page and tutorials for more info.

Batch Processing Database Records

| Comments

Using the all method to loop through a large collection of records from database is very inefficient because it will try to instantiate all of the records at once. In large data sets this will consume a lot of memory. The solution is to use one of the batch processing methods in Rails:

So, instead of doing this:

1
2
3
User.all.each do |user|
  user.do_sth
end

…Use Batch Processing methods, like so:

1
2
3
User.find_each(batch_size: 5000) do |user| #by default batch size is 1000
  user.do_sth
end

Of course you can also chain it to other query methods such as .where().

Other options you can pass to .find_each() are start and end_at; to configure the first and last ID (primary key) of the sequence:

1
2
3
User.find_each(start: 0, end_at: 10000, batch_size: 500) do |user|
  user.do_sth
end

This is particularly useful if (for instance) you need worker 1 to handle records between 0 and 10,000 & worker 2 to handle from 10,000 and beyond.

Polymorphism Through Duck Typing in Ruby

| Comments

A while ago I was doing a fun pet project named “Toy Robot Simulator” in Ruby. You can find the source code && documentation here => github.com/pouya314/ToyRobotSimulator. The simulator accepts 5 commands as valid input, parses the command and takes an appropriate action accordingly. At first, this part was done through an ugly switch statement, which was utilised to make decision about what needs to be done based on user input.

To refactor the code, one possible solution is polymorphism. There are several ways to implement polymorphism including inheritance. But, as I understand it, there is another “preferred” way in Ruby (and other dynamic languages) called duck typing. The idea is to mixin the common behavior (by including a module) omitting the need to create a base class at all. This pattern can be seen in some built-in Ruby classes like Array and Hash, where they both include Enumerable. This, in my opinion, is a true adjustment of behavior at runtime, easily made possible by the dynamic nature of Ruby, so I wanted to give it a try.

Here’s some code snippets to illustrate how it’s done:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
  # common functionality
  module ParsingConcerns
    def preprocess
      puts "I am preprocessing here.."
    end
  end

  class XMLParser
    include ParsingConcerns

    def parse
      puts "parsing xml..."
    end
  end

  class JSONParser
    include ParsingConcerns

    def parse
      puts "parsing json..."
    end
  end

  class GenericParser
    def parse(parser)
      parser.parse
    end
  end

Now you can do:

1
2
3
  parser = GenericParser.new
  parser.parse(XMLParser.new)   # output=> parsing xml...
  parser.parse(JSONParser.new)  # output=> parsing json...

Note that XMLParser and JSONParser do not inherit from GenericParser and get their common functionality from ParsingConcerns module which they both include in their respective classes. Generic parser accepts a parser object and “expects” it to respond to parse method. That’s the essence of duck typing: if it walks like a duck, swims like a duck, quacks like a duck, you can most certainly call it a duck!

Back to the problem at hand, the strategy is to have a GenericParser that takes the command and runs it through the some validations, then dynamically create an object of a more specific type of command and send the parse message to it. This way, the specific parsing (and validation) for each specific command will happen in its own class. From extensibility point of view, with this approach, developers will be able to add a new command by just introducing a new class and encapsulating all parsing and validation code in it, without having to hunt for && update any switch statement. Also, if a subset of commands require a same unique validation, it can be done in a common superclass from which all such commands are inherited. Note, however, that this is slightly different from the standard implementation of duck-typing, i.e. instead of sending the finalized object type to the generic class, I’m letting it decide (dynamically) what type of object it should send the .parse() message to.

Implementation can be found here: https://github.com/pouya314/ToyRobotSimulator/blob/master/lib/robogame/simulator.rb Have a look at the Generic and other Parser classes.

Setting Up and Using Capybara With Rails

| Comments

Capybara is an exellent choice for writing integration tests in your Rails apps. Rails doesn’t come with Capybara built in, but never worry, as setting up and using Capybara is reletively smooth. In this post, I intend to show you how to do exactly that. So, let’s get to it:

First add the following lines to your Gemfile

1
2
gem 'capybara'
gem 'capybara-webkit'

and run the bundle command to install them. The second gem is to enable the webkit driver, which allows Capybara to run (and test) javascript on the page.

Next, add the following to your test_helper.rb file:

1
2
3
4
5
require 'capybara/rails'

class ActionDispatch::IntegrationTest
  include Capybara::DSL
end

Capybara helps you test web applications by simulating how a real user would interact with your app. To achieve that, you need to mimick user actions such as visiting urls, clicking links/buttons, filling in form data, etc. And that’s exactly what Capybara DSL allows you to do (and more), with its extensive library. Find the Reference here.

Now, we are going to have to set a javascript driver for Capybara to use. You have a few options here, but I prefer Webkit, because it is much faster than selenium. Open your config/environments/test.rb file, and add the following line to it:

1
Capybara.javascript_driver = :webkit

Note that for this to work, you need to have qt library installed. on Mac it’s as simple as brew install qt.

At this point we are done with the setup. To use Capybara, you first need to generate an integration test with the following rails command:

1
rails generate integration_test <Enter the desired name of your test here>

Capybara uses rack_test as default driver which is very fast but doesn’t support javascript. We need to be able to (temporarily) switch to webkit driver when our test involves running and testing javascript. To do that, you can either use your class setup/teardown blocks…

1
2
3
setup do
  Capybara.current_driver = Capybara.javascript_driver
end

..or inside the test case itself:

1
2
3
4
5
6
7
8
9
test "javascript works" do
  # switch to webkit driver
  Capybara.current_driver = Capybara.javascript_driver

  # ... write your test here.

  #switch back to default driver
  Capybara.use_default_driver
end

Keep in mind that switching the driver creates a new session, so you may not be able to switch in the middle of a test.

And that’s about it! Capybara is a very simple and powerful tool for testing. I highly recommend you start using it.

File Upload With Jquery Ajax

| Comments

JQuery’s .ajax() interface is used to perform asynchronous requests to the server. In the settings parameter you can specify additional data that needs to be sent to the server with the request. But if you try and send a file, you will notice that it won’t work! Here is a trick to solve that problem.

Let’s say you have an input element (with an id of image_input) on the page. You can access the chosen file like so:

1
2
var image_input_element = find("#image_input")[0];
var chosen_file = image_input_element.files[0]

But as mentioned above, you cannot directly pass it as your ajax data. Instead, you will need to wrap it in a FormData object:

1
2
3
4
5
6
7
8
9
10
11
12
13
var fd = new FormData();    
fd.append( 'file', image_input_element.files[0] );

$.ajax({
  url: "<%= upload_url %>",
  data: fd,
  processData: false,
  contentType: false,
  type: 'POST',
  done: function(data){
    console.log( "ajax request done!" );
  }
});

Now, your file is properly embedded in request data. On the (Rails) server side, you can access this data with params[:file] and do what you want with it.

Rails Secrets

| Comments

Often times, developers need to store config values (such as access keys for external APIs) during development. But Rails 4 ships with a built-in secrets.yml file, which you can use to conveniently manage such config vars for development, test and production environments.

Here is an example secrets.yml file (from one of my own apps):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
development:
  secret_key_base: 4a7c0c0a1478943ccd5042a4asdadee8cb957ec2c24cc672509af3c09385b39aeda828691a65asdsadsadas
  s3_access_key_id: <%= ENV["S3_ACCESS_KEY_ID"] %>
  s3_secret_access_key: <%= ENV["S3_SECRET_ACCESS_KEY"] %>
  s3_bucket_name: <%= ENV["S3_BUCKET_NAME"] %>

test:
  secret_key_base: 02f6b64f39cdc6b0242fef06b88asdasdsad686d2b1a59536320504d2b870e457f53230e9asdasdasdasb6c4cfe612187asda270c

# Do not keep production secrets in the repository,
# instead read values from the environment.
production:
  secret_key_base: <%= ENV["SECRET_KEY_BASE"] %>
  s3_access_key_id: <%= ENV["S3_ACCESS_KEY_ID"] %>
  s3_secret_access_key: <%= ENV["S3_SECRET_ACCESS_KEY"] %>
  s3_bucket_name: <%= ENV["S3_BUCKET_NAME"] %>

As you see, I am storing my Amazon S3 credentials here. Later on, I access these secrets in an initializer:

1
2
3
4
AWS.config(access_key_id:     Rails.application.secrets.s3_access_key_id,
           secret_access_key: Rails.application.secrets.s3_secret_access_key)

S3_BUCKET = AWS::S3.new.buckets[Rails.application.secrets.s3_bucket_name]

To set the ENV variables in Mac Os X, you need to edit the .bash_profile like so:

1
2
3
export S3_ACCESS_KEY_ID="...."
export S3_SECRET_ACCESS_KEY="..."
export S3_BUCKET_NAME="..."

Replacing the dots, with the actual values.

One final tip: to generate new keys for your development and test sections, use the rake secret in terminal.

And that’s it, now you have an easy way to manage your application secrets.

Enjoy & as always don’t forget to provide us with your valuable feedback. Cheers.

How I Do Testing

| Comments

In this post, I intend to briefly talk about how I approach testing our production apps. This might seem obvious to some of you, dear readers, but I would like to emphasize the importance of testing your code: You should always write tests because they will help you find bugs before release, and you most certainly don’t want your end users to catch those bugs if you are in any kind of serious software business. Furthermore, a wide ranging set of test suites give you the assurance that when you develop a new feature, it doesn’t break another one developed n months/years earlier. Hence, the integration of your app is maintained. General rule of thumb is, the more tests you write, the better.

Over the years, my learnings & experiences have formed an opinion about testing, that I still hold till date. Let me start off by saying that I don’t do TDD religiously. After I develop the smallest unit of a feature (usually a task, or a user story if its small), I write enough unit test to verify this standalone unit of code/logic/biz [model|obj] does what it’s supposed to do, as described in the specs. Once I reach a certain level of confidence, then I move straight on to writing much (and by that, I mean a lot of) integration tests to ensure it blends with other components just fine, and that it doesn’t break any other functionality, etc. In other words, make certain this newly introduced component not only fulfils the requirement in isolation, but also plays well with the existing logic in place. So far, this combination has had me covered pretty well, and has resulted in very good quality products.

Talking about Rails specifically, I use MiniTest for unit, and Capybara for integration tests, as well as fixtures, of course. You might have noticed here, that I didn’t talk about controller tests, and that’s because (in my humble opinion) the C in MVC has traditionally played an orchestrating role, and should not bear any business logic. And when there is no logic, there is little need to test anything really. Integration tests check the overall flow of the app, which should already cover your controllers.

In practice, this approach has worked for me reasonably well so far.

Let me know what you think. :) Cheers!

A Git Tip

| Comments

This will be a mini post to share with you all a quick Git tip. The following command is very useful if, for some reason, you need to view the changes to a single file over time. Basically change history log, but for a single file, presented in a visual manner.

1
gitk [insert file name here]

It’s worth mentioning that I use SourceTree (from Atlassian) which is an awesome GUI for git. It has enabled me to gain a better insight into the evolution of my projects over time through an excellent graphical view. There is rarely any need for me to use the command line anymore. Give it a try.

Rails: Getting Next/previous Record

| Comments

Here is my solution to this common problem, but first the problem statement:

In one of my apps I have Questions that belong to certain Courses, like so:

1
2
3
4
5
6
7
class Course
  has_many :questions
end

class Question
  belongs_to :course
end

What I want to be able to do is, inside course, I’d like to show the first question (which is easy in Rails) and then when user answers it correctly, show a link to the “NEXT” question and so on. How would you go about doing that? Certainly you cannot just increment the id of the current question being displayed, and hope that id+1 is the id of the next record/question!

The solution I have come up with is the following:

1
2
3
4
5
6
7
8
9
10
11
class Question
  belongs_to :course

  def next
    self.course.questions.where("id > ?", self.id).order(id: :asc).first
  end

  def previous
    self.course.questions.where("id < ?", self.id).order(id: :desc).first
  end
end

Now, we can do:

1
2
<%= link_to 'Next', @current_question.next %>
<%= link_to 'Previous', @current_question.previous %>

Of course, this is just one way of doing it. If you (the reader) have a different/better solution, please share it with us in the comments section.

Cheers

TextMate 1.x in Mavericks

| Comments

If after upgrading your Mac OS to Mavericks you are faced with some issues while using TextMate 1.x (e.g. some keyboard shortcuts for Rails stop working), here’s the fix:

From the TextMate menu, go to Preferences > Advanced section. Select the Shell Variables tab. If you see a variable named PATH, edit it (by double clicking on it) or click the + button to add a new one. Enter the following value for the PATH variable:

1
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin

Issues were caused by the fact that most of the TM bundles were coded for version 1.8 of ruby, whereas in Mavericks, Apple has updated the version of default ruby to 2.0. So the above instructions will tell TM to use 1.8 for bundle items.