Don't use css or table layout, use Sass and Compass

Posted by Guillaume Maury
on Feb 04, 09

Sorry for the tongue in cheek title, I’m trying to bait the news.ycombinator crowd who got into another spiraling argument. Last time it was did php win? now it’s CSS vs tables (this one is a recurring one).

But, I have the solution to end this religious war, use compass and sass!

Ok, just a summary for those too busy to read:

  • I’m not a designer and having to get the layout right is the activity I hate the most when making a web application
  • I feel that a lot of shortcomings of CSS can be solved by using a framework like BluePrint
  • But using blueprint negates the benefits of using semantic markup (since you start having weird classes like span-6 prepend-2)
  • Sass has a nifty feature called mixin and Compass use it to provide the power of frameworks like BluePrint while still keeping semantic classes

So now let’s get on with a tour of Sass and Compass

What is sass?

Remember when you first heard of haml combining the dreaded semantic indentation of Python and the use of gratuitous punctation from Perl (thanks to Giles Bowkett for the description) to create a very cool and useful markup language. Well sass is the brain child from the same guy.

What follows is yet another quick introduction of Sass (I should have instead just linked to the doc but it’s too late…)

Example of sass code:

#foo
  :font-width 1em
  :border 1px solid
  :padding
    left  10px
    right 5px
  a
    :text-decoration none

which compiles down to the following CSS

#foo { font-width: 1em; border: 1px solid; padding-left: 10px; padding-right: 5px; }
#foo a {text-decoration: none;}

So like with Haml, Sass gets rid of redundant markup (like the braces and ;) and instead uses semantic indentation to distinguish rules.

Variables in Sass

But there is more, Sass also support variables:

!success_color = #7fb236
.success
  :color= !success_color
.success_box
  :background-color=  !success_color

It’s an easy way to change colors in your layouts by just changing the value of the variable. Magic numbers have long been a code smell when programming, why shouldn’t it be the same with CSS?

Mixins and Sass

Sass has a nice thing called mixins. Instead of copying and pasting some CSS code you always find yourself using, you can just use a mixin that will define this code to be reused later.

!border_highlight_color= #efefef
=message_box
  :border= 2px solid !border_highlight_color
  :padding 20px
  
.success
  +message_box
  :color= !success_color

will compile to

.success {
  border: 2px solid #efefef
  padding: 20px
  color: #7fb236
}

So now at last, you can keep your style sheets DRY, no need to hang them outside in the sun… (sorry for the lame attempt of humor, no need to throw tomatoes, it won’t happen again)

UPDATE: As said in the comment below by beejamin, the example is rather contrived and in this case it would make more sense to just add a message_box class to the element. A more complex and less contrived mixin example, with parameters is the mixin to layout list horizontally from Compass

=horizontal-list(!padding = 4px)
  +reset-box-model
  +clearfix
  li
    +no-bullet
    :white-space nowrap
    :float left
    :padding
      :left= !padding
      :right= !padding
    &.first
      :padding-left 0px
    &.last
      :padding-right 0px

Sass and Script functions

Sass has a few built-in functions allowing you to do things like this:

  .menu_bar
    :background-color= hsl(66, 6, 82)
    :width= abs(20px * -4)

will compile to

.menu_bar_color {
  background-color: #d3d4ce;
  width: 80px; }

It’s easy to add new functions by monkey patching the Sass:Scripts::Functions module. For example see this gist where I added functions to determine the height and width of an image.

Wrapping it up, a cool example of Sass

UPDATE: I decided to add this section to better show the power of using mixins and functions in Sass (this uses the height function as explained above that you can find in this gist)

One technique that is often used to speed up load time is to use CSS sprites. I usually use this mixin to do it:

=vertical_background_sprite(!img, !n, !total)
  !sprite_height= height(!img)/!total
  :background= transparent url(!img) 0 (!sprite_height * -(!n - 1)) no-repeat
  :height= !sprite_height
  
#new_pick
  +vertical_background_sprite(/images/item_info_icons.png, 1, 5)
#new_favorite
  +vertical_background_sprite(/images/item_info_icons.png, 2, 5)
#tell_friend
  +vertical_background_sprite(/images/item_info_icons.png, 3, 5)
#share_link
  +vertical_background_sprite(/images/item_info_icons.png, 4, 5)
#report
  +vertical_background_sprite(/images/item_info_icons.png, 5, 5)

Like this, I don’t need to hardcode the exact size of the sprites I have in my image.

For a programmer like me, Sass is a godsend, it makes my CSS code dry, more readable and make it feel more like a real programming language. Of course, the big problem is finding a designer that understands Sass…

Compass

Compass is a meta framework for Sass, it uses the power of Sass’s mixins to allow you to use frameworks like BluePrint, YUI without having the semantic classes nightmare of having all your classes named barbaric names like span-1, prepend-5.

I’m going to use the BluePrint library from compass as an example because that’s what I use in my project.

In a nutshell BluePrint is a CSS framework that gives you grid that is by default 950px with 24 columns 30 px each separated by a 10px gutter space. (all of this is customizable through the !blueprint_grid_columns, !blueprint_grid_width and !blueprint_grid_margin variables).
Compass provides mixins to make it easy to use the BluePrint grid.

So if you have a div with an id of sidebar, you can do this

#sidebar
  +column(5)
  +prepend(1)

This will make the #sidebar div be 5 columns wide, with one empty column behind it on the left. It will compile to the following CSS code:

#sidebar {
  float: left;
  width: 190px;
  margin-right: 10px;
  padding-left: 40px; }

For those who need a fluid layout, there is blueprint/modules/liquid.sass that provides a liquid layout.

I find it very easy to use the Compass BluePrint mixins for layout and it brings me the flexibility of CSS (easy to adapt the design to say an iphone without changing the html markup, quicker to load in browsers,…) while still having a very simple and quick way to style my page exactly as I want it to be.

For more information on Compass and some good primers, visit the blog of its creator and the Compass wiki on GitHub

Concrete Example of using Compass and Sass for creating a layout that people generally use tables for

UPDATE: After rereading this article, I thought that maybe I should show an example of Compass in action

Say you want to create a four column layout

For this in Sass (using Compass) you can do this:

  @import blueprint.sass

  body
    +blueprint-typography
    .container
      +container
  #header
    +column(24)
  #left_bar
    +column(7)
  #content
    +column(11)
  #fourth_column
    +column(1)
  #right_bar
    +column(5,last)

So you just define how many columns of the blueprint grid you need for each of your columns…

and use the following haml:

!!! Strict 
%html
  %head 
    %link{:rel => 'stylesheet', :type => 'text/css', :media => "screen", :href => '/css/four_columns.css'}
  %body
    .container
      #header
        %h1 Some nifty header
      #left_bar
        Left bar
      #content
        Lorem ipsum ad infinitum
      #fourth_column
        This is a 4th col  
      #right_bar
        Right bar

And you can see the result in all its glory here

If instead you want to use liquid layout, you just need to add @import blueprint/modules/liquid.sass

NB: Now I must confess that I never had to use a liquid layout in the projects I’ve been working on with Sass and Compass and just now when I tried it didn’t work (the last column fell off). After modifying the !blueprint_liquid_grid_margin to 0.4em it worked. I’ll ask on the mailing list about that and see how it goes…

I don’t know about you, but I find that this is at least as simple as using a table layout and the html code is cleaner too. Of course this could have been done with blueprint but in this case we have kept semantic names for the ids and classes. We do lose one thing from this technique though: it’s not easy to reorder the content without touching the markup (at least if we want to use the blueprint mixins)

For those thinking that using such a grid is not flexible, remember we’re using sass the size of the grid and number of column is defined by variables. So you can redefine the number of grid columns by setting !blueprint_grid_columns (or blueprint_liquid_grid_columns if you’re using the liquid layout), the width of each grid column with !blueprint_grid_width (or !blueprint_liquid_grid_width), and so on…

Don’t use CSS or table layouts, use Compass and Sass

So to conclude with why you should use Compass and Sass

Compass and Sass

  • allows you to make your layout code Dry
  • behaves like a programming language, so you can have fun refactoring your code
  • is easy to install and start with
  • takes care of the IE 6 incompatibilities by using the code from frameworks (I also use the excellent ie7-js library from Dean Edwards)
  • allows to easily design a grid based layout while still keeping semantic names

Now I don’t like spending a lot of time with a layout when I could spend some times on more productive things like the backend code or the javascript code of my website but Compass and Sass have considerably reduced the pain of having to make a website “pretty”. And compared to dealing with tables, I find the results much cleaner and maintainable.

I’m currently available for hire on a contract basis.

httperf and File Descriptors

Posted by Guillaume Maury
on Feb 04, 09

When running httperf if you get this error:

httperf --timeout=5 --client=0/1 --server=domU-12-31-39-00-A9-F7 --port=80 --uri=/articles.html --rate=500 --send-buffer=4096 --recv-buffer=16384 --num-conns=20000 --num-calls=10
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE
Maximum connect burst length: 50

Total: connections 9789 requests 96039 replies 95959 test-duration 44.312 s

Connection rate: 220.9 conn/s (4.5 ms/conn, <=1022 concurrent connections)
Connection time [ms]: min 6.5 avg 4139.6 max 13432.2 median 3770.5 stddev 1697.6
Connection time [ms]: connect 495.6
Connection length [replies/conn]: 9.998

Request rate: 2167.4 req/s (0.5 ms/req)
Request size [B]: 88.0

Reply rate [replies/s]: min 2054.9 avg 2250.7 max 2559.2 stddev 155.9 (8 samples)
Reply time [ms]: response 242.4 transfer 123.3
Reply size [B]: header 242.0 content 17073.0 footer 0.0 (total 17315.0)
Reply status: 1xx=0 2xx=95959 3xx=0 4xx=0 5xx=0

CPU time [s]: user 1.42 system 17.77 (user 3.2% system 40.1% total 43.3%)
Net I/O: 36803.9 KB/s (301.5*10^6 bps)

Errors: total 10405 client-timo 194 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 10211 addrunavail 0 ftab-full 0 other 0

fd-unavail means that there was the per-process on the limit of open files has been exceded. httperf uses select() and uses a new file descriptor for each connection it opens concurrently. So let’s increase the limit on file descriptors.

  • Edit /etc/security/limits.conf and add the line * hard nofile 65535 (or instead of * you can put the username of the user for whom you want to change the limit)
  • Edit /usr/include/bits/typesizes.h and change #define __FD_SET_SIZE 1024 to #define __FD_SET_SIZE 65535 (in /usr/include/sys/select.h FD_SETSIZE is defined as __FD_SETSIZE)
  • Download and recompile httperf
  wget ftp://ftp.hpl.hp.com/pub/httperf/httperf-0.9.0.tar.gz
  tar xvzf httperf-0.9.0.tar.gz
  cd httperf-0.9.0
  ./configure && make
  sudo make install

I’ve tested those steps on ubuntu

NB: This is one of the reason you should make sure that nginx is using epoll (linux) or kqueue (bsd) instead of select (this is determined at ./configure time but you can force it by using a use epoll in an events block).

I’m currently available for hire on a contract basis.

Merb-cache's methods

Posted by Guillaume Maury
on Dec 19, 08

Introduction

Important note

A lot of this code needs the version of merb-cache found at http://github.com/benschwarz/merb-cache. There have been a few patches made to it particularly for eager_cache and fetch_partial

Beware that the code and api might change, I will keep this article up-to-date with the changes..

The cache controller method

cache *actions, conditions

Caches the result of the action. Accepts two specific options:

  • :store (or :stores) use the specified store
  • :params list of params to pass to the store conditions is then passed to the store so any conditions supported by the store (eg. :expire_in for MemcachedStore) can be used.
  def cache(*actions)
    conditions = extract_options_from_args!(actions) || {}
    actions.each {|a| cache_action(a, conditions)}
  end

  def cache_action(action, conditions = {})
    before("_cache_#{action}_before", conditions.only(:if, :unless).merge(:with => [conditions], :only => action))
    after("_cache_#{action}_after", conditions.only(:if, :unless).merge(:with => [conditions], :only => action))
    alias_method "_cache_#{action}_before", :_cache_before
    alias_method "_cache_#{action}_after",  :_cache_after
  end

As you can see it adds a before filter and an after filter.

The before filter (_cache_before)

  def _cache_before(conditions = {})
    unless @_force_cache
      if @_skip_cache.nil? && data = Merb::Cache[_lookup_store(conditions)].read(self, _parameters_and_conditions(conditions).first)
        throw(:halt, data)
        @_cached = true
      else
        @_cached = false
      end
    end
  end
  
  def _lookup_store(conditions = {})
    conditions[:store] || conditions[:stores] || default_cache_store
  end

@_force_cache is set by the force_cache! instance method which is called when eager caching (see below) @_skip_cache is set by the skip_cache! instance method

You can override default_cache_store to set a default store for your controller

The after filter (_cache_after)

  def _cache_after(conditions = {})
    if @_cached == false
      if Merb::Cache[_lookup_store(conditions)].write(self, nil, *_parameters_and_conditions(conditions))
        @_cache_write = true
      end
    end
  end

Pretty trivial it just writes gives the current instance of the controller to the store write method (for more details on what happens then see http://gom-jabbar.org/articles/2008/12/09/merb-cache-and-its-stores) and uses parametersand_conditions(conditions) to determine which parameters and conditions to pass on to the store

_parameters_and_conditions(conditions)

This methods is used to determine which parameters will be passed to the cache store.

def _parameters_and_conditions(conditions)
  parameters = {}

  if self.class.respond_to? :action_argument_list
    arguments, defaults = self.class.action_argument_list[action_name]
    arguments.inject(parameters) do |parameters, arg|
      if defaults.include?(arg.first)
        parameters[arg.first] = self.params[arg.first] || arg.last
      else
        parameters[arg.first] = self.params[arg.first]
      end
      parameters
    end
  end

  case conditions[:params]
  when Symbol
    parameters[conditions[:params]] = self.params[conditions[:params]]
  when Array
    conditions[:params].each do |param|
      parameters[param] = self.params[param]
    end
  end

  return parameters, conditions.except(:params, :store, :stores)
end

It passes by default all the parameters used by merb-action-args (accessed by self.class.action_argument_list[action_name]) and then uses the :params condition to determine which parameters to pass.

eager_cache

Eager caching is one solution to the dog pile effect (see Introducing Mint Store’s introduction for a quick introduction to the dog pile effect). Eager cache basically caches some actions whenever the trigger action is completed. It uses Merb.run_later so as not to make the client wait. There are two eager_cache methods, an instance method and a class method

The class method

eager_cache(trigger_action, target = trigger_action, conditions = {}, &blk)

After trigger_action has been run, the target action will be cached with the conditions sent to the store.

  • target can be of the form [controller, action] if no controller is given, the current controller is used

  • conditions accepts the following specific parameters

    • :uri the uri of the resource you want to eager cache (it’s needed by the page store but can be provided instead by a block)
    • :method the http method to use when requesting the resource to eager cache (defaults to :get)
    • :store which store to use for eager_caching
    • :params list of params to pass to the store when writing to it.

    Of course, conditions is also passed to the store, so any other supported conditions from the store can be used.

This method accepts a block that allows you setup the request (more on this later).

The instance method

eager_cache(action, conditions = {}, params = request.params.dup, env = request.env.dup, &blk)

  • target can be of the form [controller, action] if no controller is given, the current controller is used

  • conditions accepts the following specific parameters

    • :uri the uri of the resource you want to eager cache (it’s needed by the page store but can be provided instead by a block)
    • :method the http method to use when requesting the resource to eager cache (defaults to :get)
  • params is just passed as a parameter to the block

  • env is used to create a new request if you do not pass a block (it’s passed to build_request)

This method accepts a block that allows you setup the request (more on this later).

The heart of eager caching the eager_dispatch class method

This is the function that is called in the run_later block

def eager_dispatch(action, params = {}, env = {}, blk = nil)
  kontroller = if blk.nil?
    new(build_request({}, env))
  else
    result = case blk.arity
      when 0  then  blk[]
      when 1  then  blk[params]
      else          blk[*[params, env]]
    end

    case result
    when NilClass         then new(build_request({}, env))
    when Hash, Mash       then new(build_request({}, result))
    when Merb::Request    then new(result)
    when Merb::Controller then result
    else raise ArgumentError, "Block to eager_cache must return nil, the env Hash, a Request object, or a Controller object"
    end
  end

  kontroller.force_cache!

  kontroller._dispatch(action)

  kontroller
end

So as you can see the block you give is used to create the controller instance. You can either return a hash that will be used as the env for the build_request, return a request or return a controller.

Once the kontroller instance is created, force_cache! is called to bypass the cache read in _cache_before and the action is dispatch to the controller.

Examples of using eager_cache

Class method:

Eager caching show and index after creating:

  class Articles
    cache :show, :index
    eager_cache :update, :show do |params|
      self.build_request(build_url(:article, :id => params[:id]), :id => 1)
    end
    eager_cache :update, :index, :uri => '/articles'
    eager_cache :create, :index
    eager_cache(:create, [Timeline, :index]) {{ :uri => build_url(:timelines)}}
    eager_cache(:update, :show) do |params| 
      build_request(build_url(:article, :id => params[:id]), :id => params[:id])
    end

  end

Let’s analyse them one by one:

  eager_cache :update, :index, :uri => '/articles'

When the update action is completed, a get request to :index with ‘/articles’ uri will be cached (if you use the page store, this will be stored in ‘/articles.html’)

  eager_cache :create, :index

This does the same after the create action but uses the fact that if no uri is given, the current uri is used but the http method used is get. This defaults work well with standard resource controller….

  eager_cache(:create, [Timeline, :index]) {{ :uri => build_url(:timelines)}}

This line is equivalent to eager_cache(:create, [Timeline, :index]) { build_request(build_url(:timelines))} but is a bit more readable in my oppinion. It shows how you can use the url generation for specifying the uri.

  eager_cache(:update, :show) do |params| 
    build_request(build_url(:article, :id => params[:id]), :id => params[:id])
  end

This eager caches the show action for the updated object. It’s possible to do this, but I think using the instance method (see below) for this task is a cleaner approach.

Instance method:

If you want to cache the show action for a newly created article, you’ll need to use the instance_method.

  def create(article)
    @article = Article.new(article)
    if @article.save
      eager_cache :show, :uri => url(:article, @article)
      redirect resource(@article), :message => {:notice => "Article was successfully created"}
    else
      #....
    end
  end

What I usually do is create a common private method (e.g. eager_cache_article) that I call from update and create.

fetch_partial

It caches the result of a partial.

def fetch_partial(template, opts={}, conditions = {})
   template_id = template.to_s
   if template_id =~ %r{^/}
     template_path = File.dirname(template_id) / "_#{File.basename(template_id)}"
   else
     kontroller = (m = template_id.match(/.*(?=\/)/)) ? m[0] : controller_name
     template_id = "_#{File.basename(template_id)}"
   end

   unused, template_key = _template_for(template_id, opts.delete(:format) || content_type, kontroller, template_path)
   template_key.gsub!(File.expand_path(Merb.root),'')

   fetch_proc = lambda { partial(template, opts) }
   params_for_cache = opts.delete(:params_for_cache) || opts.dup

   concat(Merb::Cache[_lookup_store(conditions)].fetch(template_key, params_for_cache, conditions, &fetch_proc), fetch_proc.binding)
 end

It is called the same as a partial would except that you can add a conditions hash at the end that will be passed to the cache store.

Caveat

If you call fetch_partial with parameters that are instance of model it will fail. Currently the cache stores available convert the parameters to string using to_s. And @model.to_s by default will give you something like #<Article:0x255ca9c>

So instead, you can either override to_s in your model (I don’t recommend it, it doesn’t express your intent clearly and will probably be hell to maintain later) or you can pass the :params_for_cache option which will be used by your cache store.

Another possibility is to create a specific strategy store for your project that takes care of converting your models. It’s useful if you use a technique like the one explained by Tobias Lutke in http://blog.leetsoft.com/2007/5/22/the-secret-to-memcached, you can create a strategy store that is in charge of finding the current store version (and passing it along as a parameter) from the parameters passed to it.

After this article, you might want to check Merb-cache and its stores

I’m currently available for hire on a contract basis.

Introducing nginx_accept_language_module

Posted by Guillaume Maury
on Dec 17, 08

What does it do?

This module parses the Accept-Language header and gives the most suitable locale for the user from a list of supported locales from your website.

Why did I create it?

I’m using page caching with merb on a multi-lingual website and I needed a way to serve the correct language page from the cache For multi-lingual page caching see the cache branch of the merb_global fork at <http://github.com/giom/merb_global> , I’ll post a write-up about this later.

Syntax:

set_from_accept_language $lang en ja pl;

  • $lang is the variable in which to store the locale
  • en ja pl are the locales supported by your website

If none of the locales from accept_language is available on your website, it sets the variable to the first locale of your website’s supported locales (in this case en)

Caveat

It currently assumes that the accept-language is sorted by quality values (from my tests it’s the case for safari, firefox, opera and ie) and discards q (see <http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html>). In the situation where I’m using the module, this assumption works… but buyer beware :-)

Example configuration

If you have different subdomains for each languages

server {
    listen 80;
    server_name your_domain.com;
    set_from_accept_language $lang en ja zh;
    rewrite ^/(.*) http://$lang.your_domain.com redirect;
}

Or you could do something like this, redirecting people coming to ’/’ to /en (or /pt)

location / {
    set_from_accept_language $lang pt en;
     if ( $request_uri ~ ^/$ ) {
       rewrite ^/$ /$lang redirect;
       break;
     }
}

Where to get it?

It’s available on github at http://github.com/giom/nginx_accept_language_module

Installation

Download the module source from github: http://github.com/giom/nginx_accept_language_module or clone it with git clone git://github.com/giom/nginx_accept_language_module.git

Unpack, and then compile nginx with:

`./configure --add-module=path/to/nginx_accept_language_module`

I’m currently available for hire on a contract basis.

Introducing Mint Store a strategic store for merb

Posted by Guillaume Maury
on Dec 15, 08

If you’re coming here for the first time, be sure to check two other articles I wrote on merb-cache

What is the dog pile effect?

Memcache is great when you want to cache some rather heavy queries. For example, one query I cache on a project looks like this:

	select answers.question_id, answers.question_choice_id, COUNT(*) AS count, question_choices.text from answers INNER JOIN question_choices ON question_choice_id = question_choices.id WHERE answers.`question_id` = 1 AND answers.`media_id` = 1 GROUP BY answers.question_choice_id 

It’s a rather long query when there are a lot of rows so it makes sense to cache it. Now, I don’t want to always serve the same data so I use the expiration feature of memcache (:expire_in in merb-cache).

But what happens when some huge traffic comes and your cache expires? If the data ir quick to generate it’s not a big problem but it it uses a rather long query and take a few seconds then while the thread that got the first cache miss will recalculate the data, other requests will get a cache miss and also recalculate the data. And this added calculation might slow down the whole system (or your db server) leading you to a death spiral…

Mint Store

To solve this problem, Glenn Franxman released MintCache a caching engine for Django http://www.djangosnippets.org/snippets/155/. The Disqus team then released another implementation of MintCache http://blog.disqus.net/2008/06/11/mintcache-simple-version/ Mint Store is a port of this work to merb-cache.

It solves the problem by adding the time by adding some metadata: the stale date and if the cache is currently being refreshed. When doing a read request, it checks if the stale date is passed and if it’s passed sets refreshed to true and returns nil.

It’s probably easier to understand looking at the code (I need to get better at explaining this kind of stuff)

    def read(key, parameters = {})
      packed_data = store_read(key, parameters)
      return nil unless packed_data
      
      data, refresh_time, refreshed = *packed_data
      if !refreshed && (Time.now > refresh_time) 
        write(key, data, parameters, :expire_in => 0, :refreshed => true)
        return nil
      end
      data
    end

    def get_metadata_and_normalize!(conditions = {})
      expire_in = conditions.delete(:expire_in) || options[:expire_in]
      refreshed = conditions.delete(:refreshed)
      conditions[:expire_in] = expire_in + (conditions.delete(:mint_delay) || options[:mint_delay])
      [Time.now +  expire_in, refreshed]
    end

Compared to Disqus’ MintCache, I added two features:

  • When you use fetch (and provide a block), if the cache is stale, Mint Store will return the stale cache and update the cache (with the result of executing the provided block) after the request has been served using Merb.run_later
  • Deletion will just mark the cache as stale which will cause the next fetch to repopulate the cache. (This can be disabled, see the initialization options)
Important Note: Read and Fetch

Read returns nil the first time the cache becomes stale and then returns the stale cache for :mint_delay seconds. So on the contrary to using fetch where none of the clients will be penalized, if you use read, you will penalize one clients who will have to wait for the cache to be refreshed before his request is served. (fetch_fragment and fetch_partial from merb-cache both use fetch)

Initialization Options

Mint Store accepts several initialization options:

  • Behaviour options:

    • :force_delete if set to true, delete will just delete the data from the cache
    • :need_expire_in if set to true, writable? will return false if the :expire_in condition is not present. If you are going to use MintStore with the AdHocStore it makes sense to set it.
  • Default values:

    • :mint_delay : the difference between the stale date (that you provide by :expire_in) and the real :expire_in given to memcached (default: 30s)
    • :refresh_delay : the :expire_in value given to memcached while regenerating the cache (can set to 0 if you want memcache to never expire the stale cache while waiting for it to be refreshed)
    • :expire_in : default value for the stale date if not provided (default: 300s)

Example: setting the options

        register(:memcached_store, Merb::Cache::MemcachedStore)
        register(:mint_store, Merb::Cache::MintStore[:memcached_store], :need_expire_in => true, :refresh_delay => 0)

Where to get it

You can get it fresh from github at http://github.com/giom/mint_store Or install it as a gem with: gem install giom-mint_store --source http://gems.github.com

I’m currently available for hire on a contract basis.

Nginx configuration for merb with page caching (file store)

Posted by Guillaume Maury
on Dec 14, 08

This is the nginx configuration I use for this blog. One thing to note is that on the contrary to the rails convention of putting the cache in public, I like to segregate it in the public/page_cache directory.

upstream seshat {
  server 127.0.0.1:4444;
  server 127.0.0.1:4445;
}


# Redirecting all www.gom-jabbar.org trafic to gom-jabbar.org
server {
 listen       80;
 server_name  www.gom-jabbar.org;
 rewrite ^/(.*) http://gom-jabbar.org permanent;

}


# the server directive is nginx's virtual host directive.
 server {
   # port to listen on. Can also be set to an IP:PORT
   listen       80;

   # sets the domain[s] that this vhost server requests for
   server_name  gom-jabbar.org;

   # vhost specific access log
   access_log      /var/log/nginx/seshat.access_log main;
   error_log       /var/log/nginx/seshat.error_log info;


   #Set the max size for file uploads to 1Mb
   client_max_body_size  1M;

   root /sites/seshat_blog/public;
   set $cache_path $document_root/page_cache;
   
   index  index.html;
   
   # this rewrites all the requests to the maintenance.html
   # page if it exists in the doc root. This is for capistrano's
   # disable web task
   if (-f $document_root/system/maintenance.html) {
     rewrite  ^(.*)$  /system/maintenance.html last;
     break;
   }
   
   location / {

     proxy_set_header  X-Real-IP  $remote_addr;
     proxy_set_header  X-Forwarded-For $proxy_add_x_forwarded_for;
     proxy_set_header  Host $http_host;
     proxy_redirect    false;

     # The next two if blocks tell nginx to look for the cache
     # serve it if it exists
     if (-f $cache_path$uri.html) {
       rewrite (.*) /cache$1.html break;
     }

     if (-f $cache_path$uri) {
       rewrite (.*) /cache$1 break;
     }
     
     if (-f $cache_path${uri}index.html) {
       rewrite (.*) /cache$1/index.html break;
     }

     if (!-f $request_filename) {
       proxy_pass http://seshat;
       break;
     }
   }
   
   # Add expires header for static content
   # See Steve Souders - High Performance Web Sites rule #3
   # Normally when I add this, I also bundle all my css, javascript
   # with yuicompressor, and add the git commit hash
   # of those files to their name
   # I haven't done it yet with this blog though
   location ~* \.(js|css|jpg|jpeg|gif|png|svg)$ {
     if (-f $request_filename) {
           expires      max;
       break; 
     }  
   }

   error_page  500 502 503 504 /50x.html;
   location = /50x.html {
     root   html;
   }
 }

My merb-cache setup looks like this:

  dependency 'merb-cache', merb_gems_version do 
    Merb::Cache.setup do
      register(:blog_fragment_store, Merb::Cache::FileStore, :dir => Merb.root / 'cache' / 'fragments')
      register(:blog_page_store, Merb::Cache::PageStore[Merb::Cache::FileStore], :dir => Merb.root / 'public' / 'page_cache')
      register(:default, Merb::Cache::AdhocStore[:blog_page_store, :blog_fragment_store])
    end
  end

I’m currently available for hire on a contract basis.

Merb-cache and its stores

Posted by Guillaume Maury
on Dec 09, 08

I’ve been playing around quite a lot with merb-cache so I thought I would write a bit about merb-cache to help others. I haven’t written many tutorials before so I might err too much on the side of over-explaining things. So please, tell me if you liked this article… and if you see any fault in my english, I’d be very glad to learn about them (3 years in japan pretty much destroyed my english).

For a basic introduction to merb-cache, you should read:

You might also want to first read another article in the serie Merb-cache’s methods

In this first part, I’ll describe the different stores available, in which situations they kick in and important remarks about there usage. I’ll use a top down approach and start with the strategy stores.

There is a quick summary/cheat sheet at the end…

The Strategy Stores

The AdHocStore

Used to select the most appropriate store.

Conditions for usage (writable?)

If you look at its writable? method

    def writable?(key, parameters = {}, conditions = {})
      @stores.capture_first {|s| s.writable?(key, parameters, conditions)}
    end

It looks for the first store in the list that is writable?

Writing

Same thing, for the write method:

    def write(key, data = nil, parameters = {}, conditions = {})
      @stores.capture_first {|s| s.write(key, data, parameters, conditions)}
    end
Important note

Notice that the order is important, when setting the Adhoc store up the stores you give should be given in the order of the least likely to the most likely to work (meaning writable? true)

	register(:correct, Merb::Cache::AdhocStore[:page_store, :action_store, :fragment_store])
	register(:wrong,   Merb::Cache::AdhocStore[:action_store, :page_store, :fragment_store])

if you use the :wrong adhoc store, the page store will never be triggered.

The PageStore:

Used to cache the result of a complete page so as to use it directly in nginx (or other server you might have) without hitting the merb application at all (much faster).

Conditions for usage (writable?)

Same here, let’s look at its writable? method

  def writable?(dispatch, parameters = {}, conditions = {})
      if Merb::Controller === dispatch && dispatch.request.method == :get &&
          !dispatch.request.uri.nil? && !dispatch.request.uri.empty? &&
          !conditions.has_key?(:if) && !conditions.has_key?(:unless) &&
          query_string_present?(dispatch)
        @stores.any? {|s| s.writable?(normalize(dispatch), parameters, conditions)}
      else
        false
      end
    end

    def query_string_present?(dispatch)
      dispatch.request.env["REQUEST_URI"] == dispatch.request.uri
    end

It only accepts controllers that received a get request that have no query string parameters (e.g. ?q=z) and doesn’t cache any actions that has a if or unless conditions (e.g. cache :show, :unless => :logged_in?)

Writing

    def normalize(dispatch)
      key = dispatch.request.uri.split('?').first
      key << "index" if key =~ /\/$/
      key << ".#{dispatch.content_type}" unless key =~ /\.\w{2,6}/
      key
    end

	def write(dispatch, data = nil, parameters = {}, conditions = {})
      if writable?(dispatch, parameters, conditions)
        @stores.capture_first {|s| s.write(normalize(dispatch), data || dispatch.body, {}, conditions)}
      end
    end

The normalize method that is used to generate a key, creates a key depending on the uri and content_type. The write method doesn’t send any parameters (which would be encoded as part of the key by memcached_store and file_store), so the page cache key doesn’t depend on any parameters not encoded in the uri (consistent with writable? policy of only returning true when there are no query string parameters)

Important note

PageStore doesn’t support reading as it’s intended for use with a server like nginx (I’ll give some examples of nginx configuration in a later blog post). So read always returns nil.

  def read(dispatch, parameters = {})
    nil
  end

The ActionStore

Used to cache actions that have parameters or conditions for caching. Because the request will have to hit the merb application (and go through routing), it’s much slower than page store caching.

Conditions for usage (writable?)

    def writable?(dispatch, parameters = {}, conditions = {})
      case dispatch
      when Merb::Controller
        @stores.any?{|s| s.writable?(normalize(dispatch), parameters, conditions)}
      else false
      end
    end

As long as dispatch is a controller (meaning that the store is used by the cache controller method, e.g. cache :show) and of course that it has a substore for which store.writable?(normalize(dispatch), parameters, conditions) return true (but this is common to all strategic stores), it will be writable.

Writing

    def write(dispatch, data = nil, parameters = {}, conditions = {})
      if writable?(dispatch, parameters, conditions)
        @stores.capture_first {|s| s.write(normalize(dispatch), data || dispatch.body, parameters, conditions)}
      end
    end

   def normalize(dispatch)
      "#{dispatch.class.name}##{dispatch.action_name}" unless dispatch.class.name.blank? || dispatch.action_name.blank?
    end

The key only depends on the controller and action but the action store also passes along its parameters and conditions meaning that they too will be encoded in the key down the road.

Important note

The action store doesn’t take into account the content_type when caching, so if you use it for caching an action that provides more than one content_type, you will be in for some nasty surprise. (more on that in another blog post)

The GzipStore

Compresses the content of the cache with gzip. Useful with a nginx module like MemcachedGzip that serves the gzip content directly from memcached (and decompresses it on the fly for clients that do not support gzip content)

writable? always return true and when writing it keeps the key, parameters and conditions and just compresses the data.

The Sha1Store

It uses sha1 to digest the keys.

    def write(key, data = nil, parameters = {}, conditions = {})
      if writable?(key, parameters, conditions)
        @stores.capture_first {|c| c.write(digest(key, parameters), data, {}, conditions)}
      end
    end

    def digest(key, parameters = {})
      @map[[key, parameters]] ||= Digest::SHA1.hexdigest("#{key}#{parameters.to_sha2}")
    end

One thing to note is that it memoizes the result of the sha1 digest for the whole time your application is running.

Fundamental Stores

Filestore

Used to store the cache on the file system.

Conditions for usage (writable?)

    # File caching does not support expiration time.
    def writable?(key, parameters = {}, conditions = {})
      case key
      when String, Numeric, Symbol
        !conditions.has_key?(:expire_in)
      else nil
      end
    end

Works as long as the key is a string, numeric or symbol and when there is no :expire_in condition (memcached is better for that) So:

  • Doesn’t work when called directly from the controller cache class method (it needs a strategy store like ActionStore or PageStore to generate the key for it and give it the data)
  • Works with fetch_partial

MemcacheStore

Used to store cache in memcache

Conditions for usage (writable?)

Memcached store consideres all keys and parameters writable so writable? is always true

Writing

    # Returns cache key calculated from base key
    # and SHA2 hex from parameters.
    def normalize(key, parameters = {})
      parameters.empty? ? "#{key}" : "#{key}--#{parameters.to_sha2}"
    end

    # Writes data to the cache using key, parameters and conditions.
    def write(key, data = nil, parameters = {}, conditions = {})
      if writable?(key, parameters, conditions)
        begin
          @memcached.set(normalize(key, parameters), data, expire_time(conditions))
          true
        rescue
          nil
        end
      end
    end

The key is normalized by adding the parameters sha2 digest. When writing the expiration time is set to Time.now + :expire_in or 0

Summary

Cache Methods

NameContextUsageUsable storesKey type
cachecontrollercaches the result of the actionPageStore, ActionStorecontroller
eager_cachecontrollerrecaches the result of another action after the current action finishes using run later (great against dog piling)PageStore, ActionStorecontroller
fetch_partialviewfetches or caches the result of a partialGzipStore, Sha1Store, FileStore, MemcacheStorestring (template_location returned by _template_for)
fetch_fragmentcontroller or viewfetches or caches the result of a procGzipStore, Sha1Store, FileStore, MemcacheStorestring either :cache_key or the file and line the proc it’s called with is declared

Strategic Stores

StoreKey Typewritable? conditionUsageRemarks
AdHocStoreanylooks for its first writable? storeSelects the most appropriate storeorder is important (should be from least likely to most likely writable)
PageStorecontrollerget request with no query string parameters (e.g. ?q=z) and no :if or :unless conditioncache the page for webserverno support for reading (it’s the work of the webserver)
ActionStorecontrolleronly checks its storescache actions that have parameters or conditions for cachingdoesn’t encode content_type in it’s key
GzipStoreanyalways truecompresses cache with gzipUseful with MemcachedGzip
Sha1Storestring, numeric or symbolonly checks its storesdigest key + params with sha1

Fundamental stores

StoreKey Typewritable? conditionUsage
FileStorestring, numeric or symbolno :expire_in conditionStore the cache on the file system.
MemcacheStoreanyalways trueStore cache in memcache with expiration (:expire_in condition)

I’m currently available for hire on a contract basis.

Merb-auth quick tip

Posted by Guillaume Maury
on Dec 08, 08

How to keep things in the session object even after logging in

If you look at the source code from the merb-auth-slice-password sessions controller you have the following before filters

 before :_maintain_auth_session_before, :exclude => [:destroy]  # Need to hang onto the redirection during the session.abandon!
  before :_abandon_session,     :only => [:update, :destroy]
  before  :_maintain_auth_session_after,  :exclude => [:destroy]  # Need to hang onto the redirection during the session.abandon!

and the associated code:

  # @private
  def _maintain_auth_session_before
    @_maintain_auth_session = {}
    Merb::Authentication.maintain_session_keys.each do |k|
      @_maintain_auth_session[k] = session[k]
    end
  end
  
  # @private
  def _maintain_auth_session_after
    @_maintain_auth_session.each do |k,v|
      session[k] = v
    end
  end
  
  # @private
  def _abandon_session
    session.abandon!
  end

So if you want to keep some elements from the session accross login you just need to add the keys to Merb::Authentication.maintain_session_keys

Like this: (I put it in merb/merb-auth/setup.rb)

	Merb::Authentication.maintain_session_keys << :lang << :guest

I’m currently available for hire on a contract basis.

Please use bcrypt to store your passwords

Posted by Guillaume Maury
on Dec 02, 08

Update: If you want to use bcrypt with merb-auth, I have made a patch available at http://merb.lighthouseapp.com/projects/7433/tickets/1113-bcrypt-mixin-for-merb-auth-more

Currently most people on merb use sha1 hashes and salt the passwords with a nounce to store it. It’s a good practice, it protects you from the reddit embarrassment where a database full of plaintext passwords got leaked because, apart from brute-forcing, there is no way to reverse the hash to find the password. And by salting your hash with a nounce you protect yourself against from pre-generated rainbow tables. So all is good, right?

Well mostly, but why use md5 or sha1? They are made to be fast… Cryptographic hash functions are fast by design, as the buiding blocks of all cryptographic systems, they need to be. And of course, since they are used repeatidily, it makes sense to optimize hardware for it… So it’s possible to use something like nsa@home http://nsa.unaligned.org/ made of FPGAs that can search the full 8-character keyspace (from a 64-character set) in about a day for 800 hashes concurrently….

To attack a password scheme, you would normally use an incremental cracker (a rainbow table cracker is not much use if you the passwords are salted with a nounce) and those cracker rely on the speed of the hashing operation, the faster it is to hash your password, the more combinations you can try and the weaker your password scheme is.

So what can you use instead? You can use BCrypt an adaptive hashing scheme

Why?

  • It’s made to be computationally expensive by using blowfish block cypher that is slow to set up.
  • You can configure the cost of the hashing function to make it slower

Currently the cost is set to 10 by default but you can change it and make it higher as hardwares speed go up… All the password using the old cost will continue to work (bcrypt-ruby stores the cost along with the password) and new passwords will have a higher cost.

Some quick benchmark:

require 'sha1'
require 'bcrypt'
require 'md5'

Benchmark.bm(20) do |x|
  x.report("BCrypt (cost = 12):") { 500.times { BCrypt::Password.create("mypass", :cost => 12) } }
  x.report("BCrypt (cost = 10):") { 500.times { BCrypt::Password.create("mypass", :cost => 10) } }
  x.report("BCrypt (cost = 5):") { 500.times { BCrypt::Password.create("mypass", :cost => 2) } }
  x.report("md5:")  { 500.times { salt = MD5.hexdigest("--#{Time.now.to_s}--username--"); MD5.hexdigest("mypass-#{salt}") } }
  x.report("Sha1:") { 500.times { salt = Digest::SHA1.hexdigest("--#{Time.now.to_s}--username--"); Digest::SHA1.hexdigest("mypass-#{salt}") } }
end

#                           user     system      total        real
# BCrypt (cost = 12): 250.140000   5.130000 255.270000 (276.009547)
# BCrypt (cost = 10):  62.680000   1.230000  63.910000 ( 69.320089)
# BCrypt (cost = 5):    1.090000   0.020000   1.110000 (  1.185847)
# md5:                  0.010000   0.000000   0.010000 (  0.008147)
# Sha1:                 0.000000   0.000000   0.000000 (  0.008906)

The point is that you don’t care about the time needed to hash a single password in your system it’s not a bottleneck (0.12 seconds on my mac) but attacker will…

I’m currently available for hire on a contract basis.