Transform online stores into APIs !

Download .zip Download .tar.gz

Build Status Test Coverage Code Climate


A ruby gem that simplifies the declaration of APIs on online stores through scraping.


Once upon a time, I wanted to create online groceries with great user experience ! That’s how I started Unfortunately, most online groceries don’t have APIs, so I resorted to scrapping. Scrapping comes with its (long) list of problems as well !

  • Scrapping code is a mess
  • The scrapped html can change at any time
  • Scrappers are difficult to test

Refactoring by refactoring, I extracted this library which defines scrappers for any online store in a straightforward way (check auchandirect-scrAPI for my real world usage). A scrapper definition consists of :

  • a scrapper definition file
  • the selectors for the links
  • the selectors for the content you want to capture

As a result of using storexplore for mes-courses, the scrapping code was split between the storexplore gem and my special scrapper definition :

  • This made the whole overall code cleaner
  • I could write simple and reliable tests
  • Most importantly, I could easily keep pace with the changes in the online store html


How to define an API on in 10 minutes using storexplore


Add this line to your application’s Gemfile:

gem 'storexplore'

And then execute:

$ bundle

Or install it yourself as:

$ gem install storexplore

In order to be able to enumerate all items of a store in constant memory, Storexplore requires ruby 2.0 for its lazy enumerators.


Online stores are typically organized as hierarchies. For example Ikea (US) is organized as follows :

|-> Living room
|   |-> Sofas & armchairs
|   |   |-> Fabric Sofas
|   |   |   |-> Norsborg Sofa
|   |   |   |-> Norborg Loveseat
|   |   |   |-> ...
|   |   |   |-> Pöang Footstool cushion
|   |   |-> Leather Sofas
|   |   |-> ...
|   |   |-> Armchairs
|   |-> TV & media furniture
|   |-> ...
|   |-> Living room textiles & rugs
|-> Bedroom
|-> ...
|-> Dining

Storexplore builds hierarchical APIs on the following pattern :

|-> Category 1
|   |-> Sub Category 1
|   |   |-> Item 1
|   |   |-> ...
|   |   |-> Item n
|   |-> Sub Category 2
|   |-> ...
|   |-> Sub Category n
|-> Category 2
|-> ...
|-> Category n

The store is like a root category. Any level of depth is allowed. Any category, at any depth level can have both children categories and items. Items cannot have children of any kind. Both categories and items can have attributes.

All searching of children and attributes is done through mechanize/nokogiri selectors (css or xpath).

Here is a sample store api declaration for Ikea again:

Storexplore::Api.define '' do

  categories '.departmentLinkBlock a' do
    attributes do
      { :name => page.get_one("#breadCrumbNew .activeLink a").content.strip }

    categories '.departmentLinks a' do
      attributes do
        { :name => page.get_one("#breadCrumbNew .activeLink a").content.strip }

      categories 'a.categoryName' do
        attributes do
          { :name => page.get_one("#breadCrumbNew .activeLink a").content.strip }

        items '.productDetails > a' do
          attributes do
              :name => page.get_one('#name').content.strip,
              :type => page.get_one('#type').content.strip,
              :price => page.get_one('#price1').content.strip.sub('$','').to_f,
              :salesArgs => page.get_one('#salesArg').content.strip,
              :image => page.get_one('#productImg').attributes['src'].content,
              :ikea_id => page.uri.to_s.match("^.*\/([0-9]+)\/?$").captures.first

This defines a hierarchical API on the IKEA store that will be used to browse any store which URI contains

Now here is how this API can be accessed to pretty print all its content:

Storexplore::Api.browse('').categories.each do |category|

  puts "category: #{category.title.strip}"
  puts "attributes: #{category.attributes}"

  category.categories.each do |sub_category|

    puts "  category: #{sub_category.title.strip}"
    puts "  attributes: #{sub_category.attributes}"

    sub_category.categories.each do |sub_sub_category|

      puts "    category: #{sub_sub_category.title.strip}"
      puts "    attributes: #{sub_sub_category.attributes}"

      sub_sub_category.items.each do |item|

        puts "      item: #{item.title.strip}"
        puts "      attributes: #{item.attributes}"


(This sample can be found in samples/ikea.rb)


NOTE : please keep in mind that these testing utilities have been extracted from my first real use case (auchandirect-scrAPI) and might still rely on assumptions coming from there. Any help cleaning this up is welcome.

Testing Code Relying On A Scrapped Thirdparty

This can be quite a challenge. Storexplore can help you with that :

  • it provides a customizable offline (disk) dummy store generator
  • it provides an API for this store
  • As long as your dummy store provides the same attributes than the real store, you can use it in your tests

Dummy stores can be generated to the file system using the Storexplore::Testing::DummyStore and Storexplore::Testing::DummyStoreGenerator classes.

To use it, add the following to your spec_helper.rb for example :

require 'storexplore/testing'

Storexplore::Testing.config do |config|
  config.dummy_store_generation_dir= File.join(Rails.root, '../tmp')

It is then possible to generate a store with the following :

@store_generator =

You can add custom elements with explicit values :

  category(cat_name = "extra long category name").
  category(sub_cat_name = "extra long sub category name").
  item(item_name = "super extra long item name").generate().
    attributes(price: 12.3)

Storexplore provides an api definition for dummy stores in ‘storexplore/testing/dummy_store_api’. It can be required independently if needed.

Testing Your Own Scrapper

Storexplore also ships with an rspec shared examples macro. It guarantees basic scrapper well behavior such as the presence of many categories, of item names and prices

require 'storexplore/testing'

describe "MyStoreApi" do
  include Storexplore::Testing::ApiSpecMacros




Summary Testing Files To Require

  • To only get the api definition for a previously generated dummy store, it is enough to require ‘storexplore/testing/dummy_store_api’
  • To be able to generate and scrap dummy stores, it’s needed to require ‘storexplore/testing/dummy_store_generator’
  • To do all the previous and to use rspec utilities, require ‘storexplore/testing’


  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request


Fork me on GitHub