Why DataMapper?

DataMapper differentiates itself from other Ruby Object/Relational Mappers in a number of ways:

One API for a variety of datastores

DataMapper comes with the ability to use the same API to talk to a multitude of different datastores. There are adapters for the usual RDBMS suspects, NoSQL stores, various file formats and even some popular webservices.

There's a probably incomplete list of available datamapper adapters on the github wiki with new ones getting implemented regularly. A quick github search should give you further hints on what's currently available.

Plays Well With Others

With DataMapper you define your mappings in your model. Your data-store can develop independently of your models using Migrations.

To support data-stores which you don't have the ability to manage yourself, it's simply a matter of telling DataMapper where to look. This makes DataMapper a good choice when Working with legacy databases

 1 class Post
 2   include DataMapper::Resource
 4   # set the storage name for the :legacy repository
 5   storage_names[:legacy] = 'tblPost'
 7   # use the datastore's 'pid' field for the id property.
 8   property :id, Serial, :field => :pid
10   # use a property called 'uid' as the child key (the foreign key)
11   belongs_to :user, :child_key => [ :uid ]
12 end

DataMapper only issues updates or creates for the properties it knows about. So it plays well with others. You can use it in an Integration Database without worrying that your application will be a bad actor causing trouble for all of your other processes.

DataMapper has full support for Composite Primary Keys (CPK) builtin. Specifying the properties that form the primary key is easy.

1 class LineItem
2   include DataMapper::Resource
4   property :order_id,    Integer, :key => true
5   property :item_number, Integer, :key => true
6 end

If we were to know an order_id/item_number combination, we can easily retrieve the corresponding line item from the datastore.

1 order_id, item_number = 1, 1
2 LineItem.get(order_id, item_number)
3 # => [#<LineItem @orderid=1 @item_number=1>]

Less need for writing migrations

With DataMapper, you specify the datastore layout inside your ruby models. This allows DataMapper to create the underlying datastore schema based on the models you defined. The #auto_migrate! and #auto_upgrade! methods can be used to generate a schema in the datastore that matches your model definitions.

While #auto_migrate! destructively drops and recreates tables to match your model definitions, #auto_upgrade! supports upgrading your datastore to match your model definitions, without actually destroying any already existing data.

There are still some limitations to the operations that #auto_upgrade! can perform. We're working hard on making it smarter, but there will always be scenarios where an automatic upgrade of your schema won't be possible. For example, there's no sane strategy for automatically changing a column length constraint from VARCHAR(100) to VARCHAR(50). DataMapper can't know what it should do when the data doesn't validate against the new tightened constraints.

In situations where neither #auto_migrate! nor #auto_upgrade! quite cut it, you can still fall back to the classic migrations feature provided by dm-migrations.

Here's some code that puts #auto_migrate! and #auto_upgrade! to use.

 1 require 'rubygems'
 2 require 'dm-core'
 3 require 'dm-migrations'
 5 DataMapper::Logger.new($stdout, :debug)
 6 DataMapper.setup(:default, 'mysql://localhost/test')
 8 class Person
 9   include DataMapper::Resource
10   property :id,   Serial
11   property :name, String, :required => true
12 end
14 DataMapper.auto_migrate!
16 # ~ (0.015754) SET sql_auto_is_null = 0
18 # ~ (0.283290) DROP TABLE IF EXISTS `people`
19 # ~ (0.029274) SHOW TABLES LIKE 'people'
20 # ~ (0.000103) SET sql_auto_is_null = 0
22 # ~ (0.000932) SHOW VARIABLES LIKE 'character_set_connection'
23 # ~ (0.000393) SHOW VARIABLES LIKE 'collation_connection'
24 # ~ (0.080191) CREATE TABLE `people` (`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `name` VARCHAR(50) NOT NULL, PRIMARY KEY(`id`)) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_general_ci
25 # => #<DataMapper::DescendantSet:0x101379a68 @descendants=[Person]>
27 class Person
28   property :hobby, String
29 end
31 DataMapper.auto_upgrade!
33 # ~ (0.000612) SHOW TABLES LIKE 'people'
34 # ~ (0.000079) SET sql_auto_is_null = 0
36 # ~ (1.794475) SHOW COLUMNS FROM `people` LIKE 'id'
37 # ~ (0.001412) SHOW COLUMNS FROM `people` LIKE 'name'
38 # ~ (0.001121) SHOW COLUMNS FROM `people` LIKE 'hobby'
39 # ~ (0.153989) ALTER TABLE `people` ADD COLUMN `hobby` VARCHAR(50)
40 # => #<DataMapper::DescendantSet:0x101379a68 @descendants=[Person]>

Data integrity is important

DataMapper makes it easy to leverage native techniques for enforcing data integrity. The dm-constraints plugin provides support for establishing true foreign key constraints in databases that support that concept.

 1 require 'rubygems'
 2 require 'dm-core'
 3 require 'dm-constraints'
 4 require 'dm-migrations'
 6 DataMapper::Logger.new($stdout, :debug)
 7 DataMapper.setup(:default, 'mysql://localhost/test')
 9 class Person
10   include DataMapper::Resource
11   property :id, Serial
12   has n, :tasks, :constraint => :destroy
13 end
15 class Task
16   include DataMapper::Resource
17   property :id, Serial
18   belongs_to :person
19 end
21 DataMapper.auto_migrate!
23 # ~ (0.000131) SET sql_auto_is_null = 0
25 # ~ (0.017995) SHOW TABLES LIKE 'people'
26 # ~ (0.000278) SHOW TABLES LIKE 'tasks'
27 # ~ (0.001435) DROP TABLE IF EXISTS `people`
28 # ~ (0.000226) SHOW TABLES LIKE 'people'
29 # ~ (0.000093) SET sql_auto_is_null = 0
31 # ~ (0.000334) SHOW VARIABLES LIKE 'character_set_connection'
32 # ~ (0.000278) SHOW VARIABLES LIKE 'collation_connection'
33 # ~ (0.187402) CREATE TABLE `people` (`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, PRIMARY KEY(`id`)) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_general_ci
34 # ~ (0.000309) DROP TABLE IF EXISTS `tasks`
35 # ~ (0.000313) SHOW TABLES LIKE 'tasks'
36 # ~ (0.200487) CREATE TABLE `tasks` (`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `person_id` INT(10) UNSIGNED NOT NULL, PRIMARY KEY(`id`)) ENGINE = InnoDB CHARACTER SET utf8 COLLATE utf8_general_ci
37 # ~ (0.146982) CREATE INDEX `index_tasks_person` ON `tasks` (`person_id`)
38 # ~ (0.002525) SELECT COUNT(*) FROM "information_schema"."table_constraints" WHERE "constraint_type" = 'FOREIGN KEY' AND "table_schema" = 'test' AND "table_name" = 'tasks' AND "constraint_name" = 'tasks_person_fk'
39 # ~ (0.230075) ALTER TABLE `tasks` ADD CONSTRAINT `tasks_person_fk` FOREIGN KEY (`person_id`) REFERENCES `people` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
40 # => #<DataMapper::DescendantSet:0x101379a68 @descendants=[Person, Task]>

Notice how the last statement adds a foreign key constraint to the schema definition.

Strategic Eager Loading

DataMapper will only issue the very bare minimums of queries to your data-store that it needs to. For example, the following example will only issue 2 queries. Notice how we don't supply any extra :include information.

 1 zoos = Zoo.all
 2 zoos.each do |zoo|
 3   # on first iteration, DM loads up all of the exhibits for all of the items in zoos
 4   # in 1 query to the data-store.
 6   zoo.exhibits.each do |exhibit|
 7     # n+1 queries in other ORMs, not in DataMapper
 8     puts "Zoo: #{zoo.name}, Exhibit: #{exhibit.name}"
 9   end
10 end

The idea is that you aren't going to load a set of objects and use only an association in just one of them. This should hold up pretty well against a 99% rule.

When you don't want it to work like this, just load the item you want in it's own set. So DataMapper thinks ahead. We like to call it "performant by default". This feature single-handedly wipes out the "N+1 Query Problem".

DataMapper also waits until the very last second to actually issue the query to your data-store. For example, zoos = Zoo.all won't run the query until you start iterating over zoos or call one of the 'kicker' methods like #length. If you never do anything with the results of a query, DataMapper won't incur the latency of talking to your data-store.

Note: that this currently doesn't work when you start to nest loops that access the associations more than one level deep. The following would not issue the optimal amount of queries:

 1 zoos = Zoo.all
 2 zoos.each do |zoo|
 3   # on first iteration, DM loads up all of the exhibits for all of the items in zoos
 4   # in 1 query to the data-store.
 6   zoo.exhibits.each do |exhibit|
 7     # n+1 queries in other ORMs, not in DataMapper
 8     puts "Zoo: #{zoo.name}, Exhibit: #{exhibit.name}"
10     exhibit.items.each do |item|
11       # currently DM won't be smart about the queries it generates for
12       # accessing the items in any particular exhibit
13       puts "Item: #{item.name}"
14     end
15   end
16 end

However, there's work underway to remove that limitation. In the future, it will be possible to get the same smart queries inside deeper nested iterations.

Depending on your specific needs, it might be possible to workaround this limitations by using DataMapper's feature that allows you to query models by their associations, as described briefly in the chapter below.

You can also find more information about this feature on the Finders and the Associations pages.

Querying models by their associations

DataMapper allows you to create and search for any complex object graph simply by providing a nested hash of conditions. The following example uses a typical Customer - Order domain model to illustrate how nested conditions can be used to both create and query models by their associations.

For a complete definition of the Customer - Order domain models have a look at the Finders page.

 1 # A hash specifying one customer with one order
 2 #
 3 # In general, possible keys are all property and relationship
 4 # names that are available on the relationship's target model.
 5 # Possible toplevel keys depend on the property and relationship
 6 # names available in the model that receives the hash.
 7 #
 8 customer = {
 9   :name   => 'Dan Kubb',
10   :orders => [
11     {
12       :reference   => 'TEST1234',
13       :order_lines => [
14         {
15           :item => {
16             :sku        => 'BLUEWIDGET1',
17             :unit_price => 1.00,
18           },
19         },
20       ],
21     },
22   ]
23 }
25 # Create the Customer with the nested options hash
26 Customer.create(customer)
27 # => [#<Customer @id=1 @name="Dan Kubb">]
29 # The same options to create can also be used to query for the same object
30 p Customer.all(customer)
31 # => [#<Customer @id=1 @name="Dan Kubb">]

QueryPaths can be used to construct joins in a very declarative manner.

Starting from a root model, you can call any relationship by its name. The returned object again responds to all property and relationship names that are defined in the relationship's target model.

This means that you can walk the chain of available relationships, and then match against a property at the end of that chain. The object returned by the last call to a property name also responds to all the comparison operators that we saw above. This makes for some powerful join construction!

1 Customer.all(Customer.orders.order_lines.item.sku.like => "%BLUE%")
2 # => [#<Customer @id=1 @name="Dan Kubb">]

You can even chain calls to all or first to continue refining your query or search within a scope. See Finders for more information.

Identity Map

One row in the database should equal one object reference. Pretty simple idea. Pretty profound impact. If you run the following code in ActiveRecord you'll see all false results. Do the same in DataMapper and it's true all the way down.

1 @parent = Tree.first(:conditions => { :name => 'bob' })
3 @parent.children.each do |child|
4   puts @parent.object_id == child.parent.object_id
5 end

This makes DataMapper faster and allocate less resources to get things done.

Laziness Can Be A Virtue

Columns of potentially infinite length, like Text columns, are expensive in data-stores. They're generally stored in a different place from the rest of your data. So instead of a fast sequential read from your hard-drive, your data-store has to hop around all over the place to get what it needs.

With DataMapper, these fields are treated like in-row associations by default, meaning they are loaded if and only if you access them. If you want more control you can enable or disable this feature for any column (not just text-fields) by passing a lazy option to your column mapping with a value of true or false.

1 class Animal
2   include DataMapper::Resource
4   property :id,    Serial
5   property :name,  String
6   property :notes, Text    # lazy-loads by default
7 end

Plus, lazy-loading of Text property happens automatically and intelligently when working with associations. The following only issues 2 queries to load up all of the notes fields on each animal:

1 animals = Animal.all
2 animals.each do |pet|
3   pet.notes
4 end

Embracing Ruby

DataMapper loves Ruby and is therefore tested regularly against all major Ruby versions. Before release, every gem is explicitly tested against MRI 1.8.7, 1.9.2, JRuby and Rubinius. We're proud to say that almost all of our specs pass on all these different implementations.

Have a look at our CI server reports for detailed information about which gems pass or fail their specs on the various Ruby implementations. Note that these results always reflect the state of the latest codes and not the state of the latest released gem. Our CI server runs tests for all permutations whenever someone commits to any of the tested repositories on Github.

All Ruby, All The Time

DataMapper goes further than most Ruby ORMs in letting you avoid writing raw query fragments yourself. It provides more helpers and a unique hash-based conditions syntax to cover more of the use-cases where issuing your own SQL would have been the only way to go.

For example, any finder option that are non-standard is considered a condition. So you can write Zoo.all(:name => 'Dallas') and DataMapper will look for zoos with the name of 'Dallas'.

It's just a little thing, but it's so much nicer than writing Zoo.find(:all, :conditions => [ 'name = ?', 'Dallas' ]) and won't incur the Ruby overhead of Zoo.find_by_name('Dallas'), nor is it more difficult to understand once the number of parameters increases.

What if you need other comparisons though? Try these:

 1 Zoo.first(:name => 'Galveston')
 3 # 'gt' means greater-than. 'lt' is less-than.
 4 Person.all(:age.gt => 30)
 6 # 'gte' means greather-than-or-equal-to. 'lte' is also available
 7 Person.all(:age.gte => 30)
 9 Person.all(:name.not => 'bob')
11 # If the value of a pair is an Array, we do an IN-clause for you.
12 Person.all(:name.like => 'S%', :id => [ 1, 2, 3, 4, 5 ])
14 # Does a NOT IN () clause for you.
15 Person.all(:name.not => [ 'bob', 'rick', 'steve' ])
17 # Ordering
18 Person.all(:order => [ :age.desc ])
19 # .asc is the default

Open Development

DataMapper sports a very accessible code-base and a welcoming community. Outside contributions and feedback are welcome and encouraged, especially constructive criticism. Go ahead, fork DataMapper, we'd love to see what you come up with!

Make your voice heard! Submit a ticket or patch, speak up on our mailing-list, chat with us on IRC, write a spec, get it reviewed, ask for commit rights. It's as easy as that to become a contributor.