A data translator from sql database to mongoDB.
Supports MySQL, PostgreSQL, SQLite, Oracle, SQLServer, and DB2 (Basically anything ActiveRecord has built-in). However, I’ve only tested it with MySql and SQLite
Supports any version of MongoDB
Learn more about MongoDB at: www.mongodb.org
gem install mongify
You might have to install active record database gems (you'll know if you get an error message while running mongify command)
In order for Mongify to do its job, it needs to know where your databases are located.
Building a config file is as simple as this:
sql_connection do adapter "mysql" host "localhost" username "root" password "passw0rd" database "my_database" batch_size 10000 # This is defaulted to 10000 but in case you want to make that smaller (on lower RAM machines) # Uncomment the following line if you get a "String not valid UTF-8" error. # encoding "utf8" end mongodb_connection do host "localhost" database "my_database" # Uncomment the following line if you get a "String not valid UTF-8" error. # encoding "utf8" end
You can check your configuration by running
mongify check database.config
Currently the only supported option is for the mongodb_connection.
mongodb_connection :force => true do # Forcing a mongodb_connection will drop the database before processing # ... end
You can omit the mongodb connection until you’re ready to process your translation
If your database is large and complex, it might be a bit too much work to write the translation file by hand. Mongify’s translate command can help with this:
mongify translation database.config
Or pipe it right into a file by running
mongify translation database.config > translation_file.rb
Creating a translation is pretty straightforward. It looks something like this:
table "users" do column "id", :key column "first_name", :string column "last_name", :string column "created_at", :datetime column "updated_at", :datetime end table "posts" do column "id", :key column "title", :string column "owner_id", :integer, :references => :users column "body", :text column "published_at", :datetime column "created_at", :datetime column "updated_at", :datetime end table "comments", :embed_in => :posts, :on => :post_id do column "id", :key column "body", :text column "post_id", :integer, :references => :posts column "user_id", :integer, :references => :users column "created_at", :datetime column "updated_at", :datetime end table "preferences", :embed_in => :users, :as => :object do column "id", :key, :as => :string column "user_id", :integer, :references => "users" column "notify_by_email", :boolean end table "notes", :embed_in => true, :polymorphic => 'notable' do column "id", :key column "user_id", :integer, :references => "users" column "notable_id", :integer column "notable_type", :string column "body", :text column "created_at", :datetime column "updated_at", :datetime end
Save the file as "translation_file.rb"
and run the command:
mongify process database.config translation_file.rb
Usage: mongify command database_config [database_translation.rb] Commands: "check" or "ck" >> Checks connection for sql and no_sql databases [configuration_file] "process" or "pr" >> Takes a translation and process it to mongodb [configuration_file, translation_file] "sync" or "sy" >> Takes a translation and process it to mongodb, only syncs (insert/update) new or updated records based on the updated_at column [configuration_file, translation_file] "translation" or "tr" >> Outputs a translation file from a sql connection [configuration_file] Examples: mongify translation datbase.config mongify tr database.config mongify check database.config mongify process database.config database_translation.rb mongify sync database.config database_translation.rb Common options: -h, --help Show this message -v, --version Show version
When dealing with a translation, there are a few options you can set
Structure for defining a table is as follows:
table "table_name", {options} do # columns go here... end
Table Options are as follow:
table "table_name" # Does a straight copy of the table table "table_name", :embed_in => 'users' # Embeds table_name into users, assuming a user_id is present in table_name. # This will also assume you want the table embedded as an array. table "table_name", # Embeds table_name into users, linking it via a owner_id :embed_in => 'users', # This will also assume you want the table embedded as an array. :on => 'owner_id' table "table_name", # Embeds table_name into users as a one to one relationship :embed_in => 'users', # This also assumes you have a user_id present in table_name :on => 'owner_id', # You can also specify both :on and :as options when embedding :as => 'object' # NOTE: If you rename the owner_id column, make sure you # update the :on to the new column name table "table_name", :rename_to => 'my_table' # This will allow you to rename the table as it's getting process # Just remember that columns that use :reference need to # reference the new name. table "table_name", :ignore => true # This will ignore the whole table (like it doesn't exist) # This option is good for tables like: schema_migrations table "table_name", # This allows you to specify the table as being polymorphic :polymorphic => 'notable', # and provide the name of the polymorphic relationship. :embed_in => true # Setting embed_in => true allows the relationship to be # embedded directly into the parent class. # If you do not embed it, the polymorphic table will be copied in to # MongoDB and the notable_id will be updated to the new BSON::ObjectID table "table_name" do # A table can take a before_save block that will be called just before_save do |row| # before the row is saved to the no sql database. row.admin = row.delete('permission').to_i > 50 # This gives you the ability to do very powerful things like: end # Moving records around, renaming records, changing values in row based on end # some values! Checkout Mongify::Database::DataRow to learn more table "users" do # Here is how to set new ID using the old id before_save do |row| row._id = row.delete('pre_mongified_id') end end table "preferences", :embed_in => "users" do # As of version 0.2, embedded tables with a before_save will take an before_save do |pref_row, user_row, unset_user_row| # extra argument which is the parent row of the embedded table. user_row.email_me = pref_row.delete('email_me') # This gives you the ability to move things from an embedded table row # to the parent row. if pref_row['one_name'] unset_user_row['last_name'] = true # This will delete/unset the last name for the parent row end end end
More documentation can be found at {Mongify::Database::Table}
Structure for defining a column is as follows:
column "name", :type, {options}
Columns with no type given will be set to :string
as of version 0.2: Leaving a column out when defining a table will result in the column being ignored
Before we cover the options, you need to know what types of columns are supported:
:key # Columns that are primary keys need to be marked as :key type. You can provide an :as if your :key is not an integer column :integer # Will be converted to a integer :float # Will be converted to a float :decimal # Will be converted to a string (due to MongoDB Ruby Drivers not supporting BigDecimal, read details in Mongify::Database::Column under Decimal Storage) :string # Will be converted to a string :text # Will be converted to a string :datetime # Will be converted to a Time format (DateTime is not currently supported in the Mongo ruby driver) :date # Will be converted to a Time format (Date is not currently supported in the Mongo ruby driver) :timestamps # Will be converted to a Time format :time # Will be converted to a Time format (the date portion of the Time object will be 2000-01-01) :binary # Will be converted to a string :boolean # Will be converted to a true or false values
column "post_id", :integer, :references => :posts # Referenced columns need to be marked as such, this will mean that they will be updated # with the new BSON::ObjectID. # NOTE: if you rename the table 'posts', you should set the :references to the new name column "name", :string, :ignore => true # Ignoring a column will make the column NOT copy over to the new database column "surname", :string, :rename_to => 'last_name' # Rename_to allows you to rename the column <em>For decimal columns you can specify a few options:</em> column "total", # This is a default conversion setting. :decimal, :as => 'string' column "total", # You can specify to convert your decimal to integer :decimal, # specifying scale will define how many decimal places to keep :as => 'integer', # Example: :scale => 2 will convert 123.4567 to 12346 before saving :scale => 2
More documentation can be found at {Mongify::Database::Column}
Mongify has been tested on Ruby 1.8.7-p330, 1.9.2-p136, 1.9.3-p327, 2.0.0-p353, 2.1.1
If you have any issues, please feel free to report them here: issue tracker
If you want to continually sync your sql database with MongoDB using Mongify, you MUST run the sync feature from the start. Using the process command will not permit future syncing. The sync feature leaves the ‘pre_mongify_id` in your documents as well as it creates it’s own collection called ‘mongify_sync_helper` to keep dates and times of your last sync.
As of May 2013, we’ve released an add-on tool to Mongify called Mongify-Mongoid which lets you use the translation.rb file to generate Mongoid models. It generates the model fields, relations (most of the way) and timestamps. This should save you some time from having to re-type all the models by hand. Check out the gem at: rubygems.org/gems/mongify-mongoid
-
Allow deeper embedding
-
Test in different databases
-
Give an ability to mark source DB rows as imported (allowing Mongify to run as a on going converter)
-
Can’t do anything to an embedded table
You just need bundler >= 1.0.8
gem install bundler bundle install copy over spec/support/database.example to spec/support/database.yml and fill in required info rake test:setup:mysql rake test:setup:postgresql rake test
Just want to thank my wife (Alessia) who not only puts up with me working on my free time but sometimes helps out listening to my craziness or helping me name classes or functions.
I would also like to thank Mon_Ouie on the Ruby IRC channel for helping me figure out how to setup the internal configuration file reading.
Another thanks goes out to eimermusic for his feedback on the gem, and a few bugs that he helps flush out. And to Pranas Kiziela who committed some bug and typo fixes that make Mongify even better!
This gem was made by Andrew Kalek from Anlek Consulting
Reach me at:
-
Twitter: @anlek
-
Email: andrew.kalek@anlekcom
The sync functionality was made by Hossam Hammady from Qatar Computing Research Institute
Reach me at:
-
Twitter: @hammady
-
Email: hhammady@qf[dot]orgqa
Copyright © 2011 - 2013 Andrew Kalek, Anlek Consulting
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.