Sunday, September 30, 2012

MongoDB install guide and getting started guide

Getting  up and running with MongoDB is fairly easy. MongoDB is a good way to get introduced to the NoSQL world too. In less than five minutes, you can be messing around with the console app and learning MongoDB. This short tutorial and install guide shows how to use JavaScript and commands from the MongoDB terminal.

Installing MongoDB: Guide to getting started with MongoDB and install guide 

Now mix in some code samples to try out along with the concepts.
To install MongoDB go to their download page, download and untar/unzip the download to~/mongodb-platform-version/. Next you want to create the directory that will hold the data and create a mongodb.config file (/etc/mongodb/mongodb.config) that points to said directory as follows:

Listing: Installing MongoDB

$ sudo mkdir /etc/mongodb/data

$ cat /etc/mongodb/mongodb.config 

The /etc/mongodb/mongodb.config has one line dbpath=/etc/mongodb/data that tells mongo where to put the data. Next, you need to link mongodb to /usr/local/mongodb and then add it to the path environment variable as follows:

Listing: Setting up MongoDB on your path

$ sudo ln -s  ~/mongodb-platform-version/  /usr/local/mongodb
$ export PATH=$PATH:/usr/local/mongodb/bin

Run the server passing the configuration file that we created earlier.

Listing: Running the MongoDB server

$ mongod --config /etc/mongodb/mongodb.config

Short tutorial on using MongoDB
Mongo comes with a nice console application called mongo that let's you execute commands and JavaScript. JavaScript to Mongo is what PL/SQL is to Oracle's database. Let's fire up the console app, and poke around.

Firing up the mongos console application

$ mongo
MongoDB shell version: 2.0.4
connecting to: test
> db.version()

One of the nice things about MongoDB is the self describing console. It is easy to see what commands a MongoDB database supports with the db.help() as follows:

Client: mongo db.help()

> db.help()
DB methods:
db.addUser(username, password[, readOnly=false])
db.auth(username, password)
db.commandHelp(name) returns the help for the command
db.copyDatabase(fromdb, todb, fromhost)
db.createCollection(name, { size : ..., capped : ..., max : ... } )
db.currentOp() displays the current operation in the db
db.eval(func, args) run code server-side
db.getCollection(cname) same as db['cname'] or db.cname
db.getLastError() - just returns the err msg string
db.getLastErrorObj() - return full status object
db.getMongo() get the server connection object
db.getMongo().setSlaveOk() allow this connection to read from the nonmaster member of a replica pair
db.getProfilingStatus() - returns if profiling is on and slow threshold 
db.getSiblingDB(name) get the db at the same server as this one
db.isMaster() check replica primary status
db.killOp(opid) kills the current operation in the db
db.listCommands() lists all the db commands
db.runCommand(cmdObj) run a database command.  if cmdObj is a string, turns it into { cmdObj : 1 }
db.setProfilingLevel(level,{slowms}) 0=off 1=slow 2=all
db.version() current version of the server
db.getMongo().setSlaveOk() allow queries on a replication slave server
db.fsyncLock() flush data to disk and lock server for backups
db.fsyncUnock() unlocks server following a db.fsyncLock() 

Just see how you can see some of the commands refer to concepts we discussed earlier. Now let's create a collection of employees, and do some create, read, update operations on it.

Create Employee Collection

 > use tutorial; 
switched to db tutorial 
> db.getCollectionNames(); [ ]
 > db.employees.insert({name:'Rick Hightower', gender:'m', gender:'m', phone:'520-555-1212', age:42}); 
Mon Apr 23 23:50:24 [FileAllocator] allocating new datafile /etc/mongodb/data/tutorial.ns, ...

The use command uses a database. If that database does not exist, it will be lazily created the first time we access it (write to it). The db object refers to the current database. The current database does not have any document collections to start with (this is why db.getCollections() returns an empty list). To create a document collection, just insert a new document. Collections like databases are lazily created when they are actually used. You can see that two collections are created when we inserted our first document into the employees collection as follows:


> db.getCollectionNames();
[ "employees", "system.indexes" ]
The first collection is our employees collection and the second collection is used to hold onto indexes we create.

To list all employees you just call the find method on the employees collection.
> db.employees.find()
{ "_id" : ObjectId("4f964d3000b5874e7a163895"), "name" : "Rick Hightower", 
    "gender" : "m", "phone" : "520-555-1212", "age" : 42 }

The above is the query syntax for MongoDB. There is not a separate SQL like language. You just execute JavaScript code, passing documents, which are just JavaScript associative arrays, err, I mean JavaScript objects. To find a particular employee, you do this:

> db.employees.find({name:"Bob"})

He quit so to find another employee, you would do this:

> db.employees.find({name:"Rick Hightower"})
{ "_id" : ObjectId("4f964d3000b5874e7a163895"), "name" : "Rick Hightower", "gender" : "m", "phone" : "520-555-1212", "age" : 42 }

The console just prints out the document right to the screen. I don't feel that old. At least I am not 100 as shown by this query:

> db.employees.find({age:{$lt:100}})
{ "_id" : ObjectId("4f964d3000b5874e7a163895"), "name" : "Rick Hightower", "gender" : "m", "phone" : "520-555-1212", "age" : 42 }

Notice to get employees less than a 100, you pass a document with a subdocument, the key is the operator ($lt), and the value is the value (100). Mongo supports all of the operators you would expect like $lt for less than, $gt for greater than, etc. If you know JavaScript, it is easy to inspect fields of a document, as follows:

> db.employees.find({age:{$lt:100}})[0].name
Rick Hightower

If we were going to querysort or shard on employees.name, then we would need to create an index as follows:
db.employees.ensureIndex({name:1}); //ascending index, descending would be -1
Indexing by default is a blocking operation, so if you are indexing a large collection, it could take several minutes and perhaps much longer. This is not something you want to do casually on a production system. There are options to build indexes as a background taskto setup a unique index, and complications around indexing on replica sets, and much more. If you are running queries that rely on certain indexes to be performant, you can check to see if an index exists with db.employees.getIndexes(). You can also see a list of indexes as follows:

> db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "tutorial.employees", "name" : "_id_" }

By default all documents get an object id. If you don't not give it an object an _id, it will be assigned one by the system (like a criminal suspects gets a lawyer). You can use that _id to look up an object as follows with findOne:

> db.employees.findOne({_id : ObjectId("4f964d3000b5874e7a163895")})
{ "_id" : ObjectId("4f964d3000b5874e7a163895"), "name" : "Rick Hightower", 
   "gender" : "m", "phone" : "520-555-1212", "age" : 42 }

If you would like to learn more about MongoDB consider the following resources:

    Kafka and Cassandra support, training for AWS EC2 Cassandra 3.0 Training