Monday, December 15, 2014

MongoDB Security

This blog describes details to understand the way mongo security work for client-server and between mongo instances in replica set environment. Securing data remains concern areas for solution architects and this blog will help developers to set up mongo environment enabling secuity. 

To implement security control for mongo data, all clients must be authenticated. Mongo authentication (username/password) is always validated with specific database, the way mongo stores users in system.users collection of admin database. Once authenticated before client execute any command authorization is verified to allow specific action on mongo instances based on role associated with. Authorization can be set based on pre-defined roles applicable over database or on individual collection.

For replica set environment, mongo instances communicate each other based on keyfile or X509 certificate. Same keyfile must be used on all instances of mongo instances and mongo instances are required to run with --keyfile option. Steps to run mongo instances with security enabled are mentioned below.

Create admin user with mongo built in role root. Below mentioned are steps to follow.

run mongd with --noauth option (e.g. mongod or mongod --noauth)
connect using mongo
db.createUser({"user":"superuser","pwd":"123456","roles":["root"]})
Now, run mongod without --auth option

connect to mongod instance using mongo command 
mongo -u superuser -p 123456 --authenticateDatabase admin

Important to understand that two same name users can be created provided both of them belongs to the difference database. So we can create two users with the same name provided we switch database using 'use databasename'

Below mentioned will not be authenticated as it mongo will try to authenticate user with test database while superuser belong to admin database and it must be authenticated using admin database.

mongo -u superuser -p 123456

Once connected as the role is root it can access any database.

Logically it is should not be possible to connect to mongo using 'mongo' command, but mongod will allow connection to be established with but not allowing any actions. for e..g. after connecting even 'show dbs' will not work.

Below mentioned is java code using mongo driver to connect to database.

 List<MongoCredential> credentials =  new ArrayList<>();     credentials.add(MongoCredential.createMongoCRCredential("superuser","admin","123456".toCharArray()));
DBCollection collection = new MongoClient(new ServerAddress("127.0.0.1",27017), credentials).getDB("tenantDB").getCollection("test");collection.insert(new BasicDBObject("x",1));

 In situation where we need to enable mongod instance with --auth option below mentioned are steps.

generate key file using 

openssl rand -base64 741 > mongodb-keyfile
chmod 600 mongodb-keyfile

mongod --replSet m101 --logpath "2.log" --dbpath /data/rs2 --port 27018 --smallfiles --oplogSize 64 --auth --keyFile mongodb-keyfile --fork
 

mongod --replSet m101 --logpath "2.log" --dbpath /data/rs2 --port 27018 --smallfiles --oplogSize 64 --auth --keyFile mongodb-keyfile --fork
 

mongod --replSet m101 --logpath "2.log" --dbpath /data/rs2 --port 27018 --smallfiles --oplogSize 64 --auth --keyFile mongodb-keyfile --fork

rs.initiate() from primary and adding members to replica set using 

rs.add("localhost:27018")
rs.add("localhost:27019")
rs.conf() and rs.status() will provide the status for the replica set. 

Any of the member to be added with replica set now must be supplied key file and it can run in --auth mode, else it will not be reachable by other members.
 
Reference :

http://docs.mongodb.org/manual/tutorial/deploy-replica-set-with-auth/







Thursday, December 11, 2014

MongDB ReplicaSet and Streaming

This blog is about providing information on mongodb replication and oplog streaming.

MongoDB instances (mongod) can be configured for replication. One instance will be working as primary while others will work as secondary. Primary instances are for reading and writing to mongo clusterl Secondary instances sync data written to primary using oplog (explain later in the same blog).

Below mentioned will created three mongod instance running on different ports.

Steps to create replica set …

mkdir -p /data/rs1 /data/rs2 /data/rs3
Create three mongod instances running on port no. 27017, 27018 and 27019
mongod --replSet m101 --logpath "1.log" --dbpath /data/rs1 --port 27017 --smallfiles --oplogSize 64 --fork

mongod --replSet m101 --logpath "2.log" --dbpath /data/rs2 --port 27018 --smallfiles --oplogSize 64 --fork

mongod --replSet m101 --logpath "3.log" --dbpath /data/rs3 --port 27019 --smallfiles --oplogSize 64 --fork
At this point all the instances are running separately, no replica is set.

Connect with one of the mongo instance

mongo —port 27017

Once you get mongo shell enter below configuration

config = { _id: "m101", members:[
          { _id : 0, host : "localhost:27017"},
          { _id : 1, host : "localhost:27018"},
          { _id : 2, host : "localhost:27019"} ]
};
rs.initiate(config);

Now, one of the instance will act as PRIMARY and remaining will be SECONDARY.


  • By default read / write operations are allowed only on the primary.To perform read operation on secondary one can enter slave.isOk() . Write operations are not permitted on secondary
  • Secondary are working as replica set (meaning write on primary will get sync with secondary in almost real time).
  • Secondary can be used for read load distribution which provides read operations horizontally scalable.

once replica is set, mongod instances will have local db which will have oplog.rs collection. oplog.rs collection is capped collection and it logs each operations (insert / update / delete) in any collection. for e.g. when we insert document in any of the collection in mongo, it is inserted in the oplog.rs.

For example below mentioned steps insert a document inserted in customer collection and its entry in oplog.rs collection.

It is possible to create tailable cursor on oplog.rs collection. (Note : Tailable cursor blocks the current thread on cursor.hasNext() method.)

Oplos will be rolled over once specified size of the oplog is reached deep

One of the implementation with my project we used oplog.rs to read document entry (insert / update / delete) for further processing. We also stored checkpoint       (oplog entry timestamp) so that when system is restarted it will process mongo documents from that point onwards. 

Program (java mongo driver) connected with one of the node will get all information about the replica set members. And if primary is down, application will transparently write to other mongod instance(which comes up as PRIMARY)

It is possible to add / remove members in the replica set without restarting mongo instances.

Now, let’s try to understanding sharding… Sharding add scalability to mongo architecture, shard instances are mongo instances and shard instance can have replica members to provide HA. Shard key can be created and there will be mongos instances which works as router to send read / write query to correct shard. Just to make aware that mongos communicate to mongod (PRIMARY) 

CRUD operations with mongo and oplog.rs corresponding entries

for e.g. if insert below mentioned statement we can find corrosponding entry in oplog.rs


m101:PRIMARY>use test;
m101:PRIMARY>db.customer.insert({“x”:1})
m101:PRIMARY>use local
m101:PRIMARY> db.oplog.rs.find().sort({"$natural":-1}).limit(1).pretty()
{
   "ts" : Timestamp(1418846623, 1),
   "h" : NumberLong("-5919409785401989933"),
   "v" : 2,
   "op" : "i",
   "ns" : "test.customer",
   "o" : {
      "_id" : ObjectId("5491e19f02d16dfe03bb04f6"),
      "x" : 1
   }
}


ns is namespace which is dbname.collectionname, o stands object that we are inserting in the namespace and op stands for type of operation we are performing in our case it is “i” stands for insert.


when we have update operation as mentioned below op will be type “u”


m101:PRIMARY> db.customer.update({"x":1},{"$set":{"y":1}});
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
m101:PRIMARY> use local
switched to db local
m101:PRIMARY> db.oplog.rs.find().sort({"$natural":-1}).limit(1).pretty()
{
   "ts" : Timestamp(1418847060, 1),
   "h" : NumberLong("-187714577821739110"),
   "v" : 2,
   "op" : "u",
   "ns" : "test.customer",
   "o2" : {
      "_id" : ObjectId("5491e19f02d16dfe03bb04f6")
   },
   "o" : {
      "$set" : {
          "y" : 1
      }
   }
}


in case when document is removed from any of the collection, we can have op value “d”


m101:PRIMARY> db.customer.remove({"x":1})
WriteResult({ "nRemoved" : 1 })
m101:PRIMARY> use local
switched to db local
m101:PRIMARY> db.oplog.rs.find().sort({"$natural":-1}).limit(1).pretty()
{
   "ts" : Timestamp(1418847272, 1),
   "h" : NumberLong("-849370824550739777"),
   "v" : 2,
   "op" : "d",
   "ns" : "test.customer",
   "b" : true,
   "o" : {
      "_id" : ObjectId("5491e19f02d16dfe03bb04f6")
   }
}


now let’s consider the scenario where we are updating existing value, there will not be any change in the oplog.rs entry, same as explained in the update entry.

Some of the useful links to further study