Friday, February 27, 2015

MongoDB Write Operations

Write Operations Overview :

Write operations modify the data in mongodb. Write operations are atomic at document level.

If update operation include upsert:true, documents are inserted if the query condition does not match criteria.

db.people.update(
...    { name: "Andy" },
...    {
...       name: "Andy",
...       rating: 1,
...       score: 1
...    },
...    { upsert: true }
... )
WriteResult({
    "nMatched" : 0,
    "nUpserted" : 1,
    "nModified" : 0,
    "_id" : ObjectId("54f0b1051235b5c69441d9ea")
})
> db.people.update(    { name: "Andy" },    {       name: "Andy",       rating: 1,       score: 1    },    { upsert: true } )
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })


Write Concerns :

Unack : does not confirm whether error occured or not

Acknowledge : ensure that data is written and available for in-memory read but not persisted.

Journal : ensure that  data is written to disk.

What is journal : journal are binary content, journaling is process in which data is written to binary structure first in journal and then written to  data files. In case with clean shutdown, this files are removed. journal files starts with j_ in the journal directory

Replica Acknowledged : response confirms that data is written to primary as well as secondary.

Mongodb size of the document is 16 MB. when document increases more than this size, gridFS can be used. GridFS uses two collection, one for storing metadata and one for storing actual chunks.

Data Models design includes either having normalized data divided between collections or having single document with embedded data. having multiple collection poses for transctions related issues as mongodb have transaction commit at document level and no write operation can have more than one document involve.




MongoDB Query Plan and Distributed Queries

Mongodb query optimizer processes query and chooses the most efficient query plan for query. The results are buffered and query plan is cached, so next time when the query is fired cached query plan is used.

Query optimizer also evaluates query plan on timely basis based on the conditions like mongo restart, add or drop index, reIndex etc.

db.collection.getPlanCache() method provide interface to view query plan information and clear plan.

Distributed Queries : Sharded cluster allow you to partition data based on the shard key and mongos router based on the config metadata information will route the query to specific shard node only.

This is possible only if query includes shard key, in case if query does not include shard key, mongos router will route the query on all cluster.

Replica sets uses read preferences to determine how to route query. Read preferences can be per connection based or per operation based.

Default read preference are read on primary, nearest is on secodary with minimum n/w latency, secondaryOnly is read only through secondary and error if secondaries are not available. 

Secondary is when read from secondary and if all are not available read from primary.

Mongodb Installation and crud operation with index overview

Mongodb current version at the time of writing this blog is 2.8. It can be downloaded from

curl -O http://downloads.mongodb.org/osx/mongodb-osx-x86_64-2.6.8.tgz
 
or brew install mongodb on mac machines.
 
update PATH variable using 
 
export PATH=<mongodb-install-directory>/bin:$PATH 

and create data directory using

mkdir -p /data/db
 
now to run mongodb run command mongod or ./mongod from bin directory
 
mongo client - console based which can run through mongo or ./mongo
 
At this stage if you see prompt with mongo > means that you are connected to mongodb server
 
CRUD operations
 
Some understanding before we start on CRUD operations on console
 
1. show dbs will display all databases available
2. use test will connect to database test. If not existing at the time of inserting first
   document database will be created automatically.
3. mongo documents are json format, BSON is binary format type of json document
 
Inserting document in mongodb
 
db.students.insert({firstName:"paresh",lastName:"bhavsar",age:39,interest:["cricket","basket-ball","gardening"],address:{"zip":85298,"city":"phoenix"}})
 
Here, interest is array type and address is complex type element.

TO Retrieve document from collection
 
1. db.students.find({"firstName":"paresh"}) - search by first name
2. db.students.find({"address.city":"phoenix"}) - searching in complex structure
3. db.students.find({interest:{$all:["cricket","gardening"]}}) - search in array. all elements must match 

By default cursor are closed after 20 minutes this timeframe can be configured.

db.getServerStatus() command will provide information about server status.

_id is unique index for each collection in mongodb. you can not remove this index. All documents
inserted in mongodb will have default value of _id. 

To optimize the query performance, index needs to be created in mongodb.

At present as we have not created any index in the collection we can get following output for the query


 db.students.getIndexes()
[
 {
  "v" : 1,
  "key" : {
   "_id" : 1
  },
  "name" : "_id_",
  "ns" : "test.students"
 }
]

db.students.ensureIndex({firstName:1}) this will create single field index while below mentioned
will create compound index, as it is associated with multiple fields
 
db.students.ensureIndex({"firstName":1,"lastName":1})
 
now to check how many indexes are available we can use below mentioned command.
 
db.students.getIndexes()
[
 {
  "v" : 1,
  "key" : {
   "_id" : 1
  },
  "name" : "_id_",
  "ns" : "test.students"
 },
 {
  "v" : 1,
  "key" : {
   "firstName" : 1
  },
  "name" : "firstName_1",
  "ns" : "test.students"
 },
 {
  "v" : 1,
  "key" : {
   "firstName" : 1,
   "lastName" : 1
  },
  "name" : "firstName_1_lastName_1",
  "ns" : "test.students"
 }
]
 
 
 
As index is created we can findout explain command to check whether mongo is using indexes or not for query
 
db.students.find({firstName:"paresh"}).explain()
{
 "cursor" : "BtreeCursor firstName_1",
 "isMultiKey" : false,
 "n" : 1,
 "nscannedObjects" : 1,
 "nscanned" : 1,
 "nscannedObjectsAllPlans" : 2,
 "nscannedAllPlans" : 2,
 "scanAndOrder" : false,
 "indexOnly" : false,
 "nYields" : 0,
 "nChunkSkips" : 0,
 "millis" : 0,
 "indexBounds" : {
  "firstName" : [
   [
    "paresh",
    "paresh"
   ]
  ]
 },
 "server" : "Pareshs-MacBook-Pro.local:27017",
 "filterSet" : false
}

Here, index firstName_1 is used, for query with first name and last name it is mentioned below. 
 
db.students.find({firstName:"paresh","lastName":"bhavsar"}).explain()
{
 "cursor" : "BtreeCursor firstName_1_lastName_1",
 "isMultiKey" : false,
 "n" : 1,
 "nscannedObjects" : 1,
 "nscanned" : 1,
 "nscannedObjectsAllPlans" : 2,
 "nscannedAllPlans" : 2,
 "scanAndOrder" : false,
 "indexOnly" : false,
 "nYields" : 0,
 "nChunkSkips" : 0,
 "millis" : 0,
 "indexBounds" : {
  "firstName" : [
   [
    "paresh",
    "paresh"
   ]
  ],
  "lastName" : [
   [
    "bhavsar",
    "bhavsar"
   ]
  ]
 },
 "server" : "Pareshs-MacBook-Pro.local:27017",
 "filterSet" : false
} 
 
 
For queries which includes firstName and age indexes used are 
 
 db.students.find({firstName:"paresh","age":{$gt:30}}).explain()
{
 "cursor" : "BtreeCursor firstName_1_lastName_1",
 "isMultiKey" : false,
 "n" : 1,
 "nscannedObjects" : 1,
 "nscanned" : 1,
 "nscannedObjectsAllPlans" : 2,
 "nscannedAllPlans" : 2,
 "scanAndOrder" : false,
 "indexOnly" : false,
 "nYields" : 0,
 "nChunkSkips" : 0,
 "millis" : 0,
 "indexBounds" : {
  "firstName" : [
   [
    "paresh",
    "paresh"
   ]
  ],
  "lastName" : [
   [
    {
     "$minElement" : 1
    },
    {
     "$maxElement" : 1
    }
   ]
  ]
 },
 "server" : "Pareshs-MacBook-Pro.local:27017",
 "filterSet" : false
} 
 
 
 
Why is performance better, because index occupies less storage than the document and index information
is available in RAM and mostly sequential on disk.
 
If we are searching only thru ID field it uses IDCursor index to search documents
 
db.students.find({_id:1,"address.zip":85298}).explain()
{
 "cursor" : "BtreeCursor _id_",
 "isMultiKey" : false,
 "n" : 0,
 "nscannedObjects" : 0,
 "nscanned" : 0,
 "nscannedObjectsAllPlans" : 1,
 "nscannedAllPlans" : 1,
 "scanAndOrder" : false,
 "indexOnly" : false,
 "nYields" : 0,
 "nChunkSkips" : 0,
 "millis" : 0,
 "indexBounds" : {
  "_id" : [
   [
    1,
    1
   ]
  ]
 },
 "server" : "Pareshs-MacBook-Pro.local:27017",
 "filterSet" : false
}
 
 
 
IndexOnly Field : Covered Index

If returned fields in the query (called projection) are all covered with indexes, mongo will not
have to read from disk to retrieve the document and this give extremely fast, blazing fast
performance. This type of queries are called covered query.
 
> db.students.find({"firstName":"paresh"},{_id:0,"firstName":1}).explain()
 
{
 "cursor" : "BtreeCursor firstName_1",
 "isMultiKey" : false,
 "n" : 6,
 "nscannedObjects" : 0,
 "nscanned" : 6,
 "nscannedObjectsAllPlans" : 0,
 "nscannedAllPlans" : 12,
 "scanAndOrder" : false,
 "indexOnly" : true,
 "nYields" : 0,
 "nChunkSkips" : 0,
 "millis" : 0,
 "indexBounds" : {
  "firstName" : [
   [
    "paresh",
    "paresh"
   ]
  ]
 },
 "server" : "Pareshs-MacBook-Pro.local:27017",
 "filterSet" : false
} 

As we have index on lastName below mentioned query is displaying indexOnly value as true.

db.students.find({"firstName":"paresh"},{_id:0,"firstName":1,lastName:1}).explain()
{
 "cursor" : "BtreeCursor firstName_1_lastName_1",
 "isMultiKey" : false,
 "n" : 6,
 "nscannedObjects" : 0,
 "nscanned" : 6,
 "nscannedObjectsAllPlans" : 6,
 "nscannedAllPlans" : 12,
 "scanAndOrder" : false,
 "indexOnly" : true,
 "nYields" : 0,
 "nChunkSkips" : 0,
 "millis" : 0,
 "indexBounds" : {
  "firstName" : [
   [
    "paresh",
    "paresh"
   ]
  ],
  "lastName" : [
   [
    {
     "$minElement" : 1
    },
    {
     "$maxElement" : 1
    }
   ]
  ]
 },
 "server" : "Pareshs-MacBook-Pro.local:27017",
 "filterSet" : false
}





 
 
 
 


 


 
 

Tuesday, February 24, 2015

Certificates Using Java KeyTool and Portecle

Keystores stores certificates, using below mentioned command mysite-keystore.jks file will be created. There are two password to be provided one for keystore and second password for alias

keytool -genkey -alias mysite.com -keyalg RSA -keystore mysite-keystore.jks -keysize 2048



This will generate file named as mysite-keystore.jks file which will contain certificate information as provided.

Now, to list certificate information using keystore below mentioned command can be used.

keytool -list -v -keystore mysite-keystore.jks

This will list self signed certificate inside your keystore. To configure tomcat with selfsigned certificate below mentioned is server.xml file change for e.g.

<Connector port="7443" protocol="org.apache.coyote.http11.Http11Protocol"
               maxThreads="150" SSLEnabled="true" scheme="https" secure="true"
               clientAuth="false" sslProtocol="TLS"
               keystoreFile="/home/ec2-user/certs/gruber-keystore"
                keystorePass="changeit"
  />

Certificates are signed by CA called certification authority, in which case we need to generate certificate signing request called csr. Keytool command can genereate csr request using below mentioned.

keytool -certreq -alias mysite.com -keystore mysite-keystore.jks -file mydomain.csr

This will generate mydomain.csr file which we need to send to CA and it will be signed by them.

CA provides certificates which includes root certificate and certificate chains. We can import those certificates using

keytool -import -trustcacerts -alias root -file Thawte.crt -keystore mydomain-keystore.jks

Portecle is very good tool for generating keystore, generating keypair, examining certificate and importing signed certificate.

More information on protecle can be found here. http://portecle.sourceforge.net/








Monday, February 23, 2015

Spring Boot (RESTful WS, https and swagger)

RESTful web services provides http endpoints. Http endpoints can have functions like POST, PUT, DELETE and GET.  Http methods can be mapped to operations like create, update, retrieve and delete. For example we can have Http GET endpoint to retrieve user information and http POST to create customer. PUT can be for update and DELETE for deleting customer.

Spring Boot provides very simple way to create Restful Webservice with embedded tomcat or jetty server. Below mentioned is dependency to be added for creating Restful WebService using Spring Boot.

       <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>


We also require to add parent element inside pom.xml file as mentioned below.

   <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>1.1.10.RELEASE</version>
    </parent>


Now, RestEndpoint can be created by annotating class with @RestController as shown below.

@RestController
@RequestMapping("/users")

public class UserController {

 .....

}

@RestController will facilitate to create http endpoints for each function defined in UserController class. @RequestMapping will map /users URL to functions.

Functions can be declared as shown in below to create Http POST method.

 @RequestMapping(method = RequestMethod.POST)
 public CreateUserResponse createUser(@RequestBody User user) throws Exception {
        validateUser(user);
        String token = userService.createUser(user);
        return new CreateUserResponse(token);
    }


Here, method is annotated with POST method type and RequestMapping e.g. http://<host-name>:port/users ensure that it will call createUser function. @RequestBody will marshal json text to Java Object. For e.g. json request like {"firstName":"steven","lastName":"sharma"} will create java object as described below. Marshalling / Unmarshalling operation are provided by Jackson API.

class User {
    private String firstName;
    private String lastName;
    // getter and setter

}

To accept parameters from query string @RequestParam is used. so below function will be mapped to URL http://<localhost>:port/users/=?firstName=myname

@RequestMapping(method = RequestMethod.GET)
    public List<User> searchUser(@RequestParam("firstName") String firstName) throws Exception {
       return null;
    }


To accept parameter as a part of URL path for e.g.  URL http://<localhost>:port/users/firstName/myname

  @RequestMapping(value = "/firstName/{name}",method = RequestMethod.GET)
    public List<User> searchUser(@PathVariable("name") String name) throws Exception {
       return userService.getUsersByFirstName(name);
    }
 



Main function of the method is defined below.

@ComponentScan("basepackage")
public class Application {
    public static void main(String[] args) {
        SpringApplication.run(Application.class,args);
    }
}



mvn clean package
java -jar project.jar

Above commands will start tomcat as embedded server and generate rest endpoints as per discussed above.

Changing server port : 

Configuration parameters can be passed through yml file which is part of resource directory, To change port of tomcat which is embedded in spring-boot project. Add application.yml file in resources directory and add below mentioned configuration parameters, to ensure that rest endpoint will have port no. 9191 associated with.

server:
  port: 9191


Enabling Https :

Generating self signed certificate and private key using below mentioned command

keytool -genkey -alias aliasName -storetype PKCS12 -keyalg RSA -keysize 2048 -keystore myapp.p12 -validity 365

This will create myapp.p12 file. Create Spring Bean as shown below from configuration file.

 @Bean
    public EmbeddedServletContainerCustomizer containerCustomizer() throws FileNotFoundException {
        final String absoluteKeystoreFile = ResourceUtils.getFile(keyStore).getAbsolutePath();

        final TomcatConnectorCustomizer customizer = new GruberTomcatConnectionCustomizer(
                absoluteKeystoreFile, "changeit", "PKCS12", "gruber",tomcatPort);

        return new EmbeddedServletContainerCustomizer() {

            @Override
            public void customize(ConfigurableEmbeddedServletContainer container) {
                if(container instanceof TomcatEmbeddedServletContainerFactory) {
                    TomcatEmbeddedServletContainerFactory containerFactory = (TomcatEmbeddedServletContainerFactory)          container;
                    containerFactory.addConnectorCustomizers(customizer);
                }
            };
        };
    }


Create below mentioned class

public class GruberTomcatConnectionCustomizer implements TomcatConnectorCustomizer {

        private String absoluteKeystoreFile;
        private String keystorePassword;
        private String keystoreType;
        private String keystoreAlias;
        private int tomcatPort;

        public GruberTomcatConnectionCustomizer(String absoluteKeystoreFile,
                String keystorePassword, String keystoreType, String keystoreAlias, int tomcatPort) {
            this.absoluteKeystoreFile = absoluteKeystoreFile;
            this.keystorePassword = keystorePassword;
            this.keystoreType = keystoreType;
            this.keystoreAlias = keystoreAlias.toLowerCase();
            this.tomcatPort = tomcatPort;

        }

        @Override
        public void customize(Connector connector) {
            connector.setPort(tomcatPort);
            connector.setSecure(true);
            connector.setScheme("https");

            connector.setAttribute("SSLEnabled", true);
            connector.setAttribute("sslProtocol", "TLS");
            connector.setAttribute("protocol", "org.apache.coyote.http11.Http11Protocol");
            connector.setAttribute("clientAuth", false);
            connector.setAttribute("keystoreFile", absoluteKeystoreFile);
            connector.setAttribute("keystoreType", keystoreType);
            connector.setAttribute("keystorePass", keystorePassword);
            connector.setAttribute("keystoreAlias", keystoreAlias);
            connector.setAttribute("keyPass", keystorePassword);
        }
 }


This configuration will start tomcat on secured port.

Here is configuration to be placed on application.yml file

tomcat-port: 7443

In case if you want to run tomcat with https and self-signed 

https://looksok.wordpress.com/2014/11/16/configure-sslhttps-on-tomcat-with-self-signed-certificate/