MongoDB JSON Description and working, Lecture notes of Advanced Data Analysis

MongoDB Uses working and example

Typology: Lecture notes

2018/2019

Uploaded on 07/29/2019

Ramyaarulraj
Ramyaarulraj 🇮🇳

4.8

(5)

4 documents

1 / 19

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Unit-III
MongoDB
Introduction to MongoDB :What is MongoDB-Why-Datatypes-
MongoDb Query Language.
What is mongoDB?
MongoDB is a
Cross-platform
Open source
Non-relational
Distributed
Mongodb is a document-oriented NoSQL database used for high volume data
storage. MongoDB is a database which came into light around the mid-2000s. It
falls under the category of a NoSQL database. The MongoDb contains the features
such as “Auto Sharding”, “Replication”, its “Rich query language”. “fast in-place
update”, etc.
As said before MongoDB is a document database. Each database contains
collections which in turn contains documents. Each document can be different with
varying number of fields. The size and content of each document can be different
from each other.
Why MongoDb?
1. The document structure is more in line with how developers construct their
classes and objects in their respective programming languages. Developers
will often say that their classes are not rows and columns but have a clear
structure with key-value pairs.
2. As seen in the introduction with NoSQL databases, the rows are called as
documents in MongoDB doesn't need to have a schema .
3. The data model available within MongoDB allows you to represent
hierarchical relationships, to store arrays, and other more complex structures
more easily.
4. Scalability – The MongoDB environments are very scalable.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13

Partial preview of the text

Download MongoDB JSON Description and working and more Lecture notes Advanced Data Analysis in PDF only on Docsity!

Unit-III

MongoDB

Introduction to MongoDB :What is MongoDB-Why-Datatypes-

MongoDb Query Language.

What is mongoDB?

- MongoDB is a - Cross-platform - Open source - Non-relational - Distributed

Mongodb is a document-oriented NoSQL database used for high volume data storage. MongoDB is a database which came into light around the mid-2000s. It falls under the category of a NoSQL database. The MongoDb contains the features such as “Auto Sharding”, “Replication”, its “Rich query language”. “fast in-place update”, etc.

As said before MongoDB is a document database. Each database contains collections which in turn contains documents. Each document can be different with varying number of fields. The size and content of each document can be different from each other.

Why MongoDb?

  1. The document structure is more in line with how developers construct their classes and objects in their respective programming languages. Developers will often say that their classes are not rows and columns but have a clear structure with key-value pairs.
  2. As seen in the introduction with NoSQL databases, the rows are called as documents in MongoDB doesn't need to have a schema.
  3. The data model available within MongoDB allows you to represent hierarchical relationships, to store arrays, and other more complex structures more easily.
  4. Scalability – The MongoDB environments are very scalable.
  1. Document-oriented – Since MongoDB is a NoSQL type database, instead of having data in a relational type format, it stores the data in documents. This makes MongoDB very flexible and adaptable to real business world situation and requirements.
  2. Ad hoc queries - MongoDB supports searching by field, range queries, and regular expression searches. Queries can be made to return specific fields within documents.
  3. Indexing - Indexes can be created to improve the performance of searches within MongoDB. Any field in a MongoDB document can be indexed.
  4. Replication - MongoDB can provide high availability with replica sets. A replica set consists of two or more mongo DB instances. Each replica set member may act in the role of the primary or secondary replica at any time. The primary replica is the main server which interacts with the client and performs all the read/write operations. The Secondary replicas maintain a copy of the data of the primary using built-in replication. When a primary replica fails, the replica set automatically switches over to the secondary and then it becomes the primary server.
  5. Load balancing - MongoDB uses the concept of sharding to scale horizontally by splitting data across multiple MongoDB instances. MongoDB can run over multiple servers, balancing the load and/or duplicating data to keep the system up and running in case of hardware failure. Common terms used in MongoDB

1._id – This is a field required in every MongoDB document. The _id field

represents a unique value in the MongoDB document. The _id field is like the document's primary key. If you create a new document without an _id field, MongoDB will automatically create the field. So for example, if we see the example of the above customer table, Mongo DB will add a 24 digit unique identifier to each document in the collection. _Id CustomerID CustomerName 563479cc8a8a4246bd27d784 11 Guru 563479cc7a8a4246bd47d784 22 Trevor Smith 563479cc9a8a4246bd57d784 33 Nicole

2. Database

It is a collection of collections. In other words its is like a container for collections. Each database gets its own set of files on the file system. A MongoDB server can store multiple databases. It is created when the first time that your collection makes a reference to it.

single primary and several secondary’s. Each write request from the client is directed to the primary. The primary logs all write requests into its Oplog(Operations log). The Oplog is then used by the secondary replica members to synchronize their data. This way there is strict adherence to consistency. The clients usually read from the primary. However, the client can also specify a read preference that will then direct the read operation to the secondary.

Writes Reads

Replication Replication

The process of REPLICATION in MongoDb

8. Sharding

Logical database (Collection 1)

The process of SHARDING in MongoDB

Replication Client ApplicationSecondarySecondarySecondaryCollection 1Primary 1 TB database

Shard 4 (256 GB)

Shard 2 (256 GB)

Shard 3 (256 GB)

Shard 1 (256 GB)

Sharding is akin to horizontal scaling. It means that the large dataset is divided and distributed over multiple servers or shards. Each shard is an independent database and collectively they would constitute a logical database.

The prime advantages of sharding are as follows:

1. Sharding reduces the amount of data that each shard needs to store and manage. For example, if the dataset was 1TB in size and we were to distribute this over four shards, each shard would house just 256GB data. The above example as the cluster grows, the amount of data that each shard will store and manage will decrease. 2. Sharding reduces the number of operations that each shard handles. For example, if we were to insert data, the application needs to access only that shard which houses that data.

JSON(Java Script Object Notation)

This is known as JavaScript Object Notation. This is a human-readable, plain text format for expressing structured data. JSON is currently supported in many programming languages.

JSON is extremely expressive. MongoDB actually does not JSON but BSON-it is Binary JSON. It is an open standard. It is used to store complex data structures. BSON extends the JSON model to provide additional data types.JSON and BSON are very similar, but BSON provides extra speed. In the relational database, these are tables, which are responsible for storing data in form of rows and columns. JSON uses objects and arrays. objects are key-value pairs and arrays are the list of values. They can be nested recursively.

Basic Constructs of JSON

connection.

Later on, this data can be retrieved. Because of the very nature of the JSON, it is useful for storing or representing semi structured data.

Documents have the following restrictions on field names:

  • The field name _id is reserved for use as a primary key; its value must be unique in the collection, is immutable, and may be of any type other than an array.
  • The field names cannot start with the dollar sign ($) character.
  • The field names cannot contain the dot (.) character.
  • The field names cannot contain the null character.

BSON documents may have more than one field with the same name. Most MongoDB interfaces, however, represent MongoDB with a structure (e.g. a hash table) that does not support duplicate field names. If you need to manipulate documents that have more than one field with the same name, see the driver documentation for your driver.

Some documents created by internal MongoDB processes may have duplicate fields, but no MongoDB process will ever add duplicate fields to an existing user document.

Terms used in RDBMS and MongoDB

RDBMS MongoDB Database Database Table Collection Record Document Columns Fields/Key Value pairs Index Index Joins Embedded documents Primary Key Primary key(_id)

To create a database in MongoDb

The syntax for creating databse is

Use DATABASE_Name

To create a database by the name “myDB” the syntax is

use myDB

To confirm the existing of the database

db;

To get a list of all databases, type

show dbs

Admin (empty)

Local 0.078GB

Test 0.078GB

To Drop a database in mongoDB

Update-> Update to data is accomplished using the update() method with UPSERT set to false.

Delete-> A document is deleted using the remove() method.

1 .To Create a inventory database

use inventory;

Or it can be also created using

db.createCollection(“inventory”)

{ “ok” : 1}

To list the collections(databases) use the following command

show collections;

inventory

person

food

system.indexes

System.js

To drop the collection named by food

db.food.drop();

It drops the collection food.

2.Insert method

To insert the values in the student database use the following command

Firstly create the student collection

use student;

The student database contains the following fields

Rollno

Name

Grade

Phone no

INSERT RECORD

db.student.insert({_id:1,rollno:101,name:"raja",grade:100,phno:9876543210});

One document gets inserted.

To check if the document for the student “raja” has been successfully inserted use the following command.

db.student.find();

To format the result use the following command

db.student.find().pretty();

To insert the another document into the collection use the following command

db.student.insert({_id:2,rollno:102,name:"ram",grade:90,phno:9876543211});

Similarly insert other documents into the collection

db.student.insert({_id:3,rollno:103,name:"raji",grade:80,phno:9876543212});

db.student.insert({_id:4,rollno:104,name:"rasi",grade:70,phno:9876543213});

db.student.insert({_id:5,rollno:105,name:"raki",grade:60,phno:9876543214});

So, Now the collection contains 5 documents.

To insert many documents(Records) to the collection

> db.inventory. insertMany ([

{ item: "journal", qty: 25, size: { h: 14, w: 21, uom: "cm" }, status: "A" },

{ item: "notebook", qty: 50, size: { h: 8.5, w: 11, uom: "in" }, status: "A" },

{ item: "paper", qty: 100, size: { h: 8.5, w: 11, uom: "in" }, status: "D" },

{ item: "planner", qty: 75, size: { h: 22.85, w: 30, uom: "cm" }, status: "D" },

{ item: "postcard", qty: 45, size: { h: 10, w: 15.25, uom: "cm" }, status: "A" }

]);

It adds the location:”Newark” to the document

Removing a field from an existing document in update method using UNSET

command

The field can be removed in general by using

db.student.remove({Age:{$gt:18}});

By using the UNSET command it can be removed

db.student.update{_id:4},{$unset:{Location:”Newark”}});

It removes the location from the document

3.SAVE ()Method

db.student.save({name:”rakshitha”,grade:55});

It saves the document with the collection

4.FIND method –finding documents based on Search criteria

To find the document using the name use the following command

db.student.find({name:"raki"});

It list the document raki in the collection

To find the document using the _id

db.student.find({_id:1},pretty();

It lists the document in the _id:

To display the id,name and rollno

db.student.find({_id:1},{rollno:1,name:1}).pretty();

To display the rollno and name but should not display the _id value use the

command

db.student.find({_id:1},{rollno:1,name:1,_id:0}).pretty();

Relational operators

$eq – equal to

$ne – not equal

$gte – greater than or equal to

$lte – less than or equal to

$gt – greater than

$lt – less than

To display the student document with the grade greater than 70

db.student.find({grade:{$gt:70}});

To display the student document with the grade not equal to 70 use the command

db.student.find({grade:{$ne:70}});

To list the student document with the grade less than or equal to 35 use

db.student.find({grade:{$lte:35}});

IN AND NOT IN function

To find those documents from the student collection where the hobby is set to

either

“chess” or is set to “skating” use the command

db.student.find({Hobby:{$in:[‘Chess’,’skating’]}}).pretty();

To find those documents from the student collection where hobby is not in chess

and skating

db.student.find({Hobby:{$nin:[‘Chess’,’skating’]}}).pretty();

The above command list out the documents without the chess and skating

documents

To list out the name of the student whose name starts with R use the command

db.student.find().sort({Name:1}).pretty();

To sort the student collection in the descending order of the studentname

db.student.find().sort({Name:-1}).pretty();

To skip the first 2 documents in the collection

db.student.find().skip(2).pretty();

To display the last 2 records from the student collection

db.student.find().pretty()skip(db.stuent.count()-2);

7.Arrays

To create a collection by name “inventoryarray” and insert documents into the

“inventoryarray” collection. Each document should contain “tags” array

use inventoryarray;

db.inventoryarray.insertMany([

{ item: "journal", qty: 25, tags: ["black", "red"], dim_cm: [ 14, 21 ] },

{ item: "notebook", qty: 50, tags: ["red", "black"], dim_cm: [ 14, 21 ] },

{ item: "paper", qty: 100, tags: ["red", "black", "plain"], dim_cm: [ 14, 21 ] },

{ item: "planner", qty: 75, tags: ["black", "red"], dim_cm: [ 22.85, 30 ] },

{ item: "postcard", qty: 45, tags: ["blue"], dim_cm: [ 10, 15.25 ] }

]);

To list the documents in the collection”inventoryarray” with “tags” in red and

black

db.inventoryarray.find( { tags: ["red", "black"] } );

To find tags in “red”

db.inventoryarray.find( { tags: "red" } );

All documents where array dim_cm contains at least one element whose value is greater than 25.

db.inventoryarray.find( { dim_cm: { $gt: 25 } } );

dim_cm array that contains at least one element that is both greater than ($gt) 22 and less than ($lt) 30:

db.inventoryarray.find( { dim_cm: { $elemMatch: { $gt: 22, $lt: 30 } } } );

Query an Array by Array Length: $size operator the following selects documents where the array tags has 3 elements.

db.inventory.find( { "tags": { $size: 3 } } );

Query for an Element by the Array Index Position

Using dot notation, you can specify query conditions for an element at a particular

index or position of the array. The array uses zero-based indexing.When querying

using dot notation, the field and nested field must be inside quotation marks.

All documents where the second element in the array dim_cm is greater than 25:

db.inventoryarray.find( { "dim_cm.1": { $gt: 25 } } );

8.Using the AND and OR Operator

It finds the status A& D documents in the collection

db.inventory.find( { status: { $in: [ "A", "D" ] } } );

It finds the documents with status A and qty less than 30

db.inventory.find( { status: "A", qty: { $lt: 30 } } );

It finds the documents with status A or qty less than 30

db.inventory.find( { $or: [ { status: "A" }, { qty: { $lt: 30 } } ] } );