Data base Management Systems, Lecture notes of Database Management Systems (DBMS)

This file is a series of information on how information is stored in a noSQL database and the suitable formats of storing the files.

Typology: Lecture notes

2024/2025

Uploaded on 11/29/2025

ododa-sijenyi24
ododa-sijenyi24 🇹🇿

1 document

1 / 7

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
19
Lecture 4: Types of NoSql Databases
There are four kinds of NoSQL databases:
i). document databases e.g Mongo DB, XML Native DB
ii). key-value stores e.g. Amazon DynamoDB, ScyllaDB,
iii). column-oriented databases: Apache Cassandra
iv). graph databases.
The db language of NoSQL data bases differ depending on file storage type e.g XML Native
DB uses XQUERY, MongoDB uses Javascript etc
Document Databases
This type of data model allows you to store information as any type of data. This is in contrast
to SQL, which relies heavily on XML and JSON, and essentially ties the two together, and can
make any query inefficient (or less efficient). Since NoSQL doesn’t use a scheme, there’s no
need for relational data storage, and no need to tie those two together. there is a NoSQL data
model that is XML specific, if you want to go that route.
The document-based database is a nonrelational database. Instead of storing the data in rows
and columns (tables), it uses the documents to store the data in the database. A document
database stores data in JSON, BSON, or XML documents.
Documents can be stored and retrieved in a form that is much closer to the data objects used in
applications which means less translation is required to use these data in the applications. In
the Document database, the particular elements can be accessed by using the index value that
is assigned for faster querying.
Collections are the group of documents that store documents that have similar contents. Not
all the documents are in any collection as they require a similar schema because document
databases have a flexible schema.
Key features of documents database:
Flexible schema: Documents in the database has a flexible schema. It means the
documents in the database need not be the same schema.
Faster creation and maintenance: the creation of documents is easy and minimal
maintenance is required once we create the document.
pf3
pf4
pf5

Partial preview of the text

Download Data base Management Systems and more Lecture notes Database Management Systems (DBMS) in PDF only on Docsity!

Lecture 4: Types of NoSql Databases There are four kinds of NoSQL databases: i). document databases e.g Mongo DB, XML Native DB ii). key-value stores e.g. Amazon DynamoDB, ScyllaDB, iii). column-oriented databases: Apache Cassandra iv). graph databases. The db language of NoSQL data bases differ depending on file storage type e.g XML Native DB uses XQUERY, MongoDB uses Javascript etc Document Databases This type of data model allows you to store information as any type of data. This is in contrast to SQL, which relies heavily on XML and JSON, and essentially ties the two together, and can make any query inefficient (or less efficient). Since NoSQL doesn’t use a scheme, there’s no need for relational data storage, and no need to tie those two together. there is a NoSQL data model that is XML specific, if you want to go that route. The document-based database is a nonrelational database. Instead of storing the data in rows and columns (tables), it uses the documents to store the data in the database. A document database stores data in JSON, BSON, or XML documents. Documents can be stored and retrieved in a form that is much closer to the data objects used in applications which means less translation is required to use these data in the applications. In the Document database, the particular elements can be accessed by using the index value that is assigned for faster querying. Collections are the group of documents that store documents that have similar contents. Not all the documents are in any collection as they require a similar schema because document databases have a flexible schema. Key features of documents database:  Flexible schema: Documents in the database has a flexible schema. It means the documents in the database need not be the same schema.  Faster creation and maintenance: the creation of documents is easy and minimal maintenance is required once we create the document.

 No foreign keys: There is no dynamic relationship between two documents so documents can be independent of one another. So, there is no requirement for a foreign key in a document database.  Open formats: To build a document we use XML, JSON, and others. Key-Value Stores: A key-value store is a nonrelational database. The simplest form of a NoSQL database is a key-value store. Every data element in the database is stored in key-value pairs. The data can be retrieved by using a unique key allotted to each element in the database. The values can be simple data types like strings and numbers or complex objects. A key-value store is like a relational database with only two columns which is the key and the value. Key features of the key-value store:  Simplicity.  Scalability.  Speed. Key-value databases are highly partitionable and allow horizontal scaling at scales that other types of databases cannot achieve. For example, Amazon DynamoDB allocates additional partitions to a table if an existing partition fills to capacity and more storage space is required. NB: Horizontal Vs Vertical Scaling: horizontal scaling is adding more database nodes (storage locations in a network) while Vertical scaling is adding more storage to an existing database (in the same computer) The following diagram shows an example of data stored as key-value pairs in DynamoDB.

Use cases

Session store: A session-oriented application such as a web application starts a session when a user logs in and is active until the user logs out or the session times out. During this period, the application stores all session-related data either in the main memory or in a database. Session data may include user profile information, messages, personalized data and themes, recommendations, targeted promotions, and discounts. Each user session has a unique identifier. Session data is never queried by anything other than a primary key, so a fast key-value store is a better fit for session data. In general, key-value databases may provide smaller per-page overhead than relational databases.

When to use the Columnar Database:

  1. Queries that involve only a few columns.
  2. Compression but column-wise only.
  3. Clustering queries against a huge amount of data. Advantages of Columnar Database:
  4. Columnar databases can be used for different tasks such as when the applications that are related to big data comes into play then the column-oriented databases have greater attention in such case.
  5. The data in the columnar database has a highly compressible nature and has different operations like (AVG), (MIN, MAX), which are permitted by the compression.
  6. Efficiency and Speed: The speed of Analytical queries that are performed is faster in columnar databases.
  7. Self-indexing: Another benefit of a column-based DBMS is self-indexing, which uses less disk space than a relational database management system containing the same data. Limitation of Columnar Database:
  8. For loading incremental data, traditional databases are more relevant as compared to column-oriented databases.
  9. For Online transaction processing (OLTP) applications, Row oriented databases are more appropriate than columnar databases. Graph-Based databases: Graph-based databases focus on the relationship between the elements. It stores the data in the form of nodes in the database. The connections between the nodes are called links or relationships. Key features of graph database:  In a graph-based database, it is easy to identify the relationship between the data by using the links.  The Query’s output is real-time results.  The speed depends upon the number of relationships among the database elements.  Updating data is also easy, as adding a new node or edge to a graph database is a straightforward task that does not require significant schema changes. When do we need Graph Database?
  10. It solves Many-To-Many relationship problems If we have friends of friends and stuff like that, these are many to many relationships. Used when the query in the relational database is very complex.
  11. When relationships between data elements are more important For example- there is a profile and the profile has some specific information in it but the major selling point is the relationship between these different profiles that is how you get connected within a network. In the same way, if there is data element such as user data element inside a graph database

there could be multiple user data elements but the relationship is what is going to be the factor for all these data elements which are stored inside the graph database.

  1. Low latency with large scale data When you add lots of relationships in the relational database, the data sets are going to be huge and when you query it, the complexity is going to be more complex and it is going to be more than a usual time. However, in graph database, it is specifically designed for this particular purpose and one can query relationship with ease. Why do Graph Databases matter? Because graphs are good at handling relationships, some databases store data in the form of a graph. Example We have a social network in which five friends are all connected. These friends are Anay, Bhagya, Chaitanya, Dilip, and Erica. A graph database that will store their personal information may look something like this:

id first name last name email phone

1 Anay Agarwal [email protected] 555-111-

2 Bhagya Kumar [email protected] 555-222-

3 Chaitanya Nayak [email protected] 555 - 333 - 5555

4 Dilip Jain [email protected] 555-444-

5 Erica Emmanuel [email protected] 555-555-

Now, we will also a need another table to capture the friendship/relationship between

users/friends. Our friendship table will look something like this:

user_id friend_id

Now, let’s analyse the time taken in this Relational database approach. This will be

approximately log(N) times where N represents the number of tuples in friendship

table or number of relations. Here, the database maintains the rows in the order of id’s.

So, in general for ‘M’ no of queries, we have a time complexity of M*log(N) Only if

we had used a graph database approach, the total time complexity would have been

O(N). Because, once we’ve located Cindy in the database, we have to take only a

single step for finding her friends. Here is how our query would be executed: