Data Aggregation Pipeline in MongoDB, Thesis of Accounting

A detailed assignment on using mongodb to perform data aggregation tasks. The assignment involves creating a database named 'companies' by importing data from a 'companies.json' file, executing various mongodb queries to retrieve and analyze company information, and designing and implementing a mongodb aggregation pipeline to show the total number of offices by state for all companies that have offices in the united states. Topics such as data loading, querying, and aggregation, which are essential skills for students studying data engineering, data analysis, and database management. The assignment requires the use of mongodb's powerful data manipulation capabilities, including the use of the 'mongoimport' tool, 'find' and 'limit' queries, and the aggregation pipeline. By completing this assignment, students can demonstrate their understanding of mongodb's data processing capabilities and their ability to apply them to real-world data analysis tasks.

Typology: Thesis

2024/2025

Available from 10/16/2024

helperatsof-1
helperatsof-1 🇺🇸

4.2

(5)

14K documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Southern New Hampshire
University
8-1 Assignment: Data Aggregation
Pipeline
CS
340
09:03:18 GMT -05:00
pf3
pf4
pf5

Partial preview of the text

Download Data Aggregation Pipeline in MongoDB and more Thesis Accounting in PDF only on Docsity!

Southern New Hampshire University 8-1 Assignment: Data Aggregation Pipeline CS 340

  1. Using the mongoimport tool, create the database “companies” by loading the documents found in the “companies.json” file into the “research” collection. This file is located in the “/usr/local/datasets/” directory in Apporto. Verify your load by issuing the following queries : a. db.research.find({"name" : "AdventNet"}) b. db.research.find({"founded_year" : 1996},{"name" : 1}).limit(10)
  1. Perform the following tasks using MongoDB queries : a. List only the first 20 names of companies founded after the year 2010, ordered alphabetically. b. List only the first 20 names of companies with offices in either California or Texas, ordered by the number of employees and sorted largest to smallest. Provide screenshots of your statements and the results as evidence.
  2. Design and implement a MongoDB aggregation pipeline to show the total number of offices by state for all companies that have offices in the United States. Be sure that you account for the fact that some companies have offices in several states. Explain your aggregation