University Information Systems Program Course D270: Data Management & Analytics, Exams of Database Management Systems (DBMS)

University Information Systems Program Course D270: Data Management & Analytics

Typology: Exams

2025/2026

Available from 02/09/2026

GerryBlair
GerryBlair 🇺🇸

1

(1)

929 documents

1 / 24

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
University Information Systems Program
Course D270: Data Management & Analytics
Section 1: Database Fundamentals & Data Modeling (Questions 1-25)
1. What is the primary purpose of a Database Management System (DBMS)?
A) To create spreadsheets
B) To store, retrieve, and manage data efficiently
C) To design user interfaces
D) To write business reports
Answer: B
Explanation: A DBMS is software that enables efficient storage, retrieval, updating, and
management of data in databases, providing data integrity, security, and concurrent access.
2. Which data model represents data as tables with rows and columns?
A) Hierarchical model
B) Network model
C) Relational model
D) Object-oriented model
Answer: C
Explanation: The relational model, introduced by E.F. Codd, organizes data into tables
(relations) with rows (tuples) and columns (attributes).
3. In the three-schema architecture, the conceptual schema describes:
A) Physical storage structures
B) User views of the database
C) The logical structure of the entire database
D) External applications
Answer: C
Explanation: Three-schema architecture: Internal (physical storage), Conceptual (logical
structure), External (user views). The conceptual schema provides a community view of the
database.
4. An entity in an ER diagram represents:
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18

Partial preview of the text

Download University Information Systems Program Course D270: Data Management & Analytics and more Exams Database Management Systems (DBMS) in PDF only on Docsity!

University Information Systems Program

Course D270: Data Management & Analytics

Section 1: Database Fundamentals & Data Modeling (Questions 1-25)

  1. What is the primary purpose of a Database Management System (DBMS)? A) To create spreadsheets B) To store, retrieve, and manage data efficiently C) To design user interfaces D) To write business reports

Answer: B Explanation: A DBMS is software that enables efficient storage, retrieval, updating, and management of data in databases, providing data integrity, security, and concurrent access.

  1. Which data model represents data as tables with rows and columns? A) Hierarchical model B) Network model C) Relational model D) Object-oriented model

Answer: C Explanation: The relational model, introduced by E.F. Codd, organizes data into tables (relations) with rows (tuples) and columns (attributes).

  1. In the three-schema architecture, the conceptual schema describes: A) Physical storage structures B) User views of the database C) The logical structure of the entire database D) External applications

Answer: C Explanation: Three-schema architecture: Internal (physical storage), Conceptual (logical structure), External (user views). The conceptual schema provides a community view of the database.

  1. An entity in an ER diagram represents:

A) A relationship between tables B) A thing or object in the real world C) A database constraint D) A SQL query

Answer: B Explanation: Entities are real-world objects (e.g., Student, Course) about which data is stored. Represented as rectangles in ER diagrams.

  1. A relationship where one entity is associated with one and only one entity of another type is: A) One-to-one (1:1) B) One-to-many (1:N) C) Many-to-one (N:1) D) Many-to-many (M:N)

Answer: A Explanation: 1:1 relationship = each entity in set A associates with at most one entity in set B, and vice versa (e.g., Employee to Company Car assignment).

  1. A weak entity depends on another entity for its: A) Attributes B) Existence C) Relationships D) Primary key

Answer: B Explanation: Weak entities cannot exist independently; they depend on a strong (owner) entity. Represented with double rectangles in ER diagrams.

  1. Which of the following is a characteristic of a primary key? A) Can contain NULL values B) Can have duplicate values C) Uniquely identifies each row D) Must be a single attribute

Answer: C Explanation: Primary key constraints: unique, NOT NULL, minimal (cannot have unnecessary attributes). Can be composite (multiple attributes).

  1. Referential integrity ensures that: A) Primary keys are unique B) Foreign keys match primary keys in referenced table C) Data types are consistent D) All transactions are atomic
  1. Which SQL statement is used to create a database table? A) CREATE TABLE B) MAKE TABLE C) DEFINE TABLE D) BUILD TABLE

Answer: A Explanation: CREATE TABLE statement defines a new table structure including column names, data types, and constraints.

  1. The DELETE statement without a WHERE clause will: A) Delete the first row only B) Delete the entire table structure C) Delete all rows in the table D) Generate an error

Answer: C Explanation: DELETE FROM table_name without WHERE deletes all rows (but table structure remains). TRUNCATE TABLE is faster for deleting all rows.

  1. Which join returns all rows from both tables, filling NULLs where no match exists? A) INNER JOIN B) LEFT JOIN C) RIGHT JOIN D) FULL OUTER JOIN

Answer: D Explanation: FULL OUTER JOIN returns all rows from both tables, matching where possible, NULLs where no match. Not all DBMS support (MySQL doesn't).

  1. A correlated subquery: A) Executes once for the entire query B) Executes once for each row processed by outer query C) Cannot contain aggregate functions D) Must return a single value

Answer: B Explanation: Correlated subquery references columns from outer query, executes repeatedly for each outer row. Often less efficient than joins.

  1. The HAVING clause is used to: A) Filter rows before grouping B) Filter groups after GROUP BY

C) Sort the result set D) Join multiple tables

Answer: B Explanation: WHERE filters rows, HAVING filters groups. HAVING can use aggregate functions (SUM, COUNT), WHERE cannot.

  1. Which constraint ensures a column cannot have NULL values? A) UNIQUE B) NOT NULL C) CHECK D) DEFAULT

Answer: B Explanation: NOT NULL constraint prevents NULL values in column. Primary key implies NOT NULL, but columns can have NOT NULL without being primary key.

  1. A transaction's ACID properties include all EXCEPT: A) Atomicity B) Consistency C) Isolation D) Durability E) Accuracy

Answer: E Explanation: ACID = Atomicity (all or nothing), Consistency (valid state), Isolation (concurrent execution), Durability (permanent). Accuracy is not an ACID property.

  1. Two-phase locking protocol ensures: A) Serializability B) Data integrity C) Both A and B D) Faster query execution

Answer: C Explanation: Two-phase locking (growing then shrinking phases) ensures serializable schedules and maintains data integrity during concurrent transactions.

  1. The data dictionary contains: A) Actual user data B) Metadata about the database C) Backup copies of data D) User passwords in plain text

Section 2: SQL Queries & Advanced Database Concepts (Questions 26-50)

  1. Which SQL function returns the number of rows in a table? A) SUM() B) COUNT() C) AVG() D) TOTAL()

Answer: B Explanation: COUNT() returns number of rows including NULLs. COUNT(column) counts non-NULL values in specific column.

  1. What does the following SQL return: SELECT DISTINCT Department FROM Employees; A) All employees in each department B) Unique department names C) Employees with unique names D) Departments with more than one employee

Answer: B Explanation: DISTINCT eliminates duplicate values, returning unique department names from Employees table.

  1. Which operator is used for pattern matching in SQL? A) = B) LIKE C) MATCH D) SIMILAR

Answer: B Explanation: LIKE with wildcards: % (any string), _ (single character). Example: WHERE name LIKE 'A%' finds names starting with A.

  1. The UNION operator: A) Combines rows from two queries, eliminating duplicates B) Combines columns from two tables C) Joins tables horizontally D) Updates multiple tables

Answer: A Explanation: UNION combines result sets vertically, removing duplicates. UNION ALL keeps duplicates. Requires same number/type of columns.

  1. Which statement creates a primary key constraint? A) ALTER TABLE Employees ADD PRIMARY KEY (EmployeeID); B) MODIFY TABLE Employees SET PRIMARY KEY (EmployeeID); C) UPDATE TABLE Employees PRIMARY KEY (EmployeeID); D) CREATE PRIMARY KEY ON Employees (EmployeeID);

Answer: A Explanation: Can create primary key with CREATE TABLE or ALTER TABLE ADD PRIMARY KEY. Some DBMS also use CONSTRAINT syntax.

  1. A self-join is: A) A join between a table and itself B) A join with no conditions C) A join that returns all possible combinations D) An illegal operation in SQL

Answer: A Explanation: Self-join uses table aliases to join a table to itself, often for hierarchical data (e.g., employees and managers both in Employees table).

  1. What is the default transaction isolation level in most RDBMS? A) Read Uncommitted B) Read Committed C) Repeatable Read D) Serializable

Answer: B Explanation: Read Committed is default in Oracle, PostgreSQL, SQL Server. Prevents dirty reads but allows non-repeatable reads and phantom reads.

  1. A deadlock occurs when: A) Two transactions wait indefinitely for each other B) A transaction updates too many rows C) Database server crashes D) Foreign key constraint is violated

Answer: A Explanation: Deadlock = transaction T1 holds lock A needs lock B, transaction T2 holds lock B needs lock A. DBMS detects and aborts one transaction.

  1. Which backup type copies only changed data since last full backup? A) Full backup B) Differential backup C) Incremental backup

Explanation: Star schema: fact table (measures, foreign keys) connected to dimension tables (descriptive attributes). Snowflake schema = normalized dimensions.

  1. OLAP operations include: A) INSERT, UPDATE, DELETE B) SELECT, JOIN, WHERE C) Roll-up, Drill-down, Slice, Dice D) GROUP BY, HAVING, ORDER BY

Answer: C Explanation: OLAP (Online Analytical Processing) operations: Roll-up (summarize), Drill-down (detail), Slice (subset on one dimension), Dice (subset on multiple dimensions).

  1. Data mining is the process of: A) Storing large amounts of data B) Extracting patterns and knowledge from data C) Cleaning dirty data D) Creating data visualizations

Answer: B Explanation: Data mining = discovering patterns, correlations, trends in large datasets using ML, statistics, database systems. Includes classification, clustering, association.

  1. A surrogate key is: A) A natural identifier from business data B) A system-generated artificial primary key C) A foreign key that references multiple tables D) A composite key with many attributes

Answer: B Explanation: Surrogate key = artificial key (auto-increment, GUID) with no business meaning. Advantages: stable, simple, performance. Contrast with natural key (SSN, email).

  1. Which SQL window function calculates a cumulative sum? A) ROW_NUMBER() B) RANK() C) SUM() OVER (ORDER BY ...) D) LEAD()

Answer: C Explanation: SUM(col) OVER (ORDER BY col2) calculates running total. Window functions perform calculations across row set without collapsing rows.

  1. Common Table Expressions (CTEs) are defined using:

A) WITH clause B) CREATE VIEW C) TEMPORARY TABLE D) SUBQUERY

Answer: A Explanation: CTE syntax: WITH cte_name AS (SELECT ...) SELECT FROM cte_name. Improves readability over nested subqueries, can be recursive.

  1. Which isolation level prevents dirty reads, non-repeatable reads, and phantom reads? A) Read Uncommitted B) Read Committed C) Repeatable Read D) Serializable

Answer: D Explanation: Serializable provides highest isolation: prevents dirty reads (uncommitted data), non-repeatable reads (same row different values), phantom reads (new rows appear).

  1. The NOSQL database type that uses key-value pairs is: A) Document store B) Column-family store C) Graph database D) Key-value store

Answer: D Explanation: Key-value stores: simple schema, fast lookups (Redis, DynamoDB). Document stores: JSON/XML (MongoDB). Column-family: tables with flexible columns (Cassandra). Graph: nodes/edges (Neo4j).

  1. In data warehousing, ETL stands for: A) Extract, Transform, Load B) Export, Transfer, Load C) Extract, Transfer, Load D) Export, Transform, Load

Answer: A Explanation: ETL = Extract (from source), Transform (clean, integrate), Load (into data warehouse). ELT (Load then Transform) is modern variant for big data.

  1. A fact table in a data warehouse typically contains: A) Descriptive attributes B) Foreign keys and measures C) Only primary keys

D) Data modeling

Answer: B Explanation: Data analytics = inspecting, cleaning, transforming data to discover useful information, support decision-making. Includes descriptive, diagnostic, predictive, prescriptive.

  1. Which type of analytics predicts future outcomes? A) Descriptive analytics B) Diagnostic analytics C) Predictive analytics D) Prescriptive analytics

Answer: C Explanation: Predictive analytics uses statistical models, ML to forecast future. Descriptive = what happened, Diagnostic = why, Prescriptive = what should be done.

  1. A regression model predicts: A) Categorical outcomes B) Continuous numerical values C) Cluster assignments D) Association rules

Answer: B Explanation: Regression predicts continuous values (price, temperature). Classification predicts categories (yes/no, spam/not). Clustering groups similar items.

  1. The K-means algorithm is used for: A) Classification B) Regression C) Clustering D) Association rule mining

Answer: C Explanation: K-means partitions data into K clusters based on distance to centroids. Unsupervised learning (no labels). Sensitive to initial centroids and K value.

  1. In a confusion matrix, true positives represent: A) Correctly predicted positive cases B) Incorrectly predicted positive cases C) Correctly predicted negative cases D) Incorrectly predicted negative cases

Answer: A

Explanation: Confusion matrix: TP = actual positive predicted positive, FP = actual negative predicted positive, TN = actual negative predicted negative, FN = actual positive predicted negative.

  1. Precision is calculated as: A) TP / (TP + FN) B) TP / (TP + FP) C) (TP + TN) / total D) TP / total

Answer: B Explanation: Precision = TP/(TP+FP) = how many selected items are relevant. Recall = TP/(TP+FN) = how many relevant items are selected. F1-score = harmonic mean.

  1. Overfitting in machine learning occurs when: A) Model performs well on training data but poorly on new data B) Model performs poorly on both training and test data C) Model is too simple D) Training data is insufficient

Answer: A Explanation: Overfitting = model learns noise/patterns specific to training set, fails to generalize. Solutions: more data, regularization, cross-validation, simpler model.

  1. Cross-validation is used to: A) Increase training speed B) Estimate model performance on unseen data C) Reduce number of features D) Visualize data

Answer: B Explanation: k-fold cross-validation splits data into k subsets, train on k-1, test on 1, repeat k times. Reduces overfitting, better performance estimate than single train/test split.

  1. Principal Component Analysis (PCA) is primarily used for: A) Classification B) Dimensionality reduction C) Clustering D) Time series forecasting

Answer: B Explanation: PCA transforms variables into uncorrelated principal components, reducing dimensions while preserving variance. Useful for visualization, noise reduction.

D) All of the above

Answer: D Explanation: Recursive CTE: anchor member (base case) UNION ALL recursive member (references CTE). Used for hierarchical data (org charts, bill of materials).

  1. Which SQL feature prevents SQL injection attacks? A) Stored procedures B) Parameterized queries C) Input validation D) All of the above

Answer: D Explanation: SQL injection defenses: parameterized queries (prepared statements), stored procedures, input validation, least privilege, escaping user input.

  1. The MERGE statement (UPSERT) performs: A) INSERT only B) UPDATE only C) INSERT, UPDATE, or DELETE based on condition D) Only DELETE

Answer: C Explanation: MERGE (UPSERT) inserts if not exists, updates if exists. Syntax: MERGE INTO target USING source ON condition WHEN MATCHED/MISMATCHED.

  1. In database security, RBAC stands for: A) Rule-Based Access Control B) Role-Based Access Control C) Resource-Based Access Control D) Risk-Based Access Control

Answer: B Explanation: Role-Based Access Control assigns permissions to roles, users assigned roles. Simpler than assigning permissions directly to each user.

  1. Data governance includes: A) Data quality management B) Data security and privacy C) Metadata management D) All of the above

Answer: D

Explanation: Data governance = overall management of data availability, usability, integrity, security. Includes policies, standards, processes, roles.

  1. GDPR compliance requires: A) Data encryption at rest and in transit B) Right to be forgotten C) Data breach notification within 72 hours D) All of the above

Answer: D Explanation: GDPR (General Data Protection Regulation) EU law: consent, right to access/delete, data portability, breach notification, privacy by design.

  1. In cloud databases, horizontal scaling means: A) Adding more CPU/RAM to single server B) Adding more servers C) Increasing storage capacity D) Upgrading network speed

Answer: B Explanation: Horizontal scaling (scale-out) = add more servers/nodes. Vertical scaling (scale-up) = add resources to existing server. Cloud favors horizontal.

  1. Data lakes store: A) Only structured data B) Raw data in native format C) Only processed data D) Only small datasets

Answer: B Explanation: Data lake stores raw, unstructured/semi-structured/structured data at scale. Schema-on-read vs. data warehouse schema-on-write. Used with big data.

  1. Apache Hadoop is designed for: A) Online transaction processing B) Distributed storage and processing C) Real-time analytics D) Relational database management

Answer: B Explanation: Hadoop = open-source framework for distributed storage (HDFS) and processing (MapReduce) of big data across clusters.

  1. Stream processing handles:

Explanation: Both explicit JOIN syntax (B) and implicit comma join (C) work. Modern SQL prefers explicit JOIN for clarity. A fails because DeptID is number not name.

  1. You need to ensure no employee salary exceeds $200,000. Which constraint? A) PRIMARY KEY B) FOREIGN KEY C) CHECK (salary <= 200000) D) UNIQUE

Answer: C Explanation: CHECK constraint validates column values: CHECK (salary BETWEEN 0 AND 200000). Some DBMS allow CHECK constraints, others use triggers.

  1. A query runs slowly. Which index would likely help: SELECT FROM Orders WHERE CustomerID = 1234 AND OrderDate > '2025-01-01'; A) Index on CustomerID B) Index on OrderDate C) Composite index on (CustomerID, OrderDate) D) Index on OrderID

Answer: C Explanation: Composite index on columns used in WHERE clause, especially equality first (CustomerID) then range (OrderDate). Covering index if includes all selected columns.

  1. You need to delete orders older than 5 years but maintain referential integrity with OrderDetails. Best approach: A) DELETE FROM Orders WHERE OrderDate < DATEADD(year, -5, GETDATE()); B) Use ON DELETE CASCADE foreign key C) First delete from OrderDetails, then Orders D) Both B and C

Answer: D Explanation: Either cascade delete (B) or manual deletion in correct order (C child then parent). A alone fails if foreign key constraint without cascade.

  1. Which scenario requires denormalization? A) Online transaction processing system B) Data warehouse for reporting C) Database in 1NF D) System requiring ACID compliance

Answer: B Explanation: Data warehouses often denormalize (star schema) for faster analytical queries. OLTP systems typically normalize for integrity, minimize anomalies.

  1. You have customer data with missing phone numbers. This is a _____ problem. A) Data inconsistency B) Data redundancy C) Data incompleteness D) Data duplication

Answer: C Explanation: Missing values = incompleteness. Solutions: ignore, impute (mean, median, prediction), use algorithms handling missing data. NULL represents unknown.

  1. For real-time analytics on social media posts, which database type is best? A) Relational database B) Document database C) Graph database D) Column-family database

Answer: B Explanation: Social media posts = JSON-like documents, semi-structured, evolving schema. MongoDB (document) good fit. Graph for relationships between users.

  1. You need to analyze year-over-year sales growth. Which SQL function? A) SUM() B) AVG() C) LAG() or LEAD() D) RANK()

Answer: C Explanation: LAG(column, 1) accesses previous row's value for comparison. Example: LAG(sales, 12) for same month previous year if monthly data.

  1. A banking system requires money transfer between accounts to be atomic. Which ensures this? A) Transactions with COMMIT/ROLLBACK B) Stored procedures C) Database triggers D) All of the above

Answer: D Explanation: All can help. Transaction ensures atomicity (both debit and credit or neither). Stored procedure encapsulates logic. Trigger could automate but careful with recursion.

  1. Which data warehouse schema is shown? FactSales(TimeID, ProductID, StoreID, SalesAmount) with dimTime, dimProduct, dimStore.