STREAM System: Query Processing & Resource Management in Data Streams, Slides of Database Management Systems (DBMS)

An overview of the stream system, a data stream management system (dsms), and its contributions to the field, including semantics for continuous queries, query plans, exploiting stream constraints, operator scheduling, and approximation techniques. The document also includes examples of continuous queries and their execution.

Typology: Slides

2012/2013

Uploaded on 04/27/2013

dhanapati
dhanapati 🇮🇳

4.1

(24)

123 documents

1 / 52

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Query Processing, Resource Management,
and Approximation in a Data Stream
Management System
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34

Partial preview of the text

Download STREAM System: Query Processing & Resource Management in Data Streams and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Query Processing, Resource Management, and Approximation in a Data Stream Management System

Data Streams

  • Stream = Continuous, unbounded, rapid, time- varying streams of data elements
  • DSMS = Data Stream Management System

Contributions to Date

  • Semantics for continuous queries
  • Query plans
  • Exploiting stream constraints
  • Operator scheduling
  • Approximation techniques

The (Simplified) Big Picture

DSMS

Scratch Store

Input streams

Register Query

Streamed Result

Stored Result

Archive Stored Relations

Declarative Language for Continuous

Queries

  • A distinction between STREAM and Aurora :
    • Aurora users directly manipulate one large execution plan
    • STREAM compiles declarative queries into individual plans, system may merge plans
  • Syntax based on SQL, additional constructs for sliding windows and sampling

Example Query 1

Two streams, contrived for ease of examples:

Orders (orderID, customer, cost) Fulfillments (orderID, clerk)

Example Query 1

Two streams, contrived for ease of examples: Orders (orderID, customer, cost) Fulfillments (orderID, clerk)

Total cost of orders fulfilled over the last day by clerk “Sue” for customer “Joe”

Select Sum(O.cost) From Orders O, Fulfillments F [Range 1 Day] Where O.orderID = F.orderID And F.clerk = “Sue” And O.customer = “Joe”

Example Query 1

Two streams, contrived for ease of examples: Orders (orderID, customer, cost) Fulfillments (orderID, clerk)

Total cost of orders fulfilled over the last day by clerk “Sue” for customer “Joe”

Select Sum(O.cost) From Orders O, Fulfillments F [Range 1 Day] Where O.orderID = F.orderID And F.clerk = “Sue” And O.customer = “Joe”

Example Query 1

Two streams, contrived for ease of examples:

Orders (orderID, customer, cost) Fulfillments (orderID, clerk)

Total cost of orders fulfilled over the last day by clerk “Sue” for customer “Joe”

Select Sum(O.cost) From Orders O, Fulfillments F [Range 1 Day] Where O.orderID = F.orderID And F.clerk = “Sue” And O.customer = “Joe”

Example Query 2

Using a 10% sample of the Fulfillments stream, take the 5 most recent fulfillments for each clerk and return the maximum cost

Select F.clerk, Max(O.cost) From Orders O, Fulfillments F [Partition By clerk Rows 5] 10% Sample Where O.orderID = F.orderID Group By F.clerk

Example Query 2

Using a 10% sample of the Fulfillments stream, take the 5 most recent fulfillments for each clerk and return the maximum cost

Select F.clerk, Max(O.cost) From Orders O, Fulfillments F [Partition By clerk Rows 5] 10% Sample Where O.orderID = F.orderID Group By F.clerk

Example Query 2

Using a 10% sample of the Fulfillments stream, take the 5 most recent fulfillments for each clerk and return the maximum cost

Select F.clerk, Max(O.cost) From Orders O, Fulfillments F [Partition By clerk Rows 5] 10% Sample Where O.orderID = F.orderID Group By F.clerk

A Nonobvious Continuous Query

  • Stream of stock quotes: Stocks(ticker,price)
  • Monitor last 10 minutes of quotes: Select ∗ From Stocks [Range 10 minutes]
  • Is result a relation, a stream, or something

else?

  • If a relation, what exactly does it contain?
  • If a stream, how does query differ from: Select ∗ From Stocks [Range 1 minute] or Select ∗ From Stocks [∞]

Our Semantics and Language for

Continuous Queries

  • Abstract: interpretation for CQs based on certain “black boxes”
  • Concrete: SQL-based instantiation for our system; includes syntactic shortcuts, defaults, equivalences
  • Goals
    • CQs over multiple streams and relations
    • Exploit relational semantics to the extent possible
    • Easy queries should be easy to write, simple queries should do what you expect