Continuously Adaptive Queries over Streams: CACQ, Slides of Database Management Systems (DBMS)

Continuously adaptive queries over streams (cacq), a system that addresses the challenges of long-running queries over data streams using adaptivity, work sharing, and state sharing. Cacq uses eddies for adaptivity, tuple lineage for work sharing, and state modules for state sharing. Motivating applications include monitoring queries over sensor data, stock analysis, and router events.

Typology: Slides

2012/2013

Uploaded on 04/27/2013

dhanapati
dhanapati 🇮🇳

4.1

(24)

123 documents

1 / 27

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Continuously Adaptive Continuous
Queries (CACQ) over Streams
1
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b

Partial preview of the text

Download Continuously Adaptive Queries over Streams: CACQ and more Slides Database Management Systems (DBMS) in PDF only on Docsity!

Continuously Adaptive Continuous

Queries (CACQ) over Streams

1

CACQ Introduction

  • Proposed continuous query (CQ) systems are

based on static plans

  • But, CQs are long running
  • Initially valid assumptions less so over time
  • Static optimizers at their worst!
  • CACQ insight: apply continuous adaptivity of

eddies to continuous queries

  • Avoid static optimization via dynamic operator

ordering

  • Process multiple queries simultaneously
  • Explore sharing of work & storage

2

Continuous Queries

  • Long running, “standing queries”, similar to trigger systems
  • Installed; continuously produce streamed results until
removed
  • Lots of queries, over the same data sources
    • Opportunity for work sharing!
    • Global query optimization problem: hard!
    • Idea: adaptive heuristics not quite as hard?
      • Bad decisions are not final
    • Future work: finding an optimal plan (adaptively)

4

Joins in CACQ

  • CACQ uses Parallel Pipelined Joins
    • To avoid blocking
  • Example: Symmetric (Windowed) Hash Join

5

R.a S.b

R S

Build (^) Build

Probe

Probe

CACQ Main Points

  • Adaptivity via Eddies and Routing

policies

  • Tuple Lineage for flexible sharing of

operators between queries

  • Grouped Filter for efficiently computing

selections over multiple queries

  • State Modules (SteMs) for enabling

state sharing among joins 7

Step by Step Using Example

  • First, just one query with only selections
  • Then, add multiple queries
  • Then, add joins to the picture

8

Eddies : Single Query, Single Source

  • R eady bits track what to do next - All 1’s in single source
  • Done bits track what has been done - Tuple can be output when all bits set

10

R

(R.a > 10)

Eddy

(R.b < 15) R (^1)

R (^1)

R (^1)

R a 5 b 25

R a 15 b 0

Ready Done

σa σb σa σb R

(R.a > 10)

Eddy

(R.b < 15)

R (^2)

R R^2 2 R^2 R (^2)

R (^2)

SELECT * FROM R WHERE R.a > 10 AND R.b < 15

Multiple Queries

11

σa
σb

R

σa
σb

R

σa
σb

R

Q1 Q2 Q

R.a > 10 R.a > 20 R.a = 0

R.b < 15 R.b = 25 R.b <> 50

σb

σa

R

R (^1)

R (^1)

R (^1)

R (^1)

R (^1)

Grouped Filters

R a 5 b 25

SELECT * FROM R WHERE R.a > 10 AND R.b < 15

Q

SELECT * FROM R WHERE R.a > 20 AND R.b = 25

Q

**SELECT ***

FROM R

WHERE R.a = 0

AND R.b <> 50

Q

R 1 R^1

R 1 R 1 R (^1)

R 1 R (^1) R (^1)

0 0 0 0 00 0 1 0 00 1 1 0 00 1 1 1 11 1 1 1 1

σa σb Q1^ Q2^ Q

Done QueriesCompletedDocsity.com

Tuple & Query Data Structures

  • Per tuple bitmaps:
    • queriesCompleted
      • What queries has this tuple been output to or rejected by?
    • done
      • What operators have been applied to this tuple?
    • ready
      • What operators can be applied to this tuple?
  • Per query bitmaps:
    • completionMask
      • What operators must be applied to output a tuple to this query?

13

Tuple [10, 1100, …] Bit Value QueriesCompleted Query 1 1 Query 2 0 Done S.a Index 1 S.b Index 1 R.a – S.b Join 0 UDF (R.a) 0 Query [0110] Bit Value completionMask S.a Index 0 S.b Index 1 R.a – S.b Join 1 UDF (R.a) 0 Docsity.com

Outputting Tuples

  • Store a completionMask bitmap for
each query
  • One bit per operator
  • Set if operator in query
  • To determine if a tuple t can be
output to query q:
  • Eddy ANDs q’s completionMask with t’s done bits
  • Output only if q’s bit not set in t’s queriesCompleted bits
  • Every time a tuple returns from an
operator

14 Q2: 0111

Q1: 1100 &^ Done^ == 1100

& Done == 0111

completionMasks

&& QueriesCompleted[0] == 0

SELECT * FROM R WHERE R.a > 10 AND R.b < 15

Q

SELECT * FROM R WHERE R.b < 15 AND R.c <> 5 AND R.d = 10

Q

completionMasks

σ a b c d

Q1 1 1 0 0 Q2 0 1 1 1

Done (^) QC

σa σb σc σd Q1^ Q

Tuple

1 1 0 0 0 01 1 0 0 1 01 1 0 0 1 0

Docsity.com

Work Sharing via Tuple Lineage

16

Q1: SELECT * FROM s WHERE A, B, C
Q2: SELECT * FROM s WHERE A, B, D

Work Sharing via Tuple Lineage

17

A
B
C D
B
A
Data Stream S
s
s c
s BC
sD
sBD
Query 1 Query 2
Conventional Queries
s
s
sC
s CD
sCDB
CACQ
A
C
D
B
Data Stream S
A
C D
B
Data Stream S
Q 1 Q 2
Shared Subexpr.
sB
sAB sAB

Reject?

sCDBA
s

Q1: SELECT * FROM s WHERE A, B, C Q2: SELECT * FROM s WHERE A, B, D

Inter- section of CD goes through AB an extra time!

AB must be applied first!

Lineage (Queries Completed) Enables Any Ordering!

0 | 0 QC

0 or 1 | 0 QC

1 | 1 0 or 1 | 0 or 1^ QC QC

0 or 1 | 0 or 1 QC

C D

0 or 1 | 0 or 1 QC

Joins in CACQ

  • Use symmetric hash join to avoid blocking
  • Use State Modules (SteMs) to share storage

between joins with a common base relation

19

R.a S.b

R S

Build (^) Build

Probe

Probe

Processing Joins Via State Modules

  • Idea: Share join indices
over base relations
  • State Modules (SteMs)
are:
  • Unary indexes (e.g. hash
tables, trees)
  • Built on fly (as data arrives)
  • Scheduled by CACQ as first
class operators
  • Based on symmetric hash
join

20

S.b = T.c

Query 1 R.a = S.b

Query 2

R S

T

R.a S.b

T.c

Build Probe

Build Probe

Probe