Haar Wavelets in Signal Processing & Sensor Networks: Query Processing - Prof. Amol V. Des, Study notes of Computer Science

Haar wavelets, their use in signal processing for studying frequency components and data reduction, and their application in sensor networks for multi-resolution query processing. The document also covers the challenges of ordering data elements, zero-padding, and communication tree design. Additionally, it mentions tinydb and its acquisitional query processing approach.

Typology: Study notes

Pre 2010

Uploaded on 02/13/2009

koofers-user-adx
koofers-user-adx 🇺🇸

10 documents

1 / 26

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CMSC 828K: In-network Processing (of
Haar Wavelets); ACQP/TinyDB
Amol Deshpande
University of Maryland, College Park
September 13, 2007
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a

Partial preview of the text

Download Haar Wavelets in Signal Processing & Sensor Networks: Query Processing - Prof. Amol V. Des and more Study notes Computer Science in PDF only on Docsity!

CMSC 828K: In-network Processing (of

Haar Wavelets); ACQP/TinyDB

Amol Deshpande

University of Maryland, College Park

September 13, 2007

Haar Wavelets

Wavelets: Used to study “frequency components” of a

signal

Also proposed in databases for data reduction or

approximate query answering

Haar Wavelets

Keep only top “k” coefficients

Need to normalize first (not discussed in paper in detail) Minimizes the sum-squared error

Allows for multi-resolution query processing

Haar Wavelets

Keep only top “k” coefficients

Need to normalize first (not discussed in paper in detail) Minimizes the sum-squared error

Allows for multi-resolution query processing

Ordering of data elements?

In signal processing, the numbers form a sequence, so they are ordered In some cases, you might be able to reorder Most likely NP-Hard to find the best ordering

In sensor networks, ordering may need to be preserved

Depends on the application

Haar Wavelets in Sensor Networks

TAG interface can be used to implement this

merging function:

Take the two PSRs from children, and merge them Send top “k” coefficients up Two problems: May have more than 2 children The children may be of different sizes Must “zero-pad” Can have too many zeros This introduces larger errors Consider a Haar wavelet on the sequences: {100, 100, 0, 0, 0, 0, 0, 0} vs {100, 100}

Haar Wavelets in Sensor Networks

Can we choose the communication tree so that

zero-padding is not required?

What is the optimal tree?

For Haar wavelets, it is a binomial tree

Haar Wavelets in Sensor Networks

!"

$

% & !'

( '( '$ $ ') ( # %

" ' % '# #%

$ % & ( % " ' *#+ *%+ (^) *(+ *"+ *'),!"+ '$,#,!",#+$,#+ [t] igure 3: An in-network computation of the Haar wavelet of Figure 1. The left side annotates the (logical) support ee with dark arrows representing physical message-passing between the sensor nodes. The right side of the figure hows just the (physical) communication tree, i.e., the leaf level of the left side. Each edge on the right is labeled with

Haar Wavelets in Sensor Networks

Can we enforce such a communication tree in the

sensor network?

Open problem

Figure 5: A binomial tree embedded in a radius-1 grid.

Sectio

tion tr

is an

in sen

aggre

conne

multi

ure 4

Generalizing the Ideas

What about other types of aggregates?

May not be able to map the support graph directly onto

communication graph

Use overlay networks Nodes forward messages without applying merging functions Essentially assume a complete communication graph Hard to optimize (need to use weighted graphs)

Very interesting problems in this line of work

Basic ideas also extend to peer-to-peer networks

TinyDB: Acquisitional Query Processing

Followup work after TAG

“Acquisitional” query processing

We can control when/where to observe data (within certain limits) In traditional databases, the data is given How does this change query processing?

Sensor Networks

Energy consumption

Little difference in receiving and transmitting costs

locally. The mote then switches to a “Processing and Receiving” mode, where results are collected from neighbors over the radio. Finally, in the “Transmitting” mode, results for the query are delivered by the lo- cal mote – the noisy signal during this period reflects switching as the receiver goes off and the transmitter comes on and then cycles back to a receiver-on, transmitter-off state.

2

4

6

8

10

12

14

16

18

20

22

0 0.5 1 1.5 2 2.5 3

Current (mA)

Time (seconds)

Time v. Current Draw In Different Phases of Query Processing

Transmitting

Processing and Listening

Processing

Snoozing

Figure 2: Phases of Power Consumption In TinyDB

TinyDB: Query language

Basic query:

adios such as m a few feet environmen- deployments ediate nodes mmunication filter so that eighbors are tional energy age from the ents indicat- bor node. No ce and easy utomatically defined sample intervals that are a parameter of the query. The per time between each sample interval is known as an epoch. As we d in Section 6, epochs provide a convenient mechanism for struc computation to minimize power consumption. Consider the quer SELECT nodeid, light, temp FROM sensors SAMPLE INTERVAL 1s FOR 10s This query specifies that each sensor should report its own id, ligh temperature readings (contained in the virtual table sensors) on second for 10 seconds. Results of this query stream to the root network in an online fashion, via the multi-hop topology, wher may be logged or output to the user. The output consists of an growing sequence of tuples, clustered into 1s time intervals. Each includes a time stamp corresponding to the time it was produced. Note that the sensors table is (conceptually) an unbounded, c uous data stream of values; as is the case in other streaming an line systems, certain blocking operations (such as sort and sym

Creating storage points (persistent views):

easy ically me of hbors orma- work. a link esults use a o dis- rmed node r this til the count, Note that the sensors table is (conceptually) an unbounded, contin uous data stream of values; as is the case in other streaming and on line systems, certain blocking operations (such as sort and symmetri join) are not allowed over such streams unless a bounded subset of th stream, or window , is specified. Windows in TinyDB are defined a fixed-size materialization points over the sensor streams. Such mater alization points accumulate a small buffer of data that may be used i other queries. Consider, as an example: CREATE STORAGE POINT recentlight SIZE 8 AS (SELECT nodeid, light FROM sensors SAMPLE INTERVAL 10s) This statement provides a shared, local (i.e. single-node) location t store a streaming view of recent data similar to materialization point in other streaming systems like Aurora or STREAM [7, 34], or mater alized views in conventional databases. Joins are allowed between tw storage points on the same node, or between a storage point and th sensors relation, in which case sensors is used as the outer rela

TinyDB: Query language

Event-based queries: Much superior to polling

3.2 Event-Based Queries As a variation on the continuous, polling based mechanisms for data acquisition, TinyDB supports events as a mechanism for initiating data collection. Events in TinyDB are generated explicitly, either by another query or the operating system (in which case the code that generates the event must have been compiled into the sensor node.) For example, the query: ON SELECT EVENT AVG(light),bird-detect(loc): AVG(temp), event.loc FROM WHERE sensors dist(s.loc, AS s event.loc) < 10m SAMPLE INTERVAL 2 s FOR 30 s could be used to report the average light and temperature level at sen- sors near a bird nest where a bird has just been detected. Every time a bird-detect event occurs, the query is issued from the detecting node and the average light and temperature are collected from nearby nodes once every 2 seconds for 30 seconds. Such events are central in ACQP, as they allow the system to be dormant until some external conditions occurs, instead of continually polling or blocking on an iterator waiting for some data to arrive. Since most mi- croprocessors include external interrupt lines than can wake a sleeping device to begin processing, events can provide significant reductions in power consumption, shown in Figure 3.

3.3 L In lieu specific < x > much m Especia not par nor do sumptio of the n

This qu samplin sible an To satis goal of rate giv ering ho these ra to comp

Lifetime-based queries:

Can specify a lifetime for the query Decide the sampling rate based on that Tricky to estimate lifetimes accurately Also, the sampling rate is usually dependent on the application

Query optimization

Cost-based (as with traditional relational databases)

Maintain a catalog with sensing costs etc..

Also aggregates (whether they are monotonic, distributive etc..)

When to take samples?

Should interleave taking samples and evaluating predicates