Contextual Object Retrieval: Improving Visual Query Results, Slides of Applications of Computer Sciences

The challenges of visual query context in object retrieval and proposes a contextual object retrieval (cor) model to address these issues. The cor model uses contrast-based saliency detection, sift descriptors, and visual words to match objects in a database. The document also introduces two algorithms, spatial propagation (cora) and appearance propagation (corm), to estimate search intent scores for visual words.

Typology: Slides

2012/2013

Uploaded on 04/24/2013

bandhula
bandhula 🇮🇳

4.7

(10)

91 documents

1 / 26

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Object Retrieval Using Visual
Query Context
Docsity.com
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a

Partial preview of the text

Download Contextual Object Retrieval: Improving Visual Query Results and more Slides Applications of Computer Sciences in PDF only on Docsity!

Object Retrieval Using Visual

Query Context

What is a Visual Query?

• TinEye

• Google Image Search

  • Google Goggles

Bad Query Image vs. Good Query Image

How Can We Improve a Visual Query?

Objects in real-life aren’t bound by a box

Existing Methods

  • Relevance feedback
  • “Bag of visual words”
    • Scale-invariant feature transform (SIFT)
  • Cosine retrieval model
  • Language modeling

Proposed COR Model

  • Based on the Kullbak-Leibler retrieval model
    • Detect interest points
    • Extract SIFT descriptors
    • Convert into visual words
    • Match words to documents in a database
  • Uses Jelinek-Mercer smoothing method
    • Captures important patterns, while removing noise

COR Search Intent Score

  • Standard LM approach uses binary search intent score
  • Two proposed algorithms to compute SI from bounding box with context: 1. Based on pixel distance from bounding box (spatial propagation) 2. Based on color coherence of the pixels (appearance propagation)

Spatial Propagation (CORa)

  • Bounding box is usually rough and inaccurate
    • Lack of user effort
    • Limiting rectangular shape
  • Use smoothed approximation of bounding box
    • Dual-sigmoid function
    • Uses  as a control variable

Appearance Propagation (CORm)

  • Assign high scores to object of interest, normally in foreground
  • Assign low scores to background objects, or objects of no interest
  • Similar to image matting
    • Separate foreground and background using alpha values
    • Separate relevant objects from irrelevant in bounding box

Appearance Propagation (CORm)

Three step approach:

  1. Estimate foreground and background models guided by boundingbox
    • GrabCut algorithm
  2. Use models to select foreground and background pixels
  3. Search intent score estimated based on pixel information
    • Use pseudo-foreground and -background pixels to account for spatialsmoothness
    • Top 10% of foreground pixels from inside box and top 20% ofbackground pixels from outside box

Experiments

  • Experiments performed using 3 image datasets: 1. Oxford5K 2. Oxford5K+ImageNet500K 3. Web1M
  • 1, 2 use 11 landmarks (55 total images) as queries

  • 3 adds an additional 45 images

  • Randomly selected
  • Various categories

Experiments

  • COR models compared to 2 baseline retrieval models:
    1. Cosine
    2. General language modeling (context-unaware)
  • Baseline models only use visual words from inside bounding box
  • All models evaluated in terms of average precision (AP)
  • AP over all queries are averaged to obtain mean average precision(MAP)

AP for different landmarks on Oxford5K dataset.