Docsity
Docsity

Prepara tus exámenes
Prepara tus exámenes

Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity


Consigue puntos base para descargar
Consigue puntos base para descargar

Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium


Orientación Universidad
Orientación Universidad


medidas, Apuntes de Estadística

Asignatura: Teoria de la probabilitat, Profesor: fineti fineti, Carrera: Ciències i Tècniques Estadístiques, Universidad: UV

Tipo: Apuntes

2013/2014

Subido el 23/04/2014

batallerferrer
batallerferrer 🇪🇸

8 documentos

1 / 15

Toggle sidebar

Esta página no es visible en la vista previa

¡No te pierdas las partes importantes!

bg1
A Case Base Similarity Framework
?
Hugh R. Osborne and Derek G. Bridge
University of York
Abstract.
Case based systems typically retrieve cases from the case
base by applying similarity measures. The measures are usually con-
structed in an ad hoc manner. This pap er presents a theoretical frame-
work for the systematic construction of similarity measures. In addition
to paving the way to a design methodology for similarity measures, this
systematic approach facilitates the identication of opportunities for par-
allelisation in case base retrieval.
1 Case Memory Systems
In this paper we present a framework for the construction of similarity measures.
Great exibility is achieved by constructing complex similarity measures from
more basic measures using a variety of connectives that we dene. The concepts
introduced in this paper are illustrated by an extensive example in the appendix.
A case memory system will be considered to consist of a case base and a
retrieval mechanism. The case base will be modelled as a nite set,
, of cases,
equipped with
projection
functions for accessing the component elements of these
cases. While the cases in the example in the appendix are all tuples, and the
projection functions the standard pro jection functions for tuples, a case may
be a more complex structure, with correspondingly more complex projection
functions. The projection functions might even implement considerable infer-
encing [9, 14], perhaps to obtain \deep" features [3] from \surface" features.
A retrieval request is presented to the system as a pair, consisting of an
element,
#
, of
and a similarity measure,
. The case
#
, known as the
seed
will,
in combination with the similarity measure, represent the \best possible" case.
This is in contrast to the earliest approaches, e.g. [11], in which the seed was
the ideal case, and the similarity measure measured the closeness of retrieved
cases to this ideal. In the approach taken here, the similarity measure can, for
example, include negation, so that distance from, rather than closeness to, the
seed becomes the measure of suitability.
We take a very general view of what cases are. The problem description, its
wider situation or context, its solution, the solution's outcome, etc. may all be
features that may be pro jected from a case. In some case memory implementa-
tions, only a subset of these might be stored directly as elds of the cases; others
?
In:
Proceedings of the 3
rd
European Workshop on Case-Based Reasoning,
EWCBR'96,
Advances in Case-Based reasoning,
Ian Smith and Boi Faltings (Eds.),
Lecture Notes in Articial Intelligence 1168,
Springer Verlag,
1996
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Vista previa parcial del texto

¡Descarga medidas y más Apuntes en PDF de Estadística solo en Docsity!

A Case Base Similarity Framework

Hugh R. Osb orne and Derek G. Bridge

University of York

Abstract. Case based systems typically retrieve cases from the case base by applying similarity measures. The measures are usually con- structed in an ad ho c manner. This pap er presents a theoretical frame- work for the systematic construction of similarity measures. In addition to paving the way to a design metho dology for similarity measures, this systematic approach facilitates the identi cation of opp ortunities for par- allelisation in case base retrieval.

1 Case Memory Systems

In this pap er we present a framework for the construction of similarity measures. Great exibility is achieved by constructing complex similarity measures from more basic measures using a variety of connectives that we de ne. The concepts intro duced in this pap er are illustrated by an extensive example in the app endix. A case memory system will b e considered to consist of a case base and a retrieval mechanism. The case base will b e mo delled as a nite set,  , of cases, equipp ed with projection functions for accessing the comp onent elements of these cases. While the cases in the example in the app endix are all tuples, and the pro jection functions the standard pro jection functions for tuples, a case may b e a more complex structure, with corresp ondingly more complex pro jection functions. The pro jection functions might even implement considerable infer- encing [9, 14 ], p erhaps to obtain \deep" features [3] from \surface" features. A retrieval request is presented to the system as a pair, consisting of an element, #, of  and a similarity measure, . The case #, known as the seed will, in combination with the similarity measure, represent the \b est p ossible" case. This is in contrast to the earliest approaches, e.g. [11], in which the seed was the ideal case, and the similarity measure measured the closeness of retrieved cases to this ideal. In the approach taken here, the similarity measure can, for example, include negation, so that distance from, rather than closeness to, the seed b ecomes the measure of suitability. We take a very general view of what cases are. The problem description, its wider situation or context, its solution, the solution's outcome, etc. may all b e features that may b e pro jected from a case. In some case memory implementa- tions, only a subset of these might b e stored directly as elds of the cases; others

? (^) In: Proceedings of the 3 rd (^) European Workshop on Case-Based Reasoning, EWCBR'96, Advances in Case-Based reasoning, Ian Smith and Boi Faltings (Eds.), Lecture Notes in Arti cial Intelligence 1168, Springer Verlag, 1996

might b e part of an indexing structure (as, e.g., with the explanations of case applicability in [5]). However, even in these systems, the information has to b e asso ciated with the case in some fashion, and so we can, without loss of gen- erality, assume that the information can b e obtained by applying a pro jection function to a case.

By taking this broad view of what cases are and by allowing similarity mea- sures to apply to any of the features that can b e pro jected from the case, our framework also encompasses prop osals that retrieval should b e sensitive to as- p ects of the case other than the problem description (e.g. the adaptability of the solution, as in [12, 24 ]). If, on the other hand, only problem descriptions are to b e compared,  will b e designed to ignore other features.

Finally, we should note that, in systems in which the case memory is indexed, case base interrogation is often a two-stage pro cess [13, 1]: a retrieval step ex- ploits the indexes to restrict computational e ort to cases that are similar to the seed on those characteristics enco ded as indexes, but the nal ranking and case selection requires application of a similarity measure to this retrieved set of cases.

There is a sense in which this two-step pro cess is equivalent to the application of two similarity measures: one that is \hard-co ded" as indexes, and one that is then applied to the results of the application of the rst. From this p oint of view, our framework encompasses systems of this kind.

In passing, we note that at stake here is whether to take a representational or a computational view of similarity [19], or, in Richter's terminology [21], whether the similarity measure is compiled knowledge or knowledge that is interpreted at run-time. In the representational approach, cases reside in a data structure, such as a DAG, where, e.g., proximity in the data structure denotes similarity. Representational approaches can a ord considerable eciency in retrieval. The data structure is e ectively optimised towards retrieval according to the \hard- co ded" similarity measure. This can b e of esp ecial value when similarity assess- ment requires the application of large b o dies of domain-sp eci c knowledge: that knowledge will b e applied once p er case at case base up date time, rather than b eing applied afresh on every retrieval [19]. However, this form of optimisation can lead to a loss of exibility [15] as it may b e hard or inecient to access the case base in di erent ways as might b e needed to give more context-sensitivity or to use the case base for multiple tasks [4]. The computational approach, on the other hand, will, in its most extreme form, compute similarity \from scratch" on each retrieval. This can b e a exible approach as nothing is hard-co ded; it may b e more amenable to user manipulation of the similarity measure (as allowed in many case-based reasoning shells, e.g., [10]) or even manipulation through some learning pro cess, e.g. [22]; but there may b e an eciency price to b e paid. (A \spin-o " of our own work has b een the identi cation of opp ortunities for paral- lelisation in pure computational approaches using our similarity framework [17], and this may help to make computational approaches more widely usable. See also [16].) The two-stage pro cess mentioned ab ove is clearly a compromise b e- tween pure representational and pure computational approaches. We rep eat that

at: No elements are related

at 3 is: The seed is b etter than all others

is 3

x ( at #) y x = y 1 2 3 4 5

x (is #) y (y = #) _ (x = y )

^

@@I^

HH

HY

minimal: The seed is a minimum requirement

minimal 3 maximal: The seed is a maximum requirement

maximal 3

x (minimal #) y (x  y ^ x  #) _ (x = y ) 1

@I@

x (maximal #) y (x  y ^ x  #) _ (x = y ) 4

@@I

b est: The seed is b est b est 3 id: Ignore the seed id 3

x (b est #) y (y  x  #) _ (y  x  #) (^1)

@I@

@I@

x (id #) y x  y

Fig. 1. Atomic similarity measures

element and the seed. The range of the distance function is the ordered set N 1 = N [ f1g. The de nition of the distance function 7 ! makes use of three other functions: depth (giving the depth of an element in a tree), Tree (the subtrees of a tree), and (^2) Tree (which tests if an element app ears in a tree). These functions are de ned in Fig. 2. Since # ( 7 ! t) clearly de nes a function from elements to the ordered set N 1 , it can b e used to de ne a similarity measure generator for trees, <Tree , also given in Fig. 2.

Directed Acyclic Graphs. The functions in Fig. 2 have b een de ned in such a way that they can easily b e adapted to apply to DAGs. Details can b e found in [18]. The reader should note that graphs are b eing used in this pap er to de ne distances b etween elements and seeds. This is quite distinct from assessing the similarity of two graph structures by some sort of subgraph algorithm, as is

depth :: elem! Tree elem! N 1 depth e (Leaf l ) = 0 ; if e = l = 1 ; otherwise depth e (No de n) = 1 + (min fdepth e t j t 2 ng) Tree :: Tree elem! fTree elem g Tree (Leaf l ) = f(Leaf l )g Tree (No de n) = f(No de n)g [ (

S

t 2 n Tree^ t) (^2) Tree :: elem! Tree elem! b o ol e (^2) Tree (Leaf l ) = e = l e (^2) Tree (No de n) = 9 t 2 n : e (^2) Tree t 7 !:: Tree elem! elem! elem! N 1

( 7 ! t) e = min fdepth e t^0 j t^0 2 (Tree t) ^ # (^2) Tree t^0 g

<Tree :: Tree elem! elem! (elem! elem! b o ol ) e 1 (<Tree t #) e 2 = # ( 7 ! t) e 1 < # ( 7 ! t) e 2

Fig. 2. Similarity measure generating functions for trees

found in many case based reasoning systems [2, 6 ]. There is, however, no reason why such an algorithm could not b e used to de ne an ordering on graphs, and then apply this ordering in the way presented in this pap er.

Graphs. A similar construction can b e used for graphs in general, by de ning the distance function to return the length of the shortest path b etween the seed and an element.

User De ned Typ es. The same metho d can b e applied to user de ned typ es. A metric should b e de ned giving the \closeness" of an element to a seed, and this can then b e used to de ne an ordering on that typ e. Indeed this can b e used to implement more representational approaches [19], with the user de ned typ e b eing some representation of the p ositioning of a case in a structured case base. The similarity measure will then re ect the indexing of cases in the case base structure.

2.3 Bo olean Connectives

Sections 2.1 and 2.2 presented a rep ertoire of similarity measures for individual features of a case. It is now necessary to consider how to combine these similarity measures to form more complex similarity measures for whole cases. The rst obvious candidates for combining orderings are the usual b o olean op erators. These are covered in this section. The section starts with a presen- tation of the application of b o olean op erators to construct complex similarity

ensures that the intersection of the maxima of two relations will b e an acceptable approximation of the maxima of the disjunction of those relations. These, and other prop erties, given in this pap er are stated without pro of. Pro ofs are given in [18]. A sucient condition for the inclusion in Prop. 2 to b e an equality is that the two relations involved have a degree of consistency in their inverses. If x is less than y in the rst ordering, and greater than y in the second, then it must also b e greater than y in the rst, and vice versa. I.e. if, for all x and y in S :

(x ^1 y ^ y ^2 x ) y ^1 x) ^ (x ^2 y ^ y ^1 x ) y ^2 x) (1)

then u (^1 b_ ^2 ) S = ((u ^1 ) \b (u ^2 )) S :

Since this condition holds if ^1 _b ^2 is a partial order, then, in this case, the intersection of the maxima will b e the maxima of the disjunction. These results can b e applied to determine the maxima of a relation con- structed by application of the b o olean op erators. This can b e done by taking the disjunctive normal form of the b o olean expression and determining, in par- allel, the maxima of the constituent terms of the normal form, and then taking the intersection.

2.4 Filters

Another p ossibility is to rst \ lter" the set through some predicate b efore applying the maximising function. Normally a lter will take a predicate, and when applied to a set will give a subset of that set. In keeping with the approach taken in the rest of this pap er a lter here will take a relation over a typ e, and apply it to a seed to give the predicate that will b e applied to a set. The symb ol /" will b e used for lters.

/ :: (! ! b o ol)! ! f g! f g /  # S = fx 2 S j x  #g

Filters can b e used to express concepts such as \only when" and \except when". Filters can also b e used to construct more complex preferences. A feature of a case may b e a set of constituents. Filters can then b e used to select cases containing a minimum (or maximum) set of constituents, to eliminate cases containing (or not containing) some sp eci c constitutent, or even, in combination with the op erators given in Sect. 2.1, to order cases according to the closeness of their list of constituents to some ideal. Filters can b e expressed as similarity measures. Given a relation  that is to b e applied in a lter, it is p ossible to de ne a similarity measure  such that, except for one sp ecial case, /  = u . The exception is when /  S = ;, in which case u  S = S.

Prop erty 3 Let  be any binary relation. De ne  by  # x y = y  p. Then /  = u .

Applying  makes it p ossible to combine lters with similarities by applying the two following prop erties:

Prop erty 4 (/ ) b (u  ) = u ( _b  ) ;

Prop erty 5 (u  ) b (/ ) = u ( _b  ^b  ) :

2.5 Priorities and Preferences

Another p ossible typ e of connective is one which will take one similarity measure as b eing more signi cant than another. There are two p ossible approaches to this. The rst applies to the similarity measures themselves, the second to the pro cess of determining maxima. The rst of these will b e referred to as a priority (after [23]), the second as a preference (after [8]).

Priorities. The prioritisation of relation ^1 over relation ^2 , notation ^1  ^2 , is a generalisation of lexicographic ordering de ned for relations, and is the relation de ned by:

De nition 3. x (^1  ^2 ) y = (x ^1 y ^ :(y ^1 x)) _ (x ^1 y ^ y ^1 x ^ x ^2 y ) :

The two terms (x ^1 y ^ :(y ^1 x) and x ^1 y ^ y ^1 x ^ x ^2 y ) in this dis- junction satisfy (1), since b oth antecedents in this condition will b e false. As a consequence, the intersections of the maxima of the two terms will b e equal to the maxima of the the prioritisation. A prioritisation of similarity measures is a prioritisation of relations \lifted" to similarity measures:

( 1   2 ) p = ( 1 p)  ( 2 p) :

Prop erty 6 When taking maxima of priorities the rst term (x ^1 y ^ :(y ^1 x)) may be replaced by x ^1 y , since x ^1 y ^ :(y ^1 x) ) y ^1 x ^ :(x ^1 y ) is equivalent to x ^1 y ) y ^1 x.

The prioritisation of similarity  1 over similarity  2 will therefore b e de ned as:

De nition 4.  1   2 =  1 _b ( 1 ^b  1 1 ^b  2 ) :

thus avoiding the need for negation.

Prop erty 7 Prioritisation distributes to the right over disjunction and, when taking maxima, if the two similarity measures being disjoined satisfy (1), also to the left.

Preferences. An alternative approach is to rst select maxima for the rst similarity measure, and then take the maxima over these according to the second | i.e. the second similarity measure is applied only to discriminate b etween the maxima of the rst. The preference of similarity measure  1 over similarity measure  2 is de ned by:

De nition 5. u ( 1   2 ) = (u  2 )  (u  1 ) :

at is

at e x = 1

is e x = 0 ; if f x = f e = 1 ; otherwise maximal minimal maximal e x = f e f x; if f x  f e = 1 ; otherwise

minimal e x = f e f x; if f x  f e = 1 ; otherwise b est id b est e x = jf x f ej id e x = f x

Fig. 3. Cardinal similarity measures

Other structures. It is also p ossible to derive cardinal similarity measures from the structures discussed in Sect. 2.2 | trees, DAGs, graphs, user de ned struc- tures. All that is required is that a distance function (such as 7 ! in Sect. 2.2) b e de ned for these structures which can then b e used to give the \score" for each case, rather than using the distance to de ne an ordering, as was done in Sect. 2.2. This could even b e applied to numeric valued features by de ning a distance function on numb ers | e.g. a logarithmic distance function | and applying this directly. This provides an alternative to de ning some function f and using op erations such as b est.

3.2 Combining Numeric Measures

Numeric similarity measures can b e combined using basic arithmetic op erations in a manner analogous to that presented in Sect. 2.3. Using the op erators +, (unary and binary), and , measures can b e added, subtracted, multiplied and, using the constn measures and multiplication, weighted. Again, a normal form can b e derived | a sum of pro ducts | and the pro ducts computed in parallel.

3.3 Switching Typ es of Similarity Measure

It is fairly easy to switch from cardinal to ordinal similarity measures. To trans- form a cardinal measure to an ordinal measure the cardinal information can b e used to generate an ordinal measure by comparing values. If N 1 is a cardinal similarity measure an ordinal measure can b e de ned:

 # x y = (N 1 # x)  (N 1 # y );

The scores determine the ordering. Obviously the cardinal information | how much more highly one case scores than another | is lost in the transformation.

The transformation from an ordinal measure to a cardinal measure is more complex. The problem is in creating ordinal information where none was previ- ously available, and in transforming a partial order to a total order. One p ossi- bility is to take the numb er of cases that can b e found b etween two cases as an ordinal measure of the similarity of those two cases. Note that, as b efore, the higher the measure, the less similar the cases. However, if the cases are incom- parable, there will b e no ob jects b etween the two cases, and the measure will have to b e adapted to deal with this. A p ossible transformation is, therefore

N 1 # x = 1 ; if jfy jx ( #) y ( #) # _ # ( #) y ( #) xgj = 0 = jfy jx ( #) y ( #) # _ # ( #) y ( #) xgj ; otherwise

Note that if # and x are related, then the cardinality of the set of elements ap- p earing b etween # and x will never b e zero, since b oth # and x app ear \b etween"

and x.

This measure can, however, give p ossibly counter-intuitive results. Consider the ordering, for seed A:

H

F G

B C D E

A

@I@ 6  6

^6 @I@^ HH

HY

The cardinal measure prop osed will return a higher (worse) value for F than for H, b ecause there are ve values b etween A and F (including A and F them- selves), and only four b etween A and H. An alternative would b e to take the shortest path b etween the elements, given by:

minpath x # = 1 ; if :(x ( #) #) ^ :(# ( #) x) = 0 ; if x = # = 1 + min fminpath x^0 #jx^0 2 neighb oursg; otherwise where neighb ours = fy jx  y ^ x 6 = y ^ 69 z 2 = fx; y g : x  z  y g  =  #; if x ( #) # = ( #)^1 ; otherwise :

4 Conclusions

We have presented a rep ertoire of to ols for constructing similarity measures, b oth numeric and symb olic. These to ols make it p ossible to construct similarity measures systematically and/or incrementally, in which a more re ned similarity measure is derived from the result of applying a simpler measure. The implied loss of eciency that the use of more exible similarity mea- sures entails can b e comp ensated for by the opp ortunities this metho d o ers for parallel evaluation.

The budgetary restraint can b e achieved by applying a lter: / (price <). The simplest of the guest's requirements to mo del is their weight watching, as this is simply the inverse of the usual ordering on integers: cal id^1. Their desire for hot dishes is also fairly simple, requiring an application of the b est similarity measure, which will \break the back" of the spiciness ordering at hot. The required dishes will b e selected by spic b est. The remaining two preferences are slightly more complex. For the desired list of ingredients a directed acyclic graph can b e de ned | the standard lat- tice representing the subset ordering on sets of ingredients | and the distance function for DAGs applied to order sets of ingredients according to their prox- imity to the ideal set of ingredients. The ordering generated | which will b e called <DAG ingr | re ects the fact that the distance function in this DAG is a measure of the numb er of elements common to the seed and the set under con- sideration. The meals b est tting the desired list of ingredients will b e selected by ingr (<DAG ingr ). The nal preference requires a tree to b e de ned representing a taxonomy of meat typ es, to which a distance function can b e applied to select those meat typ es most similar to turkey. This tree is given in Fig. 5. The required meals will b e selected by meat (<Tree meat). The four orders discussed here are presented in Fig. 6.

lamb b eef chicken turkey

red white

meat  

HH

H

@@ ,, ll

non-veg.

sh

r

veg.

none

HH

H

Fig. 5. A taxonomy of meat typ es

A.3 Retrieval

Assume that we rst wish to lter out the more exp ensive meals, and then select according to our guest's preferences. Assume also that our guest's preferred list of ingredients and desire for meat similar to turkey is to b e given priority over their weight watching and preference for hot fo o d. The \b est" meals will b e selected by: ((ingr (<DAG ingr ) ^b meat (<Tree meat))  (cal id^1 b^ spic b est)) b (price / <); applied to #. Application of the transformations presented in this pap er shows this to b e equivalent to the disjunctive normal form:

(price < )

lm,vb,cv,bb " ps,pl,cc " tm

cv,pl " lm,bb " tm " ps,vb,cc

cv,ps vb " " cc lm,bb " tm,pl

vb,tm " lm,cv " bb,cc " pl,ps ingr (<DAG ingr ) meat (<Tree meat) spic b est cal id^1

Fig. 6. Some atomic similarity measures

_b (ingr (<DAG ingr ) ^b meat (<Tree meat) ^b price < ) _b (ingr (<DAG ingr ) ^b meat (<Tree meat) ^b ingr (<DAG ingr)^1 b^ meat (<Tree meat)^1 b^ cal id^1 ^b spic b est ^b price < ) :

The three terms in this expression will select as maximal cases flm, vb, ps, bb, plg, flm, bb, cvg and flm, vb, cv, tm, pl, ps, ccg resp ectively, the intersection of which is flmg. Consequently, the recommendation will b e to serve the lamb casserole.

References

  1. A. Aamo dt and E. Plaza. Case based reasoning: Foundational issues, metho dolog- ical variations and system approaches. AI Communications, 7(1):39{59, 1994.
  2. R. Altermann. Adaptive planning. Cognitive Science, 12:393{421, 1988.
  3. K.D. Ashley and E.L. Rissland. A case-based approach to mo deling legal exp ertise. IEEE Expert, 3(3):70{77, 1988.
  4. R. Bareiss, J.A. King, J. Ashley, K. Kolo dner, B. Porter, and P. Thagard. Panel on \similarity metrics". In Proceedings of DARPA Case Based Reasoning Work- shop, pages 66{84. Morgan Kaufmann, 1989.
  5. R. Barletta and W. Mark. Explanation-based indexing of cases. In Proceedings of AAAI-88, pages 541{546, 1988.
  6. M. Brown. A Memory Model for Case Retrieval by Activation Passing. PhD the- sis, Department of Computer Science, University of Manchester, 1994. Technical Rep ort 94-2-1.
  7. D. K. G. Campb ell, H. R. Osb orne, A. M. Wo o d, and D. G. Bridge. Generic op er- ations for CBR in linda. Technical Rep ort (to app ear), Department of Computer Science, University of York, 1996.
  8. A.D. Griths and D.G. Bridge. Formalising the knowledge content of case memory systems. In Ian D. Watson, editor, Progress in Case-Based Reasoning: First United Kingdom Workshop in Case-Based Reasoning, pages 32{41. Springer Verlag Lecture Notes in Computer Science; 1020; Lecture Notes in Arti cial Intel ligence, 1995.
  9. T.R. Hinrichs. Problem Solving in Open Worlds: A Case Study in Design. Lawrence Erlbaum, 1992.