









Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Prepara tus exámenes
Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Prepara tus exámenes con los documentos que comparten otros estudiantes como tú en Docsity
Encuentra los documentos específicos para los exámenes de tu universidad
Estudia con lecciones y exámenes resueltos basados en los programas académicos de las mejores universidades
Responde a preguntas de exámenes reales y pon a prueba tu preparación
Consigue puntos base para descargar
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Comunidad
Pide ayuda a la comunidad y resuelve tus dudas de estudio
Ebooks gratuitos
Descarga nuestras guías gratuitas sobre técnicas de estudio, métodos para controlar la ansiedad y consejos para la tesis preparadas por los tutores de Docsity
Asignatura: Teoria de la probabilitat, Profesor: fineti fineti, Carrera: Ciències i Tècniques Estadístiques, Universidad: UV
Tipo: Apuntes
1 / 15
Esta página no es visible en la vista previa
¡No te pierdas las partes importantes!










A Case Base Similarity Framework
Hugh R. Osb orne and Derek G. Bridge
University of York
Abstract. Case based systems typically retrieve cases from the case base by applying similarity measures. The measures are usually con- structed in an ad ho c manner. This pap er presents a theoretical frame- work for the systematic construction of similarity measures. In addition to paving the way to a design metho dology for similarity measures, this systematic approach facilitates the identi cation of opp ortunities for par- allelisation in case base retrieval.
1 Case Memory Systems
In this pap er we present a framework for the construction of similarity measures. Great exibility is achieved by constructing complex similarity measures from more basic measures using a variety of connectives that we de ne. The concepts intro duced in this pap er are illustrated by an extensive example in the app endix. A case memory system will b e considered to consist of a case base and a retrieval mechanism. The case base will b e mo delled as a nite set, , of cases, equipp ed with projection functions for accessing the comp onent elements of these cases. While the cases in the example in the app endix are all tuples, and the pro jection functions the standard pro jection functions for tuples, a case may b e a more complex structure, with corresp ondingly more complex pro jection functions. The pro jection functions might even implement considerable infer- encing [9, 14 ], p erhaps to obtain \deep" features [3] from \surface" features. A retrieval request is presented to the system as a pair, consisting of an element, #, of and a similarity measure, . The case #, known as the seed will, in combination with the similarity measure, represent the \b est p ossible" case. This is in contrast to the earliest approaches, e.g. [11], in which the seed was the ideal case, and the similarity measure measured the closeness of retrieved cases to this ideal. In the approach taken here, the similarity measure can, for example, include negation, so that distance from, rather than closeness to, the seed b ecomes the measure of suitability. We take a very general view of what cases are. The problem description, its wider situation or context, its solution, the solution's outcome, etc. may all b e features that may b e pro jected from a case. In some case memory implementa- tions, only a subset of these might b e stored directly as elds of the cases; others
? (^) In: Proceedings of the 3 rd (^) European Workshop on Case-Based Reasoning, EWCBR'96, Advances in Case-Based reasoning, Ian Smith and Boi Faltings (Eds.), Lecture Notes in Arti cial Intelligence 1168, Springer Verlag, 1996
might b e part of an indexing structure (as, e.g., with the explanations of case applicability in [5]). However, even in these systems, the information has to b e asso ciated with the case in some fashion, and so we can, without loss of gen- erality, assume that the information can b e obtained by applying a pro jection function to a case.
By taking this broad view of what cases are and by allowing similarity mea- sures to apply to any of the features that can b e pro jected from the case, our framework also encompasses prop osals that retrieval should b e sensitive to as- p ects of the case other than the problem description (e.g. the adaptability of the solution, as in [12, 24 ]). If, on the other hand, only problem descriptions are to b e compared, will b e designed to ignore other features.
Finally, we should note that, in systems in which the case memory is indexed, case base interrogation is often a two-stage pro cess [13, 1]: a retrieval step ex- ploits the indexes to restrict computational e ort to cases that are similar to the seed on those characteristics enco ded as indexes, but the nal ranking and case selection requires application of a similarity measure to this retrieved set of cases.
There is a sense in which this two-step pro cess is equivalent to the application of two similarity measures: one that is \hard-co ded" as indexes, and one that is then applied to the results of the application of the rst. From this p oint of view, our framework encompasses systems of this kind.
In passing, we note that at stake here is whether to take a representational or a computational view of similarity [19], or, in Richter's terminology [21], whether the similarity measure is compiled knowledge or knowledge that is interpreted at run-time. In the representational approach, cases reside in a data structure, such as a DAG, where, e.g., proximity in the data structure denotes similarity. Representational approaches can a ord considerable eciency in retrieval. The data structure is e ectively optimised towards retrieval according to the \hard- co ded" similarity measure. This can b e of esp ecial value when similarity assess- ment requires the application of large b o dies of domain-sp eci c knowledge: that knowledge will b e applied once p er case at case base up date time, rather than b eing applied afresh on every retrieval [19]. However, this form of optimisation can lead to a loss of exibility [15] as it may b e hard or inecient to access the case base in di erent ways as might b e needed to give more context-sensitivity or to use the case base for multiple tasks [4]. The computational approach, on the other hand, will, in its most extreme form, compute similarity \from scratch" on each retrieval. This can b e a exible approach as nothing is hard-co ded; it may b e more amenable to user manipulation of the similarity measure (as allowed in many case-based reasoning shells, e.g., [10]) or even manipulation through some learning pro cess, e.g. [22]; but there may b e an eciency price to b e paid. (A \spin-o " of our own work has b een the identi cation of opp ortunities for paral- lelisation in pure computational approaches using our similarity framework [17], and this may help to make computational approaches more widely usable. See also [16].) The two-stage pro cess mentioned ab ove is clearly a compromise b e- tween pure representational and pure computational approaches. We rep eat that
at: No elements are related
at 3 is: The seed is b etter than all others
is 3
x ( at #) y x = y 1 2 3 4 5
x (is #) y (y = #) _ (x = y )