Natural language processing Quz, Prüfungen von Semantic Web

Natural language processing Quz

Art: Prüfungen

2021/2022

Hochgeladen am 04.02.2023

Younus_2022
Younus_2022 🇩🇪

1 dokument

1 / 18

Toggle sidebar

Diese Seite wird in der Vorschau nicht angezeigt

Lass dir nichts Wichtiges entgehen!

bg1
1
Short Quiz I
Q: The web is an application area for NLP, e.g.: [][]
Q: Web is a resource to improve the quality of NLP, e.g.: [][][]
Q: Segmentation is an important analysis step in many NLP pipelines.
Which types of segments do you know?
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12

Unvollständige Textvorschau

Nur auf Docsity: Lade Natural language processing Quz und mehr Prüfungen als PDF für Semantic Web herunter!

Short Quiz I

Q: The web is an application area for NLP, e.g.: [][] Q: Web is a resource to improve the quality of NLP, e.g.: [][][] Q: Segmentation is an important analysis step in many NLP pipelines. Which types of segments do you know?

Short Quiz I

Q: The web is an application area for NLP, e.g.: [][] A: Internet of Services, Community mining, Information retrieval, ... Q: Web is a resource to improve the quality of NLP, e.g.: [][][] A: Web as Corpus, Analyzing web-based knowledge repositories, Q: Segmentation is an important analysis step in many NLP pipelines. Which types of segments do you know? A: Sentence, Token.

Short Quiz II

Q: Identify all stems and affixes (prefix, suffix, infix, circumfix) in following words: index, incorrect, interesting A: stem:index, prefix:in stem:correct, stem:interest suffix:ing Q: In contrast to lemmatization, stemming does not necessarily return a valid word form. Why is stemming still useful? A: Faster, easier, applications in IR. Q: What types of syntactic ambiguity do you know? List at least two types with an example for each type. A: Attachment ambiguity (e.g. "He walked around the house with the dog."), coordination ambiguity (e.g. "We serve excellent rice and fish."), garden path sentences (e.g. "The loud shot the silent.").

Short Quiz III

Q: What is the name of the internationally accepted ISO standard for web genre classification? A: no such thing Q: Why is it more difficult to classify web genres in comparison with traditional text genres? A: Higher complexity: Hypertext links, Interactive features, Multimedia, Web 2.0 – Elements; less clear definition Q: Why are more features not always better for learning a classifier? A: More training data might be needed; Dependent features might lead to deficient models; Some ML algorithms can only deal with specific data types; Slower. Q: What is the typical type of input data that a sequence tagger training step requires? A: Labeled training text. (e.g. POS, NER)

Short Quiz IV

Q: What to do first in a web search engine? Bring these tasks into the right order: Indexing, crawling, ranking, (document) parsing, stemming: A: crawling - > parsing - > stemming - > indexing - > ranking Q: Name features and their feature types, which are used for WebSearch ranking. A: static: : inlinks, pagerank, document length, language quality, ... dynamic: TF-IDF, LM scores, anchor text, user clicks ... query features: length, known bigrams, number of stopwords Q: Why not just use user clicks for ranking after having a few million clicks from a previously unranked search engine? A: Top-result-bias

Short Quiz IV

Q: What to do first in a web search engine? Bring these tasks into the right order: Indexing, crawling, ranking, (document) parsing, stemming: A: crawling - > parsing - > stemming - > indexing - > ranking Q: Name features and their feature types, which are used for WebSearch ranking. A: static: : inlinks, pagerank, document length, language quality, ... dynamic: TF-IDF, LM scores, anchor text, user clicks ... query features: length, known bigrams, number of stopwords Q: Why not just use user clicks for ranking after having a few million clicks from a previously unranked search engine? A: Top-result-bias

Short Quiz V

Q: Relate following queries to query types (informational, navigational, transactional, exploratory): A: - facebook - > navigational

  • NSA scandal - > informational
  • facebook alternatives - > exploratory
  • facebook data migration tool - > transactional Q: What are the main components and parameters of a summarization system? A: Components: Content selection, Ordering of extracted units, sentence realization. Parameter: Compression rate. Q: You pick the three most important sentences from a text as summary. Is that an extractive or an abstractive summary? A: Extractive

Short Quiz VI

Q: Extract a lexical chain: You visit the NLP4Web lectures every week at the university. Also you submit the exercises in order to get the bonus. At the end of the semester, you write the exam and hope for a good grade. A: e.g.: lectures <> university <> exercises <> semester <> exam <> grade Q: What is the difference between extrinsic and intrinsic evaluation? A: intrinsic evaluation directly measures a processing step, e.g. measure the accuracy on POS tagging. Extrinsic evaluation measures improvements in larger tasks/system, e.g. how do POS tagging improvements influence the quality of summarization or question answering

Short Quiz VII

Q: List three different question types with an example for each of them. A:

  1. Feature specification. - What shape did the stone have?
  2. Definition. - What are prime numbers?
  3. Quantification. - How many Bavarians required for exchanging a light bulb? Q: Give a schema on how these Question Answering System components interact: Answer, Document processing, Document collection, Answer processing, Question, Question processing. A: see Lecture 10

Short Quiz VII

Q: List three different question types with an example for each of them. A:

  1. Feature specification. - What shape did the stone have?
  2. Definition. - What are prime numbers?
  3. Quantification. - How many Bavarians required for exchanging a light bulb? Q: Give a schema on how these Question Answering System components interact: Answer, Document processing, Document collection, Answer processing, Question, Question processing. A: see Lecture 10

Short Quiz VIII

Q: The Reciprocal Rank (RR) is the inverse of the rank of the first correct answer or 0 if no correct answer was given. The Mean Reciprocal Rank (MRR) is the mean of the RRs over all questions. P@1 is the precision on the firat result. Which measure is more suited for question answering, and why? A: In QA, it does not make much sense to produce many answers, what is important is the retrieval of one correct answer. Therefore, P@1 is more suited. MRR is better for IR

Short Quiz IX

Q: Name three different possible information sources in Wikipedia. A: Title, Introduction, Redirects, Infoboxes, Hyperlinks, disambiguation pages, Revisions … Q: Which Wikipedia features can help to create a a set of keyphrases for an entity that has a Wikipedia article? A: Link anchor texts, citation titles, category names, titles of linking articles, ... Q: Name three types of information in Wiktionary that can be useful for NLP. A: Language, etymology, pronunciation, part-of-speech, word senses, synonyms, derived terms, translations, ...