论文笔记[1] From Distributional Semantics to Conceptual Spaces: A Novel Computational Method for Concept

论文题目:From Distributional Semantics to Conceptual Spaces: A Novel Computational Method for Concept Creation
论文作者:Stephen McGregor, Kat Agres, Matthew Purver, Geraint A. Wiggins

1. Introduction & Concepts and Creativity

  • A computational model for the discovery of concepts which attempts to bridge the gap between standard lexical distributional semantics and conceptual spaces. (based on a standard approach to distributional lexical semantics)

  • On WordNet-based task, Comparisons to human judgements

  • Building a distributed semantic space ⇒ \Rightarrow Projecting subspaces informed by query ⇒ \Rightarrow Discovering conceptual members

  • Concepts: not only look-up table, associating words with rules(A cognitive agent’s dynamic interaction with an environment)

  • Creative Conceptualisation: more than just a rearranging of things into predefined categories, involves the creation of a new way of associating things

  • Flexibility & Immediacy, tightly coupled with the environment (core of cognition in general)

  • Conceptual Spaces: Concepts    ⟺    \iff Regions, Individual Entities belong to those concepts (distances between points, prototypical members of concepts as more central points, distances from central points) ⇒ \Rightarrow Geometry

  • Words and Concepts in Context: ad hoc concept

2. Distributional Semantic Language Models

  • Word Counting and Matrix Factorisation:

    • distributional semantics(lexical statistics): premise - similar words appear in similar contexts
    • Vector based on co-occurrence counts & word-counting technique
    • Reduction: singular value decompositions to matrices(matrix factorisation)
  • Word Embeddings from Neural Networks:

    • To overcome curse of dimensionality: smooth redistribution of probability
    • N-gram ⇒ \Rightarrow word2vec, etc. trend
  • Finding Dynamic Context in a Lexical Space

3. The Model

Figure 1

  • Projection: a base space into context-specific, conceptually loaded subspaces

  • Dynamic generate query-specific spaces

Step 1

  • Co-occurrence Matrix M M M: p × q p×q p×q, M w , c M_{w,c} Mw,c(PMI : pointwise mutual information)
    • n w , c n_{w,c} nw,c : the count of co-occurrences of word w w w with context term c c c
    • n w n_{w} nw: the count of occurrences of word w w w in any context
    • n c n_{c} nc: the count of occurrences of context term c c c with any word
    • W W W : the total count of all word tokens across the corpus
  • Then to build up Subspaces

Step 2

  • N N N: a set of n words, b b b non-zero features, n × b n×b n×b space, normalisation
    • sharing weight more evenly amongst dimensions
  • select a subset c ′ c^{'} c of the t t t most salient features

Step 3

① Anchor method

  • Anchor point:
  • q q q is determined experimentally
  • For any word-vector w ⃗ \vec{w} w :
  • standard Euclidean hypersphere
    Figure 2

② Norm method

  • Defines concepts as regions beyond a certain distance from the origin:
    • For any word-vector w ⃗ \vec{w} w :
      Figrue 3

4. Study 1-3

Study 1: Concept Discovery inWordNet

  • On conceptual membership discovery
    • ground truth : WordNet
    • co-occurrence corpus : Wikipedia
    • Compare to : word2vec & GloVe
  • Pre-processed & 1.1 billion word tokens, 7.5 million word types ⇒ \Rightarrow 200,000 types
  • Window size : 4 & 10
  • Example : b o d y   p a r t body\ part body part
    • Subspace : b o d y ⃗ \vec{body} body and p a r t ⃗ \vec{part} part
    • Word2vec & GloVe : mean vector, lowest cosine differences
    • Compare top 50 results
      Figure 4

Study 2: Empirical validation of concepts

  • Study 1 constrained by WordNet
  • Human responses for a concept, the proportion matching outputs from the model
  • 8 concepts given, each list 3 terms
  • Participants : 35(avg age = 40.1, stdev = 14.3 yrs, 24 female)
  • Compared with the top 50 output vectors
    Figure 5

Study 3: Direct Comparison of Model and Human Terms

  • For 8 concepts given in Study 2
  • Model 20 + Human 20(random) ⇒ \Rightarrow “X” pick
  • Participants : 7(avg age = 36.3, stdev = 13.1 yrs, 1 female)
  • Human terms : 61.8%, Model terms : 53.3%
  • Hypothesis Testing( h 0 h_{0} h0: there is no difference between the two sets of terms) :
  • 6 out of 8 failed to reject

5. Summary and Future Direction

  • high-dimensional space based on lexical co-occurrence ⇒ \Rightarrow subspaces corresponding to these ad hoc concepts ⇒ \Rightarrow conceptual sets

  • Get improved in some certain concepts : positive emotion, bright color and not listed, etc.

  • Adjusting various parameters ⇒ \Rightarrow more fine-grained

  • Provide a platform for modeling the formation of conceptual metaphor

Essence of creativity : an ability to incorporate the ongoing emergence of unpredictable context into a flexible conceptual framework which results in the construction, through the composition of representations, in interesting and useful new conceptualisations.

