论文笔记[1] From Distributional Semantics to Conceptual Spaces: A Novel Computational Method for Concept
论文题目:From Distributional Semantics to Conceptual Spaces: A Novel Computational Method for Concept Creation
论文地址: https://content.sciendo.com/view/journals/jagi/6/1/article-p55.xml?language=en.
论文作者:Stephen McGregor, Kat Agres, Matthew Purver, Geraint A. Wiggins
目录
1. Introduction & Concepts and Creativity
-
A computational model for the discovery of concepts which attempts to bridge the gap between standard lexical distributional semantics and conceptual spaces. (based on a standard approach to distributional lexical semantics)
-
On WordNet-based task, Comparisons to human judgements
-
Building a distributed semantic space ⇒ \Rightarrow ⇒ Projecting subspaces informed by query ⇒ \Rightarrow ⇒ Discovering conceptual members
-
Concepts: not only look-up table, associating words with rules(A cognitive agent’s dynamic interaction with an environment)
-
Creative Conceptualisation: more than just a rearranging of things into predefined categories, involves the creation of a new way of associating things
-
Flexibility & Immediacy, tightly coupled with the environment (core of cognition in general)
-
Conceptual Spaces: Concepts ⟺ \iff ⟺ Regions, Individual Entities belong to those concepts (distances between points, prototypical members of concepts as more central points, distances from central points) ⇒ \Rightarrow ⇒ Geometry
-
Words and Concepts in Context: ad hoc concept
2. Distributional Semantic Language Models
-
Word Counting and Matrix Factorisation:
- distributional semantics(lexical statistics): premise - similar words appear in similar contexts
- Vector based on co-occurrence counts & word-counting technique
- Reduction: singular value decompositions to matrices(matrix factorisation)
-
Word Embeddings from Neural Networks:
- To overcome curse of dimensionality: smooth redistribution of probability
- N-gram ⇒ \Rightarrow ⇒ word2vec, etc. trend
-
Finding Dynamic Context in a Lexical Space
3. The Model
-
Projection: a base space into context-specific, conceptually loaded subspaces
-
Dynamic generate query-specific spaces
Step 1
- Co-occurrence Matrix
M
M
M:
p
×
q
p×q
p×q,
M
w
,
c
M_{w,c}
Mw,c(PMI : pointwise mutual information)
- n w , c n_{w,c} nw,c : the count of co-occurrences of word w w w with context term c c c
- n w n_{w} nw: the count of occurrences of word w w w in any context
- n c n_{c} nc: the count of occurrences of context term c c c with any word
-
W
W
W : the total count of all word tokens across the corpus
- Then to build up Subspaces
Step 2
-
N
N
N: a set of n words,
b
b
b non-zero features,
n
×
b
n×b
n×b space, normalisation
- sharing weight more evenly amongst dimensions
- select a subset
c
′
c^{'}
c′ of the
t
t
t most salient features
Step 3
① Anchor method
- Anchor point:
- q q q is determined experimentally
- For any word-vector
w
⃗
\vec{w}
w :
- standard Euclidean hypersphere
② Norm method
- Defines concepts as regions beyond a certain distance from the origin:
- For any word-vector
w
⃗
\vec{w}
w :
- For any word-vector
w
⃗
\vec{w}
w :
4. Study 1-3
Study 1: Concept Discovery inWordNet
- On conceptual membership discovery
- ground truth : WordNet
- co-occurrence corpus : Wikipedia
- Compare to : word2vec & GloVe
- Pre-processed & 1.1 billion word tokens, 7.5 million word types ⇒ \Rightarrow ⇒ 200,000 types
- Window size : 4 & 10
- Example :
b
o
d
y
p
a
r
t
body\ part
body part
- Subspace : b o d y ⃗ \vec{body} body and p a r t ⃗ \vec{part} part
- Word2vec & GloVe : mean vector, lowest cosine differences
- Compare top 50 results
Study 2: Empirical validation of concepts
- Study 1 constrained by WordNet
- Human responses for a concept, the proportion matching outputs from the model
- 8 concepts given, each list 3 terms
- Participants : 35(avg age = 40.1, stdev = 14.3 yrs, 24 female)
- Compared with the top 50 output vectors
Study 3: Direct Comparison of Model and Human Terms
- For 8 concepts given in Study 2
- Model 20 + Human 20(random) ⇒ \Rightarrow ⇒ “X” pick
- Participants : 7(avg age = 36.3, stdev = 13.1 yrs, 1 female)
- Human terms : 61.8%, Model terms : 53.3%
- Hypothesis Testing( h 0 h_{0} h0: there is no difference between the two sets of terms) :
- 6 out of 8 failed to reject
- POSITIVE EMOTION and BRIGHT COLOR
5. Summary and Future Direction
-
high-dimensional space based on lexical co-occurrence ⇒ \Rightarrow ⇒ subspaces corresponding to these ad hoc concepts ⇒ \Rightarrow ⇒ conceptual sets
-
Get improved in some certain concepts : positive emotion, bright color and not listed, etc.
-
Adjusting various parameters ⇒ \Rightarrow ⇒ more fine-grained
-
Provide a platform for modeling the formation of conceptual metaphor
Essence of creativity : an ability to incorporate the ongoing emergence of unpredictable context into a flexible conceptual framework which results in the construction, through the composition of representations, in interesting and useful new conceptualisations.