Papers and Publications
This page contains links to papers and other articles I've written
over the years. Other up-to-date information can be found on my Google Home
Page.
Public Software
NEW! The following paper describes the SemanticVectors
package in some detail, including its design, motivations, inner
workings, and one online application.
- Semantic Vectors: A
Scalable Open Source Package and Online Technology Management
Application. Dominic Widdows and Kathleen Ferraro. To appear in
Sixth International Conference on Language Resources and
Evaluation (LREC 2008).
It was really productive to be able to write a paper that focusses on
software itself, rather than mathematics, competitive evaluation of
results, etc. I'd like to thank the European Language Resources
Association (ELRA) and all the people who work so hard for making such
a conference possible: I think that if more of our research time was
spent sharing each other's tools rather than beating each other's
results, we'd make a lot more progress much more quickly.
This said, the software is largely an implementation of many of the
pieces of mathematics that are desribed in the other papers on vectors
and quantum informatics below.
Quantum Informatics
Delving deeper into the language applications of vector
algebra, the following paper appeared in Oxford this Spring:
- Semantic Vector
Products: Some Initial Investigations. Dominic Widdows. To appear
in Second AAAI Symposium on Quantum Interaction, Oxford, 26th –
28th March 2008.
In recent years, it has become increasingly clear that some of the
mathematical structures responsible for the supposed "weirdness" of
quantum mechanics aren't weird at all. Quantum mechanics is often much
more in line with common sense than the predictions of classical
physics. Peter Bruza and I finally felt confident enough about this to
submit a paper on the subject, trying to encourage people to think
about the goals and opportunities of an
Open World Science approach.
-
Quantum Information Dynamics and Open World Science.
Dominic Widdows and Peter Bruza.
AAAI Spring Symposium on Quantum Interaction,
Stanford, California, March 2007.
There is also a book chapter in press that describes more of the cognitive and logical motivations for using "quantum" models to describe intelligent behaviour.
- A Quantum Logic of Down Below. P.D. Bruza, D. Widdows, John Woods.
Pre-final draft of a chapter to appear in K. Engesser, D. Gabbay and D. Lehmann (eds) Handbook of Quantum Logic, Quantum Structure and Quantum Computation. Vol 2.
There is much preparatory ground for this work in Geometry and
Meaning, and in some of the Information Retrieval papers below. An
increasing number of key researchers have helped us considerably so
far, including John Woods and Keith van Rijsbergen. We
are starting to believe that this is one of the best founded and
most crucial areas in which science is advancing.
Biomedical Informatics
The following two papers are a pair of related works, describing
recent progress that my collaborators in Algeria and Germany and I
have been making in ontology learning and adaptation for the medical domain.
The first paper describes some detailed experiments on ontology
adaptation using purely noun coocurrences, led by Mr Toumouh. The
second paper is more general, and describes some other linguistic
constructions used for clustering and ontology adaptation, and
describes a broad strategy for combining text-mining solutions with
hand-coded language resources, motivated by work in word sense
disambiguation and language acquisition over the past 15 years.
- A. Toumouh, A. Lehireche, D. Widdows, M. Malki Adapting WordNet to the Medical
Domain using Lexicosyntactic Patterns in the Ohsumed Corpus.
Appeared in 4th ACS/IEEE International Conference on Computer
Systems and Applications (AICCSA-06) March 8-11, 2006,
Dubai/Sharjah, UAE.
- Dominic Widdows, Adil Toumouh, Beate Dorow, Ahmed Lehireche
Ongoing Developments in Automatically Adapting Lexical Resources to the Biomedical Domain.
Fifth International Conference on Language Resources and Evaluation (LREC 2006). Genoa, Italy, May 24-26, 2006.
The following paper is an older work, carried out as part of the MUCHMORE project.
Dominic Widdows, Stanley Peters, Scott Cederberg, Chiu-Ki Chan,
Diana Steffen and Paul Buitelaar.
Unsupervised Monolingual and Bilingual Word-Sense Disambiguation
of Medical Documents using UMLS. Natural
Language Processing in Biomedicine ACL 2003 Workshop, ACL
workshop, Sapporo, Japan, July 11, 2003, pages 9-16.
(.ps)
Quaternionic algebra and geometry
These are very technical ...
-
Dominic Widdows.
A Dolbeault-type Double Complex on Quaternionic Manifolds.
Asian Journal of Mathematics, Vol 6, No 2, pp. 253-276, June 2002.
(.ps)
-
Dominic Widdows.
Quaternionic Algebra described by Sp(1) representations.
Quarterly Journal of Mathematics, Oxford, 2003.
(.ps)
-
Dominic Widdows.
Quaternionic Algebraic Geometry.
D.Phil thesis, University of Oxford, 2000.
(.ps)
Geometric Logic and Inductive Reasoning
- Dominic Widdows.
Geometric ordering of concepts, logical disjunction,
and learning by induction. Compositional
Connectionism in Cognitive Science, AAAI Fall
Symposium Series, Washington, DC, October 22-24, 2004.
(.ps)
Information Retrieval
- Dominic Widdows.
Orthogonal Negation in Vector Spaces for Modelling Word-Meanings and
Document Retrieval.
41st Annual Meeting of the Association for Computational
Linguistics, Sapporo, Japan, July 7-12, 2003, pages 136-143.
(.ps)
The following paper fills a (surprising) gap in the vector space
model for information retrieval using quantum logic:
- Dominic Widdows and Stanley Peters.
Word
Vectors and Quantum Logic: Experiments with negation and
disjunction. Eighth Mathematics of Language
Conference, Bloomington,
Indiana, June 20-22, 2003, pages 141-154.
(.ps)
Automatic Concept Learning using Vectors
-
Dominic Widdows, Beate Dorow, and Chiu-Ki Chan.
Using Parallel Corpora to enrich Multilingual Lexical Resources.
Third International Conference on Language Resources and Evaluation,
Las Palmas, May 2002, pages 240-245.
(.ps)
-
Dominic Widdows.
Unsupervised methods for developing taxonomies by combining
syntactic and statistical information. In Proceedings of
HLT/NAACL 2003, Edmonton, Canada, June 2003, pages 276-283.
(.ps)
- Scott Cederberg and Dominic Widdows.
Using
LSA and Noun Coordination Information to Improve the Precision and
Recall of Automatic Hyponymy Extraction. In Seventh Conference
on Computational Natural Language Learning (CoNLL-2003),
Edmonton, Canada, June 2003, pages 111-118.
(.ps)
- Baldwin, Timothy, Colin Bannard, Takaaki Tanaka and Dominic
Widdows An Empirical
Model of Multiword Expression Decomposability, Proceedings
of the ACL-2003 Workshop on Multiword Expressions: Analysis,
Acquisition and Treatment, Sapporo, Japan, July 2003, pages
89-96.
Automatic Concept Learning using Graph Theory
-
Dominic Widdows and Beate Dorow.
A Graph Model for Unsupervised Lexical Acquisition.
19th International Conference on Computational Linguistics,
Taipei, August 2002, pages 1093-1099.
(.ps)
-
Beate Dorow and Dominic Widdows,
Discovering Corpus-Specific Word Senses.
EACL 2003, Budapest, Hungary.
Conference Companion (research notes and demos) pages 79-82.
(.ps)
-
Beate Dorow, Dominic Widdows, Katerina Ling, Jean-Pierre Eckmann,
Danilo Sergi and Elisha Moses.
Using Curvature and Markov Clustering in Graphs for Lexical
Acquisition and Word Sense Discrimination.
MEANING-2005, 2nd Workshop organized by the
MEANING Project, February 3rd-4th 2005, Trento, Italy. (.ps)
- Dominic Widdows and Beate Dorow.
Automatic Extraction of Idioms using Graph Analysis and Asymmetric
Lexicosyntactic Patterns.
ACL2005 Workshop on Deep Lexical Acquisition
Ann Arbor, Michigan, June 30th, 2005
Information visualisation
-
Dominic Widdows, Scott Cederberg and Beate Dorow.
Visualisation Techniques for Analysing Meaning.
Fifth International Conference on Text, Speech and Dialogue,
Brno, Czech Republic, September 2002, pages 107-115.
(.ps)
-
Dominic Widdows and Scott Cederberg,
Monolingual and Bilingual Concept Visualization from Corpora.
Demonstration presented at HLT/NAACL 2003, Edmonton,
Canada, June 2003.
(.ps)
Meaning in Context (including applications to Medical Informatics)
- Dominic Widdows.
A
Mathematical Model for Context and Word-Meaning.
Fourth International and Interdisciplinary Conference on Modeling and
Using Context, Stanford, California, June 23-25, 2003, pages 369-382.
(.ps)
- Dominic Widdows, Stanley Peters, Scott Cederberg, Chiu-Ki Chan,
Diana Steffen and Paul Buitelaar.
Unsupervised Monolingual and Bilingual Word-Sense Disambiguation
of Medical Documents using UMLS. Natural
Language Processing in Biomedicine ACL 2003 Workshop, ACL
workshop, Sapporo, Japan, July 11, 2003, pages 9-16.
(.ps)
Distributed Databases, Peer to Peer and Information Commons Research
The main ongoing research I worked on at MAYA was in the area of
peer-to-peer networks and distributed databases, collectively many of
the research goals being towards the idea of creating and maintaining
an "Information Commons". The main Information Commons website can be
found at http://www.maya.com/infocommons.
A more thorough (though somewhat outdated) introduction to the
Information Commons can be found on the Civium Wiki.
Foundational Papers
- Distributed Knowledge
Representation using Universal Identity and Replication. Peter
Lucas, Jeff Senn and Dominic Widdows. MAYA Design Inc. Technical
Report MAYA-05007.
- Roles in the Universal
Database: Data and Metadata in a Distributed Semantic Network.
Peter Lucas, Dominic Widdows, Joe Hughes and William Lucas. MAYA
Design Inc. Technical Report MAYA-05009.
Peer-to-Peer Technology and Research Supporting the Information Commons
-
Shepherdable Indexes and
Persistent Search Services for Mobile Users. Michael Higgins,
Dominic Widdows, Magesh Balasubramanya, Peter Lucas, David Holstius.
8th International Symposium on Distributed Objects and
Applications (DOA). Montpellier, France, Oct 30 - Nov 1, 2006
Managing Distributed Collaboration in a Peer-to-Peer Network.
Michael Higgins, Stuart Roth, Jeff Senn, Peter Lucas, Dominic Widdows.
14th International Conference on Cooperative
Information Systems (CoopIS 2006). Montpellier, France,
Nov. 2006.
The Civium World Model: Spatial and Semantic Issues in Pervasive Computing.
Dominic Widdows, Peter Lucas, David Holstius, Michael Higgins. Tech Report MAYA-07013.
Practical Information Systems using the Information Commons
- The
Universal Genetics Database: Information Sharing in Genetics and
Beyond. Dr D. Widdows and Prof M. Barmada. Byline article for
BioTech International, Vol. 18 No. 3, pp. 11-13, Reed Elsevier, June
2006.
The following paper, a collaboration between the Brookings Institution, 3
Rivers Connect, and MAYA Design, describes some of the ways in which
the Information Commons is being used as a public information space,
combining geographic and socioeconomic data and building interfaces
that practioners in a range of fields can use to access and understand
information in ways the were previously impossible.
- The National
Infrastructure for Community Statistics: Liberating Public GIS and
Statistical Data. Pari Sabety (The Brookings Institution), Chris
Sweeney (3 Rivers Connect), Dominic Widdows, Joshua Knauer, Maryl
Curran Widdows and Peter Lucas (MAYA Design). Proceedings of
the 4th Annual Public Participation GIS (PPGIS) Conference,
Cleveland, Ohio. August, 2005.
- Peter Lucas, Magesh Balasubramanya, Dominic Widdows and Michael Higgins.
The
Information Commons Gazetteer: A Public Resource of Populated
Places and Worldwide Administrative Divisions.
Fifth International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, May 24-26, 2006
The following paper is a much more complete version of the oral
presentation given at the AAACL conference in Ann Arbor in 2005. As
well as describing how peer-to-peer technology will benefit corpus
linguistics and empirical science in general, the paper describes the
Information Commons Publication Model, which uses recognized publisher
indexes to provide quality assurance.
- Magesh Balasubramanya, Michael Higgins, Peter Lucas, Jeff Senn and
Dominic Widdows. Collaborative Annotation
that Lasts Forever: Using Peer-to-Peer Technology for Disseminating
Corpora and Language Resources.
Fifth International Conference on Language Resources and Evaluation (LREC 2006). Genoa, Italy, May 24-26, 2006.
Back to Puttypeg home page.