Papers and Publications

This page contains links to papers and other articles I've written over the years. Other up-to-date information can be found on my Google Home Page.


Public Software

NEW! The following paper describes the SemanticVectors package in some detail, including its design, motivations, inner workings, and one online application.
Semantic Vectors: A Scalable Open Source Package and Online Technology Management Application. Dominic Widdows and Kathleen Ferraro. To appear in Sixth International Conference on Language Resources and Evaluation (LREC 2008).
It was really productive to be able to write a paper that focusses on software itself, rather than mathematics, competitive evaluation of results, etc. I'd like to thank the European Language Resources Association (ELRA) and all the people who work so hard for making such a conference possible: I think that if more of our research time was spent sharing each other's tools rather than beating each other's results, we'd make a lot more progress much more quickly.

This said, the software is largely an implementation of many of the pieces of mathematics that are desribed in the other papers on vectors and quantum informatics below.

Quantum Informatics

Delving deeper into the language applications of vector algebra, the following paper appeared in Oxford this Spring:
Semantic Vector Products: Some Initial Investigations. Dominic Widdows. To appear in Second AAAI Symposium on Quantum Interaction, Oxford, 26th – 28th March 2008.
In recent years, it has become increasingly clear that some of the mathematical structures responsible for the supposed "weirdness" of quantum mechanics aren't weird at all. Quantum mechanics is often much more in line with common sense than the predictions of classical physics. Peter Bruza and I finally felt confident enough about this to submit a paper on the subject, trying to encourage people to think about the goals and opportunities of an Open World Science approach.
Quantum Information Dynamics and Open World Science. Dominic Widdows and Peter Bruza. AAAI Spring Symposium on Quantum Interaction, Stanford, California, March 2007.
There is also a book chapter in press that describes more of the cognitive and logical motivations for using "quantum" models to describe intelligent behaviour.
A Quantum Logic of Down Below. P.D. Bruza, D. Widdows, John Woods. Pre-final draft of a chapter to appear in K. Engesser, D. Gabbay and D. Lehmann (eds) Handbook of Quantum Logic, Quantum Structure and Quantum Computation. Vol 2.
There is much preparatory ground for this work in Geometry and Meaning, and in some of the Information Retrieval papers below. An increasing number of key researchers have helped us considerably so far, including John Woods and Keith van Rijsbergen. We are starting to believe that this is one of the best founded and most crucial areas in which science is advancing.

Biomedical Informatics

The following two papers are a pair of related works, describing recent progress that my collaborators in Algeria and Germany and I have been making in ontology learning and adaptation for the medical domain. The first paper describes some detailed experiments on ontology adaptation using purely noun coocurrences, led by Mr Toumouh. The second paper is more general, and describes some other linguistic constructions used for clustering and ontology adaptation, and describes a broad strategy for combining text-mining solutions with hand-coded language resources, motivated by work in word sense disambiguation and language acquisition over the past 15 years.
A. Toumouh, A. Lehireche, D. Widdows, M. Malki Adapting WordNet to the Medical Domain using Lexicosyntactic Patterns in the Ohsumed Corpus. Appeared in 4th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA-06) March 8-11, 2006, Dubai/Sharjah, UAE.

Dominic Widdows, Adil Toumouh, Beate Dorow, Ahmed Lehireche Ongoing Developments in Automatically Adapting Lexical Resources to the Biomedical Domain. Fifth International Conference on Language Resources and Evaluation (LREC 2006). Genoa, Italy, May 24-26, 2006.

The following paper is an older work, carried out as part of the MUCHMORE project.

Dominic Widdows, Stanley Peters, Scott Cederberg, Chiu-Ki Chan, Diana Steffen and Paul Buitelaar. Unsupervised Monolingual and Bilingual Word-Sense Disambiguation of Medical Documents using UMLS. Natural Language Processing in Biomedicine ACL 2003 Workshop, ACL workshop, Sapporo, Japan, July 11, 2003, pages 9-16. (.ps)

Quaternionic algebra and geometry

These are very technical ...
Dominic Widdows. A Dolbeault-type Double Complex on Quaternionic Manifolds. Asian Journal of Mathematics, Vol 6, No 2, pp. 253-276, June 2002. (.ps)

Dominic Widdows. Quaternionic Algebra described by Sp(1) representations. Quarterly Journal of Mathematics, Oxford, 2003. (.ps)

Dominic Widdows. Quaternionic Algebraic Geometry. D.Phil thesis, University of Oxford, 2000. (.ps)

Geometric Logic and Inductive Reasoning

Dominic Widdows. Geometric ordering of concepts, logical disjunction, and learning by induction. Compositional Connectionism in Cognitive Science, AAAI Fall Symposium Series, Washington, DC, October 22-24, 2004. (.ps)

Information Retrieval

Dominic Widdows. Orthogonal Negation in Vector Spaces for Modelling Word-Meanings and Document Retrieval. 41st Annual Meeting of the Association for Computational Linguistics, Sapporo, Japan, July 7-12, 2003, pages 136-143. (.ps)

The following paper fills a (surprising) gap in the vector space model for information retrieval using quantum logic:

Dominic Widdows and Stanley Peters. Word Vectors and Quantum Logic: Experiments with negation and disjunction. Eighth Mathematics of Language Conference, Bloomington, Indiana, June 20-22, 2003, pages 141-154. (.ps)

Automatic Concept Learning using Vectors

Dominic Widdows, Beate Dorow, and Chiu-Ki Chan. Using Parallel Corpora to enrich Multilingual Lexical Resources. Third International Conference on Language Resources and Evaluation, Las Palmas, May 2002, pages 240-245. (.ps)

Dominic Widdows. Unsupervised methods for developing taxonomies by combining syntactic and statistical information. In Proceedings of HLT/NAACL 2003, Edmonton, Canada, June 2003, pages 276-283. (.ps)

Scott Cederberg and Dominic Widdows. Using LSA and Noun Coordination Information to Improve the Precision and Recall of Automatic Hyponymy Extraction. In Seventh Conference on Computational Natural Language Learning (CoNLL-2003), Edmonton, Canada, June 2003, pages 111-118. (.ps)

Baldwin, Timothy, Colin Bannard, Takaaki Tanaka and Dominic Widdows An Empirical Model of Multiword Expression Decomposability, Proceedings of the ACL-2003 Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, Sapporo, Japan, July 2003, pages 89-96.

Automatic Concept Learning using Graph Theory

Dominic Widdows and Beate Dorow. A Graph Model for Unsupervised Lexical Acquisition. 19th International Conference on Computational Linguistics, Taipei, August 2002, pages 1093-1099. (.ps)

Beate Dorow and Dominic Widdows, Discovering Corpus-Specific Word Senses. EACL 2003, Budapest, Hungary. Conference Companion (research notes and demos) pages 79-82. (.ps)

Beate Dorow, Dominic Widdows, Katerina Ling, Jean-Pierre Eckmann, Danilo Sergi and Elisha Moses. Using Curvature and Markov Clustering in Graphs for Lexical Acquisition and Word Sense Discrimination. MEANING-2005, 2nd Workshop organized by the MEANING Project, February 3rd-4th 2005, Trento, Italy. (.ps)

Dominic Widdows and Beate Dorow. Automatic Extraction of Idioms using Graph Analysis and Asymmetric Lexicosyntactic Patterns. ACL2005 Workshop on Deep Lexical Acquisition Ann Arbor, Michigan, June 30th, 2005

Information visualisation

Dominic Widdows, Scott Cederberg and Beate Dorow. Visualisation Techniques for Analysing Meaning. Fifth International Conference on Text, Speech and Dialogue, Brno, Czech Republic, September 2002, pages 107-115. (.ps)

Dominic Widdows and Scott Cederberg, Monolingual and Bilingual Concept Visualization from Corpora. Demonstration presented at HLT/NAACL 2003, Edmonton, Canada, June 2003. (.ps)

Meaning in Context (including applications to Medical Informatics)

Dominic Widdows. A Mathematical Model for Context and Word-Meaning. Fourth International and Interdisciplinary Conference on Modeling and Using Context, Stanford, California, June 23-25, 2003, pages 369-382. (.ps)

Dominic Widdows, Stanley Peters, Scott Cederberg, Chiu-Ki Chan, Diana Steffen and Paul Buitelaar. Unsupervised Monolingual and Bilingual Word-Sense Disambiguation of Medical Documents using UMLS. Natural Language Processing in Biomedicine ACL 2003 Workshop, ACL workshop, Sapporo, Japan, July 11, 2003, pages 9-16. (.ps)

Distributed Databases, Peer to Peer and Information Commons Research

The main ongoing research I worked on at MAYA was in the area of peer-to-peer networks and distributed databases, collectively many of the research goals being towards the idea of creating and maintaining an "Information Commons". The main Information Commons website can be found at http://www.maya.com/infocommons. A more thorough (though somewhat outdated) introduction to the Information Commons can be found on the Civium Wiki.

Foundational Papers

Distributed Knowledge Representation using Universal Identity and Replication. Peter Lucas, Jeff Senn and Dominic Widdows. MAYA Design Inc. Technical Report MAYA-05007.

Roles in the Universal Database: Data and Metadata in a Distributed Semantic Network. Peter Lucas, Dominic Widdows, Joe Hughes and William Lucas. MAYA Design Inc. Technical Report MAYA-05009.

Peer-to-Peer Technology and Research Supporting the Information Commons

Shepherdable Indexes and Persistent Search Services for Mobile Users. Michael Higgins, Dominic Widdows, Magesh Balasubramanya, Peter Lucas, David Holstius. 8th International Symposium on Distributed Objects and Applications (DOA). Montpellier, France, Oct 30 - Nov 1, 2006
Managing Distributed Collaboration in a Peer-to-Peer Network. Michael Higgins, Stuart Roth, Jeff Senn, Peter Lucas, Dominic Widdows. 14th International Conference on Cooperative Information Systems (CoopIS 2006). Montpellier, France, Nov. 2006.
The Civium World Model: Spatial and Semantic Issues in Pervasive Computing. Dominic Widdows, Peter Lucas, David Holstius, Michael Higgins. Tech Report MAYA-07013.

Practical Information Systems using the Information Commons

The Universal Genetics Database: Information Sharing in Genetics and Beyond. Dr D. Widdows and Prof M. Barmada. Byline article for BioTech International, Vol. 18 No. 3, pp. 11-13, Reed Elsevier, June 2006.

The following paper, a collaboration between the Brookings Institution, 3 Rivers Connect, and MAYA Design, describes some of the ways in which the Information Commons is being used as a public information space, combining geographic and socioeconomic data and building interfaces that practioners in a range of fields can use to access and understand information in ways the were previously impossible.

The National Infrastructure for Community Statistics: Liberating Public GIS and Statistical Data. Pari Sabety (The Brookings Institution), Chris Sweeney (3 Rivers Connect), Dominic Widdows, Joshua Knauer, Maryl Curran Widdows and Peter Lucas (MAYA Design). Proceedings of the 4th Annual Public Participation GIS (PPGIS) Conference, Cleveland, Ohio. August, 2005.

Peter Lucas, Magesh Balasubramanya, Dominic Widdows and Michael Higgins. The Information Commons Gazetteer: A Public Resource of Populated Places and Worldwide Administrative Divisions. Fifth International Conference on Language Resources and Evaluation (LREC 2006), Genoa, Italy, May 24-26, 2006

The following paper is a much more complete version of the oral presentation given at the AAACL conference in Ann Arbor in 2005. As well as describing how peer-to-peer technology will benefit corpus linguistics and empirical science in general, the paper describes the Information Commons Publication Model, which uses recognized publisher indexes to provide quality assurance.

Magesh Balasubramanya, Michael Higgins, Peter Lucas, Jeff Senn and Dominic Widdows. Collaborative Annotation that Lasts Forever: Using Peer-to-Peer Technology for Disseminating Corpora and Language Resources. Fifth International Conference on Language Resources and Evaluation (LREC 2006). Genoa, Italy, May 24-26, 2006.

Back to Puttypeg home page.