| Class Label | Score |
| European country, European nation | 3.500 |
| Balkan country, Balkan nation, Balkans, Balkan state | 1.250 |
| country, state, land, nation | 0.972 |
| administrative district, administrative division, territorial division | 0.458 |
| district, territory | 0.268 |
| region | 0.176 |
| location | 0.124 |
| object, physical object | 0.092 |
| entity, something | 0.072 |
Produced by the Infomap group, CSLI, Stanford University
Or if we want to know what cheese and its top five neighbours butter, milk, meat, bread, wine from the left hand cluster in Figure 2.6 have in common, we would like to be able to take these words and classify them as follows:
| Class Label | Score |
| foodstuff, food product | 2.500 |
| dairy product | 2.250 |
| food, nutrient | 0.944 |
| substance, matter | 0.472 |
| object, physical object | 0.334 |
| beverage, drink, drinkable, potable | 0.250 |
| entity, something | 0.248 |
| money | -0.250 |
| combatant, battler, belligerent, fighter, scrapper | -0.250 |
| baked good, baked goods | -0.250 |
| dark red | -0.250 |
Produced by the Infomap group, CSLI, Stanford University
Needless to say, there is an algorithm behind this class labelling trick, and it relies on a lot of careful work by many Princeton students over many years. The goal of this chapter is to describe this work and the mathematical ideas behind it. The algorithm itself is running here and you're welcome to try it out for yourself. Two of the most important characters in this story --- Aristotle and Darwin --- are not usually thought of as mathematicians at all, but nonetheless they described their ideas using mathematical models which were, if anything, far ahead of their time.
The idea is that concepts can be arranged into a hierarchy, a tree of meaning whose trunk and branches correspond to general concepts and whose twigs and leaves correspond to particular or specific concepts. We shall see that there are many examples of this sort of structure in common use, from the famous 'Tree of Life' to postal addresses and computer file systems. If the symmetric relationships of the previous chapter can be thought of as level or horizontal in character, the relationship between a child-node and a parent-node in a hierarchy (most clearly exemplified in the relationship between a species and its genus in the Tree of Life) can be thought of as a vertical relationship.
|
|
| The Tree of Porphyry, one of the earliest examples of a concept hierarchy. From the ceiling fresco at Schussenried, Germany, by Franz Georg Herrmann (1757), photograph by Jeffrey Garrett (Northwestern University Library, October 2000). |
|
|
| Some of the uppermost nodes (major categories) in the WordNet noun taxonomy. |
|
|
| The neighbours of Poland that we found in Chapter 2 are now classified as European Countries |
The following paper describes and evaluates the class-labelling work more technically:
| Up to Geometry and Meaning | | | Back to Chapter 2 | | | On to Chapter 4 |