I've been working pretty solidly on my pygloss project: a
Python
Glossonomy web application. So what's Glossonomy you ask? Well, it's something between
Glossary - a list of terms and definitions, and
Taxonomy - a controlled vocabulary with relationships between terms (like Narrower/Wider, Container/Part, Related etc. ). Also in this space are
Ontologies,
Topic Maps and
Concept Maps. And since I don't want to be constrained by preconceptions I've coined a new word -
Glossonomy!
Here are a couple of scenarios that might be familiar...
Two groups of people (communities, departments, agencies) are working in similar spaces and have similar interests. Now they need to collaborate, but they find they're using different terms for the same thing, or the same term meaning different things. It can take months to iron out assumptions and errors caused by simple misconceptions.
A business has invested in defining a formal vocabulary to support its' core processes, and this is published as MS Word document. Because it covers the whole business it's weighty and mind-numbingly boring. People don't really use it as intended.
The tools I mention above all try to address these issues, and so does pygloss, but with some Web 2.0 influences at work. This give use the following characteristics/goals:
- 100% web application
- Bundles related Terms and Concepts into Domains which have self-managing user Communities.
- Encourages sharing Terms and Concepts across Domains - cross-pollination, reuse and enrichment.
- Search and Visualisation tools to explore related concepts.
- The URI is important - terms get enduring URLs so they can be referenced from other places reliably.
- Support for semantic web standards - RDF, skos etc
Other extended capabilities could include
- Glossary Extraction from parsed documents.
- Term, Concept Extraction, based on submitted content (docs, web pages etc) using Natural Language techniques.
In future posts I'll cover the following:
- some screen shots and previews of latest pygloss 0.2 featues and the data model.
- relationships to wordnet and other taxonomy tools
- Some 'under the hood stuff' - how it hangs together (expect some python, ajax, zodb, xapian learnings here).
- how this project relates to my earlier cvcore work.
But I'll sign-off now with an appetizer - a some
visualisations produced by pygloss 0.1...
Using the
AT&T dot library, automated layouts delivered as SVG, PDF, PNG etc...

And from
thejit - interactive visualisations, using client-side javascript...
A 'hypertree' layout on the central term 'COURSE' and others within a radius of 2 'hops'

And a radial graph layout, composed of all the terms from the domain IM.SDR...