Deniman Domain «Home Get custom programming done at!
David Deniman

Probabilistic Ontological Semantics (POS)

POS is a theoretical construct for a text indexing and retrieval system, implemented as a free tree based on a proprietary N-gram schema. When combined with quantitative measures of web page structure (such as WebTango), human mediated metadata and collection management (such as the research being done in the digital library community), and both web and user usage analytics, the POS construct evolves into a self-organizing topical ontology, easily identifying the highly relevant seminal sources on the Internet.

The N-gram schema (note big N, meaning whole words, as opposed to small n, meaning word parts) is based on the realization that words are combined in particular sequences and associated in some proximity to other significant words in order to express complete thoughts and ideas. Thus, the key to identifying the meaning within words and their relationships to various concepts is to be able to identify the significant relationships as given by their usage across a wide range of contexts. This is morphological relevance.

Simple keyword and link analyses alone can not accomplish this, but those are easy to implement with the current level of technology. The POS construct requires several more doublings of current technological capability to become practical.

I conceived the POS construct and N-gram schema as the product of an independent research project while studying for my masters degree in computer science. Subsequently, working on the DLESE digital library project helped me learn many of the needs and objectives involved in evaluating and organizing informational and educational resources.

For POS to become a reality it needs to be developed as both an intelligent expert-mediated system and as a large-scale computational graph. That is, POS requires human experts, intelligent agents and a humongous but fast data structure. In the meantime, I am keeping tabs on ACM SIGIR proceedings and related Web/knowledge management/digital library efforts.

POS, like a Star Trek holodeck, is for the future.


"Words without thoughts never to heaven go." --William Shakespeare

"Wisdom is not in words; wisdom is meaning within words." --Kahlil Gibran
© 2007 David Deniman All rights reserved.