The VIE Technique

The process of knowledge acquisition from generic information domains has its central phase in the Information Extraction (IE), which aims to extract from the located documents relevant information that appear in certain semantic or syntactic relationships.
IE process

In particular, IE tries to process the relevant information found on the documents in order to make it available to structured queries. Most often, information extraction systems are customized for specific application domains, and require manual or semi-automatic training sessions.

IE from structured and semi-structured documents is frequently performed using wrappers, whose most natural and widespread application area is the World Wide Web.

The aim of the VIE research is to define and implement a general information extraction approach based on the visual appearance of the information, conceived as its user-perceived rendering. This allows to shift the IE problem from the low level of code (e.g., raster graphics, vector drawing, wordprocessor formatted text, web page, etc.) to the higher level of visual features, providing a paradigm of the kind "what you see drives your search" that supports a natural query formulation.

Our approach is based on the formal basis given by the spatial relations theory and on a SQL-Like query language. The theory has been implemented in a full-featured tool.


For an updated list of related publications see the papers page.

Further Reading