In particular, IE tries to process the relevant information found on the documents in order to make it available to structured queries. Most often, information extraction systems are customized for specific application domains, and require manual or semi-automatic training sessions.
IE from structured and semi-structured documents is frequently performed using wrappers, whose most natural and widespread application area is the World Wide Web.
The aim of the VIE research is to define and implement a general information extraction approach based on the visual appearance of the information, conceived as its user-perceived rendering. This allows to shift the IE problem from the low level of code (e.g., raster graphics, vector drawing, wordprocessor formatted text, web page, etc.) to the higher level of visual features, providing a paradigm of the kind "what you see drives your search" that supports a natural query formulation.
For an updated list of related publications see the papers page.