INESC TEC researcher awarded in international conference
Ana Costa e Silva, a researcher at INESC TEC’s Laboratory of Artificial Intelligence and Decision Support (LIAAD), has recently won an international competition, which took place as part of the 2013 ICDAR conference, in Washington, USA, between 25 and 28 July.
The competition focused on the location of tables in pdf documents, a theme that the researcher had already addressed in her master’s dissertation, supervised by Alípio Jorge and Luís Torgo, respectively manager and researcher at LIAAD. This theme was then extended throughout her doctoral studies.
The researcher stood out in this competition because she used a top-down method, unlike the remaining participants who presented bottom-up methods focusing solely on the information on the page. The LIAAD researcher developed a model to locate not only the tables, but also other elements such as charts, titles, footnotes, page numbers, and texts divided into two columns. This happens because all these elements are easily mistaken as parts of tables when a top-down approach is not used. Using this input, Ana Costa e Silva has managed to improve the classification of tables.
The purpose of the researcher’s dissertation was to teach the computer not only to locate, but also to interpret tables, such as those found in reports and accounts, and to retrieve information in the tables directly to databases, which requires the content of the table to be matched to the content of context taxonomies.
This work is useful to help computers automatically read tables, for instance for blind people. Currently used systems read these tables line by line or column by column, as if they contained continuous texts. This is advantageous not only to financial, tax and statistical entities, but also to insurance companies that daily receive large documents to process manually. The same applies to large companies such as Boeing, which have large instruction manuals containing tables and images that a computer cannot read. Finally, the process of locating and interpreting tables can also make it easier to convert tables published on the Internet so that they are more easily integrated in the Semantic Web.
Ana Costa e Silva graduated in Management at the Faculty of Economics of the University of Porto (FEP), and concluded her master’s degree in Data Analysis and Decision Support at FEP, and her PhD in Artificial Intelligence at the University of Edinburgh, Scotland.
The INESC TEC researchers mentioned in this article are associated with the following partner institutions: INESC Porto and FCUP.