TY - GEN AV - public Y1 - 2017/03/17/ TI - The Web Data Commons Structured Data Extraction ID - heidok22891 EP - 1 A1 - Primpeli, Anna A1 - Meusel, Robert A1 - Bizer, Christian A1 - Stuckenschmidt, Heiner UR - https://archiv.ub.uni-heidelberg.de/volltextserver/22891/ N2 - More and more websites annotate their content using different markup formats. These annotations involve a large number of topics such as persons, events, products, hotels, organizations and cities. The purpose of embedding structured data in HTML pages is to make the content of those pages understandable to web applications. In this way, the retrieval and integration of data deriving from different web pages is greatly facilitated. The presented poster gives an overview of the Web Data Commons - structured data project for the year 2016. The Web Data Commons project extracts structured data from the web corpus provided by Common Crawl, the largest public web corpus, and offers the extracted data for public download. In order to process these huge amounts of data, Web Data Commons builds upon its Extraction Framework and the Amazon Web Services. ER -