Tweet The UMBC WebBase corpus is a dataset of high quality English paragraphs containing over three billion words derived from the Stanford WebBase project’s February 2007 Web crawl....
Tweet Facebook engineers Xiao Li and Maxime Boucher describe the language processing techniques used to implement Facebook’s graph search in a recent post on the Facebook Engineering page...
Tweet Barbara Starr posts at Search Engine Land about progress the notion of “semantic search” has made in the past three years, Semantic and Graph-Based Search: The Future Face Of Search:...
Tweet There has recently been a spike in the number of compromised Twitter accounts, which has increased concerns about the trustworthiness of information broadcast on Twitter and other social...
Tweet Heather McIlvaine from enterprise software giant SAP blogs about open data: “How are mobile apps, Big Data, and civic hacking changing the nature of open data in government? The Center for...
Tweet If you are interested in the semantic web and linked data, data.ac.uk looks like a site worth investigating. “This is a landmark site for academia providing a single point of contact...
Tweet The results of the 2013 Semantic Textual Similarity task (STS) are out. We were happy to find that our system did very well on the core task, placing first out of the 35 participating...
Tweet A post in Micrsoft’s Bing blog, Understand Your World with Bing, announced that an update to their Satori knowledge base allows Bing to do a better job of identifying queries that...
Tweet We felt a great disturbance in the Force, as if millions of voices suddenly cried out in terror and were about to be silenced. We fear something terrible is about to happen. We went...
Tweet Memoto is a $279 lifelogging camera takes a geotagged photo every 30 seconds, holds 6K photos, and runs for several days without recharging. The company producing Memoto is a Swedish...