Meta's AI-based Wikipedia Successor Might be the Biggest Break in NLP
Meta has open-sourced an ML resource that one day could supplant Wikipedia as the globe's largest publicly available knowledge-verification database. Dubbed Sphere, it can be leveraged to execute knowledge-intensive NLP (KI-NLP). In practical terms, that means it can be utilized to answer challenging questions using NLP and find sources for claims.
Wikipedia has served as the recorded corpus; Meta's eggheads wrote in a paper discussing Sphere design, claiming the volunteer-maintained uber-wiki is precise, organized, and small for seamless use in testing environments. To create something massive and better than Wikipedia, though, Meta pulled together content from all over the web to form a universal, unstructured knowledge source for multiple KI-NLP tasks at once.
The result is Sphere, which is more or less a mountain of processed data that can be queried leveraging various ML tools. The team adds that Sphere can match and outperform baselines grounded in Wikipedia on some activities using the KILT AI benchmark.
Sphere's developers think iterative efforts must focus on assessing the data quality it retrieves, identifying contradictions and false claims, determining the method to prioritize trustworthy sources, and when to decide not to answer a question due to lack of data.
If it can successfully turn Sphere into a white-box AI with trustworthy and reliable data, Meta added, Sphere might be the next big break in NLP.