“data lake” of Covid-19 information, culled from Johns Hopkins, the World Health Organization, the Institute for Health Metrics and Evaluation, the Covid Tracking Project and dozens of other organizations that researchers could access in one place for free.Billionaire tech entrepreneur Tom Siebel struck gold with Siebel Systems, which he sold to Oracle in 2006, and is trying again with artificial intelligence firm C3.ai, valued at $3.3 billion. But as the pandemic hit, business slowed and he spent weeks immersed in how to use data to help Covid-19 researchers. He set up a so-called
All told, he says, some 2,000 active users from around the world are now working with this compendium of datasets to research the course of the disease and ways to mitigate it. Among the users, he says, are researchers at the National Institutes of Health, MIT and various pharmaceutical companies.
“What’s difficult about these data sets is making all the connections. All of these data sets are extraordinarily large with tens of thousands of fields, and hundreds of millions of records. In order to make them useful for analytics you need to connect issues like comorbidiy and infection rates,” he says. “The number of things we have connected is mind-numbing.”
Siebel, 67, is in a unique position to create a compendium of data sets. He spent more than a decade and, he says, nearly $1 billion building the technology underlying C3.ai, which offers predictive analytics to customers that include 3M, Royal Dutch Shell and the U.S. Air Force. His Redwood City, California-based business has grown rapidly, passing $160 million in revenue for the fiscal year ended in April. Yet as the pandemic hit the United States this spring, Siebel–who expects both a recession and a massive shakeout among AI companies–became one of more than a dozen billionaires to borrow money from the federal Paycheck Protection Program, accessing between $5 million and $10 million, according to data from the Small Business Admnistration. (For more on Siebel and other billionaires who’ve borrowed from the PPP, see our online feature; for more on C3.ai, see our 2017 magazine story.)
C3.ai cleaned up the data sets using the automated tools it developed to help its corporate customers so that researchers could access data that is structured, readable by machine and free of anomalies. The effort began with 11 data sets, published in April, and expanded over time to include 32 in June. Siebel says that he intends to continuing adding new datasets to the data lake, which is hosted on AWS, over time.
“This is a natural application of AI,” Siebel says. “There are a lot of applications of AI that we both know are a little scary and onerous, and this is one that is potentially enormously socially beneficial.”
The data effort is one of two Covid-19 projects that Siebel launched this spring. The other, called the C3.ai Digital Transformation Institute, is giving away more than $300 million in grants and in-kind resources to data-driven, Covid-19 research projects in partnership with Microsoft. The University of California, Berkeley, and the University of Illinois at Urbana-Champaign are managing that consortium, which has funded 26 projects to date.
“We’re doing our best to help advance the underlying science that will make this problem go away,” Siebel says. “Until we make this problem go away, I don’t think we’re going to get this economy back on its feet.”