Follow Us

Lack of data scientists is the new Von Neumann bottleneck

Strata Conference's Founding Chair, Edd Dumbill, talks about bridging the data and information gap

Data is a huge presence within much of business and technology, and the next installment of the O'Reilly Strata Conference will provide attendees a look into the revolutionary ways data is driving, well, everything.

The Winter 2012 edition of the O'Reilly Strata Conference will offer sessions for everyone to the businessperson trying to figure out just what this whole Big Data thing is all about, to the hard-core data scientist wonks who are bringing all this new technology to the fore.

Big Data has gotten a lot of attention in the past couple of years, as Hadoop, Cassandra, MapReduce, and other open source technologies have enabled businesses and governments to use data in ways unheard of when using relational database technology. The Strata Conference is the first and most prominent gathering for any party interested in learning about just what makes big data tick.

And that, according to Founding Chair Edd Dumbill, is part of the whole point of Strata: educating users and data scientists about the benefits and applications of Dig Data.

"There are three main themes examined at Strata," Dumbill said in a recent interview, "The increasing of data and the growth of ubiquitous computing are two, which form the start of an arc to the third aspect."

The arc, Dumbill continued, leads to a much higher level of interconnectivity, the so-called "Internet of Things," which describes the billions of objects tagged and otherwise connected to the internet, each providing massive amounts of data to be collected and processed.

But processed by whom? Stored how? And utilised in what manner? Those are the key questions that gatherings, like Strata, hope to address, particularly that last, third part of the arc: how data is used. This is what Dumbill euphemistically refers to as "data and the final mile."

The "final mile" is likely a familiar term to network engineers: it refers to the all-important connectivity between the end-user and the rest of the internet.

"So it is with data science and analytics within a business," Dumbill. For data, the "final mile" refers to the capability to properly process data and convey what's really important: information.

The bridge of turning data to information (which can then be used to acquire knowledge) is exactly where the data scientist lives, and it's a skill that is still lacking within this burgeoning field.

Data scientists are described by Strata organisers as being talented in engineering, data management, mathematics, and writing. "The art of storytelling and visualisation are also important," Dumbill explained.

I suggested to Dumbill that an example might be the work of Hans Rosling, who very effectively uses stunning graphics to convey a wealth of information. Dumbill agreed that this was pretty much the same sort of work, though Rosling was not working with truly massive data sets. Data scientists for big data will be able to create models beyond even the work of Rosling.

"The headline here is that there are still very few data scientists to go around," Dumbill said. "The lack of data scientists is our new Von Neumann bottleneck."

Dumbill was quick to emphasise that data scientists do not necessarily have to be all-in-one super geniuses that can do it all. Teams with members whose talents are complimentary toward data science are also very effective.

The first Strata conference of the year will be held in Santa Clara, California, from February 28 until March 1. A second conference will be held in New York later this year. The conference will feature a Jumpstart track that will be "the missing MBA of big data" for businesspeople, as well as a Deep Data track into which data scientists can really sink their collective teeth.

"Strata is the home for the data science community," Dumbill explained, "And we're happy to have an oasis of deep geekery as well."

Tracks on Hadoop, which is currently regarded as the Linux of the Big Data world, as well as a showcase for data startups will also be a part of the three-day conference.



Comments




Send to a friend

Email this article to a friend or colleague:

PLEASE NOTE: Your name is used only to let the recipient know who sent the story, and in case of transmission error. Both your name and the recipient's name and address will not be used for any other purpose.

Techworld White Papers

Choose – and Choose Wisely – the Right MSP for Your SMB

End users need a technology partner that provides transparency, enables productivity, delivers...

Download Whitepaper

10 Effective Habits of Indispensable IT Departments

It’s no secret that responsibilities are growing while budgets continue to shrink. Download this...

Download Whitepaper

Optimise Performance For Global eCommerce

Global is all the rage: eBusiness teams are feverishly building new international initiatives in...

Download Whitepaper

Gartner Magic Quadrant for Enterprise Information Archiving

Enterprise information archiving is contributing to organisational needs for e-discovery and...

Download Whitepaper

Techworld UK - Technology - Business

Part 2 of your journey to virtualisation

You can still access part 2 of our virtualisation journey - explore how you can improve your servers, storage and networks by developing your infrastructure.

Watch now...
Techworld Mobile Site

Access Techworld's content on the move

Get the latest news, product reviews and downloads on your mobile device with Techworld's mobile site.

Find out more...

From Wow to How : Making mobile and cloud work for you

On demand Biztech Briefing - Learn how to effectively deliver mobile work styles and cloud services together.

Watch now...

Site Map

* *