How to get a job in Big Data
The Big Data revolution is creating a new breed of business IT jobs - and threatening to destabilise dyed-in-the-wool IT careers
By Dan Tynan | InfoWorld | Published: 15:37, 19 March 2012
Big Data is reshaping business IT. Thanks to cheap storage, massive processing power, and tools like Hadoop, organisations are now able to mine terabytes of information and derive useful business intelligence from it. But the data revolution is also creating a new breed of hybrid business IT jobs, ones that blend business knowledge and powerful IT tools to the benefit of tech-savvy line-of-business professionals - and the possible detriment of IT pros oblivious to the Big Data trend.
The data deluge is affecting more than just America's cubicle farms. Industries as diverse as toolmaking, auto repair, and health care are being transformed by technology, adds Dr. Tracey Wilen-Daugenti, managing director of the Apollo Research Institute and author of "Society 3.0: How Technology Is Reshaping Education, Work, and Society."
Related Articles on Techworld
"Precision toolmakers need people with computer backgrounds to run their assembly lines," she says. "As cars get smarter, we need tech people who understand how to build and repair them. Hospitals need patient advocates who understand health care, the law, and database technology so they can help people manoeuvre through the system. Every industry will require smart technology people with subject-matter expertise who can create new devices and think through all ways they might be used."
Here are five hybrid data-driven jobs born of the Big Data revolution - and one in danger of being sidelined by the deluge, as yesterday's "superusers" transform into tomorrow's business-IT professionals.
Jonathan Goldman's job is a textbook example of the changes Big Data has brought. The director of analytics and applications for Aster Data, a division of data warehousing giant Teradata Systems, Goldman holds a doctorate in physics. But after he joined LinkedIn in 2006, he became a data scientist.
At LinkedIn, Goldman was asked to take massive amounts of data collected by the business social network and turn it into products. The result: features such as People You Might Know, an algorithm that looks for non-obvious links between people and offers recommendations for making new connections.
At Aster, Goldman brings his data-crunching skills to a wider set of problems, such as isolating factors that can reduce customer churn for telecoms or optimising the flow of information on a website.
"Websites used to capture only transaction data - what was bought, who bought it, when it was shipped," he says. "Traditionally, click data was thrown away. But not anymore. The amount of data generated just by people clicking on a website is in the terabyte range. Working with lots of data can really change your business and enable you to do more and more."
Data scientists aren't just limited to social networks and data warehousing firms, says David Inbar, director of sales and marketing for Pervasive Software, a high-performance platform provider for data-intensive analytics.
"Industries most likely to hire a data scientist are those adopting Big Data technologies like Hadoop, and they're everywhere - consumer packaged goods, retail, financial services, and any company dealing with internet-scale data," he says. "Most organisations probably haven't realised just how big the Big Data wave is going to be and the extent to which it will bring conventional IT architecture to its knees."
Though having some programming ability is a plus, you don't need a computer science degree to become a data scientist, Goldman says. Likewise, familiarity with statistical software packages like R or SAS is helpful, but intense curiosity and the ability to effectively present your results is just as important.
"One of the persons I hired at LinkedIn had a background in poetry," he says. "But he was very curious, asked lots of questions, and ended up doing phenomenally well."
Data visualisation: The admissions officer who turned into a data wonk
When Mary Chase started her career as a college admissions officer, the job was largely about working with people to build relationships. Now it's about working with relational databases to find people.
As associate vice president for enrolment at Creighton University, Chase oversees undergraduate admissions and financial aid for the 7,000-student Jesuit school in Omaha. Since she began her career in the mid-1990s, Chase says the number of incoming applications has increased 10-fold. So has the amount of information colleges request from each applicant - from students' test scores and GPAs to their ethnic and socio-economic backgrounds, where their parents attended college, what they do for a living, and so on.
"We've had to adapt," she says. "Just using Excel is insufficient when you're looking at hundreds of thousands of student records with hundreds of data points for each. When I got into admissions work, my job was to build relationships and work with families to get them to enroll. Today it's about identifying which students we should be building relationships with and whether they fit the institution we are working for. I needed to find tools that would let me visualise data in a meaningful way."
About three years ago, Creighton adopted Tableau, a data visualisation tool that hooks into the university's CRM suite and allows users to manipulate information on the fly. Using Tableau on her iPad, for example, Chase can sit in a meeting and instantly model what effect raising tuition by 3-4% would have on admissions or how that would impact the university's financial aid programmes.
"To do this job, you have to be analytic," she adds. "You need to understand the baseline use of technology, including how databases are set up, how cross-relations work, and the difference between relational databases and flat databases. You have to understand how web analytics work and how you can track behaviours to predict and influence the desired outcome."
Chase, who has a master's degree in higher education and is currently working on her doctorate in educational leadership, has become a data wonk by necessity.
"It is no longer acceptable in my profession to make decisions that are not data-driven," she says. "If you rely on your gut instinct, you will make errors. If the data is available and you choose not to use it, you are making a mistake."
There aren't many professional marketers whose résumés include the ability to program in Python, but Cody Boyte is one of them. He employs his coding skills in the service of analysing and presenting data.
As marketing manager for AxialMarket, a business network that allows midmarket M&A (mergers and acquisitions) professionals to connect and make deals, Boyte spends roughly 25% of his time hacking code. That includes writing scripts that pull data from different service providers to analyse the effectiveness of AxialMarket's front end and building "engineered marketing elements" that present data in innovative ways to its customers.
One such element pulls data on the number and location of each midmarket M&A deal for the past 30 days and displays the deals on a geographical map. Boyte's job is to build the front end where the data is delivered, but his knowledge of programming allows him to communicate much more easily with full-time coders working on the back end.
"Even if you can't build it yourself, once you understand how API calls work and what happens to the layers of data they're interacting with, you can get together with the developers involved and enable them to execute it in a matter of hours," he says. "Otherwise they could spend weeks going back and forth with me trying to figure out what it is I need them to do."
Knowing whether a development project will take three hours or three weeks to deliver the same results is invaluable, he adds, as is the ability to speak geek.
"I may never know enough to be the tech lead in a startup, but I know that having an understanding of programming gives you a massive advantage in recruiting," he says. "The last thing your CTO wants to do is spend three hours explaining something to a tech novice that should really only take three minutes."
Matt Giandonato didn't start his career as a numbers geek. Trained as an artist, the digital print manager for Tukaiz, a marketing services production company, finds himself spending less time refining designs in Illustrator these days and more time crunching data in Excel.
Giandonato's dance with data began in 2004 when Tukaiz sent out personalised calendars to its customers with each recipient's name blended seamlessly into every photo. The clients liked it so much they asked Tukaiz to create personalised products to send to all of their customers. Now so-called variable printing accounts for a third of Tukaiz's business, which means Giandonato spends much of each day poring over spreadsheets filled with client data and manipulating it to create calendars, postcards, notebooks, brochures, and more.
"Over the last 10 years, we've gone from virtually no digital work to being almost completely digital," he says. "I deal with data files every day. Opening a file in Excel is one thing, but learning how to combine files, sort them in certain ways, and break out data to work with different workflows was a challenge at first. But when demand for these products kept growing, we realised that this is the wave of the future."
Along the way Giandonato also got involved with developing Tukaiz's PixyMe app for Apple's iTunes Store. With PixyMe, users can type a short message and have it appear inside a photo written in snow or displayed as balloons, for example, then choose to have the image delivered electronically or printed as a postcard and mailed to any US destination. Giandonato's job was to ensure that whatever people entered into PixyMe could actually be printed by Tukaiz.
"You never know what people are going to type," he says. "Certain characters have a particular function within an application that can make the postcard come out blank. It's amazing what one little character can do to a print job. You have to figure out what they did wrong and how to fix it, which means you need to know a lot about data."
Data discovery: The geek who joined the lawyer's nest
Not all big data jobs are being snapped up by line-of-business pros. Entrepreneurial techies are capitalising on the new business-IT blend as well, especially in industries where the data revolution has had a deep impact, such as the legal profession.
Take electronic discovery, which is now a multi-billion-dollar industry that accounts for the lion's share of costs associated with most litigation.
"A run-of-the-mill case used to involve 50GB to 100GB of data," says Craig Carpenter, VP of marketing for Recommind, a maker of predictive coding software that automates e-discovery by finding key documents while filtering out irrelevant ones. "These days a typical case can easily run 200GB to 300GB, and we're increasingly seeing cases involving several terabytes of data."
When discovery was largely paper-based, the job fell to paralegals and clerks. But as more companies began storing documents and communications digitally, discovery moved into the digital realm and technology-savvy people took over, notes Jeff Fehrman, vice president of forensics and consulting at professional services organisation Integreon.
"Email administrators would be asked to collect mailboxes from certain individuals relevant to an investigation, or network administrators would be asked to retrieve document stores," he says. "All of these IT people were still responsible for doing their day-to-day jobs, but they were also asked to help with litigation. They ended up being pulled in different directions."
The result: a new field that combines both law and technological expertise. For example, Fehrman says his background is as a network/systems administrator, not an attorney. But he makes a point to read as much as possible about legal matters and emerging case law. Along with his understanding of the rules of civil procedure, this allows him communicate more effectively with attorneys.
"The people who deal with this data used to be either tech people or lawyers and paralegals," says Carpenter. "Either they knew the law, or they knew speeds and feeds. That's changed dramatically. We're finding that people on both sides of this need to be able to speak both languages. The hottest area in hiring today is people who understand both areas really well. "
Will the new generation of superusers created by the data revolution replace IT workers in the enterprise? No, says Talener's Dsupin. But they will change the roles IT plays.
"There's always the next thing you want the system to do," he says. "Every system has version 1, 2, 3. Users will always need help to get from version 5 to versions 6, 7, and 8. This won't take jobs away from tech folks; it will allow them to avoid menial tasks like support. In general, IT professionals are becoming more dynamic than ever before."
It also means that IT pros need to develop subject-matter expertise outside the bits and bytes they are often more comfortable with. Expectations are rising on both sides, says Michael Nicholas, head of strategy for Isobar, a digital marketing and advertising firm.
"In the past, creative people in traditional ad agencies were dreamers, which meant they didn't make things," says Nicholas. "Now we expect our tech people to be creatives, and we expect our creative people to understand technology well enough that they can make their dreams a reality."