How to become a data scientist: A cheat sheet

3 years ago 270

If you are funny successful pursuing a vocation successful information science, this primer is simply a bully notation for accusation astir indispensable skills, salaries, grooming resources and more.

data-science.jpg

Image: Wright Studio/Shutterstock

Data scientists are successful large demand, taking the coveted No. 2 spot connected Glassdoor's Best Jobs successful America database for 2021 with 5,971 occupation openings, and the request continues to grow. In 2012, the Harvard Business Review billed information scientists arsenic the sexiest occupation of the 21st century."

Among information scientists, galore antithetic jobs tin exist. "There are information scientists that absorption precise overmuch connected precocious analytics. Some information scientists lone bash natural connection processing," said Dana Seidel, information scientist. "And the enactment emcompasses galore divers skills, she said, including "project management skills, information skills, investigation skills, captious reasoning skills."

SEE: Hiring kit: Data scientist (TechRepublic Premium)

The tract is successful specified precocious request due to the fact that businesses request information analytics to enactment competitive. "In the end, the main crushed request is inactive precocious is due to the fact that if your competitors are relying connected data-driven determination making and you aren't, they volition surpass you and bargain your marketplace share.

Therefore companies person to accommodate and employment information subject tools and techniques oregon they volition simply beryllium forced retired of business," said Christopher Zita successful an article connected Towards Data Science.

To assistance those funny successful the tract amended recognize however to interruption into a vocation successful information science, we've created a usher with the astir important details and resources.

SEE: All of TechRepublic's cheat sheets and astute person's guides

Executive summary

  • Why is determination an accrued request for information scientists? Nearly each institution present has the quality to cod data, and the magnitude of information is increasing larger and larger. This has led to a higher request for employees with circumstantial skills who tin efficaciously signifier and analyse this information to glean concern insights.

  • What are immoderate of the information idiosyncratic occupation roles? Core information scientist, researcher and large information specializer are immoderate of the apical occupation titles successful the information subject field.

  • What skills are required to beryllium a information scientist? The communal accomplishment acceptable for a information idiosyncratic includes, instrumentality learning, Python, Hadoop SPARK, SQL, according to Glassdoor

  • Which industries person the hottest markets for information scientists? The cities with the fastest-growing tech salaries betwixt 2019-2020 according to the DICE 2021 Tech Salary Report see Charlotte, North Carolina (+13.8%); Orlando, Florida, (+13.4%); New York, New York (+11.6%); Austin, Texas (+9.7%); and Philadelphia, Pennsylvania (+8.3%). Other top-ranking cities successful this class were Detroit, Phoenix, Houston, Minneapolis and Baltimore. In summation to the accepted "tech hubs" this database includes a fig of emerging cities. Some of the top-paying industries are aerospace merchandise and parts manufacturing, $119,590; telecommunications, $102,180; national enforcement subdivision (OEWS designation) $101,560; lipid and state extraction, $101,130; and bundle publishers, $96,510.What is the mean wage of a information scientist? The nationalist mean basal wage for information scientists was $117,288 arsenic of September 2021, according to Glassdoor. LinkedIn placed the nationalist mean basal wage astatine $119,378 for September  2021. Salaries alteration greatly depending connected location; the positions with the highest salaries are successful San Francisco, San Jose, Seattle, and New York City.

  • What are emblematic interrogation questions for a vocation successful information science? "In an interview, expect to reply method questions astir your quality to execute quantitative tests arsenic good arsenic make wide visualizations of large, analyzable information sets. Come acceptable to sermon past projects you've worked connected and however you pass information findings intelligibly and concisely successful bid to assistance lick business-related problems," Glassdoor suggested.

  • Where tin I find resources for a vocation successful information science? The Data Science Association, The Institute for Operations Research and the Management Sciences and the International Institute for Analytics are nationalist and planetary organizations wherever you tin question retired accusation astir the assemblage arsenic good arsenic certification and grooming options. A fig of online courses successful programming languages specified arsenic Python, R and SQL are disposable from galore providers.

Additional resources:

Why is determination an accrued request for information scientists?

As each institution becomes a tech institution to immoderate degree, the request for skilled professionals who tin analyse that information and glean concern insights increases.

"As the size of information astatine companies turn larger and larger, determination is higher request for employees with circumstantial skills who tin efficaciously signifier and analyse this data," said Pablo Ruiz Junco, Glassdoor economical probe fellow. "At the aforesaid time, the magnitude of radical with these skills is inactive comparatively debased compared to the demand, which results successful higher pay."

SEE: Python is eating the world: How 1 developer's broadside task became the hottest programming connection connected the satellite (cover communicative PDF) (TechRepublic)

Technology advances and the monolithic volumes of online information disposable are affecting each sector, and person tremendous impacts connected the economy, said Karen Panetta, IEEE fellow and dean of postgraduate engineering astatine Tufts University. This alleged "data avalanche" is not conscionable astir the sheer measurement of data, but besides the velocity astatine which it changes and grows, and the divers types of information available.

"Knowing however to usage a spreadsheet and a accepted database volition not suffice successful the emerging Big Data revolution," Panetta said. "Analyses request to beryllium done successful real-time, wherever decisions tin beryllium critical. Being capable to simply cognize however to usage the bundle tools is lone portion of this challenge. Understanding the information crossed disciplines, being capable to pass its meaning, and utilizing statistic volition beryllium the differentiating factors from a accepted 'number cruncher.'"

Additional resources:

What are immoderate of the information idiosyncratic occupation roles?

Generally speaking, information scientists excavation information and analyse it for circumstantial institution interests, and past enactment with selling departments to capitalize connected that knowledge. These workers indispensable beryllium acquainted with data-gathering software, programming, and warehousing techniques.

Data idiosyncratic jobs autumn into 10 categories, according to Towards Data Science.

Data scientist—A information idiosyncratic knows a spot of everything, and they tin connection insights connected the champion solutions for a circumstantial project. They are successful complaint of researching and processing caller algorithms and approaches. In ample companies, they oversee projects from commencement to finish.

Data analyst—Data analysts are liable for visualizing, transforming and manipulating the data. They are often successful complaint of preparing the information for connection by making reports that amusement trends and insights.

Data engineer—Data engineers are liable for designing, gathering and maintaining information pipelines. They marque definite that the information is acceptable to beryllium processed and analyzed. They request to support the ecosystem and the pipeline optimized and efficient.

Data architect—A information designer is akin to a information engineer. They some request to guarantee that the information is well-formatted and accessible. Data architects besides design, make and support caller database systems that lucifer the requirements of a circumstantial concern model. 

Data storyteller—This is the newest occupation relation successful this list. Data storytelling is not conscionable astir visualizing the information and making reports and stats; rather, it is astir uncovering the communicative that champion describes the information and uses it to explicit it. The information storyteller helps radical recognize the data.

Machine learning scientist—A instrumentality learning idiosyncratic researches caller information manipulating approaches and designs caller algorithms to beryllium used. 

Machine learning engineer—Machine learning engineers request to beryllium precise acquainted with the assorted instrumentality learning algorithms similar clustering, categorization and classification and are up-to-date with the latest probe advances successful the field. Machine learning engineers request to person beardown statistic and programming skills successful summation to immoderate cognition of the fundamentals of bundle engineering.

Business quality developer—Business Intelligence developers plan and make strategies that let concern users to find the accusation they request to marque decisions rapidly and efficiently. BI developers request to person astatine slightest a basal knowing of the fundamentals of concern models.

Database administrator—A database head volition beryllium successful complaint of monitoring the database, making definite it functions properly, keeping way of the information flow, and creating backups and recoveries.

Technology specialized roles—As the information subject tract grows, much circumstantial technologies volition emerge. As the tract develops, caller specialized occupation roles volition beryllium created. These occupation roles use to information scientists and investigation arsenic well. 

Additional resources:

What skills are required to beryllium a information scientist?

Here are the 12 marketable skills a information idiosyncratic mightiness need, according to an Indeed report:

  1. Cloud computing
  2. Statistics and probability
  3. Advanced mathematics
  4. Machine learning
  5. Data visualization skills
  6. Query languages
  7. Database management
  8. Visualizations
  9. Python coding
  10. Microsoft Excel
  11. R programming
  12. Data wrangling

"If you're looking to participate the tract of information subject and physique a coagulated instauration of acquisition that volition basal retired successful the eyes of aboriginal employers, determination are 3 halfway skills you need: Python, R and SQL," said Pablo Ruiz Junco, Glassdoor economical probe fellow. "With these skills, you'll beryllium eligible to use to implicit 70% of each online occupation postings for information idiosyncratic roles. Plus, expanding your skills beyond these foundational languages tin pb you to a higher wage and let you to formed a wider nett erstwhile applying."

Additional resources:

Which information subject occupation roles wage the highest salaries?

While analysts predicted that request for information scientists would roar by 2020, that request slowed down successful 2020, acknowledgment to the COVID-19 pandemic. Fortunately, that slowdown isn't expected to last.

According to a study from Indeed, the 15 highest-paying information jobs by nationalist mean wage successful 2021 are:

  1. Machine learning engineer: $149,847
  2. Enterprise architect: $144,013
  3. Data architect: $133,840
  4. Big information engineer: $132,571
  5. Data modeler: $93,476
  6. Data scientist: $122,519
  7. Infrastructure engineer: $113,546
  8. Business quality developer: $100,494
  9. Statistician: $99,055
  10. Database administrator: $97,730
  11. Business quality analyst: $96,737
  12. Database developer: $89,250
  13. Data warehouse manager: $84,221
  14. Data analyst: $75,225
  15. Database manager: $65,558

Additional resources:

What is the mean wage of a information scientist?

Average wage figures disagree somewhat for U.S. information scientists depending connected which occupation tract you look at. LinkedIn says the mean basal wage is $119,378 , and Glassdoor says the mean basal wage for the presumption is $117,288.

Data scientists successful San Francisco are the highest paid, with a median basal wage of $160,525, followed by San Jose, California ($107,226), Seattle ($143,300), and New York City ($151, 527), according to Indeed.

The Bureau of Labor Statistics said the median wage for a information scientist with a master's grade successful 2020 was $126,830 per year.

As seen supra with the wage differences betwixt halfway information scientists, researchers, and large information specialists, the skills that idiosyncratic information scientists bring to the array tin person a ample interaction connected pay. Job seekers should see what relation they are astir funny successful and marque a cost-benefit investigation of which skills are worthy spending clip learning.

Additional resources:

What are emblematic interrogation questions for a vocation successful information science?

"To measure if a campaigner tin beryllium palmy arsenic a information scientist, I'm looking for a fewer things: baseline cognition of the fundamentals, a capableness to deliberation creatively and scientifically astir real-world problems, exceptional connection astir highly method topics, and changeless curiosity," said Kevin Safford, elder manager of engineering astatine Umbel.

A inferior information idiosyncratic tin expect questions similar the pursuing successful a occupation interview, according to Forrester expert Kjell Carlsson:

  • Walk maine done the task that you are astir arrogant of wherever you utilized data/data science/machine learning/advanced analytics. What was your relation connected the project, and what did you bash successful each step?
  • Tell maine astir a task wherever you utilized (insert connection oregon accomplishment here, e.g., Python, R).
  • Tell maine astir a clip you had to enactment with idiosyncratic who is not data-savvy connected a information subject project.
  • Pretend I americium not a information scientist, explicate (insert information subject topic, e.g., transverse validation, unsupervised learning, etc.) to me.
  • Tell maine astir a clip you had to enactment with precise messy data.
  • Tell maine astir your acquisition moving successful teams.
  • Tell maine astir a clip erstwhile you had to go an adept connected a caller method quickly.

The interviewee mightiness beryllium fixed a mini-case survey based connected a information subject task the squad has undertaken, with questions specified as: What information would you need? What are the hypotheses you would similar to test? What technique(s) would you usage to measure them?

An interrogation whitethorn besides see an workout successful which the interviewee is fixed a information acceptable and a wide question, and asked to contiguous their findings, Carlsson said.

For much elder positions, these questions whitethorn travel up, according to Daniel Miller, vice president of recruiting astatine Empowered Staffing:

  • Have you built a information warehouse from scratch? If so, archer maine astir the process you created successful bid to successfully instrumentality the information warehouse. (If they person not been portion of it from scratch, you tin inquire if they person been portion of a section that dealt with a institution merger oregon acquisition of information and however they handled it.)
  • What types of customized dashboards person you built, and what information/analytics were being presented done your dashboard?
  • Tell maine astir the astir analyzable information task you person worked on, and what you were capable to bash successful bid to execute success.
  • How are you with explaining and presenting information to enforcement and elder leadership?

Additional resources:

Where tin I find resources for a vocation successful information science?

The Data Science Association, The Institute for Operations Research and the Management Sciences and the International Institute for Analytics are nationalist and planetary organizations wherever you tin question accusation astir the assemblage arsenic good arsenic certification and grooming options.

Some acquisition institutions person created data subject grade programs, including University of California Berkeley, Northwestern University, Carnegie Mellon University and Kennesaw State University. Some of these schools connection online courses.

You tin find a fig of online programming courses, specified arsenic those successful Python, R and SQL, from galore providers. Programs and seminars are besides disposable done the IEEE Computer Society.

A fig of certifications successful information subject are besides available. These see the vendor-neutral Certified Analytics Professional (CAP), the Dell EMC Proven Professional certification program, the Microsoft Certified Solutions Expert (MCSE) and the SAS Data Science Certification.

Additional resources:

Data, Analytics and AI Newsletter

Learn the latest quality and champion practices astir information science, large information analytics, and artificial intelligence. Delivered Mondays

Sign up today

Editor's note: This nonfiction was has been updated to bespeak the latest information. 

Read Entire Article