mohammed al madhani

Post on 27-Jan-2017

68 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Setting up a Big Data TeamBest Practices

(*) perspective from Mohammed AL Madhani.

Sources of this presentation

Sources used in this presentation

Building Data Science TeamsBy Paco Nathan Publisher: O'Reilly Media

Big Data trend in Google trends

Bigdata vs IOT36 different types of sensors = 30$ in Amazon.com!!

Bigdata vs IOT vs data science trends

The Business Impacts of Data Science

What is a Data Scientist?

Business Intelligence vs Data Science

Some commonly used methods/model

Discrete event simulation

Queuing model

Monte Carlo simulation

Agent-based modeling

System dynamics

Game theory

Probabilities

Economic analysis, IRR , NPV , FV

Linear regression

Stepwise regression–method

Logistic regression

Confidence intervals

Hypothesis testing

Statistical inferencesDesign of experiments

Analysis of variance Principal component analysis (PCA)

Data mining Forecasting

Artificial neural networks

Fuzzy logic

Expert systems

Decision trees

Markov chain

Revenue management (yield management)

Optimization

Linear programming

Integer programming

14

Data Scientest

Data artist of turning Data into action

Skills:

65%

30%

45%

75%

Math

Computer science

Statistics

Domain expertise

Experience:Doing data representations and Using algorithms for optimization and validation, communicante with the team

to make sure data avaiability, rediness, completness, work with data researcher in decomposition the problem.

30%Machine Learning

The Data Science Venn Diagram

How we can reduce the traffic Jam?

How we can reduce waiting time for the patient?

17

UNLOCKING DATAThe Data scientist mission

?

Solution

Answer

ROI

Question

Data

INPUTS

OUTPUTS

18

Generating the momentum

Description:Proofs of concepts can generate the

critical momentum needed to jump start

any Data Science Capability

Problemsolving

Critical ThinkingBuy-in

Necessary Data

Clear ROI

Dedication &

focus Fail often and learn

quickly

Limited Complexity

and Duration

Dealing with the problem ( Informs – CAP Approach)

1 2Decomposition & Datafication

DESIGN THINKING METHODOLOGY (Alternative Approach)

BOOZ ALLEN’S DESIGN THINKING TOOL BOX FOR ANALYTICS

22

Big Data Researcher

Domain expert with data science

knowledge

Skills:

65%

10%

Math

Computer science

Domain expertise

70%

Communication

Statistics

30%

30%

Mission: Generates low-fidelity prototypes to demonstrate applicability and test ideas quickly and cheaply before making significant investments

The Four Key Activities of a Data Science

Respondents who said there weren’t enough data scientists to go around

Do Data Scientists Have What They Need?

Data preparation

P

30

If you have perfect information or zero information then your task is easy – it is in between those two

extremes that the trouble begins“ ”

Maslow’s of Need could by applied to Data Op

timized

Measured

Defined

managed

Performed

Enhance the data management maturity ( Data Preparation )

Data Management Team ( Data preparation part )Ready to Go !

Develop and execute all data flow jobs, business

rules, matching, Scrapping, Cleaning, munging, joining and

wrangling

Responsible about data flow , data solutions and Data

Models and architectures

Responsible about managing the data elements and data

metadata.Running the maturity model components

Big Data Steward

Big DataArchitect

Big Data

Engineer

33

34

Some CertificatsTargeting to Certify your team will increase the maturity

1. Certified Analytics Professional (CAP)

created in 2013 by the Institute for Operations Research and the Management Sciences (INFORMS) and is targeted towards data scientists.

2. EMC: Data Science Associate (EMCDSA)

tests the ability to apply common techniques and tools required for big data analytics.

3. SAS Certified Predictive Modeler :

designed for SAS Enterprise Miner users who perform predictive analytics.

SAS Certified Predictive Modeler u

EMCDSA

CAP Certified

Tools to support the bigdata team• • Spreadsheet systems• • Statistical systems• • Optimization systems• • Simulation systems• • Business intelligence systems• • Data management systems• ▪ Structured data• ▪ Unstructured data• • Data integration systems• • Operating systems such as HADOOP

BOOZ ALLEN TALENT MANAGEMENT MODEL

BOOZ ALLEN TALENT MANAGEMENT MODEL

CRISP-DM (cross-industry standard process for data mining)

Six Sigma’s DMAIC

News:Metis Bootcamp Tuition Increase• Effective June 20, 2016, the tuition for the Metis Data Science

Bootcamp in New York and San Francisco will increase to $15,500. Accepted students who have signed and returned their enrollment agreements on or before June 20, 2016 will receive the current tuition of $14,000.• This is the first tuition increase for Metis, and is the result of

continued investments to ensure that our students are best prepared for careers in data science.

top related