Download - Mohammed AL Madhani

Transcript
Page 1: Mohammed AL Madhani

Setting up a Big Data TeamBest Practices

(*) perspective from Mohammed AL Madhani.

Page 2: Mohammed AL Madhani

Sources of this presentation

Page 3: Mohammed AL Madhani

Sources used in this presentation

Building Data Science TeamsBy Paco Nathan Publisher: O'Reilly Media

Page 4: Mohammed AL Madhani

Big Data trend in Google trends

Page 5: Mohammed AL Madhani

Bigdata vs IOT36 different types of sensors = 30$ in Amazon.com!!

Page 6: Mohammed AL Madhani
Page 7: Mohammed AL Madhani

Bigdata vs IOT vs data science trends

Page 8: Mohammed AL Madhani

The Business Impacts of Data Science

Page 9: Mohammed AL Madhani

What is a Data Scientist?

Page 10: Mohammed AL Madhani
Page 11: Mohammed AL Madhani
Page 12: Mohammed AL Madhani

Business Intelligence vs Data Science

Page 13: Mohammed AL Madhani

Some commonly used methods/model

Discrete event simulation

Queuing model

Monte Carlo simulation

Agent-based modeling

System dynamics

Game theory

Probabilities

Economic analysis, IRR , NPV , FV

Linear regression

Stepwise regression–method

Logistic regression

Confidence intervals

Hypothesis testing

Statistical inferencesDesign of experiments

Analysis of variance Principal component analysis (PCA)

Data mining Forecasting

Artificial neural networks

Fuzzy logic

Expert systems

Decision trees

Markov chain

Revenue management (yield management)

Optimization

Linear programming

Integer programming

Page 14: Mohammed AL Madhani

14

Data Scientest

Data artist of turning Data into action

Skills:

65%

30%

45%

75%

Math

Computer science

Statistics

Domain expertise

Experience:Doing data representations and Using algorithms for optimization and validation, communicante with the team

to make sure data avaiability, rediness, completness, work with data researcher in decomposition the problem.

30%Machine Learning

Page 15: Mohammed AL Madhani

The Data Science Venn Diagram

Page 16: Mohammed AL Madhani

How we can reduce the traffic Jam?

How we can reduce waiting time for the patient?

Page 17: Mohammed AL Madhani

17

UNLOCKING DATAThe Data scientist mission

?

Solution

Answer

ROI

Question

Data

INPUTS

OUTPUTS

Page 18: Mohammed AL Madhani

18

Generating the momentum

Description:Proofs of concepts can generate the

critical momentum needed to jump start

any Data Science Capability

Problemsolving

Critical ThinkingBuy-in

Necessary Data

Clear ROI

Dedication &

focus Fail often and learn

quickly

Limited Complexity

and Duration

Page 19: Mohammed AL Madhani

Dealing with the problem ( Informs – CAP Approach)

1 2Decomposition & Datafication

Page 20: Mohammed AL Madhani

DESIGN THINKING METHODOLOGY (Alternative Approach)

Page 21: Mohammed AL Madhani

BOOZ ALLEN’S DESIGN THINKING TOOL BOX FOR ANALYTICS

Page 22: Mohammed AL Madhani

22

Big Data Researcher

Domain expert with data science

knowledge

Skills:

65%

10%

Math

Computer science

Domain expertise

70%

Communication

Statistics

30%

30%

Mission: Generates low-fidelity prototypes to demonstrate applicability and test ideas quickly and cheaply before making significant investments

Page 23: Mohammed AL Madhani

The Four Key Activities of a Data Science

Page 24: Mohammed AL Madhani

Respondents who said there weren’t enough data scientists to go around

Page 25: Mohammed AL Madhani
Page 26: Mohammed AL Madhani
Page 27: Mohammed AL Madhani

Do Data Scientists Have What They Need?

Page 28: Mohammed AL Madhani

Data preparation

Page 29: Mohammed AL Madhani

P

Page 30: Mohammed AL Madhani

30

If you have perfect information or zero information then your task is easy – it is in between those two

extremes that the trouble begins“ ”

Page 31: Mohammed AL Madhani

Maslow’s of Need could by applied to Data Op

timized

Measured

Defined

managed

Performed

Page 32: Mohammed AL Madhani

Enhance the data management maturity ( Data Preparation )

Page 33: Mohammed AL Madhani

Data Management Team ( Data preparation part )Ready to Go !

Develop and execute all data flow jobs, business

rules, matching, Scrapping, Cleaning, munging, joining and

wrangling

Responsible about data flow , data solutions and Data

Models and architectures

Responsible about managing the data elements and data

metadata.Running the maturity model components

Big Data Steward

Big DataArchitect

Big Data

Engineer

33

Page 34: Mohammed AL Madhani

34

Some CertificatsTargeting to Certify your team will increase the maturity

1. Certified Analytics Professional (CAP)

created in 2013 by the Institute for Operations Research and the Management Sciences (INFORMS) and is targeted towards data scientists.

2. EMC: Data Science Associate (EMCDSA)

tests the ability to apply common techniques and tools required for big data analytics.

3. SAS Certified Predictive Modeler :

designed for SAS Enterprise Miner users who perform predictive analytics.

SAS Certified Predictive Modeler u

EMCDSA

CAP Certified

Page 35: Mohammed AL Madhani

Tools to support the bigdata team• • Spreadsheet systems• • Statistical systems• • Optimization systems• • Simulation systems• • Business intelligence systems• • Data management systems• ▪ Structured data• ▪ Unstructured data• • Data integration systems• • Operating systems such as HADOOP

Page 36: Mohammed AL Madhani

BOOZ ALLEN TALENT MANAGEMENT MODEL

Page 37: Mohammed AL Madhani

BOOZ ALLEN TALENT MANAGEMENT MODEL

Page 38: Mohammed AL Madhani
Page 39: Mohammed AL Madhani

CRISP-DM (cross-industry standard process for data mining)

Page 40: Mohammed AL Madhani

Six Sigma’s DMAIC

Page 41: Mohammed AL Madhani

News:Metis Bootcamp Tuition Increase• Effective June 20, 2016, the tuition for the Metis Data Science

Bootcamp in New York and San Francisco will increase to $15,500. Accepted students who have signed and returned their enrollment agreements on or before June 20, 2016 will receive the current tuition of $14,000.• This is the first tuition increase for Metis, and is the result of

continued investments to ensure that our students are best prepared for careers in data science.


Top Related