srikanth hadoop 3.6yrs_hyd

6
Srikanth K :[email protected] Hadoop Developer : +91-7075436413 EXECUTIVE SUMMARY Having 3+ years of IT experience as an Big Data and Hadoop. Having work experience in HDFS, MapReduce, Apache PIG, HIVE, Sqoop and Hbase.. A dedicated team player, committed towards providing high quality support and excellent problem solving skills. Good communication & interpersonal skills. Good Team player and self motivated. TECHNICAL SKILLS Hadoop : HDFS, Map Reduce, Apache PIG, HIVE, Sqoop and Hbase Operating Systems : Windows 2003/2008, Unix, Cent-Linux Relational Databases: My SQL, ORACLE Java : Core Java EDUCATION M.C.A from J N T U University. BSc (M.p.ca) from Osmania University. PROFESSIONAL EXPERIENCE Working for ADP India Pvt Ltd through Alwasi Software Pvt Ltd, Hyderabad from March 2012 to till date.

Upload: srikanth-k

Post on 16-Jan-2017

255 views

Category:

Software


0 download

TRANSCRIPT

Page 1: Srikanth hadoop 3.6yrs_hyd

Srikanth K :[email protected] Developer : +91-7075436413

EXECUTIVE SUMMARY

Having 3+ years of IT experience as an Big Data and Hadoop. Having work experience in HDFS, MapReduce, Apache PIG, HIVE, Sqoop and Hbase.. A dedicated team player, committed towards providing high quality support and excellent

problem solving skills. Good communication & interpersonal skills. Good Team player and self motivated.

TECHNICAL SKILLS

Hadoop : HDFS, Map Reduce, Apache PIG, HIVE, Sqoop and Hbase

Operating Systems : Windows 2003/2008, Unix, Cent-Linux

Relational Databases: My SQL, ORACLE

Java : Core Java

EDUCATION

M.C.A from J N T U University.

BSc (M.p.ca) from Osmania University.

PROFESSIONAL EXPERIENCE

Working for ADP India Pvt Ltd through Alwasi Software Pvt Ltd, Hyderabad from March 2012 to till date.

Trained in BIGDATA / Hadoop and deployed to client.

PROJECT DETAILS

Page 2: Srikanth hadoop 3.6yrs_hyd

PROJECT : LOWES Re-hosting of Web Intelligence Environment : Hadoop, Apache Pig, Hive, Sqoop, Java, Linux, MySQL Duration : Jan 2014 to till Date

Description:

The purpose of the project is to store terabytes of log information generated by the ecommerce

website and extract meaning information out of it. The solution is based on the open source Big

Data s/w Hadoop .The data will be stored in Hadoop file system and processed using

Map/Reduce jobs. Which intern includes getting the raw html data from the websites, Process

the html to obtain product and pricing information, Extract various reports out of the product

pricing information and Export the information for further processing.

This project is mainly for the re-plat forming of the current existing system which is running on

Web Harvest a third party JAR and in My SQL DB to a new cloud solution technology called

Hadoop which can able to process large date sets (i.e. Tera bytes and Peta bytes of data) in

order to meet the client requirements with the increasing completion from his retailers.

Responsibilities:

Participated in client calls to gather and analyses the requirement.

Moved all crawl data flat files generated from various retailers to HDFS for further

processing.

Written the Apache PIG scripts to process the HDFS data.

Created Hive tables to store the processed results in a tabular format.

Developed the sqoop scripts in order to make the interaction between Pig and

My SQL Database.

For the development of Dashboard solution, developed the Controller, Service

and Dao layers of Spring Framework.

Developed scripts for creating the reports from Hive data.

Completely involved in the requirement analysis phase.

Page 3: Srikanth hadoop 3.6yrs_hyd

Project : Private Bank Repository DW

Environment : Hadoop, HDFS, Map Reduce, Hive, Hbase and MYSQLDuration : June2012 to Dec 2013

Project Synopsis A full-fledged dimensional data mart to cater to the CPB analytical reporting

requirement as the current GWM system is mainly focused on data enrichment,

adjustment, defaulting and other data oriented process. Involved in the full development

life cycle in a distributed environment for the Candidate Module. . Private Bank

Repository system is processing approximately 5, 00,000 of records every month.

Responsibilities

Participated in client calls to gather and analyses the requirement.

Involved in setup for Hadoop Cluster in Pseudo-Distributed Mode based on Linux

commands.

Involved in Core Concepts of Hadoop HDFS, Map Reduce (Like Job Tracker, Task

tracker).

Involved in Map Reduce phases By Using Core java , create and put jar files in to HDFS

and run web UI for Name node , Job Tracker and Task Tracker.

Involved in Extracting, transforming, loading Data from Hive to Load an RDBMS.

Involved in Transforming Data within a Hadoop Cluster

Involved in using Pentaho Map Reduce to Parse Weblog Data for Pentaho Map

Reduce to convert raw weblog data into parsed, delimited records.

Involved Job to Load, Loading Data into Hive.

Involved in Create the Table in Hbase, Create a Transformation to Load Data into

Hbase.

Involved in Writing input output Formats for CSV.

Involved in Import and Export by using Sqoop for job entries.

Design and development by using Pentaho.

Involved in Unit Test Pentaho Map Reduce Transformation

Project Project under Training – March 2012 to June 2012Big Data initiative in the largest Financial Institute in North America:

Page 4: Srikanth hadoop 3.6yrs_hyd

One of the largest financial institutions in north America had implemented small

business banking e statements project using existing software tools and applications. The

overall process to generate e statement and send alerts to customers was taking 18 to 30 hours

per cycle day. Hence missing all SLA's leading to customer dissatisfaction.

The purpose of the project to cut down the processing time to generate E-statements and alerts

by at least 50% and also cut down the cost by 50%.

Solution All sources of Structured and Unstructured data ingested into Hadoop platform

o Unstructured data for small business e statements

o structured data of Financial transaction, cycle transaction, supplemental

transaction, WCC customer data and GAI online banking data.

4GB chunks of data are created at the source (card processing system) and send

directly to card processing system for PDF's generation.

E-Statements account numbers are generated using PIG scripts.

All the data combined together using Hive QL to create necessary data for customer

notification engine.

Tools used: Sqoop to import data from databases to Hadoop

Environment : Hadoop, Map Reduce, Hive.Role : Hadoop Developer

Roles & Responsibilities:

Create Hduser for performing HDFS operations

Create Map Reduce user for performing map Reduce operations only

Written the Apache PIG scripts to process the HDFS data.

Setting Password less Hadoop

Hadoop Installation Verification(Terra sort benchmark test)

Setup Hive with My Sql as a Remote Meta store

Developed the sqoop scripts in order to make the interaction between Hive and My SQL

Database.

Moved all log files generated by various network devices into HDFS location

Created External Hive Table on top of parsed data

(Srikanth)