• Home
  • Service
    • Service Overview
    • Our Services
  • Solutions
    • Finance Analytics
    • Marketing Analytics
    • HR Analytics
    • Sales Analytics
    • Supply chain Analytics
  • Products
    • OMEGA – BI Health Check
    • ROGERS – Retail Analytics
    • FOSHAN – Commodity Platform
  • Lab
    • Research
    • R-Training
      • R-Training- Beginner
      • R-Training Intermediate
      • R-Training Expert
    • Tableau Training
      • Tableau Training Beginner
      • Tableau Training Intermediate
      • Tableau Training Expert
  • Alliances
  • Events
  • Gallery
  • Careers
  • Contact Us
  • (+91) 950-310-4824
  • info@vcreatek.com
Twitter Linkedin Instagram

Employee Login
vcreatek_logo
  • Home
  • Service
    • Service Overview
    • Our Services
  • Solutions
    • Finance Analytics
    • Marketing Analytics
    • HR Analytics
    • Sales Analytics
    • Supply chain Analytics
  • Products
    • OMEGA - BI Health Check
    • ROGERS – Retail Analytics
    • FOSHAN - Commodity Platform
  • Lab
    • Research
    • R-Training
      • R-Training- Beginner
      • R-Training Intermediate
      • R-Training Expert
    • Tableau Training
      • Tableau Training Beginner
      • Tableau Training Intermediate
      • Tableau Training Expert
  • Alliances
  • Events
  • Gallery
  • Careers
  • Contact Us
vcreatek_logo
  • Home
  • Service
    • Service Overview
    • Our Services
  • Solutions
    • Finance Analytics
    • Marketing Analytics
    • HR Analytics
    • Sales Analytics
    • Supply chain Analytics
  • Products
    • OMEGA - BI Health Check
    • ROGERS – Retail Analytics
    • FOSHAN - Commodity Platform
  • Lab
    • Research
    • R-Training
      • R-Training- Beginner
      • R-Training Intermediate
      • R-Training Expert
    • Tableau Training
      • Tableau Training Beginner
      • Tableau Training Intermediate
      • Tableau Training Expert
  • Alliances
  • Events
  • Gallery
  • Careers
  • Contact Us

Research

CRISP-DM

The Cross Industry Standardized Process for Data Mining

Founded in 1996, it is the leading Project Methodology for all significant analytics work.

CRISP-DM consists of six major phases:

  1. Business Understanding (essentially the requirements phase);
  2. Data Understanding (the Business Requirements are mapped to data attributes);
  3. Data Preparation (Data Wrangling or Munging);
  4. Modeling (actual predictive models are built using a variety of algorithms/methods; e.g., GLM’s (General Linear Models), SVM’s (Support Vector Machines), and so on)
  5. Evaluation (the models are back-tested on holdout samples) and;
  6. Deployment (the models are put into production).

The last phase can lead into what is known as Operational Analytics, which is an enormous subject area – the modern way to describe it is to provide a “turn-key” solution.  Model Management is also important – models are only as good as when they were created, and constant checking is needed to ensure they still work properly.  The phases are run in a staggered manner and often get re-run over the course of a project.  CRISP-DM is extensible and can easily be adapted to the specific needs of a particular user.

Data Scientist or Not ??

What a Data Scientist is Not

Let’s begin by talking about what a Data Scientist isn’t.  As is well-known there are a lot of people claiming to be Data Scientists, many of whom are clearly not.  To start with, Data Scientists are not people who have completed a few Coursera Machine Learning courses and know only Hadoop. Statisticians are not Data Scientists, even if they have a Masters or a PhD.  Lastly, Software Engineers are not Data Scientists – even if they are fantastic programmers.

So What is a Data Scientist? 

A Data Scientist is a true Scientist – that is, someone who uses the Scientific Method.  According to the Oxford English Dictionary the scientific method is defined as “a method or procedure, consisting in systematic observation, measurement, and experiment, and the formulation, testing, and modification of hypotheses.”  The key point here is hypothesis testing, which should be used in the normal statistical sense.

A Data Scientist is someone with a very deep understanding of the relation: Data -> Information -> Knowledge as well as an intuitive grasp of the Man/Machine Boundary.  Also, an advanced degree in Data Science is not necessarily enough – you need to find a master and serves as an apprentice.  This is because Data Science is intensely practical – lots of knowledge, understanding, and (especially) experience are required.  It is also a good idea to think in a cross-disciplinary manner, as some of the best and most powerful methods have been borrowed from the most unlikely places.

**************************************************

Note: Original Blog Entry by Mark A. Norrie.

Looking for Best Analytics & Automation Services ?

We have 40+ experts in Advance Technology which will change the way you work. Request for a call today.
Get In Touch
View VCREATEK CONSULTING SERVICES PVT LTD profile on Ariba Discovery

India

VCREATEK CONSULTING SERVICES PVT LTD
C-9 Hermes Drome. Viman Nagar, Pune 411014,
Maharashtra.
(+91) 848-40 4-4824

USA

VCREATEK CONSULTING LLC
27718 Rocky Creek Ct,
Fulshear TX 77441
+1 ‪(904) 310-4824‬

Belgium

SALES OFFICE
Schrieksebaan 280,
3140 Keerbergen.
(+32) 486 728 885

UK

SALES OFFICE
234, Century Warf, Chantlery, Cardiff CF10 5NQ.
(+44) 7383 161767

Twitter Linkedin Instagram
Copyright 2020 by vCreaTek LLC All Right Reserved.