http://sps.columbia.edu/certificates/actuarial-science-certificate/tuition-and-financing
http://sps.columbia.edu/actuarial-science
https://www.collegechoice.net/rankings/best-actuarial-science-degrees/
https://www.collegevaluesonline.com/rankings/best-value-actuarial-science-programs/
http://www.businessresearchguide.com/degrees/best/careers-becoming/actuary/
https://thebestschools.org/rankings/best-online-bachelor-in-mathematics-degree-programs/
https://www.appliedmathonline.uw.edu/program-details/courses-curriculum/
https://www.appliedmathonline.uw.edu/
https://math.washington.edu/campus-resources
https://math.washington.edu/courses-related-actuarial-examinations
http://www.concordia.ca/artsci/math-stats/programs.html
https://uwaterloo.ca/statistics-and-actuarial-science/
https://www.statistics.utoronto.ca/
http://depts.washington.edu/compfin/cfrm-ms/
https://amath.washington.edu/master-science-applied-mathematics
https://marketplace.kpiinstitute.org/online-course-certified-kpi-prof.html
https://marketplace.kpiinstitute.org/online-certified-performance-improvement-prof.html
https://marketplace.kpiinstitute.org/online-course-certified-data-analysis-prof.html
https://marketplace.kpiinstitute.org/online-certified-benchmarking-prof.html
https://aryng.com/analytical-test-certificate/?cl=JHLOufal721553787200
https://hbr.org/2010/07/the-execution-trap
“A mediocre strategy well executed is better than a great strategy poorly executed.”
Balanced scorecard can help senior managers systematically link current actions with tomorrow’s goals, focusing on that place where, in the words of the authors, “the rubber meets the sky".
Core to data science is the ability to connect a problem, or market, to data. Even today, many practicing data scientist are required to be "full stack": problem definition and research design; statistics and applied mathematics training; software engineering experience; and, product articulation and business impact. Having one person, or even a small team, being responsible for all of these operational components across a whole organization is unsustainable. As such, these roles need to begin being broken down into constituent parts within organizations.
Best advice for someone to wants to go into data science?
The most successful data scientists I know are those driven by a curiosity for solving problems through data. To that end, I recommend starting by thinking about what kinds of problems they find most interesting. From there, go out and try to find some data in the problem area and begin experimenting. Run some basic summary statistics on the data, plot some basic graphs, or even build a simple model. This is also a great way to learn the relevant tools and methods, and the result of which can become a nice asset in a person's portfolio. This also, typically, leads people to ask a lot of new questions, which creates a virtuous feedback loop of learning and community discovery.
On average, a 1 percent price increase translates into an 8.7 percent increase in operating profits (assuming no loss of volume, of course).
customer analytics dominate big data use in sales and marketing departments, supporting the four key strategies of increasing customer acquisition, reducing customer churn, increasing revenue per customer and improving existing products.
Big Data Use Cases:
48% Customer Analytics
21% Operational Analytics
12% Fraud & Compliance
10% New Product & Service Compliance
10% Enterprise Data Warehouse Optimization
18 Best Analytics Tools Every Business Manager Should Know
Business experiments: Business experiments, experimental design and AB testing are all techniques for testing the validity of something – be that a strategic hypothesis, new product packaging or a marketing approach. It is basically about trying something in one part of the organization and then comparing it with another where the changes were not made (used as a control group). It’s useful if you have two or more options to decide between.
Correlation analysis: This is a statistical technique that allows you to determine whether there is a relationship between two separate variables and how strong that relationship may be. It is most useful when you ‘know’ or suspect that there is a relationship between two variables and you would like to test your assumption.
Regression analysis: Regression analysis is a statistical tool for investigating the relationship between variables; for example, is there a causal relationship between price and product demand? Use it if you believe that one variable is affecting another and you want to establish whether your hypothesis is true.
Scenario analysis: Scenario analysis, also known as horizon analysis or total return analysis, is an analytic process that allows you to analyze a variety of possible future events or scenarios by considering alternative possible outcomes. Use it when you are unsure which decision to take or which course of action to pursue.
Forecasting/time series analysis: Time series data is data that is collected at uniformly spaced intervals. Time series analysis explores this data to extract meaningful statistics or data characteristics. Use it when you want to assess changes over time or predict future events based on what has happened in the past.Data mining: This is an analytic process designed to explore data, usually very large business-related data sets – also known as ‘big data’ – looking for commercially relevant insights, patterns or relationships between variables that can improve performance. It is therefore useful when you have large data sets that you need to extract insights from.
Text analytics: Also known as text mining, text analytics is a process of extracting value from large quantities of unstructured text data. You can use it in a number of ways, including information retrieval, pattern recognition, tagging and annotation, information extraction, sentiment assessment and predictive analytics.
Sentiment analysis: Sentiment analysis, also known as opinion mining, seeks to extract subjective opinion or sentiment from text, video or audio data. The basic aim is to determine the attitude of an individual or group regarding a particular topic or overall context. Use it when you want to understand stakeholder opinion.
Image analytics: Image analytics is the process of extracting information, meaning and insights from images such as photographs, medical images or graphics. As a process it relies heavily on pattern recognition, digital geometry and signal processing. Image analytics can be used in a number of ways, such as facial recognition for security purposes.
Video analytics: Video analytics is the process of extracting information, meaning and insights from video footage. It includes everything that image analytics can do plus it can also measure and track behavior. You could use it if you wanted to know more about who is visiting your store or premises and what they are doing when they get there.
Voice analytics: Voice analytics, also known as speech analytics, is the process of extracting information from audio recordings of conversations. This form of analytics can analyze the topics or actual words and phrases being used, as well as the emotional content of the conversation. You could use voice analytics in a call center to help identify recurring customer complaints or technical issues.
Monte Carlo Simulation: The Monte Carlo Simulation is a mathematical problem-solving and risk-assessment technique that approximates the probability of certain outcomes, and the risk of certain outcomes, using computerized simulations of random variables. It is useful if you want to better understand the implications and ramifications of a particular course of action or decision.
Linear programming: Also known as linear optimization, this is a method of identifying the best outcome based on a set of constraints using a linear mathematical model. It allows you to solve problems involving minimizing and maximizing conditions, such as how to maximize profit while minimizing costs. It’s useful if you have a number of constraints such as time, raw materials, etc. and you wanted to know the best combination or where to direct your resources for maximum profit.
Cohort analysis: This is a subset of behavioral analytics, which allows you to study the behavior of a group over time. It is especially useful if you want to know more about the behavior of a group of stakeholders, such as customers or employees.
Factor analysis: This is the collective name given to a group of statistical techniques that are used primarily for data reduction and structure detection. It can reduce the number of variables within data to help make it more useful. Use it if you need to analyze and understand more about the interrelationships among a large number of variables.
Neural network analysis: A neural network is a computer program modeled on the human brain, which can process a huge amount of information and identify patterns in a similar way that we do. Neural network analysis is therefore the process of analyzing the mathematical modeling that makes up a neural network. This technique is particularly useful if you have a large amount of data.
Meta analytics/literature analysis: Meta analysis is the term that describes the synthesis of previous studies in an area in the hope of identifying patterns, trends or interesting relationships among the pre-existing literature and study results. Essentially, it is the study of previous studies. It is useful whenever you want to obtain relevant insights without conducting any studies yourself.
In Resume Building for data science - Communication and prioritization is important so that interviewer is aware that we are knowing of this fact
6 key analytics skills used by successful analyst/data scientist are:
- DTD framework: Understanding and hands-on experience of the basic “Data to Decisions” framework
- SQL skills: Ability to pull data from multiple sources and collate: experience in writing SQL queries and exposure to tools like Teradata, Oracle etc. Some understanding of Big Data tools using Hadoop is also helpful.
- Basic “applied” stat techniques a.k.a. Business Analytics:Hands-on experience with basic statistical techniques: Profiling, Correlation analysis, Trend analysis, Sizing/Estimation, Segmentation (RFM, product migration etc.). If you are in a consumer business, this list would include hands-on comfort in A/B Testing (also called Design of Experiments)
- Working effectively with business side: Ability to work effectively with stakeholders by building alignment, effective communication and influencing
- Advanced “applied” stat techniques (hands-on) a.k.a. Predictive Analytics and machine learning: Hands-on comfort with advanced techniques: Time Series, Predictive Analytics – Regression and Decision Tree, Segmentation (K-means clustering), machine learning - neural networks and Text Analytics (optional)
- Stat Tools: Experience with one or more statistical tools like Python, R, SAS, SPSS, Knime or others.
4 key data science skills needed by business professionals are:
- DTD framework: Understanding and hands-on experience of the basic “Data to Decisions” framework.
- Basic “applied” stat techniques: Hands-on experience with business analytics: Profiling, Correlation analysis, Trend analysis, Sizing/Estimation, Basic Segmentation and basics of A/B Testing (also called Design of Experiments)
- Working effectively with analysts: Ability to work effectively with Data Scientists/Analyst
- Advanced “applied” stat techniques (intro): High-level understanding of predictive analytics: Time Series, Predictive Analytics – Regression and Decision Tree, Segmentation, Neural Networks.
Key Attribute of an Actionable Insight:
1. Alignment
When an insight is closely tied to your key business goals and strategic initiatives, it’s more likely to drive action. If you don’t know how to react to a particular metric when it significantly increases or decreases, you might be looking at an unnecessary vanity metric. Insights based on key performance indicators (KPIs) and other key metrics inherently engender a sense of urgency that other data won’t. It’s easier to interpret and convert strategically-aligned insights into tactical responses because they often relate directly to the levers in your business that you control, influence or are focused on.
2. Context
It is hard to move forward on an insight if you are lacking ample background to appreciate why it’s important or unique. We often need to have a comparison or benchmark to give data proper context. For example, if your company generated 1,400 leads this week your reaction to this result could change entirely with a dash of context. If you know your marketing team typically generates 1,250 leads each week you might do nothing. However, if your company just sponsored and exhibited at your industry’s major convention last week, you may be wondering why it didn’t translate into significantly more leads. Without accompanying context, an insight can end up raising more questions than action. Having ample supporting details ensures the insight results in action and not unwarranted skepticism and objections.
3. Relevance
A single insight can be both a strong signal for one person and just more noise for another. There’s a level of subjectivity when it comes to the relevance of insights. In order to be relevant, an insight needs to be delivered to the right person at the right time in the right setting. If insights aren’t routed to the right decision makers, they will not receive the attention they deserve. If insights aren’t timely, they might be too stale for stakeholders to act on. If insights are trapped in an analytics tool that managers never access or delivered to devices they use infrequently, the insights may never reach the intended audience.
4. Specificity
The more specific and complete the insight is, the more likely it can be acted on. Sometimes insights based on KPIs and other high-level metrics can highlight interesting anomalies but lack sufficient detail to drive immediate action. For example, knowing your revenue is up 35% this month may be a cause to celebrate, but party planning might be premature without deeper insights. You might discover a massive fraudulent order messed up your online revenue numbers or a big customer win was based on promised product functionality that won’t exist for the foreseeable future. If an insight doesn’t adequately help to explain why something occurred, it’s not yet actionable. Deeper probing may be required before it’s ready for primetime.
5. Novelty
With so much competing data and information to digest, novel insights will have an advantage over more familiar insights. The first time your company spots a particular pattern will be more interesting and compelling than the tenth time, especially if you feel you already have a good handle on what’s driving the behavior. Curiosity can drive people to test or verify an unusual or unexpected finding in the hope that it sheds new light on a key subject area. This criterion speaks more to human nature than to how valuable an insight actually is. We become numb to certain insights if we feel as though they reinforce rather than challenge or evolve our current knowledge and beliefs.
6. Clarity
If people don’t clearly understand an insight, why it’s important and how it can help them—the insight will be overlooked and forgotten. Communicating insights effectively is important to their adoption and fruition. The right data visualizations and messaging can help explain insights so they are more easily understood and correctly interpreted. However, poor communication can cause the signal to be lost in the noise. A clearly communicated insight creates a strong signal that is hard to miss or ignore, and it prepares a pathway for action to occur.
With these six criteria you can weigh how “actionable” the insights are that you receive from your analytics and business intelligence tools. Regardless of the source of the insights—humans or machines—the more they line up with these attributes, the more actionable they will be for your business. Strategically-aligned tops random; relevant beats extraneous; novel trumps familiar—you get the idea.
Harvesting more insights from your data can yield tremendous returns for your company. However, as author Richard Bach said, “Any powerful idea is absolutely fascinating and absolutely useless until we choose to use it.” While the increased actionability of an insight doesn’t guarantee its adoption or application, it should motivate more individuals within your company to think more deeply about the data and encourage them to act on a more consistent basis. Make sure your hard-earned insights are as actionable as possible so they’re primed to drive value for your organization.
https://www.smartdatacollective.com/
https://www.smartdatacollective.com/why-data-analytics-insurance-industry-is-major-game-changer/
https://www.smartdatacollective.com/5-brands-using-big-data-to-brilliantly-disrupt-tech/
https://junkcharts.typepad.com/numbersruleyourworld/
http://www.dataminingblog.com/
https://junkcharts.typepad.com/junk_charts/
https://junkcharts.typepad.com/numbersruleyourworld/
http://abbottanalytics.blogspot.com/
http://www.dataminingblog.com/
https://www.kdnuggets.com/websites/blogs.html
https://mathbabe.org/2012/12/20/nate-silver-confuses-cause-and-effect-ends-up-defending-corruption/
https://statistically-funny.blogspot.com/
Good Sources of Data : https://www.forbes.com/sites/bernardmarr/2016/02/12/big-data-35-brilliant-and-free-data-sources-for-2016/#6c28fb45b54d
https://www.wallstreetmojo.com/top-best-actuaries-books/
https://www.quora.com/What-are-some-good-books-for-someone-interested-in-becoming-an-actuary
https://www.quora.com/What-are-some-books-for-actuarial-science
https://www.quora.com/What-book-should-I-refer-to-to-become-an-actuary
https://www.quora.com/What-are-some-good-Statistics-books-for-an-Actuarial-undergraduate
https://etchedactuarial.com/tips-and-advice-thank-page/
https://www.actuarialninja.com/actuarial-exams/choosing-between-soa-and-cas/
https://etchedactuarial.com/how-to-study-exam-p/
https://www.quora.com/Is-actuarial-science-difficult
http://www.pstat.ucsb.edu/instruction/actuary/study.html
https://www.soa.org/library/newsletters/the-future-actuary/2007/summer/how-to-prepare.aspx
https://www.soa.org/library/newsletters/the-future-actuary/2007/summer/how-to-prepare.aspx
https://www.soa.org/library/newsletters/the-future-actuary/2007/summer/how-to-prepare.aspx
https://www.beanactuary.org/exams/
https://www.amazon.in/Top-Secrets-Passing-Actuary-Exams-ebook/dp/B016YOCMUQ
https://www.amazon.in/Top-Secrets-Passing-Actuary-Exams-ebook/dp/B016YOCMUQ
https://www.theinfiniteactuary.com/
https://www.actexmadriver.com/
https://www.actexmadriver.com/asmstudymanuals.aspx
http://www.actuarialoutpost.com/actuarial_discussion_forum/showthread.php?t=193978&page=2
https://www.actuary.com/actuarial-discussion-forum/archive/index.php/t-21466.html
https://etchedactuarial.com/best-study-manual-for-exam-p/
https://www.soa.org/education/exam-req/edu-exam-p-detail.aspx
https://www.actuarialbookstore.com/samples/ASM_1P-ASM-18SSMP_SAMPLE_10-23-17.pdf
https://www.actuarialbookstore.com/samples/1P-ACT-18SSM_SAMPLE.pdf
https://www.quora.com/What-courses-should-I-take-for-exam-P-of-the-actuarial-exams
http://rixisu.6d4cf9adb07.bocesawo.pw/oEt2eabbJKYqDfJYMPs/
https://math.stackexchange.com/questions/1189145/recommended-textbook-to-prepare-for-exam-p
From KD Nuggets Website list of certifications : https://www.kdnuggets.com/education/analytics-data-mining-certificates.html
Australia Analytics Credential : https://www.iapa.org.au/resources/article/first-analytics-credential-in-australia-launched
Microsoft Certificate of Data Science: https://www.edx.org/microsoft-professional-program-data-science
University of Pennsylvania: https://www.coursera.org/specializations/business-analytics
Rice University: https://www.coursera.org/specializations/business-statistics-analysis
Essec School : https://www.coursera.org/specializations/strategic-analytics
Master of Business Administration :
- https://www.coursera.org/specializations/strategic-leadership
- https://www.coursera.org/degrees/imba
Emory University Foundations of Marketing Analytics Specialization: https://www.coursera.org/specializations/marketing-analytics
EDX Data Science Courses : https://www.edx.org/course/subject/data-science
University of Colarado : https://www.coursera.org/specializations/data-analytics-business
PwC : https://www.coursera.org/specializations/pwc-analytics?
Specialization : https://www.coursera.org/specializations/data-analytics-business
https://www.coursera.org/learn/text-mining-analytics
https://www.coursera.org/learn/predictive-analytics
https://www.coursera.org/learn/hypothesis-testing-confidence-intervals
Advance level courses:
https://www.coursera.org/specializations/aml
https://www.coursera.org/learn/linear-models-2
https://www.coursera.org/specializations/advanced-data-science-ibm
Management Courses :
https://www.coursera.org/specializations/marketing-mix
https://www.coursera.org/specializations/strategic-analytics
https://www.coursera.org/specializations/understanding-modern-finance
https://www.coursera.org/learn/business-strategies
https://www.coursera.org/learn/business-model
https://www.coursera.org/learn/case-studies-business-analytics-accenture
https://www.coursera.org/learn/strategic-management
https://www.coursera.org/learn/brand-management
https://www.coursera.org/learn/strategic-business-analytics
Data Science Certification as per CIO Magazine :
Some popular data science certifications include the following:
- Certified Analytics Professional (CAP) – The Cap Program
- Certified Specialist in Predictive Analytics (CSPA) – The CAS Institute
- Cloudera Certified Professional: CCP Data Engineer – Cloudera
- Data Science Certificate – Harvard Extension School
- DASCA Data Science Credentials – Data Science Council of America
- IAPA Analytics Credentials – IAPA
- SAS Academy for Data Science – SAS Institute
- SAS Certified Big Data Professional/Data Scientist – SAS Institute
- Simplilearn Data Science Certification Training – Simplilearn
- Teradata Aster Analytics Certification – Teradata
Data Science Degree Programs :
- Master of Science in Statistics: Data Science at Stanford University
- Master of Information and Data Science: Berkeley School of Information
- Master of Computational Data Science: Carnegie Mellon University
- Master of Science in Data Science: Harvard University John A. Paulson School of Engineering and Applied Sciences
- Master of Science in Data Science: University of Washington
- Master of Science in Data Science: John Hopkins University Whiting School of Engineering
- MSc in Analytics: University of Chicago Graham School
op 15 data science certifications
- Applied AI with DeepLearning, IBM Watson IoT Data Science Certificate
- Certified Analytics Professional (CAP)
- Cloudera Certified Associate: Data Analyst
- Cloudera Certified Professional: CCP Data Engineer
- Data Science Council of America (DASCA)
- Dell Technologies Data Scientist Associate (DCA-DS)
- Dell Technologies Data Scientist Advance Analytics Specialist (DCS-DS)
- HDP Data Science
- IBM Certified Data Architect
- Microsoft MCSE: Data Management and Analytics
- Microsoft Certified Azure Data Scientist Associate
- Microsoft Professional Program in Data Science
- SAS Certified Advanced Analytics Professional
- SAS Certified Big Data Professional
- SAS Certified Data Scientist
Applied AI with DeepLearning, IBM Watson IoT Data Science Certificate
To earn IBM’s Watson IoT Data Science Certification, you’ll need some experience coding, preferably in Python, but they will consider any programming language as a place to start. Math skills, especially with linear algebra, are recommended but the course promises to cover the topics within the first week. It’s aimed at those with more advanced data science skills and classes are offered through Coursera.
Cost: $49 per month for a subscription to Coursera
Location: Online
Duration: Self-paced
Expiration: Does not expire
Certified Analytics Professional (CAP)
CAP offers a vender-neutral certification and promises to help you “transform complex data into valuable insights and actions,” which is exactly what businesses are looking for in a data scientist: someone who not only understands the data but can draw logical conclusions and then express to key stakeholders why those data points are significant. If you’re new to data analytics, you can start with the entry-level Associate Certified Analytics Professional (aCAP) exam and then move on to your CAP certification.
Cost: $495 for INFORMS members, $695 for non-members; team pricing for organizations is available on request
Location: In person at designated test centers
Duration: Self-paced
Expiration: Valid for three years
Cloudera Certified Associate: Data Analyst
The CCA exam demonstrates your foundational knowledge as a developer, data analyst and administrator of Cloudera’s enterprise software. Passing a CCA exam and earning your certification will show employers that you have a handle on the basic skills required to be a data scientist. It’s also a great way to prove your skills if you’re just starting out and lack a strong portfolio or past work experience.
Cost: $295 per exam specialty and per attempt
Location: Online
Duration: Self-paced
Expiration: Valid for two years
Cloudera Certified Professional: CCP Data Engineer
Once you earn your CCA, you can move on to the CCP exam, which Cloudera touts as one of the most rigorous and “demanding performance-based certifications.” According to the website, those looking to earn their CCP need to bring “in-depth experience developing data engineering solutions” to the table, as well as a “high-level of mastery” of common data science skills. The exam consists of eight to 12 customer problems that you will have to solve hands-on using a Cloudera Enterprise cluster. The exam lasts 120 minutes and you’ll need to earn a 70 percent or higher to pass.
Cost: $600 per attempt — each attempt includes three exams
Location: Online
Duration: Self-paced
Expiration: Valid for three years
Data Science Council of America (DASCA)
The Data Science Council of America offers a data scientist certification that was designed to address “credentialing requirements of senior, accomplished professionals who specialize in managing and leading Big Data strategies and programs for organizations,” according to DASCA. The certification track includes paths for earning your Senior Data Scientist (SDS) and the more advanced Principal Data Scientist (PDS) credentials. Both exams last 100 minutes and consist of 85 and 100 multiple-choice questions for the SDS and PDS exams, respectively. You’ll need at least six or more years of big data analytics or engineering experience to start on the SDS track and 10 or more years of experience to qualify for the PDS exam.
Cost: $520 per exam
Location: Online
Duration: Self-paced
Expiration: 5 years
Dell Technologies Data Scientist Associate (DCA-DS)
The DCA-DS certification is an entry-level data science designation that is designed for those new to the industry or who want to make a career switch to work as a data scientist. While the exam is designed for those without a strong background in machine learning, statistics, math or analytics, it’s still a requirement for the more advanced certification. So even if you’re already an experienced data scientist, you’ll still need to pass this exam before you can move on to the Advanced Analytics Specialist designation.
Cost: $230 per Proven Professional certification exam; you’ll also need to purchase any books or other course material
Location: Online via Pearson VUE
Duration: Self-paced
Expiration: Does not expire
Dell Technologies Data Scientist Advanced Analytics Specialist (DCS-DS)
The DCS-DS certification builds on the entry-level associate certification and covers general knowledge of big data analytics across different industries and technologies. It doesn’t specifically focus on one product or industry, so it’s a good option if you aren’t sure where you want to go with your data career or if you just want a more generalized certification for your resume. The exam covers advanced analytical methods, social network analysis, natural language processing, data visualization methods and popular data tools like Hadoop, Pig, Hive and HBase.
Cost: $230 per Proven Professional certification exam; you’ll also need to purchase any books or other course material
Location: Online via Pearson VUE
Duration: Self-paced
Expiration: Does not expire
HDP Data Science
The HDP Data Science certification course from Hortonworks covers data science topics like machine learning and natural language processing. It also covers popular concepts and algorithms used in classification, regression, clustering, dimensionality reduction and neural networks. The course will also get you up to speed on the latest tools and frameworks, including Python, NumPy, pandas, SciPy, Sckikit-learn, NLTK, TensorFlow, Jupyter, Spark MLlib, Stanford CoreNLP, TensorFlowOnSpark/Horovod/MLeap and Apache Zeppelin. The course includes a combination of lecture and discussion and the other half consists of hands-on labs, which you’ll complete before taking the exam.
Cost: $250 per attempt
Location: Online
Duration: 4 days
Expiration: 2 years
IBM Certified Data Architect
IBM’s Certified Data Architect certification isn’t for everyone — it’s geared toward seasoned professionals and experts in the field. IBM recommends that you have knowledge of the data layer and associated risk and challenges, cluster management, network requirement, important interfaces, data modeling, latency, scalability, high availability, data replication and synchronization, disaster recovery, data lineage and governance, LDAP security and general big data best practices. You will also need prior experience with software such as BigInsights, BigSQL, Hadoop and Cloudant (NoSQL), among others. You can see the long list of prerequisites on IBM’s website, but it’s safe to say you’ll need a solid background in data science to qualify for this exam.
The certification exam consists of 55 questions and five sections focusing on requirements (16%), use cases (46%), applying technologies (16%), recoverability (11%) — you will have 90 minutes to complete the exam. IBM offers web-based and in-classroom training courses on InfoSphere BigInsights, BigInsights Analytics for Programmers and Big SQL for developers.
Cost: $200
Location: Online
Duration: 90 minutes
Expiration: N/A
Microsoft MCSE: Data Management and Analytics
MCSE certifications cover a wide variety of IT specialties and skills, including data science. For data science certifications, Microsoft offers two courses, one that focuses on business applications, and another that focuses on data management and analytics. However, each course requires prior certification under the MCSE Certification program, so you’ll want to make sure you check the requirements first.
Cost: $165 per exam, per attempt
Location: Online
Duration: Self-paced
Expiration: Valid for three years
Microsoft Certified Azure Data Scientist Associate
The Azure Data Scientist Associate certification from Microsoft focuses your ability to utilize machine learning to “train, evaluate and deploy models that solve business problems,” according to Microsoft. Candidates for the exam are tested on machine learning, AI solutions, natural language processing, computer vision and predictive analytics. The exam focuses on defining and preparing the development environment, data modeling, feature engineering and developing models.
Cost: $165
Location: Online
Duration: Self-paced
Expiration: Credentials do not expire
Microsoft Professional Program in Data Science
The Microsoft Professional Program in Data Science focuses on eight specific data science skills, including T-SQL, Microsoft Excel, PowerBI, Python, R, Azure Machine Learning, HDInsight and Spark. Microsoft claims there are over 1.5 million open jobs looking for these skills. Courses run for three months every quarter and you don’t have to take them in order; it’s self-paced with a recommended commitment of two to four hours per week.
Cost: Must purchase credits through EdX, some materials are free
Location: Online
Duration: 6 weeks
Expiration: Does not expire
SAS Certified Advanced Analytics Professional
This program covers machine learning, predictive modeling techniques, working with big data sets, finding patterns, optimizing data techniques and time series forecasting. The certification program consists of nine courses and three exams that you’ll have to pass to earn the designation. You’ll need at least six months of programming experience in SAS or another language and it’s also recommended that you have at least six months of experience using mathematics or statistics in a business setting.
Cost: $299 per month subscription
Location: Online
Duration: Self-paced
Expiration: Credentials do not expire
SAS Certified Big Data Professional
The SAS Big Data certification includes two modules with a total of nine courses. You’ll need to pass two exams to earn the designation. The course covers SAS programming skills, working with data, improving data quality, communication skills, fundamentals of statistics and analytics, data visualization and popular data tools such as Hadoop, Hive, Pig and SAS. To qualify for the exam, you’ll need at least six months of programming experience in SAS or another language.
Cost: $299 per month subscription
Location: Online
Duration: Self-paced
Expiration: Credentials do not expire
SAS Certified Data Scientist
The SAS Certified Data Scientist certification is a combination of the other two data certifications offered through SAS. It covers programming skills, managing and improving data, transforming, accessing and manipulating data and how to work with popular data visualization tools. Once you earn both the Big Data Professional and Advance Analytics Professional certifications, you can qualify to earn your SAS Certified Data Scientist designation. You’ll need to complete all 18 courses and pass the five exams between the two separate certifications.
Cost: $299 per month subscription
Location: Online
Duration: Self-paced
Expiration: Credentials do not expire
Data Analysis for Life Sciences : https://www.edx.org/xseries/data-analysis-life-sciences
Podcast by Data Scientist : https://soundcloud.com/nssd-podcast
Data Visualization : http://socviz.co/
R CookBook : http://www.cookbook-r.com/
GGPLOT2 Mastery : https://github.com/hadley/ggplot2-book
R packages : http://r-pkgs.had.co.nz/
Advanced R : http://adv-r.had.co.nz/
R for Data Sciences : https://r4ds.had.co.nz/
Text Mining : https://www.tidytextmining.com/
Fundamentals of Data Visualization : https://serialmentor.com/dataviz/
STAT 545 Notes University of British Colombia : https://github.com/STAT545-UBC
The Caret Package : http://topepo.github.io/caret/
Over the past few years, we’ve seen a new community of data science leaders emerge.
Regardless of their industry, we have heard three themes emerge over and over: 1) Companies are recognizing that data science is a competitive differentiator. 2) People are worried their companies are falling behind — that other companies are doing a better job with data science. 3) Data scientists and data science leaders are struggling to explain to executives why data science is different from other types of work, and the implications of these differences on how to equip and organize data science teams.
Introduction
Since we started Domino five years ago, we have talked to hundreds of companies that are investing in data science, and heard all about their successes and their challenges.
At various points during that time, we focused on different aspects of the challenges that face data scientists and data science teams.
- When we first launched Domino in 2014, we focused on automating much of the “dev ops” work that data scientists must do, in order to accelerate their work.
- In 2015, we broadened our aperture to address data scientists’ need to track and organize their research.
- In 2016, we added capabilities to deploy models, creating a unified platform to support the data science lifecycle from development to deployment.
- And in 2017, we emphasized how collaboration, reproducibility, and reusability are the foundation that allows data science teams to scale effectively.
At every point along the way, we felt like there was something larger we wanted to say, but we didn’t quite know how. Like the parable of the blind men describing different parts of an elephant, we knew we were describing pieces but not the whole.
So about a year ago we took a step back. We had long discussions with our customers to distill and synthesize what makes data science different and what differentiates companies who apply it most effectively.
What do data scientists make?
Our major insight came when we asked ourselves: “what do data scientists make?”
Beyond the hype about AI and machine learning, at the heart of data science, is something called a model. By “model,” I mean an algorithm that makes a prediction or recommendation or prescribes some action based on a probabilistic assessment.
Models can make decisions and take action autonomously and with speed and sophistication that humans can’t usually match. That makes models a new type of digital life.
Data scientists make models.
And if you look at the most successful companies in the world, you’ll find models at the heart of their business driving that success.
An example that everyone is familiar with is the Netflix recommendation model. It has driven subscriber engagement, retention, and operational efficiency at Netflix. In 2016, Netflix indicated that their recommendation model is worth more than $1B per year.
Coca-Cola uses a model to optimize orange juice production. Stitch Fix uses modelsto recommend clothing to its customers. Insurance companies are beginning to use models to make automated damage estimates from accident photos, reducing dependence on claims adjusters.
The Model Myth
Though obvious in one sense, the realization that data scientists make models is powerful because it explains most of the challenges that companies have making effective use of data science.
Fundamentally, the reasons companies struggle with data science all stem from misunderstandings about how models are different from other types of assets they’ve built in the past.
Many companies try to develop and deploy models like they develop and deploy software. And many companies try to equip data scientists with technology like they were equipping business analysts to do queries and build business intelligence dashboards.
It’s easy to see why companies fall into this trap: models involve code and data, so it’s easy to mistake them for software or data assets.
We call this the Model Myth: it’s the misconception that because models involve code and data, companies can treat them like they have traditionally treated software or data assets.
Models are fundamentally different, in three ways:
- The materials used to develop them are different. They involve code, but they use different techniques and different tools than software engineering. They use more computationally intensive algorithms, so they benefit from scalable compute and specialized hardware like GPUs. They use far more data than software projects. And they leverage packages from a vibrant open source ecosystem that’s innovating every day. So data scientists need extremely agile technology infrastructure, to accelerate research.
- The process to build them is different. Data science is research — it’s experimental and iterative and exploratory. You might try dozens or hundreds of ideas before getting something that works. So data scientists need tools that allow for quick exploration and iteration to make them productive and facilitate breakthroughs.
- Models’ behavior is different. Models are probabilistic. They have no “correct” answer — they can just have better or worse answers once they’re live in the real world. And while nobody needs to “retrain” software, models can change as the world changes around them. So organizations need different ways to review, quality control, and monitor them.
Model Management
The companies who make the most effective use of data science — ones who consistently drive competitive advantage through data science — are the ones who recognize that models are different and treat them differently.
We’ve studied the various ways these companies treat models differently and organized that into a framework we call Model Management.
Historically, “model management” has referred narrowly to practices for monitoring models once they are running in production. We mean it as something much broader.
Model Management encompasses a set of processes and technologies that allow companies to consistently and safely drive competitive advantage from data science at scale.
Model Management has five parts to it:
- Model Development allows data scientists to rapidly develop models, experiment, and drive breakthrough research.
- Model Production is how data scientists’ work get operationalized. How it goes from a cool project to a live product integrated into business processes, affecting real decisions.
- Model Technology encompasses the compute infrastructure and software tooling that gives data scientists the agility they need to develop and deploy innovative models.
- Model Governance is how a company can keep a finger on the pulse of the activity and impact of data science work across its organization, to know what’s going on with projects, production models, and the underlying infrastructure supporting them.
- Model Context is at the heart of these capabilities. It is all the knowledge, insights, and all the artifacts that are generated while building or using models. This is often a company’s most valuable IP, and the ability to find, reuse, and build upon it is critical to driving rapid innovation.
Each of these facets of managing models requires unique processes and products. When integrated together, they unlock the full potential of data science for organizations.
Computing revolutions separate winners from losers
Data science is a new era of computing. The first era was hardware, where engineers made chips and boards. The second era was software, where engineers made applications. In the third era, data scientists make models.
And like past revolutions in computing, two things are true about the data science era:
- Companies’ ability to adopt and effectively apply the new approach will determine their competitiveness over the coming years. Just as “software ate the world” and “every company needed to be a software company”, every company will need to become a data science company if they want to stay competitive.
- The methodologies and tooling and processes that worked for the previous era will not work for this new era. The rise of software engineering led to new methodologies, new job titles, and new tools — what worked for developing, delivering and managing hardware didn’t work for software. The same is true for data science: what worked for software will not work for models.
Model Management is the set of processes and technologies a company needs to put models at the heart of their business. It’s required because models are different from software, so they need new ways to develop, deliver and manage them. And by adopting Model Management, organizations can unlock the full potential of data science, becoming model-driven businesses.