marketing mix notes

Marketing Mix

Product : It includes all product items marketed by the marketer, their features, quality brand, packaging, labelling, product life cycle, and all decision related to product

Product assortment , offered to customers by the entire industry

 

Product line is a group of similar featured items marketed by a marketer

Total number of lines is referred as breadth (width) of product mix

 

Product depth or item depth refers to the number of version offered to each product in the line

 

Distribution channel – is very important to Netflix

 

 

Price : brings revenue, act of determining value of a product

Includes pricing objectives, price setting strategies, general pricing policies, discount, allowance, rebate, etc. price mix also includes cash and credit policy, price discrimination, cost and contribution

 

Place : location distance , transport

 

Direct marketing no intermediary is there

 

 

Promotion: is defined as a combination of all activities concerned with informing and persuading the actuals and potential customers about the merits of a product with an intention to achieve sales goals

 

Sales promotion involves offering short-term incentive to promote buying and increase sales

 

Most popular form of sales promotion are free gifts, discounts, exchange offer, free home, delivery , after-sales services, guarantee, warrantee, various purchase schemes, etc.

 

Favourable relations between organizations and public

 

Modification and extensions to 4 p’s

Product, price place and promotion (marketed approach)

 

Consumer oriented approach (4c’s)

Commodity - Product

Cost - Cost

Channel - Place

Communication - Promotion

 

Services were fundamentally different from products

Process : procedures / mechanisms for delivering services and monitoring

People : human factor as they interact with the consumer using the services

Physical Evidences :

 

 

Extension of 4c’s

Consumer solution

Cost convenience

Communication

 

Elements of marketing mix are mutualy dependant

Marketing mix elements are meant for attaining the target markets

Essence of marketing mix is ensuring profitbality through customer satisfaction

 

Elements help the marketer in attaining marketing objectives

 

Customer is the central focus of marketing mix

 

Purpose and objectives of marketing mix

Marketing mix aims at customer satisfaction

Success of each and every product

Aims at assisting the marketers in creating effective marketing strategy

Profit maximization, image building, creation of goodwill, maintaining better customer relations

Success of each and every product

Marketing mix is the link between business and customers

Marketing mix helps to increase sales and profit

 

 for netflix : reduction in price could be attributed in diminishing returns from advertising

 

Market Mix Modelling

Marketing Mix Modelling (MMM) is a method that helps quantify the impact of several marketing inputs on sales or market share. the purpose of MMM is to understand how much each marketing input contributes to sales, and how much to spend on each marketing input.

MMM relies on statistical analysis such as multivariate regressions on sales and marketing time series data to estimate the impact of various marketing tactics (marketing mix) on sales and then forecast the impact of future sets of tactics. It is often used to optimize the advertising mix and promotional tactics with respect to sales and profits.

Marketing Mix Modeling (MMM) is one of the most popular analysis under Marketing Analytics which helps organisations in estimating the effects of spent on different advertising channels (TV, Radio, Print, Online Ads etc) as well as other factors (price, competition, weather, inflation, unemployment) on sales. In simple words, it helps companies in optimizing their marketing investments which they spent in different marketing mediums (both online and offline).

Uses of Marketing Mix Modeling
It answers the following questions which management generally wants to know.
  1. Which marketing medium (TV, radio, print, online ads) returns maximum return (ROI)?
  2. How much to spend on marketing activities to increase sales by some percent (15%)?
  3. Predict sales in future from investment spent on marketing activities
  4. Identifying Key drivers of sales (including marketing mediums, price, competition, weather and macro-economic factors)
  5. How to optimize marketing spend?
  6. Is online marketing medium better than offline?
Types of Marketing Mediums
Let's break it into two parts - offline and online.
Offline Marketing
Online Marketing
Print Media : Newspaper, Magazine Search Engine Marketing like Content Marketing, Backlink building etc.
TV Pay per Click, Pay per Impression
Radio Email Marketing
Out-of-home (OOH) Advertising like Billboards, ads in public places. Social Media Marketing (Facebook, YouTube, Instagram, LinkedIn Ads)
Direct Mail like catalogs, letters Affiliate Marketing
Telemarketing  
Below The Line Promotions like free product samples or vouchers  
Sponsorship  



Marketing Spend as a percent of companies revenues by industry

Marketing Mix Modeling

MMM has had a place in marketers’ analytics toolkit for decades. This is due to the unique insights marketing mix models can provide. By leveraging regression analysis, MMM provides a “top down” view into the marketing landscape and the high-level insights that indicate where media is driving the most impact.

For example: by gathering long-term, aggregate data over several months, marketers can identify the mediums consumers engage with the most. MMM provides a report of where and when media is engaged over a long stretch of time.

Background: Marketing Mix Modeling (MMM)

The beginning of the offline measurement

Marketing Mix Modelling is a decades-old process developed in the earliest data of modern marketing that applies regression analysis to historical sales data to analyse the effects of changing marketing activities. Many marketers still use MMM for top-level media planning and budgeting; it delivers a broad view into variables both inside and outside of the marketer's control.

Some of the factors are:

  1. Price
  2. Promotions
  3. Competitor Activity
  4. Media Activity
  5. Economic Conditions

Analytical and Statistical Methods used to quantify the effect of media and marketing efforts on a product's performance is called Marketing Mix Modeling


"It helps to maximize investment and grow ROI"

ROI = (Incremental returns from investment) / Cost of Investment

Marketing ROI = (Incremental Dollar Sales from Marketing Investment) / Spend on Marketing Investment

Why is MMM Needed? Guiding Decisions for Improved Effectiveness

  1. How do I change the mix to increase sales with my existing budget?
  2. Where am I over-spending or under-spending?
  3. Which marketing channels are effective but lack the efficiency for positive ROI?
  4. To what degree do non-marketing factors influence sales?

How does MMM work?

  1. Correlate marketing to sales
  2. Factor in lag time
  3. Test interaction effects
  4. Attribute sales by input
  5. Model to most predictive
  6. Maximize significance - to empower decisions

Example Marketing Mix Model Output

Detailed output includes:

  1. Weekly sales lift
  2. More marketing channels
  3. Contribution by tactic
  4. Contribution by campaign
  5. Non-Marketing impact


Market Contribution vs. Base


ROI Assessment:

We measure ROI because not all ads will convert to sales, but because they are cost-effective and most bang for the buck

MMM Strengths:

  • Complete set of marketing tactics
  • Impact of non-marketing factors
  • High Statistical Reliability
  • Guides change in the marketing mix
  • Guides change in spend
  • Optimizes budget allocation

MMM Limitations:

  • More Tactical than Strategic
  • Short-Term impact only
  • Dependant on variance over time
  • Average Effectiveness
  • No diagnostics to improve
  • Hard to refresh frequently

Critical Success Factors of MMM:

  • Use a Strategic approach (not tactical)
  • Disclose gaps and limitations
  • Add Diagnostic measures
  • Integrate into robust measurement plan
  • Make marketing more measurable
  • Create ROI simulation tools

Media Mix Modeling as Econometric Modeling:

Strengths:

  1. It reduces the biases
  2. It correctly or accurately isolates the impact of media on sales from the impact of all other factors that influence sales.

Weaknesses:

  1. If two types of media are highly correlated in the historical record, then isolating and separating each media type on sales gets reduced.

For working with Market Mix Modeling - a good understanding of econometrics types of modelling is needed

The objective before starting this approach is how can we maximize the value and minimize the harm of marketing mix models like store-based models or shopper based multi-user attribution models.

Marketing End Users are the root of the cause of marketing mix models problems.

Tip: Most attribution projects begin long after the strategy has already been set. So it's important to understand what the client did, why they did it, and what they expected to happen. Only then can you answer their questions in a way they'll be happy with. Remember they hired you because the results weren't what they expected... or because they never thought about how to measure them in the first place.

As we all know weekly variation is the lifeblood of marketing mix models.


Some of the problems are continuity bias

Very interesting article on using Market Mix Modelling during COVID-19.

Market Mix Modeling (MMM) in times of Covid-19 | by Ridhima Kumar | Aryma Labs | Medium

In the model, i read that there will be sudden demand of essential items during the pandemic, but this deviance cannot be attributed to existing advertisement factors.

In the regression model we can see that there will be;

  • Heteroscedasticity: The sales trend could show significant changes from the beginning to end of the series. Hence, the model could have heteroscedasticity. One of the reasons for heteroscedasticity is presence of outliers in the data or due to large range between the largest and smallest observed value.
  • Autocorrelation: Also, the model could show signs of autocorrelation due to missing independent variable (the missing variable being Covid-19 variable).

Another very interesting article on Marketing Analytics using Markov chain

Marketing Analytics through Markov Chain | LinkedIn

In the article, I read that how we can use transition matrix to understand the change in states. It explains very neatly.

Article on Conjoint Analysis : Conjoint Analysis: What type of chocolates do the Indian customers prefer? | LinkedIn

Marketing Mix Modeling (MMM) is the use of statistical analysis to estimate the past impact and predict the future impact of various marketing tactics on sales. Your Marketing Mix Modeling project needs to have goals, just like your marketing campaigns.

The main goal of any Marketing Mix Modeling project is to measure past marketing performance so you can use it to improve future Marketing Return on Investment (MROI).

The insights you gain from your project can help you reallocate your marketing budget across your tactics, products, segments, time and markets for a better future return. All of the marketing tactics you use should be included in your project, assuming there is high-quality data with sufficient time, product, demographic, and/or market variability. Each project has four distinct phases, starting with data collection and ending with optimization of future strategies. Let’s take a look at each phase in depth:

Phase 1 : Data Collection and Integrity : It can be tempting to request as much data as possible, but it's important to note that every request has a very real cost to the client. In this case the task could be simplified down to just marketing spend by day, by channel, as well as sales revenue.

Phase 2 : Modeling: Before modelling we need to;

  • Identify Baseline and Incremental Sales

  • Identify Drivers of Sales

  • Identify Drivers of Growth

  • Sales Drivers by Week

  • Optimal Media Spend
  • Understanding Brand Context: Understanding the clients marketing strategy & its implementation is key for succeeding in the delivery of the MMM project.
    • The STP Strategy (Segmentation, Targeting and Positioning) impacts the choice of the target audience and influences the interpretation of the model results.
    • The company context and 4P's determine the key datasets that needed to be collected and influence the key factors. Eg: Impact of Seasonality , Distribution of Channels

Phase 3 : Model-Based Business Measures

Phase 4 : Optimization & Strategies

Pitfalls in Market Mix Modeling: 

1. Why MMX vendors being “personally objective” is not the same as their being “statistically unbiased”.
2. How to clear the distortions that come from viewing “today’s personalized continuity marketing” through “yesterday’s mass-market near-term focused lens”.
3. Why “statistically controlling” for a variable (seasonality, trend, etc.) does NOT mean removing its influence on marketing performance.


Some points about Marketing Mix Modeling:

Your Marketing Return on Investment (MROI) will be a key metric to look at during your Marketing Mix Modeling project, whether that be Marginal Marketing Return on Investment for future planning or Average Marketing Return on Investment for past interpretation. The best projects also gauge the quality of their marketing mix model, using Mean Absolute Percent Error (MAPE) and R^2

1. Ad creative is very important to your sales top line and your MROI, especially if you can tailor it to a segmented audience. This paper presents five best Spanish language creative practices to drive MROI, which should also impact top-of-the-funnel marketing measures. 

 2. The long-term impact of marketing on sales is hard to nail down, but we have found that ads that don’t generate sales lift in the near-term usually don’t in the long-term either. You can also expect long-term Marketing Return on Investment to be about 1.5 to 2.5 times the near-term Marketing Return on Investment. 

3. Modeled sales may not be equivalent to total sales. Understand how marketing to targeted segments will be modeled.

4. Brand size matters. As most brand managers know firsthand, the economics of advertisement favors large brands over small brands. The same brand TV expenditure and TV lift produces larger incremental margin dollars, and thus larger Marketing Return on Investment, for the large brand than the small brand. 5. One media’s Marketing Return on Investment does not dominate consistently. Since flighting, media weight, targeted audience, timing, copy and geographic execution vary by media for a brand, each media’s Marketing Return on Investment can also vary significantly.

Some more background into Marketing Mix Models:

Product : A product can be either a tangible product or an intangible service that meets a specific customer need or demand
Price : Price is the actual amount the customer is expected to pay for the product
Promotion : Promotion includes marketing communication strategies like advertising, offers, public relations etc.
Place : Place refers to where a company sells their product and how it delivers the product to the market.

Marketing Objectives:
For the different marketing types: TV, Radio, Print, Outdoor, Internet, Search Engine, Mobile Apps. We would like to

1. Measure ROI by media type
2. Simulate alternative media plans

Research Objectives:

1. Measure ROI by media type
2. Simulate alternative media plans
3. Build a User-Friendly simulation tool
4. Build User-Friendly optimization tool

First Step: Building the Modeling Data Set 
  1. Cross-Sectional Unit :
      • Regions
      • Markets
      • Trade Spaces
      • Channels
      • Your brands
      • Competitor brands
    1. Unit of Time
      • Months
      • Weeks
    1. Length of History
      • At least 5 years of monthly data
      • At least 2 years of weekly data


    Define the Variables

    Sales

      • Dependent Variables
      • units(not currency)

    Media Variables: 

      • TV, Radio, Internet, Social, etc.
      • Measure as units of activity (e.g., GRPs, impressions)

    Control Variables

      • Macroeconomic factors
      • Seasonality
      • Price
      • Trade Promotions
      • Retail Promotions
      • Competitor Activity

    Pick Functional Form of Demand Equation

    Quantity Demanded = f

      • Conditions:
      • Price
      • Economic Conditions
      • Size of Market
      • Customer Preferences
      • Strength of Competition
      • Marketing Activity

    Most Common Functional Forms

      • Linear
      • Log-Linear - strong assumptions
      • Double Log - more strong assumptions (used by a large percentage of models)

    Modelling Issues

      • Omitted Variables ( try to get as many variables as possible which are considered to have big impact on demand)
      • Endogeneity Bias (Instrumental variable approach, if the variable is in our predictor variable and also in our dependant variable, this creates bias and we need to account for the bias)
      • Serial Correlation (all-time series data have serial correlation which creates bias)
      • Counterintuitive results ( time series is short, we may not have enough data to look back, then we try to go more cross-sectional variables in more granular)
      • Short Time Series

    Market-Mix Modeling Econometrics

      • Mixed Modeling: fixed effects, random effects
      • Parks Estimator
      • Bayesian Methods: Random effects
      • Adstock variables: can be split up into multiple variables for different types of advertisements like promotion, equity, etc.

    Multiple Factors that Affect Outcome (Incremental Sales) :

    1. Campaign
    2. Pricing
    3. Other Campaigns
    4. Competitor Effects
    5. Seasonality
    6. Regulatory Factors

    Market Mix modelling: is designed to pick up short term effects, it is not able to model long term effects such as the effect of the brand. Advertisement helps in making a brand but this is difficult to model.

    Attribution Modeling: is different Media/Market Mix Modeling as it offers additional insight. In this type of modelling, we measure the contribution of earlier touchpoints of customer digital journey to final sale. Attribution Modeling is bottom-up approach but will be difficult to do because third party cookies are getting phased out

    Multi-Touch Attribution modelling is more advanced than top-down Market Mix Modeling because there is an instant feed loop to understand what is working. whereas in Market Mix Modeling we would just determine the percentage of x change to drive sales and then in next year model we will do the adjustment again, without getting any real on the ground feedback to understand that whether we reached the target that we set out to achieve.


    Nielson Marketing Mix Modeling is the largest Market Mix Modeling provider in the world.

    The Pros and Cons of Marketing Mix Modeling

    When it comes to initial marketing strategy or understanding external factors that can influence the success of a campaign, marketing mix modeling shines. Given the fact that MMM leverages long-term data collection to provide its insights, marketers measure the impact of holidays, seasonality, weather, band authority, etc. and their impact on overall marketing success.

    As consumers engage with brands across a variety of print, digital, and broadcast channels, marketers need to understand how each touchpoint drives consumers toward conversion. Simply put, marketers need measurements at the person-level that can measure an individual consumer’s engagement across the entire customer journey in order to tailor marketing efforts accordingly.

    Unfortunately, marketing mix modeling can’t provide this level of insight. While MMM has a variety of pros and cons, the biggest pitfall of MMM is its inability to keep up with the trends, changes, and online and offline media optimization opportunities for marketing efforts in-campaign.

    Research on IT Certifications

    Top IT management certifications

    The most valuable certifications for 2021

    • Google Certified Professional Data Engineer
    • Google Certified Professional Cloud Architect
    • AWS Certified Solutions Architect Associate
    • Certified in Risk and Information Systems Control (CRISC)
    • Project Management Professional (PMP)

    Top agile certifications

    • PMI-ACP

    Top 15 data science certifications

    • Certified Analytics Professional (CAP)
    • Cloudera Certified Associate (CCA) Data Analyst
    • Cloudera Certified Professional (CCP) Data Engineer
    • Data Science Council of America (DASCA) Senior Data Scientist (SDS)
    • Data Science Council of America (DASCA) Principle Data Scientist (PDS)
    • Dell EMC Data Science Track (EMCDS)
    • Google Professional Data Engineer Certification
    • IBM Data Science Professional Certificate
    • Microsoft Certified: Azure AI Fundamentals
    • Microsoft Certified: Azure Data Scientist Associate
    • Open Certified Data Scientist (Open CDS)
    • SAS Certified AI & Machine Learning Professional
    • SAS Certified Big Data Professional
    • SAS Certified Data Scientist
    • Tensorflow Developer Certificate
    • Mining Massive Data Sets Graduate Certificate by Stanford

    Top 10 business analyst certifications

    • Certified Analytics Professional (CAP)
    • IIBA Entry Certificate in Business Analysis (ECBA)
    • IIBA Certification of Competency in Business Analysis (CCBA)
    • IIBA Certified Business Analysis Professional (CBAP)
    • IIBA Agile Analysis Certification (AAC)
    • IIBA Certification in Business Data Analytics (CBDA)
    • IQBBA Certified Foundation Level Business Analyst (CFLBA)
    • IREB Certified Professional for Requirements Engineering (CPRE)
    • PMI Professional in Business Analysis (PBA)
    • SimpliLearn Business Analyst Masters Program

    The top 11 data analytics and big data certifications

    • Associate Certified Analytics Professional (aCAP)
    • Certification of Professional Achievement in Data Sciences
    • Certified Analytics Professional
    • Cloudera Data Platform Generalist
    • EMC Proven Professional Data Scientist Associate (EMCDSA)
    • IBM Data Science Professional Certificate
    • Microsoft Certified Azure Data Scientist Associate
    • Microsoft Certified Data Analyst Associate
    • Open Certified Data Scientist
    • SAS Certified Advanced Analytics Professional Using SAS 9
    • SAS Certified Data Scientist

    Chartered Data ScientistTM

    This distinction is provided by the Association of Data Scientists (ADaSci). This designation is awarded to those candidates who pass the CDS exam and hold a minimum of two years of work experience as a data scientist. However, the candidates who do not have experience can also take the exam and carry the results. But their charter, in this case, is put on hold until they attain the two years of experience. There is no training or course required to earn this award. The cost of taking this exam is 250 US Dollar. This charter has lifetime validity and hence it does not expire. 

    Chartered Financial Data Scientist

    The Chartered Financial Data Scientist program is organized by the Society of Investment Professionals in Germany. They first provide a training course conducted by the Swiss Training Centre for Investment Professionals. After completing this training, the candidates are allowed to earn this designation. It costs around 8,690 Euro. 

    Certified Analytics Professional

    This professional certification is offered by INFORMS. It is supported by the Canadian Operational Research Society and 3 more professional societies. There are various levels of certification. Each level has different eligibility requirements, from graduate to postgraduate etc. To earn this certification, the cost starts from 495 US Dollar. To take this exam, the candidate needs to be available in-person in the designated test centres. It is valid for three years only.

    Cloudera Certified Associate Data Analyst

    This certification program is organized by Cloudera. It is more specific towards SQL and databases and more suitable for Data Analysts. It costs around 295 US Dollar and there is no any specific eligibility requirement for this certification. This certification is valid only for two years.

    EMC Proven Professional Data Scientist Associate

    This certification program is organized by Dell EMC. To earn this distinction, it is mandatory to attend a training program, either in-class or online. It costs around 230 US Dollar. To take this exam, the candidate needs to be available in-person in the designated test centres.

    Open Certified Data Scientist

    It is organized by the Open Group. The members of the Open Group include HCL, Huawei, IBM, Oracle etc. There are 3 levels of this certification. Require to have a different level of experience for each level of certification. The cost for this certification starts from 295 US Dollar. To take this exam, the candidate needs to be available in-person at the specified place.

    Senior Data Scientist

    This certification program is provided by the Data Science Council of America (DASCA). It requires 6+ years of experience of Big Data Analytics / Big Data Engineering. It costs around 650 US Dollar. This certification has 5 years of validity. 

    Principal Data Scientist

    This certification program is provided by the Data Science Council of America (DASCA). It requires 10+ years of experience of Big Data Analytics / Big Data Engineering. There are various tracks of this exam. It costs between 850-950 US Dollar depending on the track.

    SAS Certified Data Scientist

    It is organized by SAS. To get this certification, you need to pass two more exams first SAS Big Data Professional and SAS Advanced Analytics Professional. Along with this, you need to take 18 courses as well. It costs around 4,400 US Dollar.

    Financial Data Professional 

    Financial Data Professional program is organized by Financial Data Professional Institute (FDPI). It is more suitable for financial professionals who apply AI and data science in finance. It opens the exam window with a fixed registration period. The cost of the FDP exam is 1350 US Dollar. To take this exam, the candidate needs to be available in-person in the designated test centres.

    So, here we have listed the top certification exams in data science across the world. To choose from the list, a candidate should analyze the requirements in the coming future, the suitability of certification, contents covered in the exam so that it can meet the job requirements, exam cost, exam dates and time flexibility etc. The candidate should take one such certification which meets all their expectations instead of taking multiple certification exams. 


    Also there are many more certifications provided by insurance bodies

    IFoA and CAS which are in development but need strong insurance domain knowledge

    If you are a member of Pega Academy - then Pega has their own Data Science Program








    Machine Learning - Basic Starting Notes

    Machine Learning Problem Framing - 

    Define a  ML Problem and propose a solution

    1. Articulate a problem
    2. See if any labeled data exists
    3. Design your data for the model
    4. Determine where the data comes from
    5. Determine easily obtained inputs
    6. Determine quantifiable inputs


    We have major three types of models:

    1. Supervised Learning
    2. Un-Supervised Learning
    3. Reinforcement Learning : There is no data requirement of labeled data, and the model acts like an agent which learns. It works on foundation of a reward function. Challenges lie in defining a good reward function. Also RL models are less stable and predictable than supervised approaches. Additionally, you need to provide a way for the agent to interact with the game to produce data, which means either building a physical agent that can interact with the real world or a virtual agent and a virtual world, either of which is a big challenge.


    Type of ML Problem Description Example
    Classification Pick one of N labels Cat, dog, horse, or bear
    Regression Predict numerical values Click-through rate
    Clustering Group similar examples Most relevant documents (unsupervised)
    Association rule learning Infer likely association patterns in data If you buy hamburger buns, you're likely to buy hamburgers (unsupervised)
    Structured output Create complex output Natural language parse trees, image recognition bounding boxes
    Ranking Identify position on a scale or status Search result ranking


    In traditional software engineering, you can reason from requirements to a workable design, but with machine learning, it will be necessary to experiment to find a workable model.

    Models will make mistakes that are difficult to debug, due to anything from skewed training data to unexpected interpretations of data during training. Furthermore, when machine-learned models are incorporated into products, the interactions can be complicated, making it difficult to predict and test all possible situations. These challenges require product teams to spend a lot of time figuring out what their machine learning systems are doing and how to improve them.


    Know the Problem Before Focusing on the Data

    If you understand the problem clearly, you should be able to list some potential solutions to test in order to generate the best model. Understand that you will likely have to try out a few solutions before you land on a good working model.

    Exploratory data analysis can help you understand your data, but you can't yet claim that patterns you find generalize until you check those patterns against previously unseen data. Failure to check could lead you in the wrong direction or reinforce stereotypes or bias.


    AI - 900 Azure AI fundamentals prep notes

    The Layers of AI

    • What is Artificial Intelligence (AI) ?
    • Machines that perform jobs that mimic human behavior.

    • What is Machine Learning (ML) ?
    • Machines that get better at a task without explicit programming. It is a subset of artificial intelligence that uses technologies (such as deep learning) that enable machines to use experience to improve at tasks. 

    • What is Deep Learning (DL) ?
    • Machines that have an artificial neural network inspired by the human brain to solve complex problems. It is a subset of machine learning that's based on artificial neural network.

    • What is a Data Scientist ?
    • A person with Multi-Disciplinary skills in math, statistics, predictive modeling and machine learning to make future predictions.

     

    Principle of AI

    Challenges and Risks with AI
    • Bias can affect results
    • Errors can cause harm
    • Data could be exposed
    • Solutions may not work for everyone
    • Users must trust a complex system
    • Who's liable for AI driven decision ?

    1.  Reliability and Safety : Ensure that AI systems  operate as they were originally designed, respond to unanticipated conditions and resist harmful manipulation. If AI is making mistakes it is important to release a report quantified risks and harms to end-users so they are informed of the short comings of an AI solution.

    • AI-based software application development must be subjected to rigorous testing and deployment management processes to ensure that they work as expected before release.
    • Good Examples : while developing an AI system for a self-driving car?

    2.  Fairness : Implementing processes to ensure that decisions made by AI systems can be override by humans.

    • Harm of Allocation : AI Systems that are used to Allocate or Withhold:
      • Opportunities
      • Resources
      • Information
    • Harm of Quality-of-Service : AI systems can reinforce existing stereotypes.
      • An AI system does not work well for one group of people as it does for another. As example is a voice recognition system which works well for men but not well for women.
    • Reduce bias in the model as we will live in an unfair world.
      • Fair learn is an open-source python package that allows machine learning systems developers to assess their systems' fairness and mitigate the observed the observed fairness issues.

    3.  Privacy and Security : Provide customers with information and controls over the collection, use and storage of the data. 

    • Example: On device machine learning
    • AI security Aspects: Data Origin and Lineage , Data Use : Internal vs External
    • Anomaly Detection API is good example for the above use case.

    4.  Inclusiveness: AI systems should empower everyone and engage people especially minority groups based on:

    • Physical Ability
    • Gender
    • Sexual orientation
    • Ethnicity
    • Other factors
    • Microsoft Statement- "We firmly believe everyone should benefit from intelligent technology, meaning it must incorporate and address a broad range of human needs and experiences. For the 1 billion people with disabilities around the world, AI technologies can be a game-changer."

    5. Transparency : AI systems should be understandable. Interpretability / intelligently is when end-users can understand the behavior of UI. Adopting an open source framework for AI can provide transparency (at least from the technical perspective) on the internal working of an AI systems.

    • AI systems should be understandable. Users should be made fully aware of the purpose of the system, how it works, and what limitations may be expected.
    • Example : Detail Documentation of Code for debugging 

    6. Accountability : People should be responsible for AI systems. The structure put in place to consistently enact AI principles and taking them into account. AI systems should work with the :

    • Framework of governance
    • Organizational principles
    • Ethical and legal standards
    • That are clearly defined
    • AI-Based solutions meets ethical and legal standards that advocate regulations on people civil liberties and works within a framework of governance and organizational principles.
    • Designers and developers of AI-based solution should work within a framework of governance and organizational principles that ensure the solution meets ethical and legal standards that are clearly defined.
    • AI-Based solutions meets ethical and legal standards that advocate regulations on people civil liberties and works withing a framework of governance and organizational principles.
    • Ensure that AI systems are not the final authority on any decision that impacts people's lives and that humans maintain meaningful control over otherwise highly autonomous AI systems.

    Dataset : A dataset is a logical grouping of units of data that are closely related and/or share the same data structure.

    Data labeling : process of identifying raw data and adding one or more meaningful and informative labels to provide context so that a machine learning model can learn.

    Ground Truth : a properly labeled dataset to you use as the objective standard to train and assess a given model is often called as ‘ground truth’. The accuracy of your trained model will depend on the accuracy of the ground truth.


    Machine learning in Microsoft Azure

    Microsoft Azure provides the Azure Machine Learning service - a cloud-based platform for creating, managing, and publishing machine learning models. Azure Machine Learning provides the following features and capabilities:

    Feature

    Capability

    Automated machine learning
    This feature enables non-experts to quickly create an effective machine learning model from data.
    Azure Machine Learning designer
    A graphical interface enabling no-code development of machine learning solutions.
    Data and compute management
    Cloud-based data storage and compute resources that professional data scientists can use to run data experiment code at scale.
    Pipelines
    Data scientists, software engineers, and IT operations professionals can define pipelines to orchestrate model training, deployment, and management tasks.

    Other Features of Azure Machine Learning Services :

    A service that simplifies running AI/ML related workloads allowing you to build flexible Automated ML Pipelines. Use Python, R, Run DL workloads such as TensorFlow.

    1. Jupyter Notebooks
    • build and document your machine learning models as you build them, share and collaborate.

    2. Azure Machine Learning SDK for Python

    • As SDK designed specifically to interact with Azure Machine Learning Services.

    3. MLOps

    • End to End Automation of ML Model pipelines eg. CI/CD, training, inference.

    4. Azure Machine Learning Designer

    • drag and drop interface to visually build, test, and deploy machine learning models.

    5. Data Labeling Service

    • Ensemble a team of humans to label your training data.

    6. Responsible Machine Learning

    • Model fairness through disparity metrics and mitigate unfairness.

    Performance/Evaluation Metrics are used to evaluate different Machine Learning Algorithms

    For different types of problems different metrics matters

    • Classification Metrics (accuracy, precision, recall, F1-Score, ROC, AUC)
    • Regression Metrics (MSE, RMSE, MAE)
    • Ranking Metrics (MRR, DCG, NDCG)
    • Statistical Models (Correlation)
    • Computer Vision Models (PSNR, SSIM, IoU)
    • NLP Metrics (Perplexity, BLEU, METEOR, ROUGE)
    • Deep Learning Related Metrics (Inception Score, Frechet Inception Distance)

    There are two categories of evaluation metrics:

    • Internal Metrics : metrics used to evaluate the internals of the ML Model
      • The Famous Four - Accuracy, Precision, Recall, F1-Score 
    • External Metrics : metrics used to evaluate the final prediction of the ML Model

    Random Forest Model and find the most important variables using R

    One of the benefits of using Random Forest Model is

    1. In Regression, when the variables may be highly correlated with each other, the approach of Random Forest really help in understanding the feature importance. The trick is Random forest selects explanatory variables at each variable split in the learning process, which means it trains a random subset of the feature instead of all sets of features. This is called feature bagging. This process reduces the correlation between trees; because the strong predictors could be selected by many of the trees, and it could make them correlated.

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

    How to find the most important variables in R

    Find the most important variables that contribute most significantly to a response variable

    Selecting the most important predictor variables that explains the major part of variance of the response variable can be key to identify and build high performing models.

    1. Random Forest Method

    Random forest can be very effective to find a set of predictors that best explains the variance in the response variable.

    library(caret)
    
    library(randomForest)
    
    library(varImp)
    
    regressor <- randomForest(Target ~ . , data       ​= data, importance=TRUE) # fit the random forest with default parameter
    
    varImp(regressor) # get variable importance, based on mean decrease in accuracy
    
    varImp(regressor, conditional=TRUE) # conditional=True, adjusts for correlations between predictors
    

    varimpAUC(regressor) # more robust towards class imbalance.


    2. xgboost Method

    library(caret)
    
    library(xgboost)
    
    regressor=train(Target~., data        ​= data, method = "xgbTree",trControl = trainControl("cv", number = 10),scale=T)
     

    varImp(regressor)


    3. Relative Importance Method

    Using calc.relimp {relaimpo}, the relative importance of variables fed into lm model can be determined as a relative percentage.

    library(relaimpo)
    
    regressor <- lm(Target ~ . , data       ​= data) # fit lm() model
    
    relImportance <- calc.relimp(regressor, type = "lmg", rela = TRUE) # calculate relative importance scaled to 100
     

    sort(relImportance$lmg, decreasing=TRUE) # relative importance


    4. MARS (earth package) Method

    The earth package implements variable importance based on Generalized cross validation (GCV), number of subset models the variable occurs (nsubsets) and residual sum of squares (RSS).

    library(earth)
    
    regressor <- earth(Target ~ . , data       ​= data) # build model
    
    ev <- evimp (regressor) # estimate variable importance
     

    plot (ev)

    5. Step-wise Regression Method

    If you have large number of predictors , split the Data in chunks of 10 predictors with each chunk holding the responseVar.

    base.mod <- lm(Target ~ 1 , data       ​= data) # base intercept only model
    
    all.mod <- lm(Target ~ . , data       ​= data) # full model with all predictors
    
    stepMod <- step(base.mod, scope = list(lower = base.mod, upper = all.mod), direction = "both", trace = 1, steps = 1000) # perform step-wise algorithm
    
    shortlistedVars <- names(unlist(stepMod[[1]])) # get the shortlisted variable.
    
    shortlistedVars <- shortlistedVars[!shortlistedVars %in% "(Intercept)"] # remove intercept
    

    The output might include levels within categorical variables, since ‘stepwise’ is a linear regression based technique.

    If you have a large number of predictor variables, the above code may need to be placed in a loop that will run stepwise on sequential chunks of predictors. The shortlisted variables can be accumulated for further analysis towards the end of each iteration. This can be very effective method, if you want to

    ·        Be highly selective about discarding valuable predictor variables.

    ·        Build multiple models on the response variable.


    6. Boruta Method

    The ‘Boruta’ method can be used to decide if a variable is important or not.

    library(Boruta)
    
    # Decide if a variable is important or not using Boruta
    
    boruta_output <- Boruta(Target ~ . , data  ​= data, doTrace=2) # perform Boruta search
    
    boruta_signif <- names(boruta_output$finalDecision[boruta_output$finalDecision %in% c("Confirmed", "Tentative")]) # collect Confirmed and Tentative variables
    
    # for faster calculation(classification only)
    
    library(rFerns)
    
    boruta.train <- Boruta(factor(Target)~., data  ​=data, doTrace = 2, getImp=getImpFerns, holdHistory = F)
    boruta.train
     
    boruta_signif <- names(boruta.train$finalDecision[boruta.train$finalDecision %in% c("Confirmed", "Tentative")]) # collect Confirmed and Tentative variables
     
    boruta_signif
    
    ##
    getSelectedAttributes(boruta_signif, withTentative = F)
    
    boruta.df <- attStats(boruta_signif)
    
    print(boruta.df)
    

    7. Information value and Weight of evidence Method

    library(devtools)
    
    library(woe)
    
    library(riv)
    
    iv_df <- iv.mult(data, y="Target", summary=TRUE, verbose=TRUE)
    
    iv <- iv.mult(data, y="Target", summary=FALSE, verbose=TRUE)
    
    iv_df
    
    iv.plot.summary(iv_df) # Plot information value summary
    
    Calculate weight of evidence variables
    
    data_iv <- iv.replace.woe(data, iv, verbose=TRUE) # add woe variables to original data frame.
    

    The newly created woe variables can alternatively be in place of the original factor variables.


    8. Learning Vector Quantization (LVQ) Method

    library(caret)
    control <- trainControl(method="repeatedcv", number=10, repeats=3)
    
    # train the model
    
    regressor<- train(Target~., data       ​=data, method="lvq", preProcess="scale", trControl=control)
    
    # estimate variable importance
    
    importance <- varImp(regressor, scale=FALSE)
    

    9. Recursive Feature Elimination RFE Method

    library(caret)
    
    # define the control using a random forest selection function
    
    control <- rfeControl(functions=rfFuncs, method="cv", number=10)
    
    # run the RFE algorithm
    
    results <- rfe(data[,1:n-1], data[,n], sizes=c(1:8), rfeControl=control)
    
    # summarize the results
    
    # list the chosen features
    predictors(results)
    
    # plot the results
    plot(results, type=c("g", "o"))
    

    10. DALEX Method

    library(randomForest)
    
    library(DALEX)
    
    regressor <- randomForest(Target ~ . , data       ​= data, importance=TRUE) # fit the random forest with default parameter
    
    
    # Variable importance with DALEX
    
    explained_rf <- explain(regressor, data   ​=data, y=data$target)
    
    
    
    # Get the variable importances
    
    varimps = variable_dropout(explained_rf, type='raw')
    
    
    
    print(varimps)
    
    plot(varimps)
    

    11. VITA

    library(vita)
    
    regressor <- randomForest(Target ~ . , data    ​= data, importance=TRUE) # fit the random forest with default parameter
    
    pimp.varImp.reg<-PIMP(data,data$target,regressor,S=10, parallel=TRUE)
    pimp.varImp.reg
    
    pimp.varImp.reg$VarImp
    
    pimp.varImp.reg$VarImp
    sort(pimp.varImp.reg$VarImp,decreasing = T)
    


    12. Genetic Algorithm

    library(caret)
    
    # Define control function
    
    ga_ctrl <- gafsControl(functions = rfGA, # another option is `caretGA`.
    
                method = "cv",
    
                repeats = 3)
    
    
    
    # Genetic Algorithm feature selection
    
    ga_obj <- gafs(x=data[, 1:n-1], 
    
            y=data[, n], 
    
            iters = 3,  # normally much higher (100+)
    
            gafsControl = ga_ctrl)
    
    
    
    ga_obj
    
    # Optimal variables
    
    ga_obj$optVariables
    


    13. Simulated Annealing

    library(caret)
    
    # Define control function
    
    sa_ctrl <- safsControl(functions = rfSA,
    
                method = "repeatedcv",
    
                repeats = 3,
    
                improve = 5) # n iterations without improvement before a reset
    
    
    
    # Simulated Annealing Feature Selection
    
    set.seed(100)
    
    sa_obj <- safs(x=data[, 1:n-1], 
    
            y=data[, n],
    
            safsControl = sa_ctrl)
    
    
    
    sa_obj
    
    # Optimal variables
    
    print(sa_obj$optVariables)
    
    
    

    14. Correlation Method

    library(caret)
    
    # calculate correlation matrix
    
    correlationMatrix <- cor(data [,1:n-1])
    
    # summarize the correlation matrix
    
    print(correlationMatrix)
    
    # find attributes that are highly corrected (ideally >0.75)
    
    highlyCorrelated <- findCorrelation(correlationMatrix, cutoff=0.5)
    
    # print indexes of highly correlated attributes
     

    print(highlyCorrelated)