Transmogrification: The Magic of Feature Engineering Leah McGuire and Mayukh Bhaowal

Size: px

Start display at page:

Download "Transmogrification: The Magic of Feature Engineering Leah McGuire and Mayukh Bhaowal"

Alan Bruce
5 years ago
Views:

1 Transmogrification: The Magic of Feature Engineering Leah McGuire and Mayukh Bhaowal

6 ML algorithms take center stage in AI Modeling Raw Data Feature Engineering Bottleneck

7 Mythical Numeric Matrix X X2 X3 X4 X5 Y A B B A A

8 Use the data types

9 Automatic Feature Engineering Numeric Categorical Text Temporal Spatial Time difference Augment with external data e.g avg income Imputation Imputation Track null value Log transformation for large range Scaling - znormalize Smart Binning Track null value One Hot Encoding Dynamic Top K pivot Smart Binning LabelCount Encoding Category Embedding Tokenization Hash Encoding Circular Statistics Tf-Idf Word2Vec Sentiment Analysis Language Detection Time extraction (day, week, month, year) Closeness to major events Spatial fraudulent behavior e.g: impossible travel speed Geo-encoding

10 Transmogrification val featurevector = Seq(age, phone, , subject, zipcode).transmogrify()

11 Impact on Feature Engineering Is Spammy Top Domain Phone Country Code Age Phone Is Valid Age [-5] Vector Subject Age [5-35] Age [>35] Zipcode Top TF-IDF Terms Average Income

14 The Black Swan of Perfectly Interpretable Models Leah McGuire, Mayukh Bhaowal

16 Roadmap for this talk What does it mean to explain your model? Why explain your model? How to explain your model? Interpretability vs accuracy tradeoff Complications of feature engineering Global (full model) solutions Local (record level) solutions

17 Roadmap for this talk What does it mean to explain your model? Why explain your model? How to explain your model? Interpretability vs accuracy tradeoff Complications of feature engineering Global (full model) solutions Local (record level) solutions

18 The Question Why did the machine learning model make the decision that it did?

19 Translation # How do I fix this model? Data Scientist

20 Translation #2 Do we have our bases covered, in case of a regulatory audit? Legal Counsel

21 Translation #3 Does Einstein know what I know? How do I use this prediction? Non Technical End User

22 P(c f) Input Pk(c f) Pn(c f) Output Σ

23 Model Insights Report

24 Roadmap for this talk What does it mean to explain your model? Why explain your model? How to explain your model? Interpretability vs accuracy tradeoff Complications of feature engineering Global (full model) solutions Local (record level) solutions

25 Debuggability Top contributing features for surviving the Titanic: F. Gender 2. pclass 3. Body

26 Trust How can you trust a man that wears both a belt and suspenders? Man can't even trust his own pants.

27 Right Human Machine Wrong

28 Bias

29 Legal

31 Black defendant has higher risk scores

32 Actionable

33 Roadmap for this talk What does it mean to explain your model? Why explain your model? How to explain your model? Interpretability vs accuracy tradeoff Complications of feature engineering Global (full model) solutions Local (record level) solutions

34 It s complicated

35 Can you use a simple model? Feature Weights/ Importance Global Feature Weights/ Importance Local Are the raw features fed into the model interpretable? Does the consumer care about how features affect the model or just feature insights? Does the consumer care about individual predictions? Feature Impact Model Agnostic Global Secondary Model Global Feature Impact Model Agnostic Local Secondary Model Local

36 Roadmap for this talk What does it mean to explain your model? Why explain your model? How to explain your model? Interpretability vs accuracy tradeoff Complications of feature engineering Global (full model) solutions Local (record level) solutions

37 The best model or the model you can explain?

38 Roadmap for this talk What does it mean to explain your model? Why explain your model? How to explain your model? Interpretability vs accuracy tradeoff Complications of feature engineering Global (full model) solutions Local (record level) solutions

39 Where did you get the feature matrix? X X2 X3 X4 X5 Y A B B A A

40 Feature Engineering Is Spammy Top Domain Phone Country Code Age Phone Is Valid Age [-5] Vector Subject Age [5-35] Age [>35] Zipcode Top TF-IDF Terms Average Income

41 Metadata!!! The name of the feature the column was made from The name of the RAW feature(s) the column was made from Everything you did to get the column Any grouping information across columns Description of the value in the column

42 Roadmap for this talk What does it mean to explain your model? Why explain your model? How to explain your model? Interpretability vs accuracy tradeoff Complications of feature engineering Global (full model) solutions Local (record level) solutions

43 Interpretability: Global vs Local

44 Can you use a simple model? Feature Weights/ Importance Global Are the raw features fed into the model interpretable? Does the consumer care about how features affect the model or just feature insights? Feature Impact Model Agnostic Global Does the consumer care about individual predictions? Secondary Model Global

45 Feature Weight / Importance (Global)

46 Predict House Price

47 Predict Titanic Passenger Survival

48 P(c f) Input Pk(c f) Pn(c f) Output Σ

49 Feature Impact (Global - the hard way) X X2 X3 X4 X5 Y A B B A A

50 Feature Impact (Global - the hard way)

Issues with Feature Importance / Weight / Impact (Global) http://resources.esri.

51 Issues with Feature Importance / Weight / Impact (Global)

52 Secondary Model Prediction Input Explanation

53 Secondary Model (Global)

54 Secondary Model (Global)

55 What we do: All the metadata about how you got the feature Correlation Mutual information Feature weight / importance Feature distribution

56 { "featurename" : "sex", What "derivedfeatures" : [ { "stagesapplied" : [ "pivottext_opsetvectorizer" ], "derivedfeaturevalue" : "Male", "corr" : , "mutualinformation" : , "contribution" : ,. }, { "stagesapplied" : [ "pivottext_opsetvectorizer" ], "derivedfeaturevalue" : "Female", "corr" : , "mutualinformation" : , "contribution" : ,. } } we do:

57 Roadmap for this talk What does it mean to explain your model? Why explain your model? How to explain your model? Interpretability vs accuracy tradeoff Complications of feature engineering Global (full model) solutions Local (record level) solutions

58 Can you use a simple model? Feature Weights/ Importance Local Are the raw features fed into the model interpretable? Does the consumer care about how features affect the model or just feature insights? Feature Impact Model Agnostic Local Does the consumer care about individual predictions? Secondary Model Local

59 Feature Weight (Local)

60 Predict House Price

61 Feature Weight (Local)

62 Feature Impact (LOCO) {"age":7., "embarked":"c", "name":"attalah, Miss. Malake", "pclass":"3", "parch":"", "sex":"female", "sibsp":"", "survived":., "ticket":"2627"} Score =.62 Why? sex = "female" (+.3), pclass = 3 (-.5),...

63 Secondary Model (LIME)

64 Secondary Model (Correlation) Norm (feature) * Corr

65 What we do: Use case determines LOCO or correlation Use case determines what level of features we show

Online Supplementary Material. MetaLP: A Nonparametric Distributed Learning Framework for Small and Big Data

Online Supplementary Material. MetaLP: A Nonparametric Distributed Learning Framework for Small and Big Data Online Supplementary Material MetaLP: A Nonparametric Distributed Learning Framework for Small and Big Data PI : Subhadeep Mukhopadhyay Department of Statistics, Temple University Philadelphia, Pennsylvania,