How SHAP Can Keep You From Black Box AI

Machine learning interpretability and explainable AI are hot topics nowadays in the data world. Just working of a model will not make it trustful. Because machine learning models might decay. Science expects scepticism. You can only trust transparent, explainable and provable models. Herein, SHAP would help us to transform black box AI to transparent.

john-nash-room
John Nash’s Room (A Beautiful Mind)

Webinar


๐Ÿ™‹โ€โ™‚๏ธ You may consider to enroll my top-rated machine learning course on Udemy

Decision Trees for Machine Learning

The following webinar might attract your attention. It covers machine learning interpretability and explainable AI concepts from hands-on programming perspective.

Why being interpretable is important?

You have to be skeptic even if your model shows high accuracy because accuracy is illusionary. You might think you built a model to classify wolves but you have actualy built a snow classifier. This example is shown in the study “Why Should I Trust You?”: Explaining the Predictions of Any Classifier.

shap-husky
Failure

Naturally Explainable Algorithms

Consider the equation of the linear regression. Coefficients will give us an idea about the importance of features. Besides, decision tree algorithms are explainable algorithms and they also offer feature importance feature as well. You should read the following posts to find out naturally interpretable algorithms.

Feature importance for linear regression

Feature importance for decision trees

This post actually mentions algorithms which ARE NOT naturally explainable such as deep learning.

Regulations

Some industries such as banking and finance have strong regulations. Explaining decisions before production deploy is a must. On the other hand, the most powerful algorithms are black box models as are whereas highly explainable algorithms show low accuracy.





accuracy-vs-intrepretability

Herein, some subsidiary approaches apply logistic regression to input and predictions of black box models and let it to be overfitted. In this way, these black box models transformed to be a transparent, explainable and provable models.

PwC-AI-framework-blackbox-2
When a model becomes explainable

SHAP is a framework explaining the output of any machine learning model. It supports both common deep learning frameworks (TensorFlow, Keras, PyTorch) and gradient boosting frameworks (LightGBM, XGBoost, CatBoost). Moreover, it can explain both tabular / structured and unstructured data such as images.

Testing a CNN model

We will test SHAP for FER 2013 data set. Previously, I modeled this problem and got 57% accuracy to classify 7 different emotions.

class_names = {
    0: 'angry',
    1: 'disgust',
    2: 'fear',
    3: 'happy',
    4: 'sad',
    5: 'surprise',
    6: 'neutral'
}

num_classes = len(class_names)

Constructing the model

Same model will be contructed with same pre-trained weights

#construct CNN structure
model = Sequential()

#1st convolution layer
model.add(Conv2D(64, (5, 5), activation='relu', input_shape=(48,48,1)))
model.add(MaxPooling2D(pool_size=(5,5), strides=(2, 2)))

#2nd convolution layer
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(AveragePooling2D(pool_size=(3,3), strides=(2, 2)))

#3rd convolution layer
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(AveragePooling2D(pool_size=(3,3), strides=(2, 2)))

model.add(Flatten())

#fully connected neural networks
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.2))

model.add(Dense(num_classes, activation='softmax'))

# https://github.com/serengil/tensorflow-101/blob/master/model/facial_expression_model_weights.h5
model.load_weights('facial_expression_model_weights.h5')

Data preparation

Data set stores emotion, image pixels and train / test set usage in each row as comma separated. Image pixels are blank separated as well.

with open("fer2013.csv") as f:
    content = f.readlines()

lines = np.array(content)
num_of_instances = lines.size

x_train, y_train = [], []

for i in range(1,num_of_instances):
    try:
        emotion, img, usage = lines[i].split(",")
        val = img.split(" ")
        pixels = np.array(val, 'float32')
        emotion = keras.utils.to_categorical(emotion, num_classes)

        if 'Training' in usage:
            y_train.append(emotion)
            x_train.append(pixels)
    except Exception as err:
        print(str(err))

Inputs and outputs should be normalized.

x_train = np.array(x_train, 'float32')
y_train = np.array(y_train, 'float32')

x_train /= 255 #normalize inputs between [0, 1]
x_train = x_train.reshape(x_train.shape[0], 48, 48, 1)
x_train = x_train.astype('float32')

Explainer

Gradient explainer is a pretty way to explain images. It expects input images in 4 dimension. That’s why, we will a dummy dimension. So, (48, 48, 1) shaped images will be stored as (1, 48, 48, 1).

x_train = np.expand_dims(x_train, axis = 1)

The following generic explain function focuses on a specific image in the data set for specific layer in the model.

def map2layer(x, layer):
    feed_dict = dict(zip([model.layers[0].input], x.copy()))
    return K.get_session().run(model.layers[layer].input, feed_dict)

def explain(x_train, sample, layer):
    to_explain = x_train[[sample]]

    e = shap.GradientExplainer(
        (model.layers[layer].input, model.layers[-1].output),
        map2layer(x_train, layer),
        local_smoothing=0 # std dev of smoothing noise
    )

    shap_values,indexes = e.shap_values(map2layer(to_explain, layer), ranked_outputs=1)
    index_names = np.vectorize(lambda x: class_names[x])(indexes)
    shap.image_plot(shap_values, to_explain[0], index_names)

Let’s focus on the 39th image for layer from 0 to 4.





sample = 39
for layer in range(0,5):
    print("layer ",layer,": ",model.layers[layer])
    explain(x_train, sample, layer)

SHAP explains this image fascinatingly. Early layers focus on face features whereas the following layers mention areas in the face. Pixels pushing the prediction higher are shown in red whereas lower are shown in blue.

fer-for-shap
Testing a custom image in FER 2013 data set with SHAP

Tabular Data

Regulations enforce us tabular data set mostly. SHAP can explain models built by tabular data as well. Herein, I will test it on iris data set. We mostly apply gradient boosting for tabular data but explaining neural nets are more difficult. That’s why, I’ll build multi-layer perceptron to classify iris flowers.

The task is to classify class of the iris flower based on the leaf sizes.

class_names = ['setosa', 'versicolor', 'virginica']
feature_names = ['sepal length', 'sepal width', 'petal length', 'petal width']
num_classes = np.unique(label).shape[0]

The data set can be found here as features and labels.

attributes = pd.read_csv("iris-attr.data", delimiter=",", names = feature_names)
label = np.genfromtxt("iris-labels.data", dtype="int64")

Labels are stored in numerical format. We should transform it to one hot encoded because this is a classification task.

label = keras.utils.to_categorical(label, num_classes)

Model will be very easy. It consists of one hidden layer. Size of the input layer and hidden layer are equal.

model = Sequential()
model.add(Dense(4 #num of hidden units
    , input_shape=(attributes.shape[1],))) #num of features in input layer
model.add(Activation('sigmoid')) #activation function from input layer to 1st hidden layer
model.add(Dense(num_classes)) #num of classes in output layer
model.add(Activation('sigmoid')) #activation function from 1st hidden layer to output layer

Loss function should be cross entropy because this is a classification task. Besides, Adam will converge faster.

model.compile(loss='categorical_crossentropy', optimizer='adam')
model.fit(attributes, label, epochs=1000, verbose=0)

Predictions can be made for train set. I got 97.33% accuracy in my tests.

predictions = model.predict(attributes)
classified = 0
index = 0
for i in predictions:
    pred = np.argmax(i)
    actual = np.argmax(label[index])
    if pred == actual:
        classified = classified + 1
    index = index + 1
print("Accuracy: ",100*classified/len(predictions),"%")

Explainer

We will use kernel explainer for tabular data sets.

explainer = shap.KernelExplainer(model.predict, attributes, link="logit")
shap_values = explainer.shap_values(attributes, nsamples=100)
shap.initjs()

Let’s focus on the 115th instance directly





sample = 114
print("Features:")
print(attributes.iloc[sample])
print("Actual: ",class_names[np.argmax(label[114])])
print("Prediction: ",class_names[np.argmax(predictions[sample])])

This instance is classified as virginica. Let’s find the reason.

prediction_class = np.argmax(predictions[sample])
print("Prediction: ",class_names[prediction_class])
shap.force_plot(explainer.expected_value[0], shap_values[prediction_class][0,:], attributes.iloc[sample])

The following demonstration shows feature importances for this instance.

tabular-single-prediction
Finding the reason of the prediction

We can also find these for many instances.

shap.force_plot(explainer.expected_value[0], shap_values[prediction_class], attributes, link="logit")

This shows which features push prediction high and low.

many-explanations

Multi-dimensional graphs might be confusing. That’s why, we can filter features to have real impressions.

petal-length-filtered
Filtering petal length feature

SHAP finds feature importance values as well.

#shap.summary_plot(shap_values, attributes, plot_type="bar")
for i in range(0, len(class_names)):
    current_class = class_names[i]
    print("Feature importances for ",current_class)
    shap.summary_plot(shap_values[i], attributes, plot_type="bar")
    print("----------------------")

This is very similar to logistic regression results, isn’t this?

feature-importances
Feature importances

You might want to export feature importance values as data frame for custom classes.

class_idx = 0 #setosa
shap_sum = np.abs(shap_values[class_idx]).mean(axis=0)
importance_df = pd.DataFrame([feature_names, shap_sum.tolist()]).T
importance_df.columns = ['feature', 'importance']
importance_df = importance_df.sort_values('shap_importance', ascending=False)
importance_df.head()

In this case, we can export these importance values to any other external systemsรง





shap-feature-importance-df
Feature importance for setosa class

So, SHAP can explain any model (deep learning / boosted trees) and any kind of data (tabular / unstructured). Renouncing accuracy against interpretability is not acceptable. That’s why, opening the black boxes and explaining unexplainable problems are much more important.

I pushed the source code of this post to GitHub. Special thanks to Giray who convince me about that deep learning is not a black model.


Like this blog? Support me on Patreon

Buy me a coffee