There are 7 positive examples and 4 negative ones overall, so the
overall entropy is -(7/11)*log2(7/11) - (4/11)*log2(4/11) which is about
0.94566.
There are 3 values for the texture attribute: smooth,
wavy and rough.
For smooth, there are 4 positive examples and 1 negative, so the entropy
is -(4/5)*log2(4/5) - (1/5)*log2(1/5) which is about 0.7219.
For wavy there is one positive and one negative example, so the
entropy is -(1/2)*log2(1/2) - (1/2)*log2(1/2) = 1.
For rough there are two positive and two negative examples, so the
entropy is -(1/2)*log2(1/2) - (1/2)*log2(1/2) = 1.
Expected information for a split on texture is thus:
Attribute temperature:
There are 4 values for the temperature attribute: cold,
cool, warm and hot.
For cold, there is 1 positive and 3 negative examples, so the
entropy is -(1/4)*log2(1/4) - (3/4)*log2(3/4) which is about 0.811278.
For cool, there are 3 positive and no negative examples, so the
entropy is -(0/3)*log2(0/3) - (3/3)*log2(3/3) = 0.
For warm, there is one positive and no negative example - again
the entropy is 0.
For hot, there are 2 positive and 1 negative examples, so the
entropy is -(1/3)*log2(1/3) - (2/3)*log2(2/3) which is about 0.918296.
Thus the expected information for a split on temperature is:
0.811278*(4/11) + 0*(3/11) + 0*(1/11) + 0.918296*(3/11) which is
about 0.54545.
Thus
expected information gain for a split on temperature is about
0.94566 - 0.54545 = 0.40021.