2024年2月22日发(作者:屈邃)
Day
Day1
Day2
Day3
Day4
Day5
Day6
Day7
Day8
Day9
Day10
Day11
Outlook
Sunny
Sunny
Overcast
Rain
Rain
Rain
Overcast
Sunny
Sunny
Rain
Sunny
Temperature
Hot
Hot
Hot
Mild
Cool
Cool
Cool
Mild
Cool
Mild
Mild
Mild
Hot
Mild
Humidity
High
High
High
High
Normal
Normal
Normal
High
Normal
Normal
Normal
High
Normal
High
Wind
Weak
Strong
Weak
Weak
Weak
Strong
Strong
Weak
Weak
Weak
Strong
Strong
Weak
Strong
Play
Tennis
No
No
Yes
Yes
Yes
No
Yes
No
Yes
Yes
Yes
Yes
Yes
No
Day12 Overcast
Day13 Overcast
Day14
Rain
给定与判定树归纳相同的训练数据,我们希望使用朴素贝叶斯分类预测一个未知样本的类标号。数据样本用属性Outlook,Temperature,Humidity和Wind描述。类标号属性Play_Tennis具有两个不同值(即(Yes,No))。设C1对应于类Play_Tennis=“Yes”,而C2对应于类Play_Tennis=“No”。我们希望分类的样本为
XOutlook"Sunny",TemperatureCool",Humidity"High",Wind"Strong"
我们需要最大化PXCiPCi,i=1,2。每个类的先验概率P(C)可以根据训练样本计算:
iP(Play_Tennis=”Yes”)=9/14=0.643
P(Play_Tennis=”No”)=5/14=0.357
为计算PXCi,i=1,2,我们计算下面的条件概率:
P(Outlook=”sunny”|Play_Tennis=”Yes”)=2/9=0.222
P(Outlook=”sunny”|Play_Tennis=”No”)=3/5=0.600
P(Temperature=”Cool”|Play_Tennis=”Yes”)=3/9=0.333
P(Temperature=”Cool”|Play_Tennis=”No”)=1/5=0.200
P(Hudimity=”High”|Play_Tennis=”Yes”)=3/9=0.333
P(Hudimity=”High”|Play_Tennis=”No”)=4/5=0.800
P(Wind=”Strong”|Play_Tennis=”Yes”)=3/9=0.333
P(Wind=”Strong”|Play_Tennis=”No”)=3/5=0.600
使用以上概率,我们得到:
P(X|Play_Tennis=”Yes”)=0.222×0.333×0.333×0.333=0.00823
P(X|Play_Tennis=”No”)=0.600×0.200×0.800×0.600=0.0576
P(X|Play_Tennis=”Yes”)P(Play_Tennis=”Yes”)=0.00823×0.643=0.0053
P(X|Play_Tennis=”No”)P(Play_Tennis=”No”)=0.0576×0.357=0.0206
因此,对于样本X,朴素贝叶斯分类预测Play_Tennis=”No”
2024年2月22日发(作者:屈邃)
Day
Day1
Day2
Day3
Day4
Day5
Day6
Day7
Day8
Day9
Day10
Day11
Outlook
Sunny
Sunny
Overcast
Rain
Rain
Rain
Overcast
Sunny
Sunny
Rain
Sunny
Temperature
Hot
Hot
Hot
Mild
Cool
Cool
Cool
Mild
Cool
Mild
Mild
Mild
Hot
Mild
Humidity
High
High
High
High
Normal
Normal
Normal
High
Normal
Normal
Normal
High
Normal
High
Wind
Weak
Strong
Weak
Weak
Weak
Strong
Strong
Weak
Weak
Weak
Strong
Strong
Weak
Strong
Play
Tennis
No
No
Yes
Yes
Yes
No
Yes
No
Yes
Yes
Yes
Yes
Yes
No
Day12 Overcast
Day13 Overcast
Day14
Rain
给定与判定树归纳相同的训练数据,我们希望使用朴素贝叶斯分类预测一个未知样本的类标号。数据样本用属性Outlook,Temperature,Humidity和Wind描述。类标号属性Play_Tennis具有两个不同值(即(Yes,No))。设C1对应于类Play_Tennis=“Yes”,而C2对应于类Play_Tennis=“No”。我们希望分类的样本为
XOutlook"Sunny",TemperatureCool",Humidity"High",Wind"Strong"
我们需要最大化PXCiPCi,i=1,2。每个类的先验概率P(C)可以根据训练样本计算:
iP(Play_Tennis=”Yes”)=9/14=0.643
P(Play_Tennis=”No”)=5/14=0.357
为计算PXCi,i=1,2,我们计算下面的条件概率:
P(Outlook=”sunny”|Play_Tennis=”Yes”)=2/9=0.222
P(Outlook=”sunny”|Play_Tennis=”No”)=3/5=0.600
P(Temperature=”Cool”|Play_Tennis=”Yes”)=3/9=0.333
P(Temperature=”Cool”|Play_Tennis=”No”)=1/5=0.200
P(Hudimity=”High”|Play_Tennis=”Yes”)=3/9=0.333
P(Hudimity=”High”|Play_Tennis=”No”)=4/5=0.800
P(Wind=”Strong”|Play_Tennis=”Yes”)=3/9=0.333
P(Wind=”Strong”|Play_Tennis=”No”)=3/5=0.600
使用以上概率,我们得到:
P(X|Play_Tennis=”Yes”)=0.222×0.333×0.333×0.333=0.00823
P(X|Play_Tennis=”No”)=0.600×0.200×0.800×0.600=0.0576
P(X|Play_Tennis=”Yes”)P(Play_Tennis=”Yes”)=0.00823×0.643=0.0053
P(X|Play_Tennis=”No”)P(Play_Tennis=”No”)=0.0576×0.357=0.0206
因此,对于样本X,朴素贝叶斯分类预测Play_Tennis=”No”