16 lines
402 B
Python
16 lines
402 B
Python
# 题目5
|
|
|
|
# 词表:["Python","是","编程","语言","Java"]
|
|
# Doc1向量:[1,1,1,1,0]
|
|
# Doc2向量:[0,1,1,1,1]
|
|
# Doc3向量:[3,0,0,0,0]
|
|
|
|
|
|
|
|
# 题目6
|
|
|
|
缺点1:忽略词序与词义关系,导致无法区分语义相反的文本(如"猫吃鱼"与"鱼吃猫")。
|
|
缺点2:高纬度稀疏性与词义鸿沟,增加计算复杂度且无法处理同义词/多义词(如"编程"与"编码")。
|
|
|
|
|