About Word Association #1350
-
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 11 replies
-
Well, I think that happens.
All the words on the network came from [Word Association] screen. And the words listed on the [Word Association] screen are those that appeared more frequently in 2024 than in the whole. These are not words that only appeared in 2024, but words that appeared more frequently in 2024 than in other parts. Therefore, it is possible that they may also appear in 2016 or 2020. |
Beta Was this translation helpful? Give feedback.
-
Yes, KH Coder first omits all words with low conditional probabilities. |
Beta Was this translation helpful? Give feedback.
-
Thanks so much! Got that! |
Beta Was this translation helpful? Give feedback.
In the case of "Matthew", Jaccard coefficient is calculated like this:
a) Number of H5 units in part 01-07 that contain Matthew: 90
b) Number of H5 units in part 01-07 that do not contain Matthew: 338 - 90 = 248
c) Number of H5 units not in part 01-07 that contain Matthew: 285 - 90 = 195
Jaccard coefficient: a / (a + b + c) = 90 / (90 + 248 + 195) = 0.168856...
BTW, "Number of H5 units" means "number of cells in the excel file". H5 units means cells of Excel files.
It would be more accurate to say that the calculation is based on the probability of a word occuring within a given unit, rather than on word frequency. Because of this calculation, if you select a smaller unit such as "senten…