- Digit
- Punctuation
- Bullets
Based on the research I do, numbers, punctuation marks, and bullets do not have a significant influence on a text data. This is because data in the text will eventually be in the ranking while the data is dirty data can not stand on its own.
Because it is the numbers, punctuation marks, and bullets need to be erased before the process of text mining that others do.
this is the scheme of prepossessing
2 comments:
his article covers so many new and unique facts about preprocessing which I wasn't aware of. I am glad that I found such a useful post and its my pleasure to give comment.
e signatures
thank you for your comment,,,
Post a Comment