Use app×
Join Bloom Tuition
One on One Online Tuition
JEE MAIN 2025 Foundation Course
NEET 2025 Foundation Course
CLASS 12 FOUNDATION COURSE
CLASS 10 FOUNDATION COURSE
CLASS 9 FOUNDATION COURSE
CLASS 8 FOUNDATION COURSE
0 votes
1.4k views
in Artificial Intelligence (AI) by (47.7k points)
closed by

Which words in a corpus have the highest values and which ones have the least?

1 Answer

+2 votes
by (44.7k points)
selected by
 
Best answer

Stop words like - and, this, is, the, etc. have highest values in a corpus. But these words do not talk about the corpus at all. Hence, these are termed as stop words and are mostly removed at the pre-processing stage only.

Rare or valuable words occur the least but add the most importance to the corpus. Hence, when we look at the text, we take frequent and rare words into consideration.

Welcome to Sarthaks eConnect: A unique platform where students can interact with teachers/experts/students to get solutions to their queries. Students (upto class 10+2) preparing for All Government Exams, CBSE Board Exam, ICSE Board Exam, State Board Exam, JEE (Mains+Advance) and NEET can ask questions from any subject and get quick answers by subject teachers/ experts/mentors/students.

Categories

...