Use app×
Join Bloom Tuition
One on One Online Tuition
JEE MAIN 2025 Foundation Course
NEET 2025 Foundation Course
CLASS 12 FOUNDATION COURSE
CLASS 10 FOUNDATION COURSE
CLASS 9 FOUNDATION COURSE
CLASS 8 FOUNDATION COURSE
0 votes
891 views
in Artificial Intelligence (AI) by (47.7k points)
closed by

Does the vocabulary of a corpus remain the same before and after text normalization? Why?

1 Answer

+1 vote
by (44.7k points)
selected by
 
Best answer

No, the vocabulary of a corpus does not remain the same before and after text normalization. 

Reasons are:

● In normalization the text is normalized through various steps and is lowered to minimum vocabulary since the machine does not require grammatically correct statements but the essence of it. 

● In normalization Stop words, Special Characters and Numbers are removed. 

● In stemming the affixes of words are removed and the words are converted to their base form.

So, after normalization, we get the reduced vocabulary.

Welcome to Sarthaks eConnect: A unique platform where students can interact with teachers/experts/students to get solutions to their queries. Students (upto class 10+2) preparing for All Government Exams, CBSE Board Exam, ICSE Board Exam, State Board Exam, JEE (Mains+Advance) and NEET can ask questions from any subject and get quick answers by subject teachers/ experts/mentors/students.

Categories

...