DOCUMENT RETRIVAL USING LATENT SEMANTIC INDEXING FOR HINDI LANGUAGE

Download Article
Jasvir ,Vaibhav Pratap Singh, Amit Kumar Yadav

Abstract : In this paper; topic is document retrieval using Latent Semantic     Indexing for Hindi Language .In the past, information mining through web in text searching using English language or other regional language. But for Hindi language there is less SEO support is available. So, for improving the Hindi language text search and there appropriate result, here is method known as Suffix removal method. It helps to gather the information retrieval. The information extracted needs to be expressed by query, created by the user. Documents satisfying the query of the user are considered as “relevant.” otherwise “non-relevant.Singular value decomposition is one of the most effective dimensional reduction scheme. It is an statistical techniques that is used in many fields, such as the principal component analysis (PCA) for image processing and face recognition we can conclude that performance of Latent Semantic Indexing has been tested for many small datasets. However, it has not been tested for a larger dataset. In our research, we focused on the performance of latent Semantic Indexing on a large dataset by changing the parameters e.g. stop word lists and term weighting schemes. A single paragraph of about 200 words maximum. For research articles, abstracts should give a pertinent overview of the work. We strongly encourage authors to use the following style of structured abstracts, but without headings: 1) Background: Place the question addressed in a broad context and highlights the purpose of the study; 2) Methods: Describe briefly the main methods or treatments applied; 3) Results: Summarize the article’s main findings; and 4) Conclusion: Indicate the main conclusions or interpretations. The abstract should be an objective representation of the article, it must not contain results which are not presented and substantiated in the main text and should not exaggerate the main conclusions.

Information Retrieval (IR), Text Retrieval Conference (TREC), Term Weighting Scheme Tf-idf , Stop Words Latent Semantic Analysis (LSA) process, Stemming, LSI, Classical Boolean, Extended Boolean, Vector space, Probabilistic, NLP