Optimal Search Results Over Cloud with a Novel Ranking Approach

Загружено:

Описание:

In this paper we are proposing an efficient
search implementation over the data which is stored in
cloud services .In the recent days of research searching
over cloud services makes more importance, because of
retrieving the user interesting and relevant results based
on the user query. Obviously the out sourced data will be
encrypted form. Our approach searches keywords in the
encrypted documents in optimal manner based on the file
relevance score over service oriented applications

Доступные форматы

Optimal Search Results Over Cloud with a Novel Ranking Approach

Загружено:

Описание:

In this paper we are proposing an efficient
search implementation over the data which is stored in
cloud services .In the recent days of research searching
over cloud services makes more importance, because of
retrieving the user interesting and relevant results based
on the user query. Obviously the out sourced data will be
encrypted form. Our approach searches keywords in the
encrypted documents in optimal manner based on the file
relevance score over service oriented applications

Abstract: In this paper we are proposing an efficient search implementation over the data which is stored in cloud services .In the recent days of research searching over cloud services makes more importance, because of retrieving the user interesting and relevant results based on the user query. Obviously the out sourced data will be encrypted form. Our approach searches keywords in the encrypted documents in optimal manner based on the file relevance score over service oriented applications I. INTRODUCTION In Suppose user Alice wishes to read her email on a number of devices: laptop, desktop, pager, etc.Alice's mail gateway is supposed to route email to the appropriate device based on the keywords in the email. For example, when Bob sends email with the keyword \urgent" the mail is routed to Alices pager. When Bob sends email with the keyword \lunch" the mail is routed to Alice's desktop for reading later. One expects each email to contain a small number of keywords. For example, all words on the subject line as well as the sender's email address could be used as keywords.[3] In c Verifying the integrity and authenticity of information is a prime necessity in computer systems and networks. In particular, two parties communicating over an insecure channel require a method by which information sent by one party can be validated as authentic (or unmodified) by the other. Most commonly such a mechanism is based on a secret key shared between the parties and takes the form of a Message Authentication Code (MAC). (Other terms used include \Integrity Check Value" or \cryptographic checksum"). In this case, when party A transmits a message to party Bit appends to the message a value called the authentication tag, computed by the MAC algorithms a function of the transmitted information and the shared secret key. At reception, B re-computes the authentication tag on the received message using the same mechanism (and key) and checks that the value he obtains equals the tag attached to the received message[2]. match is the information received considered as not altered on the way from A to B.1 The goals to prevent forgery,

Namely, the computation, by the adversary, of a message (not sent by the legitimate parties) and its corresponding valid authentication tag. A precise definition of MAC sands their security.

II. RELATED WORK Even though various search engines developed over encrypted data ,they may have the vulnerabilities either on the computational complexity of performance issue wise, various Symmetric and asymmetric approaches developed by the various researcher from so many years. Searching of encryption allows data owner to outsource his data in an encrypted manner while maintaining the selectively-search capability over the encrypted data. Generally, searchable encryption can be achieved in its full functionality using an oblivious RAMs [11]Although hiding everything during the search from a malicious server (including access pattern), utilizing oblivious RAM usually brings the cost of logarithmic number of interactions between the user and the server for each search request. Thus, in order to achieve more efficient solutions, almost all the existing works on searchableencryption literature resort to the weakened security guarantee, i.e., revealing the access pattern and search pattern but nothing else. Here access pattern refers to the outcome of the search result, i.e., which files have been retrieved. The search pattern includes the equality pattern among the two search requests (whether two searches were performed for the same keyword), and any information derived thereafter from this statement. We refer readers to [12] for the thorough discussion on SSE definitions. Having a correct intuition on the security guarantee of existing SSE literature is very important for us to define our ranked searchable symmetric encryption problem. As later we will show that following the exactly same security guarantee of existing SSE scheme, it would-be very inefficient to achieve ranked keyword search, which motivates us to further weaken the security guarantee of existing SSE appropriately (leak the relative relevance order International Journal of Engineering Trends and Technology (IJETT) Volume 5 Number 4 - Nov 2013 ISSN: 2231-5381 http://www.ijettjournal.org Page 189

but not the relevance score) and realize an as-strong-as-possible ranked searchable symmetric encryption.

III. PROPOSED SYSTEM Searching data over Out sourcing is still an interesting research issue in the field of cloud computing or service oriented application, because of retrieving the user interesting and relevant results based on the user query. Obviously the out sourced data will be encrypted form. Our approach searches keywords in the encrypted documents in optimal manner based on the file relevance score over service oriented applications, because the outsourced data usually encrypted before storage for the privacy preserving, traditional approaches uses the Boolean approach those are not optimal, those are not suitable for large datasets. Our approach searches the encrypted information in the outsource data by maintains the search table information for finding the relation between the search key word and documents related to it and it maintains the score of the search keyword with respect to documents, it gives the frequency and inverse document frequency and results can be displayed to the user based on the ranking

So many clustering approaches evolved to find the optimal ranked results for the user interestingness with various clustering approaches an they have various draw backs like local optima and random selection of the centroid ,but the problem with this approaches are, optimality in solution of search results

In this approach data owner out sources the data in the server, before storing data in the server , Data owner has a collection of n data files C =(F1; F2; : : : ; Fn) that he wants to outsource on the server in encrypted form while still keeping the capability to search through them for effective data utilization reasons. To do so, before outsourcing, data owner will first build a secure searchable index I from a set of m distinct keywords W =(w1;w2; :::;wm) extracted from the file collection C, and store both the index I and the encrypted file collection C on the server. After searching the information data can be organized after the ranking. To do so, before outsourcing, data owner will first build a secure searchable index I from a set of m distinct keywords W = (w1;w2; :::;wm) extracted from the file collection C.Index table contains the unique keywords from the datasets along with file ids, before placing them into the index table encrypt the keywords by using symmetric key approach with AES algorithm for security purpose.

A)Algorithm for index table generation

1. Read the document F 2. Segment the document term wise and encrypt with key 3. Calculate term frequency (TF) and inverse document frequency(IDF) and publishing time(PT) 4. Generate index table(Itable) and files upload to server B) Rijandael algorithm Rijandael algorithm is one of the form of AES algorithm Our paper uses an advanced cryptographic algorithm for secure data transmission and it uses the key and it is already proved that it is an efficient and secure algorithm than the so many traditional approaches and it is is generated from the multikey exchange group key protocol and the brief structure of the novel cryptographic algorithm as shown below ,the system mainly works on substitution and affine transformation techniques

KeyExpansionround keys are derived from the cipher key using key schedule AddRoundKeyeach byte of the state is combined with the round key using bitwise xor Rounds SubBytesa non-linear substitution step where each byte is replaced with another according to a lookup table. ShiftRowsa transposition step where each row of the state is shifted cyclically a certain number of steps. MixColumnsa mixing operation which operates on the columns of the state, combining the four bytes in each column.

Our proposed Architecture works with web services (service oriented applications) ,that provides the language interoperability and security ,Server receives the query from the user ,it encrypts the query by using AES algorithm and authenticates himself with the user key andcompares with the encrypted keyword in the index table, finds the number of occurrences of the keyword,

that determines the term frequency and inverse document frequency for finding the file relevance score. In this paper we proposed a novel file relevance score measurement with number of terms in the file, number of occurrences of the term(term frequency) and number of files relevance_Scores[j] =Convert.ToDecimal((1 / termsinfile[j]) * (1 +Math.Log(termfreqs[j])) * Math.Log(1 +(filecount / numberoffiles))); Ranking function calculates the term frequency and inverse document frequency for finding the score of the query or keyword with respect to the files, and forwards the datasets according to the score to the user based on ranking. Files can be retrieved based on the our novel file relevance scores Step1: Registration of the user at Server by requesting the key Step2: User receives the key for authenticated and secure search Step3: User searches for relevant data with a plain keyword Step4: Service process the query and checks for the authentication of user Step5 : Service retrieves the relevant information from index table for respective keyword Step6 : calculates the file relevance scores based on thefile relevance score relevance_Scores[j] =Convert.ToDecimal((1 / termsinfile[j]) * (1 +Math.Log(termfreqs[j])) * Math.Log(1 +(filecount / numberoffiles))); Step7:return the files based on the file relevance score to user III. EXPERIMENTAL ANALYSIS For implementation purpose we had used C#.net and ASP.net,our experimental analysis shows in the index table generation at the data owners end as follows Out sources with Index table generation Data Owner User Index Tables Outsources with index tables Query Rank Oriented Results Architecture International Journal of Engineering Trends and Technology (IJETT) Volume 5 Number 4 - Nov 2013 ISSN: 2231-5381 http://www.ijettjournal.org Page 191

the above index table consists of Keyword,encrypted keyword and frequency of the keyword,it can be uploaded to service provider.User search results can be shown as specified search keyword with relevant results as follow with relevant file relevance scores.

IV. CONCLUSION AND FUTURE WORK Our approach provides an efficient secure search mechanism over service oriented application with relevant files by calculating the file relevance scores of the files which contains the search keyword, encrypting the keyword at server side and retrieves the relevant information.

We can enhance the system by improving the search mechanisms along with semantic comparison and similarity based approaches

BIOGRAPHIES MOVVA KALPANA received her B.TECH in the department of INFORMATION TECHNOLOGY from Sri Sarathi Institute of Engineering & Technology, Nuzvid - JNTU HYDERABAD in 2006 . She is currently a M.TECH candidate in Department of Software Engineering at SISTAM college JNTU Kakinada. Her research interests including Network Cryptography, information security, cloud computing and distributed Systems.

JayanthiRaoMadina is working as a HOD in Sarada Institute of Science, Technology And Management, Srikakulam, Andhra Pradesh. He received his M.Tech (CSE) from Aditya Institute of Technology And Management, Tekkali. Andhra Pradesh. His research areas include Image Processing, Computer Networks, Data Mining, Distributed Systems.