Acknowledgement
I have a great honor to acknowledge Mr. ……………,
Director, Appin Technology Labs, Bamunimaidan, who had given
me his consent to carry out this project. I feel immense pleasure
and privilege in expressing my deep sense of diploma towards my
guide Mr. Nutan, whose valuable guidance and critical analysis of
my result has led to successful completion of my project.
My special thanks to all my friends for giving me incentive
support in this report work. I express my gratitude to my
affectionate and loving friends for encouragement and
enthusiastic support throughout this study. I thank my respected
parents, whose patience and support was instrumental in
accomplishing this task.
MOON GOGOI
Table Of Contenet
Titel
Introduction
About Google
Google Search
Search Algorithm
Hilltop Algorithm Overview
How Google Search Engine Works?
What Google Searches?
Quoted Phrases
The + Operator
The – Operator
The ~ Operator
The OR Operator
The .. Operator
The * Operator
Search Operators
Querying Vulnerable Sites…..
Using “Index OF” Syntax…….
Looking for Vulnerable………..
Other Similar Search………….
What Google Hack Searches?
Some Google Hack Trickes
Accessing Blocked Site
Live Network Camera Hack
Conclusion
Reference
Introduction
Google hacking is the term used when a hacker tries to find
vulnerable targets or sensitive data by using the Google search engine. In
Google hacking hackers use search engine commands or complex search
queries to locate sensitive data and vulnerable devices on the Internet.
Although Google hacking techniques are against Google terms of service1
and Google blocks well-known Google hacking queries, nothing can stop
hackers from crawling websites and launching Google queries. Google
hacking can be used to locate vulnerable web servers and websites which
are listed in the Google search engine database. In other words, hackers can
locate many thousands of vulnerable websites, web servers and online
devices all around the world and select their targets randomly. This kind of
attack is most commonly launched by applying Google hacking techniques
to satisfy junior hackers. It is obvious that the Google hacking procedure is
based on certain keywords, which could be used effectively if they are used
by some internal commands of the Google search engine. These commands
can be used to help hackers narrow down their search to locate sensitive
data or vulnerable devices. Nevertheless, the success of Google hacking
techniques depends on the existence of vulnerable sites, servers and
devices. However, we should not ignore the power of the search engines in
providing information about the targets to the hackers in the
reconnaissance phase.
Malicious hackers can use Google hacking techniques to identify
vulnerable sites and web servers for known vulnerabilities. In addition, they
can look for error pages with the help of technical information or retrieve
files and directories with sensitive contents such as databases, passwords,
log files, login pages or online devices such as IP cameras and network
storage.
About Google
Google began in January 1996 as a research project by Larry Page and
Sergey Brin when they were both PhD students at Stanford University in
California. While conventional search engines ranked results by counting
how many times the search terms appeared on the page, the two theorized
about a better system that analyzed the relationships between websites.
They called this new technology PageRank, where a website's relevance was
determined by the number of pages, and the importance of those pages,
that linked back to the original site. A small search engine called Rankdex
was already exploring a similar strategy. Page and Brin originally
nicknamed their new search engine "BackRub", because the system
checked backlinks to estimate the importance of a site. Eventually, they
changed the name to Google, originating from a misspelling of the word
"googol", the number one followed by one hundred zeros, which was meant
to signify the amount of information the search engine was to handle.
Originally, Google ran under the Stanford University website, with the
domain google.stanford.edu. The domain google.com was registered on
September 15, 1997, and the company was incorporated on September 4,
1998, at a friend's garage in Menlo Park, California.
Google Inc. is a multinational public cloud computing, Internet
search, and advertising technologies corporation. Google hosts and
develops a number of Internet-based services and products, and generates
profit primarily from advertising through its AdWords program. The
company was founded by Larry Page and Sergey Brin, often dubbed the
"Google Guys", while the two were attending Stanford University as Ph.D.
candidates. It was first incorporated as a privately held company on
September 4, 1998, with its initial public offering to follow on August 19,
2004. The company's stated mission from the outset was "to organize the
world's information and make it universally accessible and useful",and the
company's unofficial slogan – coined by Google engineer Paul Buchheit – is
Don't be evil. In 2006, the company moved to their current headquarters in
Mountain View, California.
Google Search Engine
The Google web search engine is the company's most popular service.
According to market research published by comScore in November 2009,
Google is the dominant search engine in the United States market, with a
market share of 65.6%. Google indexes trillions of web pages, so that users
can search for the information they desire, through the use of keywords and
operators. In 2003, The New York Times complained about Google's
indexing, claiming that Google's caching of content on their site infringed
on their copyright for the content. In this case, the United States District
Court of Nevada ruled in favor of Google in Field v. Google and Parker v.
Google. Google Watch has also criticized Google's PageRank algorithms,
saying that they discriminate against new websites and favor established
sites, and has made allegations about connections between Google and the
NSA and the CIA. Despite criticism, the basic search engine has spread to
specific services as well, including an image search engine, the Google News
search site, Google Maps, and more. In early 2006, the company launched
Google Video, which allowed users to upload, search, and watch videos
from the Internet. In 2009, however, uploads to Google Video were
discontinued so that Google could focus more on the search aspect of the
service. The company even developed Google Desktop, a desktop search
application used to search for files local to one's computer.
Search Algorithm
Google uses Hilltop search algorithm.
Hilltop Search Algorithm - The Hilltop Search algorithm tries
to order search engine results according to relatedness of interlinking
pages, as opposed to the Google PageRank algorithm, which relies on
authority of Web pages and sites linking to a page.
For example, suppose that page A has a PageRank of 7 because it is a
page at the Web site of a large firm. It is all about widgets. It therefore gets
a high score on on-page optimization for keyword widgets. It has few
backlinks from external pages because competitors will not link to it.
Page B on the other hand is about widgets but it only has a PageRank of 2.
But Page B is at a university Web site and has has 10 high ranking "expert"
pages linking to it with keyword widgets in the anchor text. It also links
back to at least some of these pages.
One defect of the Hilltop algorithm is that it might be biased against
commercial Web sites, which tend not to get links or to each other to
prevent competition. Consequently, following the Florida Update of Google,
which was thought by some to have used the Hilltop algorithm, many
commercial sites found themselves excluded. Another great defect of this
algorithm is that the "expert" pages postulated by the Hilltop algorithm are
collections of links. This opens the way for exploitation and spamdexing by
shady "directories" and "link farms" - collections of links that are there only
for the purpose of increasing search engine rank. At the time the Hilltop
algorithm was created, the Web was a different place. Traffic was not
determined primarily by search engine listings, and search engines were
not very good at indexing the Web. Therefore it was customary and
constructive to create pages of "resources" that linked to related Web sites
that had good information on a topic. Such honest directories are receding
in importance. However, the algorithm could be modified appropriately. If
a blogger seeking to define a term links to a page in this glossary, that may
might be sufficient to grant that page some credit in the positioning
algorithm.
How Google Search Engine Works?
It has always been a topic to dig into that how Google displays the
Search Results on its Search Engine Result Page (SERP) when you put the
terms to be searched in the Google Search Box. I thought of sharing some
basic concepts that always have been behind this intelligent search engine’s
operational strategies.
The first thing you need to know is, when you are searching
something on Google, you are searching the Google database and not the
actual web. The Google Spiders and Crawlers are the programs that crawl
the web and go on indexing the web pages that they find. Below diagram is
a concise representation of Google’s basis to create the order of the search
results displayed on SERP.
3. What Google Searches?
Google Search provides at least 22 special features beyond the
original word-search capability. These include synonyms, weather
forecasts, time zones, stock quotes, maps, earthquake data, movie
showtimes, airports, home listings, and sports scores. (see below: Special
features). There are special features for numbers, including ranges
(70..73), prices, temperatures, money/unit conversions ("10.5 cm in
inches"), calculations ( 3*4+sqrt(6)-pi/2 ), package tracking, patents,
area codes, and language translation of displayed pages.
The order of search results (ghits for Google hits) on Google's
search-results pages is based, in part, on a priority rank called a
"PageRank". Google Search provides many options for customized
search (see below: Search options), using Boolean operators such as:
exclusion ("-xx"), inclusion ("+xx"), alternatives ("xx OR yy"), and
wildcard ("x * x")
The exact percentage of the total of web pages that Google
indexes is not known, as it is very hard to actually calculate. Google not
only indexes and caches web pages but also takes "snapshots" of other
file types, which include PDF, Word documents, Excel spreadsheets,
Flash SWF, plain text files, and so on. Except in the case of text and SWF
files, the cached version is a conversion to (X)HTML, allowing those
without the corresponding viewer application to read the file.
Quoted Phrases
A query with terms in quotes finds pages containing the exact
quoted phrase. For example, [ “Larry Page“ ] finds pages containing the
phrase “Larry Page” exactly. So this query would find pages mentioning
Google’s co-founder Larry Page, but not pages containing “Larry has a
home page,” “Larry E. Page,” or “Congressional page Larry Smith.” The
query [ Larry Page ] (without quotes) would find pages containing any of
“Larry Page,” “Larry has a home page,” or “Congressional page Larry
Smith.”
The + Operator
Force Google to include a term by preceding the term with a “+” sign.
To force Google to search for a particular term, put a + sign operator in
front of the word in the query. Note that you should not put a space
between the + and the word. So, to search for the satirical newspaper The
Onion, use [ +The Onion ], not [ + The Onion ].
The + operator is typically used in front of stop words that Google
would otherwise ignore or when you want Google to return only those
pages that match your search terms exactly. However, the + operator can be
used on any term.
Want to learn about Star Wars Episode One? “I” is a stop word and is
not included in a search unless you precede it with a + sign.
USE [ Star Wars +I ]
NOT [ Star Wars I ]
The – Operator
Precede each term you do not want to appear in any result with a “–” sign.
To find pages without a particular term, put a – sign operator in front
of the word in the query. The – sign indicates that you want to subtract or
exclude pages that contain a specific term. Do not put a space between the
– and the word, i.e.
USE [ dolphins –football ]
NOT [ dolphins – football ]
So, to search for a twins support group in Minnesota, but not return
pages relating to the Minnesota Twins baseball team:
USE [ twins support group Minnesota –baseball ]
NOT [ twins support group Minnesota ]
No pages containing the word “baseball” will be returned by the first
query.
Find pages on “salsa” but not the dance nor dance classes.
USE [ salsa –dance –class ]
NOT [ salsa ]
The ~ Operator
Find synonyms by preceding the term with a ~, which is known as the
tilde or synonym operator.
The tilde (~) operator takes the word immediately following it and
searches both for that specific word and for the word’s synonyms. It also
searches for the term with alternative endings. The tilde operator works
best when applied to general terms and terms with many synonyms. As
with the + and – operators, put the ~ (tilde) next to the word, with no
spaces between the ~ and its associated word, i.e.,
[ ~lightweight laptop ]
NOT [ ~ lightweight laptop ].
If you don’t like the synonyms that Google suggests when you use
the ~ operator, specify your own synonyms with the OR operator, which I
describe next.
The OR Operator
Specify synonyms or alternative forms with an uppercase OR or |
(vertical bar).
The OR operator, for which you may also use | (vertical bar), applies
to the search terms immediately adjacent to it. The first and second
examples will find pages that include either “Tahiti” or “Hawaii” or both
terms, but not pages that contain neither “Tahiti” nor “Hawaii.”
[ Tahiti OR Hawaii ]
[ Tahiti | Hawaii ]
The .. Operator
Specify that results contain numbers in a range by specifying two
numbers, separated by two periods, with no spaces.
For example, specify that you are searching in the price range $250 to
$1000 using the number range specification $250..$1000.
[ recumbent bicycle $250..$1000 ]
Find the year the Russian Revolution took place.
[ Russian Revolution 1800..2000 ]
The * Operator
Use *, an asterisk character, known as a wildcard, to match one or
more words in a phrase (enclosed in quotes).
Each * represents just one or more words. Google treats the * as a
placeholder for a word or more than one word. For example, [ “Google * my
life“ ] tells Google to find pages containing a phrase that starts with
“Google” followed by one or more words, followed by “my life.” Phrases that
fit the bill include: “Google changed my life,” “Google runs my life,” and
“Google is my life.”
[ “Google * my life“ ]
Querying for vulnerable sites or servers
using Google’s advance syntaxes
Well, the Google’s query syntaxes discussed above can really help
people to precise their search and get what they are exactly looking for.
Now Google being so intelligent search engine, malicious users don’t mind
exploiting its ability to dig confidential and secret information from
internet which has got restricted access. Now I shall discuss those
techniques in details how malicious user dig information from internet
using Google as a tool.
Using “Index of ” syntax to find sites
enabled with Index browsing
A webserver with Index browsing enabled means anyone can browse
the webserver directories like ordinary local directories. Here I shall discuss
how one can use “index of” syntax to get a list links to webserver which has
got directory browsing enabled. This becomes an easy source for
information gathering for a hacker. Imagine if the get hold of password files
or others sensitive files which are not normally visible to the internet.
Below given are few examples using which one can get access to many
sensitive information much easily.
Index of /admin
Index of /passwd
Index of /password
Index of /mail
"Index of /" +passwd
"Index of /" +password.txt
"Index of /" +.htaccess
"Index of /secret"
"Index of /confidential"
"Index of /root
Looking for vulnerable sites or servers using
“inurl:” or “allinurl:”
a. Using “allinurl:winnt/system32/” (without quotes) will list down all
the links to the server which gives access to restricted directories like
“system32” through web. If you are lucky enough then you might get access
to the cmd.exe in the “system32” directory. Once you have the access to
“cmd.exe” and are able to execute it then you can go ahead in further
escalating your privileges over the server and compromise it.
b. Using “allinurl:wwwboard/passwd.txt”(without quotes) in the
Google search will list down all the links to the server which are vulnerable
to “WWWBoard Password vulnerability”. To know more about this
vulnerability you can have a look at the following link:
http://www.securiteam.com/exploits/2BUQ4S0SAW.html
c. Using “inurl:.bash_history” (without quotes) will list down all the links
to the server which gives access to “.bash_history” file through web. This is
a command history file. This file includes the list of command executed by
the administrator, and sometimes includes sensitive information such as
password typed in by the administrator. If this file is compromised and if
contains the encrypted unix (or *nix) password then it can be easily cracked
using “John The Ripper”.
d. Using “inurl:config.txt” (without quotes) will list down all the links to
the servers which gives access to “config.txt” file through web. This file
contains sensitive information, including the hash value of the
administrative password and database authentication credentials. For
Example: Ingenium Learning Management System is a Web-based
application for Windows based systems developed by Click2learn, Inc.
Ingenium Learning Management System versions 5.1 and 6.1 stores
sensitive information insecurely in the config.txt file. For more information
refer the following links:
http://www.securiteam.com/securitynews/6M00H2K5PG.html
What Google Hack Searches?
VULNERABILITIES
Almost 70% of Websites have vulnerabilities
1. Known Vulnerabilities
i. Informally communicated
ii. Chain emails
2. Information Disclosure Vulnerabilities
i. Passwords
ii. Administrative files
iii. Sensitive customer information
iv. Military information (Submarines, docking stations of Navy Ships)
v. System email id lists
vi. Medical records
vii. Bank account numbers
Crawlers - Just index/Cache what ever they find.
PRIMARY REASONS
• People Negligence – Called GoogleDorks
• Increase in number of Remote administrative tools
• Security holes in the Networks
• Poor site configuration
e.g. Securing admin panel - .htaccess procedure (passowrd protection on
HTML documents)
10. Some Google Hack Tricks
Accessing Blocked Sites:
1. Go for www.google.com
2. More > Translate
3. Put the blocked site address in the given space which you want to access
4. Translate from any language to English (Let us go for www.youtube.com)
5. Here is the blocked site…(For proof you can check the language )
Live Network Camera Hack:
1. Go for www.google.com
2. Type as shown below
3.
Select any from the list
4. Live view with full control
Conclusion
Google Hacking uses Google Search to find security holes in the
configuration and code that websites use. Utilize searches to reveal
sensitive information, such as username/passwords, internal documents,
etc.
The techniques are commonly used during penetration testing/SEO.
This is used mostly to patch vulnerability and secure the webpage or
information
References
http://www.googleguide.com/advanced_operators.html
http://computer.howstuffworks.com/search-engine1.htm
http://johnny.ihackstuff.com