Skillset

No need for an introduction, Google is quite possibly the more powerful search engine used today, even used sometimes to check our connectivity; except that the power of the single search bar on the top of Google has become a source of concern for many, and if not they should and we will see why!

In addition to be one of the most powerful information databases, Google can be used to find much more than what we even should find. Google can find things like sensitive files, web vulnerabilities; it allows the identification of operating systems and can even be used to find passwords, databases and even whole mailbox content… Google understands which operators to target in order to get precisely what we are seeking; I’ll try to detail the most important of them.

You have to know that queries on Google are not case sensitive, thus there is no difference between lower or upper cases or even a combination of both: Security, SECURITY and SeCuriTY will return exactly the same result, but this rule has an exception when using logical operators.

Logical operators and symbols

Google can understand three logical operators: AND, NOT and OR, so Google recognizes the “OR” as the operator and “Or”, “oR” or “or” as search elements or keywords.

The AND operator is used to include more than one keyword in a single research query and can be replaced by a single space ” ” even if the results differ slightly between both, as you can see by looking for example for “reverse AND engineering AND tutorials “and” reverse engineering tutorials”

The NOT operator is extremely useful and can be used to eliminate some keywords from the result of a query, this operator is equivalent to the sign “-” (less) used within a keyword, to figure out the meaning try searching for “email service” and “email service -marketing” (please note that there is no space between “-” and “marketing”)

The OR operator is used to include in the result of a query a keyword or another keyword but not both, and is equivalent to the use of “|” , eg “reverse OR engineering” means to Google exactly “reverse|engineering” (try it then try “reverse engineering” to see the difference)

In addition to these operators, Google distinguishes between some symbols like ~, +, *,””

Using the tilde “~”

This little character is used to include in the result of a query the desired keyword, its synonyms and words similar to it, for example, if you search “it security ~tools” the result will be more consistent the result of “it security tools”, since Google will consider also terms such as “Software” and show them among the returned result.

Using the sign plus “+”

Google tends to ignore punctuations and removes little words like “we”, “the”, ‘to”, and “of”… Using the sign plus before a word tells Google to include it in the search query, so this way and for instance, the result of this query “security is never complete” will definitely differs from this one “security +is never complete”

Use of quotation marks “” (or exact phrase search)

If you are sure that you have entered a word as it should be written but Google continues to suggest spelling corrections, or if you want to search for a phrase, quote or an error message … putting your query between quotes marks provides you with a more relevant result, example try searching “Debugging DLLs” with and without quotes.

Using the asterisk “*” also called wildcard or Joker

The use of the wildcard helps a lot when you want to search something but with one or more missing words (generally used with exact phrase search). For example if you want to find the title of the movie “Get the Gringo” but you are remembering only “Get The” you can try “Get The * movie”, try also “the art of *” hacking book”

Now that we know a little more about how the Google search bar interprets what we type in, let’s see some more interesting operators and keywords, especially when talking about security!

Define:word

This query returns the definition of the given word from the most reliable sources (websites). Define:Security

Filetype:file_extension

Using Filetype you can find files with specific extensions; this means that you restrict your search to a specific file type. Note that there is no space between filetype: and the following word; eg. We can search for databases backups using “backup filetype:sql”

Click to Enlarge

Ext :file_extension

Regarding this operator, we can say that it has more or less the same role as the one cited above (filetype), except that the use of “ext” to seek uncommon extensions (like dmp, ks, key …) sends a more deep and accurate result.

Intitle:keyword(s)

This keyword allows you to search for a single word or a whole phrase present in the title of web pages and it is a commonly used keyword / operator to find directory listings. For example: intitle:index of “Last modified”

Click to Enlarge

You can also use allintitle:keyword1 keyword2 keyword3 … to find results with all these different elements / keywords in web page titles.

Inurl :keyword

As Intitle and allintitle, Inurl and Allinurl can be used find one or more keywords present in the web pages URLs, this operator is widely used and can provide a lot of sensitive information such as in the case of the use of this query inurl:cgi-bin/etc/

Intext :keyword / Allintext :keyword1 keyword2 keyword3 …

Allintext and intext can search for keywords present in the body of web pages or documents and can be very helpful to find some interesting things like: allintext:”Control Panel” “login”

Site:domain

The use of the keyword site restricts the result to a particular website; specifying the domain, Google filters the result by limiting it to the chosen domain or website. Site:com, site:fr , site:gov … or you can limit your query to a specific website “reverse engineering site:infosecinstitute.com”

Cache :www.site.com

Once a website is indexed by Google, there are a lot of chances that it is kept in the Google cache, so we can get some old information even after website’s updates or in some cases even if the website is not available anymore:

Info :www .site.com

This query returns links to pages containing information about the website or web page in question. For example info:infosecinstitute.com

Google is not only good at finding stuff, it can even do math!

Until now, there’s nothing bad, but we will see that by combining different operator’s together, different keywords and knowing exactly what we want to find … the results usually exceed our expectations and especially when we are looking for vulnerabilities or some “private” data. This is conventionally called Google Hacking.

A according to the Wikipedia definition, Google hacking involves using advanced operators in the Google search engine to locate specific strings of text within search results. Some of the more popular examples are finding specific versions of vulnerable web applications. The following search query would locate all web pages that have that particular text contained within them. It is normal for default installations of applications to include their running version in every page they serve, e.g., “Powered by XOOPS 2.2.3 Final”.

Finding usernames

We will use Google to find files containing user names which is useful for making dictionaries for example. allintext:username filetype:log . Here is a part of a file with more than 2209 rows:

And using the same query I found an SQL injection log attack:
2012-08-15 03:48:50 213.xxx.xx.229 cid http://www.h*****.at/index.php?option=com_yelp&controller=showdetail&task=showdetail&cid=-1+UNION+ALL+SELECT+1,2,3,concat(0×26,0×26,0×26,0×25,0×25,0×25,username,0x3a,password,0×25,0×25,0×25,0×26,0×26,0×26),5,6,7,8,9,10,11,12,13,14,15,16,17+FROM+jos_users– 2012-08-21 04:48:01 61.xxx.xxx.72 id http://www.h*****.at/index.php?option=com_recipes&Itemid=S@BUN&func=detail&id=-1/**/union/**/select/**/0,1,concat(username,0x3a,password),username,0x3a,5,6,7,8,9,10,11,12,0x3a,0x3a,0x3a,username,username,0x3a,0x3a,0x3a,21,0x3a/**/from/**/mos_users/*

Collecting email addresses

allintext:email OR mail +*gmail.com filetype:txt, with this query I was really surprised since the first result was a text file (without talking about the very interesting host found) containing 35,572 email addresses and passwords

We can almost find everything we want using Google if we are able enough to sharpen our query. I enjoyed making some queries using different combinations of keywords within different operators, see some of results below:

Full information about some website’s customers with their names, addresses, postal codes, cities, phones, mobiles and emails addresses

Click to Enlarge

You can see that things are getting more serious. As you probably guessed, no one escapes the indexation’s spiders and crawlers of Google!

Here is an Excel file containing names, country codes, marks and bachelor courses of more than 8014 students:

Click to Enlarge

Here are full dumps of databases of tens if not hundreds of some website containing in some cases clear usernames and passwords:

Click to Enlarge

I’m going to stop at this point, no need for more demonstration; Google is certainly our common friend, including malicious people with malicious intents, before putting a file, a directory or any other information that’s not supposed to be publicl, you should remember checking the state of access to your sensitive files and folders.

The use of an empty index.hml file within a directory can be very useful to remove simple directory listing, think also about applying the correct CHMOD to your sensitive directories and limit or remove access to your uploaded backups.

The use of the file Robots.txt can also save the privacy of your data; you can prevent Google or any other search engine from indexing your website, files or directories by correctly filling a Robots.txt file.

The following tips may help:

Preventing Google from indexing your site:

User-agent: Googlebot

Disallow: /

Preventing every search engine from indexing your site:

User-agent: *

Disallow: /

You can also prohibit Google from indexing a specific file type:

User-agent: Googlebot

Disallow: /*.sql$

To prohibit a directory and all its content from being indexed by Google:

User-agent: Googlebot

Disallow: /directoryName/

To prohibit a specific page from being indexed by Google:

User-agent: Googlebot

Disallow: /confidential.html

These tips could be used along with some HTML tags (Meta tags) which you can place between <head> and </head>

<meta name=”robots” content=”noindex, nofollow”>

And you can also prevent caching your website by Google by using this:

<meta name=”Googlebot” content=”noarchive”>

This non-exhaustive list of solutions may possibly help you to protect yourself against search engines and especially against Google, but you must be very careful when handling the way Googlebot (or any other search engine crawler) can see your website to not see your pages disappearing completely from their search engine results!

Soufiane Tahiri is is an InfoSec Institute contributor and computer security researcher, specializing in reverse code engineering and software security. He is also founder of www.itsecurity.ma and practiced reversing for more then 8 years. Dynamic and very involved, Soufiane is ready to catch any serious opportunity to be part of a workgroup.
Contact Soufiane in whatever way works for you:
Email: soufianetahiri@gmail.com
Twitter: https://twitter.com/i7s3curi7y
LinkedIn: http://ma.linkedin.com/in/soufianetahiri
Website: http://www.itsecurity.ma

You can find most of the Google Hacks in the book “De Google Code” Author Henk van Ess. Google helps you to find out more…type: weather zurich, 123 pound to dollar, bikes 100 … 300 euro, and so on.
IMHOP it has not to do with hacking, it is just a very handy search engine.

And you can find the same thing on Google for free ;) … Yeah its just conventionally called “Google Hacking”

grasso addominale

It’s a pity you don’t have a donate button! I’d definitely donate to this outstanding blog! I guess for now i’ll settle for bookmarking and adding your RSS feed
to my Google account. I look forward to new updates and
will talk about this website with my Facebook group.
Chat soon!

About InfoSec

InfoSec Institute is the best source for high quality information security training. We have been training Information Security and IT Professionals since 1998 with a diverse lineup of relevant training courses. In the past 16 years, over 50,000 individuals have trusted InfoSec Institute for their professional development needs!

Join our newsletter

File download

First Name

Last Name

Work Phone Number

Work Email Address

Job Title

How will you fund your training?

Why Take This Training?

What is your timeline for training?

InfoSec institute respects your privacy and will never use your personal information for anything other than to notify you of your requested course pricing. We will never sell your information to third parties. You will not be spammed.

Comments

What is Skillset?

Skillset

Practice tests & assessments.

Practice for certification success with the Skillset library of over 100,000 practice test questions. We analyze your responses and can determine when you are ready to sit for the test. Along your journey to exam readiness, we will:

1. Determine which required skills your knowledge is sufficient
2. Which required skills you need to work on
3. Recommend specific skills to practice on next
4. Track your progress towards a certification exam