Mystery Solved – Crawled Properties in SharePoint (Part 7)

Up to this point, we have gone through the details of the following Crawled Property Categories:

Overview

Basic Category

Office Category

People Category

SharePoint Category

Mail Category

This post will cover the details of the Web Category of Crawled Properties.

Crawled Properties View – Web

The Web Crawled Property Category contains metadata of the page(s) such as the code page, anchor text, when a page expires, description, etc. The majority of crawled properties in this category are self-explanatory.

The following table provides details about each of the crawled properties in the Basic category based on a test environment. The table is sorted by Property Set ID, or GUID, then by Name and includes the following columns:

Name – The name of the crawled property

Represented in Crawled Properties View – The name of the crawled property in the Crawled Properties View

Property Name – A description of what the property is

PROPSET – The GUID or Property Set ID

Property Set Description – A description of the property set for which this is a member

Variant Type – The variant type of the property

Variant Type Description – A description of the variant type

Mapped To – A listing of which managed property(ies) this crawled property is mapped to

Indexed – Indicates whether this property is included in the index from an “out of the box” installation.

NOTE: Your environment may have more or less crawled properties based on the content being crawled.

Name

Represented in Crawled Properties View

Property Name

PROPSET

Property Set Description

Variant Type

Variant Type Description

Mapped To

Index

2

Web:2(Text)

href

70eb7a10-55d9-11cf-b75b-00aa0051fe20

HTML Information

31

Text

Path(Text)

Yes

3

Web:3(Text)

HTML Heading 1

70eb7a10-55d9-11cf-b75b-00aa0051fe20

HTML Information

31

Text

4

Web:4(Text)

HTML Heading 2

70eb7a10-55d9-11cf-b75b-00aa0051fe20

HTML Information

31

Text

5

Web:5(Text)

HTML Heading 3

70eb7a10-55d9-11cf-b75b-00aa0051fe20

HTML Information

31

Text

6

Web:6(Text)

HTML Heading 4

70eb7a10-55d9-11cf-b75b-00aa0051fe20

HTML Information

31

Text

7

Web:7(Text)

HTML Heading 5

70eb7a10-55d9-11cf-b75b-00aa0051fe20

HTML Information

31

Text

8

Web:8(Text)

HTML Heading 6

70eb7a10-55d9-11cf-b75b-00aa0051fe20

HTML Information

31

Text

IMG.ALT

IMG.ALT(Text)

Alternate Text for Image

70eb7a10-55d9-11cf-b75b-00aa0051fe20

HTML Information

31

Text

INPUT.ALT

INPUT.ALT(Text)

Alternate Text for Image

70eb7a10-55d9-11cf-b75b-00aa0051fe20

HTML Information

31

Text

SELECT

SELECT(Text)

Drop Down List

70eb7a10-55d9-11cf-b75b-00aa0051fe20

HTML Information

32799

Binary Data

Yes

2

Web:2(Integer)

Detected Language

c82bf596-b831-11d0-b733-00aa00a1ebd2

NetLib Info

19

Integer

3

Web:3(Text)

HTML Comment

c82bf596-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

4

Web:4(Integer)

Code Page

c82bf596-b831-11d0-b733-00aa00a1ebd2

NetLib Info

19

Integer

NLCodePage(Integer)

A.ENDANCHOR

A.ENDANCHOR(Text)

End Anchor Text

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

A.HREF

A.HREF(Text)

HTML Link

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

APPLET.CODE

APPLET.CODE(Text)

Applet Code

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

APPLET.CODEBASE

APPLET.CODEBASE(Text)

Directory path of applet class

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

AREA.HREF

AREA.HREF(Text)

HTML Area Reference

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

BASE.HREF

BASE.HREF(Text)

HTML Link to only web pages

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

Yes

BGSOUND.SRC

BGSOUND.SRC(Text)

Background Sound

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

BODY.BACKGROUND

BODY.BACKGROUND(Text)

Backgound color/image of page

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

EMBED.SRC

EMBED.SRC(Text)

Embeded source in page

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

FRAME.SRC

FRAME.SRC(Text)

Source of frame in page

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

GENERATOR

GENERATOR(Text)

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

IFRAME.SRC

IFRAME.SRC(Text)

URL of page in iFrame

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

IMG.DYNSRC

IMG.DYNSRC(Text)

Dymanic Source of media Image

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

IMG.SRC

IMG.SRC(Text)

Source of image

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

INPUT.SRC

INPUT.SRC(Text)

URL of image to display

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

LINK.HREF

LINK.HREF(Text)

URL of linked web resource

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

LINK.OFFICECHILD

LINK.OFFICECHILD(Text)

URL of child links

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

LINK.OFFICECHILDLIST

LINK.OFFICECHILDLIST(Text)

URLs of child lists

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

LINK.STYLESHEET

LINK.STYLESHEET(Text)

URL of linked style sheet

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

META.URL

META.URL(Text)

URL metadata

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

OBJECT.CODE

OBJECT.CODE(Text)

URL of file containing compiled Java Class

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

OBJECT.CODEBASE

OBJECT.CODEBASE(Text)

URL of the component

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

SCRIPT

SCRIPT(Text)

Script for the page

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

STYLE

STYLE(Text)

Settings of inline styles for an element

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

TABLE.BACKGROUND

TABLE.BACKGROUND(Text)

Color/image of table background

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

TD.BACKGROUND

TD.BACKGROUND(Text)

Background color/image of drop down list

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

TH.BACKGROUND

TH.BACKGROUND(Text)

Background of table header

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

TR.BACKGROUND

TR.BACKGROUND(Text)

Background of table row

c82bf597-b831-11d0-b733-00aa00a1ebd2

NetLib Info

31

Text

COLLABORATIONSERVER

COLLABORATIONSERVER(Text)

Meta name of SharePoint Team Web Site

d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1

HTML Meta Info

31

Text

CONTENT-TYPE

CONTENT-TYPE(Text)

Content Type

d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1

HTML Meta Info

31

Text

DESCRIPTION

DESCRIPTION(Text)

Description

d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1

HTML Meta Info

31

Text

Description(Text)

Yes

EXPIRES

EXPIRES(Text)

Expire

d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1

HTML Meta Info

31

Text

GENERATOR

GENERATOR(Text)

d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1

HTML Meta Info

31

Text

HELPAWSKEYWORD

HELPAWSKEYWORD(Text)

d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1

HTML Meta Info

31

Text

Yes

PROGID

PROGID(Text)

Program ID

d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1

HTML Meta Info

31

Text

ROBOTS

ROBOTS(Text)

Robots

d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1

HTML Meta Info

31

Text

TOPIC

TOPIC(Text)

Topic

d1b5d3f0-c0b3-11cf-9a92-00a0c908dbf1

HTML Meta Info

31

Text

Yes

There are five categories left of which we have not covered:

Business Data

Internal

Notes

Tiff

XML

When I gather more information on the above Crawled Property Categories, I'll update this blog. I sure hope you've enjoyed this series of technical drill down into the crawled properties in Microsoft Office SharePoint Server 2007

Anne, do you have any info about how properties from PDF files are handled? I’m specifically interested in the Title value. I see a value when I open a file in Adobe Pro, but the value does not show up in the Title column of the doc library when I upload the file, but it does show up as the Title value in the search results web part. i am confused about what it going on. do you have any insight that you can share?