ArcGIS Server

Edit big data file share manifests in Manager

In this topic

Big data file shares are registered as a data store through ArcGIS Server Manager on your ArcGIS GeoAnalytics Server. A big data file share requires a manifest to outline the schema of the data, as well as the fields and formats that represent geometry and time in a dataset. The manifest is automatically generated when you register a big data file share. You may need to make modifications if there are any changes to your data, or if the manifest generation was unable to determine all the information needed — for example, if the automatically-generated manifest did not select the correct field for the geometry or time.

You can view and edit the datasets and manifest information through ArcGIS Server Manager on your ArcGIS GeoAnalytics Server .

Edit a big data file share

Once you have registered a big data file share, you can view and edit attributes and settings for that item's registered datasets by opening the big data file share manifest editor.

For example, you may want to verify the number of datasets within a registered file share. If, in doing so, you do not see the expected number of datasets in the registered file share, you should check whether the registered location contains valid datasets.

You may also want to review dataset schemas for a registered big data file share. You can modify a selected dataset's schema by updating its geometry, time definition, and field names in its associated manifest resource.

On the advanced tab of the big data file share manifest editor, you can upload a hints file to provide information about a dataset, such as the presence or absence of a header row, encoding, field delimiter, or record terminator. Regenerating the manifest after uploading a hints file will use the information provided to generate the manifest.

Optionally, you can download the manifest, edit it, and upload the edited manifest file.

Edit big data file share datasets

In the big data file share manifest editor, you can view a selected big data file share and the datasets that have been successfully registered within it. When selecting a dataset from the editor drop-down menu, the corresponding parameters are populated. For details about each option on this dialog box, see editing parameters in big data file shares. To edit dataset parameters, do the following:

Click the Edit pencil to see details and options for corresponding datasets.

Click the Datasets tab to show the registered datasets and their corresponding parameters.

Select a dataset from the drop-down menu to view the information represented in its manifest. Make updates to your dataset properties as needed.

When you have finished editing dataset properties, click Save.

Edit a big data file share manifest or hints file

On the Advanced tab of the big data file share editor, you can edit the associated manifest or hints file by choosing its respective tab. If you upload a manifest, it will overwrite any changes you have made to your big data file share manifest in the editor, and replace the current manifest. To learn more about the big data file share manifest, see Understanding a big data file share manifest. To learn more about using a hints file, see Understanding the hints file. To edit a big data file share manifest or hints file, do the following:

If you upload a hints file, be sure to regenerate the manifest. When you regenerate a manifest, only datasets with hints or new datasets will be updated, and changes made to any other datasets not in the hints file will remain the same.

Regenerate the manifest for a big data file share

After a big data file share is created and a manifest has been generated, a regenerate manifest button appears for each entry on the Registered Data Stores dialog box.

You can regenerate a manifest if you have added new data or if you have uploaded a hints file using the edit resource. The hints file provides specifications that are used when regenerating the manifest.

Note:

When a manifest is regenerated, it will update the manifest for existing datasets that have a hints file or new datasets. Any edits you have made to the manifest will be overwritten with the rules defined in the hints file.

Big data file share editing parameters

The big data file share editor comprises the following five sections:

Dataset selector

Fields

Geometry

Time

Dataset format

It is recommended to use a hints file before editing your data if manifest generation did not correctly determine field names, encoding, field delimiters, or quote characters.

Dataset selector

A manifest is composed of one or more datasets. The number of datasets is dependent on the number of folders in your big data file share location. When you open the manifest manager, you can view the datasets that have been successfully registered in your big data file share. When you select a dataset from the drop-down menu, the dataset parameters will be populated with the dataset information.

If you expected to find more datasets in your manifest or are missing any, do the following:

Check that your input data is in an allowable format, such as a collection of delimited files, shapefiles, parquet, or ORC.

Ensure that the schema of your input dataset of interest is consistent for a collection of files (all files in a single dataset must have the same fields).

Fields

The fields section lists all of the fields in a dataset. When you select a dataset, you will be able to see the following for each field:

The name of the field.

The field type.

The field name and type can be modified for delimited files. If you are modifying more than one field name, it is recommended to use a hints file.

If the input dataset is a delimited file, there will be multiple parameters that can be modified in the manifest in Manager.

Geometry

The geometry section lists the type of geometry, and how it is represented. The following table outlines the available options, with notes for changes you can make depending on the input dataset type:

Geometry parameters

Parameter

Description

Delimited files

Shapefiles

ORC files

Parquet files

Geometry

The Geometry type. Options are Point, Polyline, Polygon, or None. If there is no geometry, the input is a table.

Editable

Cannot be modified

Editable

Editable

Spatial reference (WKID/WKT)

The spatial reference of the dataset. This option is only shown if the dataset is not a table.

This can be modified. By default, it will be set to 4326, WGS 1984.

Cannot be modified

Editable

Editable

Geometry formatting type

How the geometry is formatted for each feature. Options are XYZ (fields that represent X, Y, and optionally Z values—XYZ is only applicable to points), WKT (well known text), GeoJson, EsriJson, and shape. This option is only available if the dataset is not a table and not a shapefile.

Editable

Not available

Editable

Editable

Time

The time section outlines how time is represented. The following table outlines the available options, with notes for changes you can make depending on the input dataset type. Time options are the same for all data types, except where noted.

Time parameters

Parameter

Description

Example

Time type

The type of the input time. Options are Instant (a single moment in time), Interval (a span of time with a start and end time), and None.

Instant

Time zone

The time zone of the input time. This option is only available if Time Type is not None.

UTC

Name and formatting table for time

This table selects the time field or fields, and outlines how time is defined. Time can use one or more fields to define time, as well as use one or more formats for a single field. By default, the first field with the name "time" will be used as the time field, with an estimate of the time format. If there is a shapefile, the first field of type "date" will be used. If time is of type Interval, there must be a start and end time specified. The time formatting table is only available if Time Type is not None.

Example with a single field used to represent time with two different formats:

Name: TimeField Format: yy/MM/dd hh:mm:ss

Name: TimeField Format: yyyy-MMM-dd hh:mm:ss

Example with two fields used to represent time :

Name: DateField Format: yy/MM/dd

Name: TimeField Format: hh:mm:ss

Time formats

The following table outlines how to represent time when you edit a big data file share though ArcGIS Server Manager or directly in a manifest. The examples show how to represent the time January 2nd, 2016 at 9:45:02.05 PM.

Time formats in big data file shares

Symbol

Meaning

Example

yy

The year, represented by two digits.

16

yyyy

The year, represented by four digits.

2016

MM

The month, represented numerically.

01 or 1

MMM

The month, represented using three letters.

Jan

MMMM

The month, represented using the complete spelling.

January

dd

The day.

02 or 2

HH

The hour when using a 24-hour day; values range from 0-23.

21

hh

The hourwhen using a 12-hour day; values range from 1-12.

9

mm

The minute; values range from 0-59.

45

ss

The second; values range from 0-59.

02

SSS

The millisecond; values range from 0-999.

50

a

The AM/PM marker.

PM

epoch_millis

The time in milliseconds from epoch.

1509581781000

epoch_seconds

The time in seconds from epoch.

1509747601

Z

The time zone offset expressed in hours.

-0100 or -01:00

ZZZ

The time zone offset expressed using IDs.

America/Los_Angeles

The following table shows examples for different formats of the same date, January 2nd, 2016 at 9:45:02.05 PM:

Time format examples

Input date

Date format

01/02/2016 9:45:02PM

MM/dd/yyyy hh:mm:ssa

Jan02-16 21:45:02

MMMdd-yy HH:mm:ss

January 02 2016 9:45:02.050PM

MMMM dd yyyy hh:mm:ss.SSSa

01/02/2017T9:45:14:05-0000

MM/dd/yyyy'T'HH:mm:ssZ

Dataset format

The dataset format section outlines the format the data is in. Data may be in one of the following formats:

Shapefile (.shp)

Delimited file (for example .csv)

Parquet file

ORC file

The available parameters differ depending on the dataset. For shapefiles, ORC and parquet files, the only parameter is the file type, which cannot be modified. If the input dataset is a delimited file, there will be multiple parameters that can be modified in the manifest in Manager. These are outlined in the following table:

Dataset formats

Parameter

Description

File extension

Lists the file type extension on the input dataset. Common formats are .csv and .txt. This information can be included in the hints file.

Field delimiter

Determines the delimiter for each field. Common formats are , and ;. This information can be included in the hints file.

Record terminator

Determines the terminator for each row of data. Common formats are \n and \t. This information can be included in the hints file.

Quote character

Determines the character used for quotes. This information can be included in the hints file.

Has header row

A Boolean that determines if the input table included a header row. If a header row is included, the headers will be used for the field names. Field name information is predicting geometry and time fields. Headers can be set using the hints file.

Encoding

The type of encoding used on the file. By default, this will be UTF-8. This can be set in the hints file.