geostats

Description

Use the geostats command to generate statistics to display geographic data and summarize the data on maps.

The command generates statistics which are clustered into geographical bins to be rendered on a world map.
The events are clustered based on latitude and longitude fields in the events. Statistics are then evaluated on the generated clusters. The statistics can be grouped or split by fields using a by clause.

For map rendering and zooming efficiency, the geostats command generates clustered statistics at a variety of zoom levels in one search, the visualization selecting among them. The quantity of zoom levels is controlled by the binspanlat, binspanlong, and maxzoomlevel options. The initial granularity is selected by the binspanlat and the binspanlong. At each level of zoom, the number of bins is doubled in both dimensions for a total of 4 times as many bins for each zoom in.

Required arguments

stats-agg-term

Syntax: <stats-func> ( <evaled-field> | <wc-field> ) [AS <wc-field>]

Description: A statistical aggregation function. See Stats function options. Use the AS clause to place the result into a new field with a name that you specify. The function can be applied to an eval expression, or to a field or set of fields. You can use wild card characters in the field name.

Optional arguments

binspanlat

Syntax: binspanlat=<float>

Description: The size of the bins in latitude degrees at the lowest zoom level.

Default: 22.5. If the default values for binspanlat and binspanlong are used, a grid size of 8x8 is generated.

binspanlong

Syntax: binspanlong=<float>

Description: The size of the bins in longitude degrees at the lowest zoom level.

Default: 45.0. If the default values for binspanlat and binspanlong are used, a grid size of 8x8 is generated.

by-clause

Syntax: BY <field>

Description: The name of the field to group by.

globallimit

Syntax: globallimit=<int>

Description: Controls the number of named categories to add to each pie-chart. There is one additional category called "OTHER" under which all other split-by values are grouped. Setting globallimit=0 removes all limits and all categories are rendered. Currently the grouping into "OTHER" only works intuitively for count and additive statistics.

Default: 10

locallimit

Syntax: locallimit=<int>

Description: Specifies the limit for series filtering. When you set locallimit=N, the top N values are filtered based on the sum of each series. If locallimit=0, no filtering occurs.

latfield

Syntax: latfield=<field>

Description: Specify a field from the pre-search that represents the latitude coordinates to use in your analysis.

Defaults: lat

longfield

Syntax: longfield=<field>

Description: Specify a field from the pre-search that represents the longitude coordinates to use in your analysis.

Default: lon

maxzoomlevel

Syntax: maxzoomlevel=<int>

Description: The maximum level to be created in the quad tree.

Default: 9. Specifies that 10 zoom levels are created, 0-9.

outputlatfield

Syntax: outputlatfield=<string>

Description: Specify a name for the latitude field in your geostats output data.

Default: latitude

outputlongfield

Syntax: outputlongfield=<string>

Description: Specify a name for the longitude field in your geostats output data.

Default: longitude

translatetoxy

Syntax: translatetoxy=<bool>

Description: If true, geostats produces one result per each locationally binned location. This mode is appropriate for rendering on a map. If false, geostats produces one result per category (or tuple of a multiply split dataset) per locationally binned location. Essentially this causes the data to be broken down by category. This mode cannot be rendered on a map.

Default: true

Stats function options

stats-func

Syntax: The syntax depends on the function that you use. Refer to the table below.

Description: Statistical functions that you can use with the geostats command. Each time you invoke the geostats command, you can use one or more functions. However, you can only use one BY clause. See Usage.

Usage

Functions and memory usage

Some functions are inherently more expensive, from a memory standpoint, than other functions. For example, the distinct_count function requires far more memory than the count function. The values and list functions also can consume a lot of memory.

If you are using the distinct_count function without a split-by field or with a low-cardinality split-by by field, consider replacing the distinct_count function with the the estdc function (estimated distinct count). The estdc function might result in significantly lower memory usage and run times.

Basic examples

1. Use the default settings and calculate the count

Cluster events by default latitude and longitude fields "lat" and "lon" respectively. Calculate the count of the events.

... | geostats count

2. Specify the latfield and longfield and calculate the average of a field

Compute the average rating for each gender after clustering/grouping the events by "eventlat" and "eventlong" values.

Extended examples

3. Count each product sold by a vendor and display the information on a map

Note: This example uses the Search Tutorial data files tutorialdata.zip, and the two lookup files (prices.csv
and vendors.csv). To use this example with your Splunk deployment, you must complete the steps in the Use field lookups section of the tutorial for both the prices.csv and the vendors.csv files. You can skip the step in the tutorial that makes the lookups automatic.

This search uses the stats command to narrow down the number of events that the lookup and geostats commands have to process.

Use the following search to compute the count of each product sold by a vendor and display the information on a map.

In this case, the sourcetype=vendor_sales and each of the events looks like this:

[26/Sep/2015:18:24:02] VendorID=5036 Code=B AcctID=6024298300471575

The prices_lookup is used to match the Code field in each event to a product_name in the table. The vendors_lookup is used to output all the fields in vendors.csv: Vendor, VendorCity, VendorID, VendorLatitude, VendorLongitude, VendorStateProvince, VendorCountry that match the VendorID in each event.

Note: In this search, the .csv files are uploaded and the lookups are defined but are not automatic.

This search produces a table displayed on the Statistics tab:

On the Visualizations tab, you should see the information on a world map. In the screen shot below, the map is zoomed in and the mouse is over the pie chart for a region in the northeastern USA:

Enter your email address, and someone from the documentation team will respond to you:

Send me a copy of this feedback

Please provide your comments here. Ask a question or make a suggestion.

Feedback submitted, thanks!

You must be logged into splunk.com in order to post comments.
Log in now.

Please try to keep this discussion focused on the content covered in this documentation topic.
If you have a more general question about Splunk functionality or are experiencing a difficulty with Splunk,
consider posting a question to Splunkbase Answers.

0
out of 1000 Characters

Your Comment Has Been Posted Above

We use our own and third-party cookies to provide you with a great online experience. We also use these cookies to improve our products and services, support our marketing campaigns, and advertise to you on our website and other websites. Some cookies may continue to collect information after you have left our website.
Learn more (including how to update your settings) here »