Tagged as

Stats

Working with the dtSearch® ASP.NET Core WebDemo Sample Application

Some features of the ASP.NET Core WebDemo application dtSearch provides in the dtSearch Engine product include a scrolling word list, faceted search navigation, and multicolor hit highlighting.

Editorial Note

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

Introduction

In this article, I’m going to show you some of the features of the ASP.NET Core WebDemo application dtSearch provides in the dtSearch Engine product. Features include a scrolling word list, faceted search navigation, and multicolor hit highlighting. The features outlined in this article are available in the dtSearch Engine version 7.91 build 8546 or later. Working with the demo also requires Visual Studio 2017 updated to 15.5 or later.

One of the things developers have appreciated over the many releases of the dtSearch developer products are the code samples dtSearch provides to show how to use various features. Unlike some product companies that give short snippets of code to show how to use a feature, dtSearch shows how to use many features in a real world way, often allowing developers to use the code with only minor changes in their own applications. The ASP.NET Core WebDemo sample application is a great example of this.

Example folder for NetStd

The .NET Core implementation is also a good starting point for developing cross-platform applications. You can read more on this from Microsoft here. dtSearch highlights this by including sample code for not only the ASP.NET Core WebDemo, but also other platforms using the .NET Standard library as shown here in the example folder for NetStd.

Setting Up the Sample Data

The *.gz files contain a series of *.nc files, each representing one filing. The *.nc files in turn contain metadata and documents in a unique SEC format. dtSearch has a utility, EdgarExtract.exe, that will unpack a single *.gz file to a folder tree, like this:

Key Demo Features

Because of the depth of the WebDemo, instead of going line by line, I will point out various objects and the features they encapsulate. The first of these objects is the Index.cshtml object and its accompanying C# file. This file is well worth reading on your own.

Index.cshtml and Index.cshtml.cs

The SearchModel class handles setting up and executing a SearchJob.

The Index.chtml file is made up mainly of the SearchModel class.

It allows both POST and GET requests. POST requests will come from the search form. GET requests will come from links created in search results pages to initiate additional searches. Two types of links are created:

searches to go to other pages of search results, generated by the SearchPager class, and

searches to handle faceted searches, generated by the SearchFacets class

Displaying various options in AppSettings.json/AppSettings.cs.

Indexes are identified by IndexId rather than by path to avoid exposing index paths over the web. A table in AppSettings.json/AppSettings.cs provides the mapping between index ids and index paths.

Scrolling Word List

The scrolling word list is implemented on the server side in WordsController.cs, which uses the dtSearch WordListBuilder control to enumerate words. The client-side JavaScript is in wwwroot\js\for_WordList.js and uses AJAX to get words to display from the server.

Use the IndexCache object and the WordListBuilder together for optimal performance. (Otherwise, the WordListBuilder would need to open the index for every incoming request.) The IndexCache is set up as a singleton in Startup.cs and passed to WordsController.cs and SearchModel.cs using dependency injection.

Document Display

The demo includes options to display retrieved documents in an IFRAME (ViewDoc.cshtml) or in a separate window (ViewDocEmbedded.cshtml).

The demo does not maintain session state so the links to open each document must contain all of the information needed to highlight hits in that document. The GetSearchResultsItem method in SearchModel handles this, using the SearchResults.UrlEncode* methods to generate a UrlEncoded expression to use in the link HREF attribute. A set of corresponding UrlDecode* methods are used to re-generate the SearchResults item when a link is clicked.

There are "WithIndexId" versions of these methods that allow a supplied IndexId to be used instead of the index path. This is done to avoid exposing index paths to web users.

Multicolor hit highlighting (using a different color for each search term) can be enabled through an option in appSettings.cs. Multicolor highlighting requires more information to be generated at search time because the dtSearch Engine needs to know which hit matches each search term. Three flags at search time request this information: dtsSearchWantHitsByWord, dtsSearchWantHitsArray, and dtsSearchWantHitsByWordOrdinals.

For more information on multicolor hit highlighting, check out this article.

Because this information is generated by a search, the link for each search results item needs to contain enough information to run a search that just retrieves that item. The GetSearchResultsItem method in SearchModel handles this, using the SearchResults. UrlEncodeItemAsSearchWithIndexId method in the API.

SearchPager.cs

The SearchPager class handles paging of search results. It does this by URL-encoding the search (using SearchUrlEncoder) that generates each page of results. A Bootstrap pagination control in SearchResults.cshtml is used to display the links to generate each page of results.

SearchFacets.cs and FacetsColumn.cshtml

SearchFacets handles implementing the faceted search interface, and FacetsColumn.cshtml displays the faceted search information. For further information about implementing a faceted search interface with the dtSearch Engine, please see:

For each index, the facets applicable to the index are listed in the index table in AppSettings.json.

The facets in the demo come from the SEC Edgar database. The SEC Edgar database is a public collection of securities filings. The three facets in the demo are:

CompanySic: Numeric "SIC" code used in Edgar to identify the type of business. EdgarExtract adds descriptive text so the field has the descriptive text with the numeric code in brackets. For example, SIC Code 1311 is represented as "Crude Petroleum & Natural Gas [1311]".

CompanyState: A two letter code that the SEC uses to identify the state or country where the company is located. For states, this is just the usual abbreviation (MD=Maryland), but for countries the codes are pretty random. Again, EdgarExtract adds descriptive text with the two-letter code following in brackets.

FilingType: The SEC form that the filing corresponds to.

The facet tables in AppSettings.json, CompanySic and CompanyState are marked with the BracketKeys property, which tells the demo that the key value in a facet is in brackets at the end of the facet. For example, if someone clicks on the facet "Crude Petroleum & Natural Gas [1311]", this will generate a search that includes the condition "SicCode contains 1311".

Summary

dtSearch by far supplies the most complete code I have ever worked with for their various demonstrations. The dtSearch ASP.NET Core WebDemo is no exception. Whether you are looking for examples of using the faceted searches, the WordListBuilder, or many other features using the ASP.NET Core code, this demo is a good starting point.

Share

About the Author

Scott Barker, a former member of the Microsoft team, has been developing with dtSearch products for use in both web and desktop based applications for many years now. This includes a wide range of applications from managing documents on unknown network setups for government agencies, including indexing IRS forms using dtSearch Spider product against the IRS Website, to using the dtSearch engine for indexing email messages for use with a Deep Learning Engine. You can see many of these projects documented in cases studies on the dtSearch Website on the Custom dtSearch Development — Independent Provider Listings page.