TC Oversight Controls

General Install Requirements

Anti-Virus Scanning

If not configured properly, regular anti-virus scanning may have a negative impact on both TC Oversight Controls and your data. Scanning certain files and directories could result in negative consequences, such as damaged or permanently lost data. After TC Oversight Controls is installed in your environment, anti-virus scanning must be configured in a way that certain files and directories are skipped over during a scan. Setting up exceptions in your anti-virus system is required in order to make sure data needed in the current flow is neither modified nor deleted.

The following sections will outline the best practices for clients who have already installed TC Oversight Controls and have active anti-virus policies in place.

TCOC Responsibilities

The following repositories should be excluded from anti-virus scans. Repository locations are determined in TCOC’s properties file and are configured when TCOC is installed.

Repository

Location

Content Repository

• Holds content of current files being processed as well as any archived data.

./content_repository

FlowFile Repository

• Most crucial repository.

• Corruption of this repository results in data loss.

./flowfile_repository

Provenance Repository

• How much size you need depends on size of your dataflow, volume of data, and number of events you want to be able to retain.

./provenance_repository

Database Repository

• Relatively small repository.

• Flow configuration history and user database exists here.ts you want to be able to retain.

./database_repository

Log File Directory

Logback.xml

Config Files

Technically Creative recommends avoiding active anti-virus systems that monitor access to the underlying disk systems used for metadata storage. These processes store data structures only; nothing is stored that is executable by the underlying operating system. As these processes can be quite active, potentially performing continuous writes against large files, the best performance requires direct, unimpeded access to the underlying filesystem. Any anti-virus system that traps filesystem calls will have a negative impact on system performance.

The following config files should be excluded from anti-virus scans.

Apache Hadoop HDFS Config File

Setting where defined in core-site.xml and hdfs-site.xml

Namenode

dfs.name.dir

Secondary Namenode

fs.checkpoint.dir

Datanode

dfs.datanode.dir

Tasktracker

mapred.local.dir

Jobtracker

mapred.local.dir

LOGS

$HADOOP_LOG_DIR

/tmp/Hadoop

hadoop.tmp.dir

Apache H Base Config File

Setting where defined in hbase-default.xml

HBase tmp directory

hbase.tmp.dir

HBase root directory

Hhbase.root.dir

HBase local directory

hbase.local.dir

YARN Resource Manager Config File

Setting where defined in core-site.xml and hdfs-site.xml

NameNode

HADOOP_NAMENODE_OPTS

DataNode

HADOOP_DATANODE_OPTS

Secondary NameNode

HADOOP_SECONDARYNAMENODE_OPT

ResourceManager

YARN_RESOURCEMANAGER_OPTS

NodeManager

YARN_NODEMANAGER_OPTS

WebAppProxy

YARN_PROXYSERVER_OPTS

Map Reduce Job History Server

HADOOP_JOB_HISTORYSERVER_OPTS

Apache Kafka Config File

Setting where definted in hbase-default.xml

Kafka log directory

Log.dir

Apache ZooKeeper Config File

Setting where defined in zoo.cfg

ZooKeeper data directory

dataDir=/var/lib/zookeeper

Linux File Systems

For Linux file systems, Technically Creative recommends raising the ulimit to 10,240 for all users of TCOC, HDFS, and HBASE.