Prerequisites

Logs Are Streams, Not Files

Logs are usually rotated on an hourly or daily basis based on time or size. This system quickly produces many large log files that need to be batch imported for further analysis. This is an outdated approach. Logs are better treated as continuously generated STREAMS as opposed to files.

"Server daemons (such as PostgreSQL or Nginx) and applications (such as a Rails or Django app) sometimes offer a configuration parameter for a path to the program’s logfile. This can lead us to think of logs as files. But a better conceptual model is to treat logs as time-ordered streams..." - Logs Are Streams, Not Files Adam Wiggins, Heroku co-founder.

td-agent, a data collection daemon, is used to import data continuously to Treasure Data. Although bulk-import is supported, we recommend importing your data continuously via td-agent.

What is Treasure Agent?

td-agent is a data collection daemon. It collects logs from various data sources and uploads them to Treasure Data.

How to install Treasure Agent?

To install Treasure Agent (td-agent), execute one of the commands below based on your environment. The agent program is installed automatically by using the package management software for each platform like rpm/deb/dmg.

Amazon Linux

MacOS X 10.11+

$ open 'http://packages.treasuredata.com.s3.amazonaws.com/3/macosx/td-agent-3.1.1-0.dmg'

With MacOS X 10.11.1 (El Capitan), some security changes were introduced and we are testing the changes we made to td-agent for this version of OS. For now, when the td-agent is installed, you must edit the /Library/LaunchDaemons/td-agent.plist file to change /usr/sbin/td-agent to /opt/td-agent/usr/sbin/td-agent.

Treasure Agent sometimes gets a “Failed to upload to TreasureData: Import failed” error. Is it a serious problem?

Treasure Agent has a robust retry mechanism for these cases, so continue to import the data to Treasure Data. You should see the message “retry succeeded” in your logs after several retries.

If you continue to get same error and it exceeds the retry limit, it is a non-network problem. Contact support@treasure-data.com

I cannot import my data to Treasure Data. Can you help?

Yes! Here are a couple of scenarios that we can think of.

1. Make sure your network is alive and allow you to access external networks.
2. Does your data have numbers greater than 2e64? If so, upgrade your td command/td-agent to the latest version. If you are using fluentd instead, upgrade fluent-plugin-td to version 0.10.15 or above.

Both processes run as the td-agent user under td-agent group, and all forked subprocesses run as the same. This applies to any system call initiated by td-agent as well. The agent configuration resides at /etc/td-agent/td-agent.conf. All configurations must be readable by td-agent.

The following ports are open depending on your input.

in_tail: nothing

in_forward: tcp/24224, udp/24224

in_unix: /var/run/td-agent/td-agent.sock

For secure uploading to Treasure Data, you need to open tcp/80 (http) and tcp/443 (https) for *.treasuredata.com.

Debugging

If you are having issues, add the following line to /etc/default/td-agent to enable verbose logging:

DAEMON_ARGS=-vv

After that, restart the daemon. You can now find more verbose logs in /var/log/td-agent.log

Why does the tdlog plugin warn me about the endpoint change?

You might see the following message in the td-agent log.

tdlog plugin will change the API endpoint from api.treasure-data.com to api.treasuredata.com
If you want to keep api.treasure-data.com, set 'endpoint api.treasure-data.com' in tdlog configuration

This message is for users who change the control of network access to api.treasure-data.com. If you don’t set access restrictions based on our API endpoint, ignore the warning.

What’s Next?

Modify your existing applications to post data to Treasure Data. The following articles explain the process (with sample code) for various languages, frameworks, and middleware.