Tag Info

NASDAQ makes this information available via FTP and they update it every night. Log into ftp.nasdaqtrader.com anonymously. Look in the directory SymbolDirectory. You'll notice two files: nasdaqlisted.txt and otherlisted.txt. These two files will give you the entire list of tradeable symbols, where they are listed, their name/description, and an indicator as ...

Representing time series (esp. tick data) using elaborate data structures may be not the best idea.
You may want to try to use two arrays of the same length to store your time series. The first array stores values (e.g. price) and the seconds stores time. Note that the second series is monotonically increasing (or at least non-decreasing), i.e. it's sorted. ...

Don't use them. I have used them for years because I couldn't find another source that would provide all stocks in all US exchanges -- till now. But first, about eoddata:
their data is very often missing elements, e.g., on any given day the SP500 index data my not be in their data set, even for a normal trading day
their ftp files are often out of date, ...

Bloomberg Open Symbology has this list. Look in the Common Stock precanned file. This will have a bit more data than you probably need as it has a separate entry and unique id for each place an equity is traded. However it is probably the highest quality list available for free anywhere.
As for filtering ETFs are broken out in a separate file (Equity_ETP) ...

Here are some pointers.
First of all: What you list as a Reuters RIC, RSF.ANY.AAPL.OQ, is not really a RIC, only the AAPL.OQ is. The initial part is some stuff which is essentially site specific and tells me that you are working on a site that has a legacy RTIC infrastructure (some Reuters/TIBCO technology which is quite old these days and for all ...

Equity returns have persistent negative skewness and excess kurtosis[1] over longer periods. So yes you're right: a majority of the daily returns is positive and small and a minority of the returns is negative and larger. This can be quite extreme, for example Black Monday.
I don't have the data right now but you can get returns on major indices freely.
...

You can pull a list of all stocks easily. See this question. You can get nasdaqlisted.txt and otherlisted.txt from here. nasdaqlisted.txt is clearly Tape C. otherlisted.txt contains an Exchange column which can be used to determine Tape A or B. If it is N it's listed at NYSE and therefore Tape A, otherwise it's Tape B.
Also, NYSE publishes a symbol list ...

There are so many different data providers, and they all end up using slightly different definitions. For Google, it looks like they use Deutsche Börse (Google) as a data source. I can't tell what Yahoo Deutschland is using.
I think your real question, though, is why the different data providers have different answers to the same questions. The answers ...

OpenTick used to have this... alas they are no more.
But here's a link to some decent alternatives.
http://blog.fosstrading.com/2009/11/opentick-alternatives.html
Some have free data options, but I don't believe that any include tick level data for free.
If you are in school and have access to WRDS you can get the TAQ (NASDAQ trade and quotes) which is ...

I really wouldn't implement time series on my own unless I had a good reason to. AQR uses pandas, almost everyone in R using zoo or xts.
I never like multiple parallel arrays, if it breaks everything is broken, plus it gets uglier as you increment data. If you are doing something in C++, why not have an array of structs for each object where you have ...

Don't use them. Their data is very spotty. They offer "minute by minute" data on commodities markets and forex which should be technically 24hrs yet it usually starts from 9 or 11 o clock and the "minutes" end around 3 or 4. I emailed their support alias asking why this was the case. I also copied and pasted the opening and closing times of the exchanges and ...

No - clearly you've not seen the licensing agreements the exchanges force you to sign (one way or the other). Generally such firms and individuals have greater utility from the money they'll make working with the data than risking going to jail.
Market data is a 5bn / yr business. You're pushing the proverbial up-hill.
Anyway, you can get financial index ...

the Commodity Traders report is the most useful for this, it lets you deduce large and small players on the stock index futures. This is only released weekly by the CFTC
Otherwise you can use volume:price divergence and average volume moving average to further deduce whats happening. Finally you can use level 2's to get a feel for the speed of orders and ...

It's usually more efficient to have timeseries objects located sequentially in contiguous memory.
A hashtable doesn't provide this. As good as it is, from a complexity standpoint, it's not faster than a fixed array when accessing items in a [i+1] or [i-lag] kind of operation that is typical in timeseries code.
(For the most part you can estimate the ...

The stockSymbols function in the R package TTR pulls the data from nasdaq.com that @bellamyj mentioned. It also attempts to convert the symbols to a format acceptable to Yahoo Finance.
That said, I'm not certain how to filter this list for only common stocks. There are 1275 securities with "n/a" Sector or Industry, leaving ~5000. Perhaps the remaining ...

You can download all stocks on the three exchanges listed in your question from the NASDAQ website: http://www.nasdaq.com/screening/company-list.aspx.
It looks like removing those entries with an industry of "N/A" will eliminate ETFs and other funds from the list.

Yes, there is in fact a whole literature on this subject coming from the field of non-linear dynamics-- it is known as the method of surrogates. The idea is essentially to come up with a "scrambled" version of your original data set that preserves many of the basic statistical properties, though perhaps not the serial dependence structure which might be ...

If you are serious about performance and flexibility, you have to take a look at data.table package in R. Here is the crantastic review. It is lighting fast! I think this is the best package addressing performance and memory issues.

I assume you're using returns to compute beta, not the prices. And yes, remove the "jumps", though this should happen automatically since you're looking only at intraday returns. One final piece of advice: you'll get more meaningful results if you smooth the returns via a moving average.

I think there is a straightforward answer to this:
The associated costs of changing trading hours need to be justified by the benefits.
Exchanges, regulators, and large market participants such as banks, hedge funds, buy side long-only funds very closely communicate and weigh pros and cons when considering changes to trading hours. Obviously motivations ...

That information is available in the SEC's EDGAR database, though there can be many flavors of "shares outstanding". It is reported quarterly in a company's 10-Q/K, sometimes on a weighted-average basis. If you don't want to get it manually, a service like Bloomberg will let you access the historical levels quite easily, or you can parse the EDGAR XML feeds.
...

Are you after the famous paper from Christie and Schultz?
Christie,W., Schultz,P., 1994. Why do NASDAQ market makers avoid odd-eighth quotes? Journal of Finance 49,1813–18 40.
From the abstract:
On May 26 and 27, 1994 several national newspapers reported the findings of Christie and Schultz (1994) who cannot reject the hypothesis that market makers ...

The difference is usually explained by
the way the end-providers (Google, Yahoo) aggregate the data they get
from their vendors
getting prices from same exchange, but from
different trading platforms
different adjustments and corrections
during post trading period
missing corporate actions or dividends and
many more.
If you have a discrepancy in a price ...

Making money is not the only reasonable objective to trading. Another common reason is to manage/reallocate risk. For example, this is exactly the objective of liability-driven-investors, such as pension funds. They're specifically trying to match durations of their liabilities. It doesn't matter if pension fund managers believe there are no inefficiencies ...

You can download EDHEC's indexes for free. You just need to create an account on their site (which is worth doing to get their research).
Some of their data is also bundled in the PerformanceAnalytics package in R.

While noble, unfortunately, this type of effort is not very practical. Mostly because market data is a major source of revenue for the market centers and is never simply given away, at least not in intraday form. A few things to consider:
Becoming a market data distributor is both costly and entails entering into agreements with each market center. If we ...

The "industry standard" for calculating implied volatility is OptionMetrics. Chapter 3 of their reference document contains details of how they calculate all the inputs to the standard Black-Scholes model. They also have a white paper just on dividend yield forecasting, which can potentially be a major issue.
However, much of the data they use is far from ...

Your question will be very difficult to answer, at least for equities. The best you can probably do in terms of accurate information are research reports from organizations like Tabb. You can look at positioning of players from 13F reports, meaning you can see which players have large positions in a certain equity. You may not be able to discern why, ...