Technical reproducibility in RNA-seq is considered to be excellent (provided that the same kit/lab/...) is used. So yes, technical replicates can be combined. I think the best stage to do this is as fastq or bam, although I can't think of problems of just adding the read counts.

You can concatenate the fastq files or add the counts, provided you check first for batch effects. As Wouter pointed, technical reproducibility is generally pretty high for NGS datasets - until it isn't.

technical replicates are different RNASeq runs performed on the same sample, they have to be averaged. You are taking a snapshot of the transcriptional profile of your sample, and you're sequencing it twice to avoid biases and to reduce the chance to fall for sequencing errors.

The same can be said for biological replicates, but in that case you are sequencing more than one sample, to avoid batch effects and sample preparation errors to bias your downstream analysis.

There are two equally respectable opinions, the ones that we just brought up. I gave you detailed explanation for my opinion, I am truly convinced of what I say. But I am eager to hear the reasons for which you could add them, perhaps I can change my mind if provided with enough evidence (ain't that what science is about?)

Are those the same cells from which you isolate twice, or the same RNA from which you created two libraries, or the same library sequenced in two runs, or even the same library sequenced in multiple lanes on the same sequencer?

Wouldn't adding up and averaging result in the same, provided that you normalize for total readcount before doing the rest of your analysis? You just get bigger numbers - as if you sequenced deeper.

But the amount of sequenced reads has to be connected to the amount of transcripts that you have in your sample, which ultimately defines the expression concept. To me, adding technical replicates means logically taking the transcripts of the same sample twice. I totally get your point, I'm more on a phylosophical / theoretical dimension now.

Agree and I think what important is not the seq. depth itself and the numbers we get, but rather the proportionality between them, since in the end what we compare are the relative expression values, which are basically proportions. I think it would not be wrong if you sum up read counts that come form more or less similar library sizes (you catch lowly expressed genes (which are the most of the genes) in the same proportions), but you would violate proportionality if you sum up very different library sizes.

In your example, gene counts and totals are exactly proportional to each other twice. In real experiment, it's not always the case - imagine you did DE analysis with 10 mln reads and got 100 DE genes. If the proportionality always is the same in theory if you do the same analysis with 20 million reads you will get the exactly same 100 genes. Then it wouldn't make sense to generate more read depth:) This is of course specific to genome size, for example for human you can have a look on these paper https://academic.oup.com/bioinformatics/article-lookup/doi/10.1093/bioinformatics/btt688, after a certain depth we don't see many changes, but with lower depth you can skew proportions because of lowly expressed genes, and I think this is where the problem comes. Please correct me I am wrong, would be glad to discuss.

You are very right - what my example didn't include is variability due to sampling (a Poisson distribution).
Note that sequencing deeper will give you a better estimate of the true proportions of a transcript in the total mixture (since the variance of a Poisson distribution depends on the mean).

Technical replicates in my case will be the same library sequenced on the same machine but on the different days.
For our experiment we need to reach 20mln reads depth for a given sample (biological replicate). For one of the samples we reached only 10 mln. And now need to decide ether to make a new run and seqeunce 10 mln reads (this will be a technical replicate) and add this to existing replicate, or make a run with 20 mln reads and use only these data.

For our experiment we need to reach 20mln reads depth for a given sample

I think, if the funding allows you to do it, making a 20 million reads run is better. When in one year you'll be sending the paper, the revisors could easily asked you 1) why did you sum the read counts 2) why did you sequence twice 10x instead of once 20x. I think it's better to be safe than sorry :)

Sequencing twice the same sample is not the same as sequencing it once
but twice as deep. The rare transcripts are gonna pop out only in the
second one, to my experience.

That doesn't make sense to me, because you are sampling from a distribution of molecules and the likelihood of "catching" low abundant transcripts is directly proportional to the "amount of sampling" you perform, whether once 20M molecules or twice 10M molecules.

I think the problem is underlying in the NGS technology itself. Imagine I sequence 10 mln reads and for a given gene I get 1 read of depth. Now if you sequence 5 mln reads, will you get 0 reads for that gene, and If you sequence 20 mln, will you get 2 reads?