]]>
https://blog.csdn.net/cubesky/article/details/80939026
https://blog.csdn.net/cubesky/article/details/80939026cubesky2018/07/06 12:08:40
Albedo should be sRGB, not linear. It should be noted that for the most part it's not how they're saved but how they're created eventually read. The image formats themselves don't usually have a concept of if they data they hold should be considered sRGB or linear. However it's usually pretty easy to know which is going to be wanted / needed. If the image is holding a color or some value you're going to be seeing directly (like albedo or specular color, or even occlusion) then it should be sRGB, if it holds data that is going to be interpreted then most likely it should be linear. Metallic is an odd one because it's technically "data", but but for ease of use they expect sRGB textures, that and what "halfway between dielectric and metal" looks like is kind of arbitrary; in the real world things are 100% one or the other.

Albedo, metallic, specular and occlusion are sRGB. Normals and smoothness are linear.

(The alpha component of textures are always linear regardless of the settings on the texture, sRGB only applies to the RGB channels.)

More:

For metallic textures the assumption they're making is that you're authoring these in a graphics program that's using an 8 bpc sRGB, but you can do these in linear if you want. It's kind of irrelevant as it's just data. However specular color does need to be sRGB as it's a color and uses the same slot as metallic and Unity's default behavior is to treat all textures as sRGB so they just assume that's what you're going to do. The slider for metallic in the standard shader when you don't have a texture is in sRGB / gamma space so that "0.5" is the equivalent of 127/255 in sRGB (~0.2) and not 127/255 in linear because of this.

DXT Encoding

DXT is a block-based texture compression format. The image is split up into 4×4 blocks, and each block is encoded using a fixed number of bits. In case of DXT1 format (used for compression of RGB images), each block is encoded using 64 bits. Information about
each block is stored using two 16-bit color endpoint values (color0 and color1), and 16 2-bit selector values (one selector value per pixel) which determine how the color
of each pixel is computed (it can be either one of the two endpoint colors or a blend between them). According to the DXT1 compression format, there are two different ways to blend the endpoint colors, depending on which endpoint color has higher value. However,
Crunch algorithm uses a subset of DXT1 encoding (endpoint colors are always ordered in such a way that color0 >= color1). Therefore, when using Crunch compression, endpoint colors are always blended in the following
way:

selector value

pixel color

0

color0

1

color1

2

(2 * color0 + color1) / 3

3

(color0 + 2 * color1) / 3

DXT encoding can therefore be visually represented in the following way:

color0
(RGB565)
16 bits/block

color1
(RGB565)
16 bits/block

selectors
2 bits/pixel

decoded
DXT
4 bits/pixel

Each pixel can be decoded by merging together color0 and color1 values according to the selector value.

For simplicity, information about color0 and color1 can be displayed on the same image (with the upper part of every 4×4 block filled with color0 and
the lower part filled with color1). Then all the information necessary for decoding the final texture can be represented in a form of the following 2 images (4×4 blocks are displayed slightly separated from each
other):

color
endpoints
32 bits/block

color
selectors
32 bits/block

Tiling

For an average texture it is quite common that neighbor blocks have similar endpoints. This property can be used to improve the compression ratio. In order to achieve this, Crunch introduces the concept of “chunks”. All the texture blocks are split into “chunks”
of 2×2 blocks (the size of each chunk is 8×8 pixels), and each chunk is associated with one of the following 8 chunk types:

Blocks
with identical endpoints form a “tile” within a chunk, and are displayed united together on the picture above. Once the information about the chunk types has been encoded, it is sufficient to encode only one endpoint per tile. For example, in case of the leftmost
chunk type, all the blocks within a chunk have the same endpoints, so for such a chunk it is sufficient to encode only one endpoint pair. In case of the rightmost chunk encoding, all the endpoint pairs are different, so it is necessary to encode all 4 of them.
The following example shows texture endpoints, grouped into chunks, where each chunk is split into tiles:

Of course, the described chunk types don’t cover all the possible combinations of matching endpoints, but at the same time, this way the information about the matching endpoints can be encoded very efficiently. Specifically, encoding of the chunk type requires
3 bits per 4 blocks (0.75 bits per block, uncompressed).

Crunch algorithm can enforce the neighbor blocks within a chunk to have identical endpoints in cases when extra accuracy of the encoded colors isn’t worth spending extra bits for encoding of additional endpoints. This is achieved in the following way. First,
each chunk is encoded in 8 different ways, corresponding to the described 8 chunk types (instead of using DXT1 optimization for each block, the algorithm is using DXT1 optimization for each tile). The quality of each encoding is then evaluated as the PSNR
multiplied by a coefficient associated with the used chunk type, and the optimal encoding is selected. The trick here is that chunk types with higher number of matching endpoints also have higher quality coefficients. In other words, if using the same endpoint
for two neighbor blocks within a chunk doesn’t reduce the PSNR much, then the algorithm will most likely select the chunk type where those neighbor blocks belong to the same tile. The described process can be referenced as “tiling”.

Quantization

The basic idea of Crunch compression is to perform quantization of the determined endpoints and selectors blocks, in order to encode them more efficiently. This is achieved using vector quantization. The idea is similar to color quantization, when a color image
is represented using a color palette and palette indices defined for each pixel.

In order to perform vector quantization, each endpoint pair should be represented with a vector. For example, it is possible represent a tile endpoint pair with a vector (color0.r, color0.g, color0.b, color1.r, color1.g,
color1.b), where color0 and color1 are obtained from DXT1 optimization. However, such representation doesn’t reflect the continuity properties of the source texture
very well (for example, in case of a solid block, a small change of the block color might result in significant change of the optimal color0 and color1, which are used
to encode this color). Instead, Crunch algorithm is using a different representation. Source pixels of each tile, which are represented by their (r, g, b) vectors, are split into 2 clusters using vector quantization,
providing two centroids for each tile: low_colorand high_color. Then the endpoints of each tile are represented with a (low_color.r,
low_color.g, low_color.b, high_color.r, high_color.g, high_color.b) vector. Such representation of the tile endpoints doesn’t depend on the DXT1 optimization result, but at the same time performs quite well.

Note that after quantization all the blocks within a tile will be associated with the same endpoint codebook element, so they will get assigned the same endpoint index. This means that initially determined chunk types will be still valid after endpoint quantization.

Selectors of each 4×4 block can be represented with a vector of 16 components, corresponding to the selector values of each block pixel. In order to improve the result of the quantization, selector values are reordered in the following way, in order to better
reflect the continuity of the selected color values:

linear selector value

pixel color

0

color0

1

(2 * color0 + color1) / 3

2

(color0 + 2 * color1) / 3

3

color1

Vector quantization algorithm splits all the input vectors into separate groups (clusters) in such a way so that vectors in each group appear to be more or less similar. Each group is represented by its centroid, which is computed as an average of all the vectors
in the group according to the selected metric. The computed centroid vectors are then used to generate the codebook (centroid vector components are clipped and rounded to integers in order to represent valid endpoints or selectors). The original texture elements
are then replaced with the elements of the computed codebooks (endpoints for each source 4×4 block are replaced with the closest endpoint pair from the generated endpoint codebook, selectors for each source 4×4 block are replaced with the selector values of
the closest selector codebook element).

The result of vector quantization performed for both endpoints and selectors can be represented in the following way:

endpoint codebook:

selector codebook:

After quantization, it is sufficient to store the following information in order to decode the image:

chunk types

endpoint codebook

selector codebook

endpoint indices (one index per tile)

selector indices (one index per block)

The quality parameter provided for Crunch compressor directly controls the size of generated endpoint and selector codebooks. The higher is the quality value, the larger are the endpoint and selector codebooks, the wider is the range of the possible indices,
and subsequently, the bigger is the size of the compressed texture.

Encoding of the DXT Alpha channel

DXT encoding for the alpha channel is very similar to the DXT encoding of the color information. Information about the alpha channel of each block is stored using 64 bits: two 8-bit alpha endpoint values (alpha0 and alpha1),
and 16 3-bit selector values (one selector value per pixel) which determine how the alpha of each pixel is computed (it can be either one of the two alpha values or a blend between them). As has been mentioned before, Crunch algorithm uses a subset of DXT
encoding, so the possible alpha values are always blended in the following way:

selector value

pixel alpha

0

alpha0

1

alpha1

2

(6 * alpha0 + 1 * alpha1) / 7

3

(5 * alpha0 + 2 * alpha1) / 7

4

(4 * alpha0 + 3 * alpha1) / 7

5

(3 * alpha0 + 4 * alpha1) / 7

6

(2 * alpha0 + 5 * alpha1) / 7

7

(1 * alpha0 + 6 * alpha1) / 7

Vector quantization for the alpha channel is performed exactly the same way as for the color components, except that vectors which represent alpha endpoints of each tile, consist of 2 components (low_alpha, high_alpha),
and are obtained through clusterization of the alpha values of all the tile pixels.

Note that the chunk type, determined during the tiling step, is common for both color and alpha endpoints. So in case of textures using alpha channel, chunk type is determined based on the combined PSNR computed for color and alpha components.

Compression step

The main idea used in Crunch algorithm for improving the compression ratio is based on the fact that changing the order of the elements in the codebook doesn’t affect the decompression result (considering that the indices are reassigned accordingly). In other
words, the elements of the generated codebooks can be reordered in such a way, so that the dictionary elements and indices acquire some specific properties, which allow them to be compressed more efficiently. Specifically, if the neighbor encoded elements
appear to be similar, then each element can be used for prediction of the following element, which significantly improves the compression ratio.

According to this scheme, Crunch algorithm is using zero order prediction when encoding codebook elements and indices. Instead of encoding endpoint and selector indices, the algorithm encodes the deltas between the indices of the neighbor encoded blocks. The
codebook elements are encoded using per-component prediction. Specifically, each endpoint codebook element (which is represented by two RGB565 colors) is encoded as 6 per-component deltas from the previous dictionary element. Each selector codebook element
(which is represented by 16 2-bit selector values) is encoded as 16 per-component deltas from the previous dictionary element.

On the one hand, endpoint indices of the neighbor blocks should be similar, as the encoder compresses the deltas between the indices of the neighbour blocks. On the other hand, the neighbor codebook elements should be also similar, as the encoder compresses
the deltas between the components of those neighbor codebook elements. The combined optimization is based on the Zeng’s technique, using a weighted function which takes into account both similarity of the indices of the neighbor blocks and similarity of the
neighbor elements in the codebook. Such reordering optimization is performed both for endpoint and selector codebooks.

Finally, the reordered codebooks and indices, along with the chunk type information, are encoded with Huffman coding (using zero order prediction for indices and codebook components). Each type of encoded data uses its own Huffman table, or multiple tables.
For performance reasons adaptive Huffman coding isn’t used.

Improving Crunch compression library

We performed a comprehensive analysis of the algorithms and techniques used in the original version of Crunch and introduced several modifications which allowed us to significantly improve the compression performance. The updated Crunch library, introduced
in Unity 2017.3, can compress DXT textures up
to 2.5 times faster, while providing about 10% better compression ratio. At the same time, decompressed textures, generated by both libraries, are identical bit by bit. The latest version of the library, which will reach Beta builds soon, will be able to perform
Crunch compression of DXT textures about 5 times faster than the original version. The latest version of the Crunch library can be found in the following
GitHub repository.

The main modifications of the original Crunch library are described below. The improvement in compressed size and compression time, introduced by each modification, is described as a saved portion of the compressed size and compression time spent by the original
library. It has been evaluated on the Kodak image test set. When compressing real world textures, the improvement in compression size should be normally higher.

As described above, in the original version of Crunch algorithm all the blocks are grouped into chunks of 2×2 blocks. Each chunk is associated with one of 8 different chunk types. The type of the chunk determines which blocks inside the chunk have the same
endpoints indices. This scheme performs quite well, because it is often more efficient to compress information about the endpoint equality, rather than compress duplicate endpoint indices. However, this scheme can be improved. The modified Crunch algorithm
no longer uses the concept of chunks. Instead, for each block it can encode a reference to the previously processed neighbor block, where the endpoint can be copied from. Considering that the texture is decompressed from left-to-right, top-to-bottom, endpoints
of each decoded block can be either decoded from the input stream, copied from the left nearest block (reference to the left) or copied from the upper nearest block (reference to the top):

The following example shows quantized texture endpoints with the references:

Note that the modified Crunch encoding is a superset of the original encoding, so all the images previously encoded with the original Crunch algorithm can be losslessly transcoded into the new format, but not vice versa. Even though the new endpoint equality
encoding is more expensive (about 1.58 bits per block, uncompressed), it also provides more flexibility for endpoint matching inside the previously used “chunks”, but more importantly, it allows to copy endpoints from one “chunk” to another (which isn’t possible
when using the original chunk encoding). The blocks are no longer grouped together and are encoded in the same order as they appear on the image, which significantly simplifies the algorithm and eliminates extra levels of indirection.

The original version of Crunch encodes the deltas between the neighbour indices in order to get advantage of the neighbour indices similarity. The efficiency of such approach highly depends on the continuity of the encoded data. While neighbour color and alpha
endpoints are usually similar, this is often not the case for selectors. Of course, in some situations, encoding the deltas for selector indices makes sense, for example, when an image contains a lot of regular patterns aligned to the 4×4 block boundaries.
In practice, however, such situations are relatively rare, so it usually appears to be more efficient to encode raw selector indices without prediction. Note that when selector indices are encoded without prediction, the reordering of the selector indices
no longer affects the size of the encoded selector indices stream (at least when using Huffman coding). This makes the Zeng optimization of selector indices unnecessary, and it’s sufficient to simply optimize the size of the packed selector codebook.

Remove duplicate endpoints and selectors from the codebooks (improvement in compressed size: 1.7%)

By default, the size of the endpoint and selector codebooks is calculated based on the total number of blocks in the image and the quality parameter, while the actual complexity of the image isn’t evaluated and isn’t taken into account. The target codebook
size is selected in such a way that even complex images can be approximated well enough. At the same time, normally, the lower the complexity of the image, the higher is the density of the quantized vectors. Considering that vector quantization is performed
using floating point computations, and the quantized endpoints have integer components, high density of quantized vectors will result in a large number of duplicate endpoints. As the result, some identical endpoints are being represented with multiple different
indices, which affects the compression ratio. Note that this isn’t the case for selectors, as their corresponding vector components are rounded after quantization, but instead it leads to some duplicate selectors in the codebook being unused. In the modified
version of the algorithm all the duplicate codebook entries are merged together, unused entries are removed from the codebooks, endpoint and selector indices are updated accordingly.

Use XOR-deltas for encoding of the selector codebook (improvement in compressed size: 0.9%)

In the original version of Crunch, selector codebook is encoded with Huffman coding applied to the raw deltas between corresponding pixel selectors of the neighbour codebook elements. However, using Huffman coding for raw deltas has a downside. Specifically,
for each individual pixel selector, only about half of all the possible raw deltas are valid. Indeed, once the value of the current selector is determined, the selector delta depends only on the next selector value, so only n out
of 2 * n – 1total raw delta values are possible at any specific point (where n is the number of possible selector values). This means that on each step the impossible
raw delta values are being encoded with a non-zero probability, as the probability table is calculated only once throughout the whole codebook. The situation can be improved by using modulo-deltas instead of raw deltas (modulo 4 for color selectors and modulo
8 for alpha selectors). This eliminates the mentioned implicit restriction on the values of the decoded selector deltas, and therefore improves the compression ratio. Interestingly, the compression ratio can be improved even further if XOR-deltas are used
instead of modulo-deltas (XOR-delta is computed by simply XOR-ing two selector values). At first it might seem counterintuitive that XOR-delta can perform better than modulo-delta, as it doesn’t reflect the continuity properties of the data that well. The
trick here is that the encoded selectors are first sorted according to the used delta operation and the corresponding metric.

After the endpoint codebook has been computed, the endpoints are reordered to improve the compression ratio. As has been described above, optimization is based on Zeng’s technique, using a weighted function which takes into account both similarity of the indices
in neighbor blocks and similarity of the neighbor elements in the codebook.

The ordered list of endpoints is built starting from a single endpoint and then adding one of the remaining endpoints to the beginning or to the end of the list on each iteration. It’s using a greedy strategy which is controlled by the optimization function.
The similarity of the endpoint indices is evaluated as a combined neighborhood frequency of the candidate endpoint and all the endpoints in the ordered list. The similarity of the neighbor endpoints in the codebook is evaluated as Euclidian distance from the
candidate endpoint to the extremity of the ordered list. The original optimization function for an endpoint candidate p can be represented as:

F(p) = (endpoint_similarity(p) + 1) * (neighborhood_frequency(p) + 1)

The problem with this approach is the following. While the endpoint_similarity(p) has a limited range of values, the neighborhood_frequency(p) grows rapidly with the increasing
size of the ordered list of endpoints. With each iteration this introduces additional disbalance for the weighted optimization function. In order to minimize this effect, is it proposed to normalize the neighborhood_frequency(p)on
each iteration. For computational simplicity, the normalizer is computed as the optimal neighborhood_frequency value from the previous iteration, multiplied by a constant. The modified optimization function can
be represented as:

Other improvements

Additional improvement in compression speed has been achieved by optimizing the original algorithms, reducing the total amount of computations by caching the intermediate computation results, and spreading the computations between threads more efficiently.

Crunch encoding vs. general purpose compression

The described modifications of the Crunch algorithm don’t change the result of the quantization step, which means that decompressed textures, generated by both libraries, will be identical bit by bit. In other words, the improvement in compression ratio has
been achieved by using a different lossless encoding of the quantized images. It might therefore be interesting to compare Crunch encoding with alternative ways of compressing the quantized textures. For example, quantized textures can be stored in a raw DXT
format, compressed with LZMA. The following table displays the difference in compression ratio when using different approaches:

DXT

Quantized DXT + LZMA

Quantized DXT + original Crunch encoding

Quantized DXT + improved Crunch encoding

Kodak image set

6147.4 KB

2227.0 KB

2016.8 KB

1869.9 KB

Adam Character Pack: Adam, Guard, Lu (93 textures)

652.7 MB

155.8 MB

142.8 MB

128.7 MB

Adam Exterior Environment (227 textures)

717.8 MB

162.6 MB

156.3 MB

138.1 MB

According to the test results, it seems to be more efficient to use Crunch encoding of the computed codebooks and indices, rather than compress the quantized texture with LZMA. Not to mention that Crunch decompression is also significantly faster than LZMA
decompression.

Modifying Crunch algorithm to support ETC texture format

Even though the Crunch algorithm was originally designed for compression of DXT textures, it is in fact much more powerful. With some minor adjustments it can be used to compress other texture formats. This section will describe in detail how the original Crunch
algorithm was modified in order to be able to compress ETC and ETC2 textures.

ETC encoding

ETC is a block-based texture compression format. The image is split up into 4×4 blocks, and each block is encoded using a fixed number of bits. In case of ETC1 format (used for compression of RGB images), each block is encoded using 64 bits.

The first 32 bits contain information about the colors used within the 4×4 block. Each 4×4 block is split either vertically or horizontally into two 2×4 or 4×2 subblocks (the orientation of each block is controlled by the “flip” bit). Each subblock is assigned
its own base color and its own modifier table index.

The two base colors of a 4×4 block can be encoded either individually as RGB444, or differentially (the first base color is encoded as RGB555, and the second base color is encoded as RGB333 signed offset from the first base color). The type of the base color
encoding for each block is controlled by the “diff” bit.

The modifier table index of each subblock is referencing one of the 8 possible rows in the following modifier table:

modifier table index

modifier0

modifier1

modifier2

modifier3

0

-8

-2

2

8

1

-17

-5

5

17

2

-29

-9

9

29

3

-42

-13

13

42

4

-60

-18

18

60

5

-80

-24

24

80

6

-106

-33

33

106

7

-183

-47

47

183

The intensity modifier set (modifier0, modifier1, modifier2, modifier3) defined by the modifier table index, along with the base color, determine 4 possible color values for each subblock:

Note that the higher is the value of the modifier table index, the more distributed are the subblock colors along the intensity axis.

Another 32 bits of the encoded ETC1 block describe 16 2-bit selectors values (each pixel in the block can take one of 4 possible color values, described above).

ETC1 encoding can therefore be visually represented in the following way:

base
colors +
block orientation
26 bits/block

modifier
table index
3 bits/subblock

selectors
2 bits/pixel

decoded
ETC1
4 bits/pixel

Each pixel color of an ETC1 block can be decoded by adding together the base color and the modifier color, defined by the modifier table index and selector value (the result color should be clamped).

For simplicity, information about the base colors, block orientations and modifier table indices can be displayed on the same image. The upper or the left part of each 2×4 or 4×2 subblock (depending on the block orientation) is filled with the base color, and
the rest is filled with the modifier table index color. Then all the information necessary for decoding of the final texture can be represented in a form of the following 2 images (subblocks on the left image and blocks on the right image are displayed slightly
separated from each other):

Using Crunch algorithm for compression of ETC1 textures

Even though DXT1 and ETC1 encodings seem to be quite different, they also have a lot in common. Each pixel of an ETC1 texture can take one of four possible color values, which means that ETC1 selector encoding is equivalent to DXT1 selector encoding, and therefore
ETC1 selectors can be quantized exactly the same way as DXT1 selectors. The main difference between the encodings is that in case of ETC1, each half of a 4×4 block has its own set of possible color values. But even though ETC1 subblock colors are encoded using
a base color and a modifier table index, the four computed subblock colors normally lie on the same line and are more or less evenly distributed along that line, which highly resembles DXT1 block colors. The described similarities allow to use Crunch compression
for ETC1 textures, with some modifications.

As has been described above, Crunch compression involves the following main steps:

tiling

endpoint quantization

selector quantization

compression of the determined codebooks and indices

When applying Crunch algorithms to a new texture format, it is necessary to first define the codebook element. In the context of Crunch, this means that the whole image consists of smaller non-overlapping blocks, while the contents of each individual block
are determined by an endpoint and a selector from the corresponding codebooks. For example, in case of DXT format, each endpoint and selector codebook element corresponds to a 4×4 pixel block. In general, the size of the blocks, which form the encoded image,
depends on the texture format and quality considerations.

It’s proposed to define codebook elements according to the following limitations:

Codebook elements should be compatible with the existing Crunch algorithm, while the image blocks defined by those codebook elements should be compatible with the texture encoding format.

It should be possible to cover a wide range of image quality and bitrates by changing the size of the endpoint and selector codebooks. If there is no limitation for the codebook size, it should be possible to achieve lossless or near-lossless compression quality
(not considering the quality loss implied by the texture format itself)

Endpoint codebook

In case of ETC1, the texture format itself determines the minimal size of the image block, defined by an endpoint: it can be either 2×4 or 4×2 rectangle, aligned to the borders of the 4×4 grid. It isn’t possible to use higher granularity, because each of those
rectangles can have only one base color, according to the ETC1 format. For the same reason, any image block, defined by an endpoint codebook element, should represent a combination of ETC1 subblocks.

At the same time, each ETC1 subblock has its own base color and modifier table index, which approximately determine the high and the low colors of the subblock (even though there are some limitations on the position of those high and low colors, implied by
the ETC1 encoding). If an endpoint codebook element is defined in such a way that it contains information about more than one ETC1 base color, then such a dictionary will become incompatible with the existing tile quantization algorithm for the following reason.
The Crunch tiling algorithm first performs quantization of all the tile pixel colors, down to just 2 colors. Then it performs quantization of all the generated color pairs, generated by different tiles. This approach works quite well for 4×4 DXT blocks, as
those 2 colors approximately represent the principal component of the tile pixel colors. In case of ETC1, however, mixing together pixels, which correspond to different base colors, doesn’t make much sense, because each group of those pixels has its own low
and high color values independent from other groups. If those pixels are mixed together, the information about the original principal components of each subblock will get lost.

The described limitations suggest that ETC1 endpoint codebook element should represent the area of a single ETC1 subblock (either 2×4 or 4×2). This means that ETC1 endpoint codebook element should contain information about the subblock base color (RGB444 or
RGB555) and the modifier table index (3 bits). And it is therefore proposed to encode an ETC1 “endpoint” as 3555 (3 bits for the modifier table index and 5 bits for each component of the base color).

Selector codebook

In case of DXT format, both endpoint codebook elements and selector codebook elements correspond to the same size of the decoded block (in case of DXT it is 4×4). So it would be reasonable to try the same scheme for ETC1 encoding (i.e. to use 2×4 or 4×2 blocks
for selector codebooks, matching the blocks which are defined by endpoint codebook elements). Nevertheless, after additional research we discovered a very interesting observation. Specifically, endpoint blocks and selector blocks don’t have to be of the same
size in order to be compatible with the existing Crunch algorithm. Indeed, selector codebook and selector indices are defined after the endpoint optimization is complete. At this point each image pixel is already associated with a specific endpoint. At the
same time, the selector computation step is using those per-pixel endpoint associations as the only input information, so the size and the shape of the blocks, defined by selector codebook elements, doesn’t depend in any way on the size or shape of the blocks,
defined by endpoint codebook elements.

In other words, the endpoint space of the texture can be split into one set of blocks, defined by endpoint codebook and endpoint indices. And the selector space of the texture can be split into a completely different set of blocks, defined by selector codebook
and selector indices. Endpoint blocks can be different in size from the selector blocks, as well as endpoint blocks can overlap in arbitrary way with the selector blocks, and such setup will still be fully compatible with the existing Crunch algorithm. The
discovered property of the Crunch algorithm opens another dimension for optimization of the compression ratio. Specifically, the quality of the compressed selectors can now be adjusted in two ways: by changing the size of the selector codebook and by changing
the size of the selector block. Note that both DXT and ETC formats have selectors encoded as plain bits in the output format, so there is no limitation on the size or shape of the selector block (though, for performance reasons, non-power-of-two selector blocks
might require some specific optimizations in the decoder).

Several performance tests have been conducted using different selector block sizes, and the results suggest that 4×4 selector blocks perform quite well.

Tiling

As has been described above, each element of an ETC1 endpoint codebook should correspond to an ETC1 subblock (i.e. to a 2×4 or a 4×2 pixel block, depending on the block orientation). In case of DXT encoding, the size of the encoded block is 4×4 pixels, and
tiling is performed in a 8×8 pixel area (covering 4 blocks). In case of ETC1, however, tiling can be performed either in a 4×4 pixel area (covering 2 subblocks), or in a 8×8 pixel area (covering 8 subblocks), while other possibilities are either not symmetrical
or too complex. For performance reasons and simplicity it is proposed to use 4×4 pixel area for tiling. There are therefore 3 possible block types: the block isn’t split (the whole block is encoded using a single endpoint), the block is split horizontally,
the block is split vertically:

The following example shows computed tiles for the texture endpoints:

Endpoint references

At first, it might look like ETC1 block flipping can bring some complications for Crunch, as the subblock structure doesn’t look like a grid. This, however, can be easily resolved by flipping all the “horizontal” ETC1 blocks across the main diagonal of the
block after the tiling step, so that all the ETC1 subblocks will become 2×4 and form a regular grid:

flipped
color endpoints

flipped
color selectors

Note that decoded selectors should be flipped back according to the block orientation during decompression (this can be efficiently implemented by precomputing a codebook of flipped selectors).

Endpoint references for the ETC1 format are encoded in a similar way to the DXT1 format. The are however two modifications, specific to the ETC1 encoding:

In addition to the standard endpoint references (to the top and to the left blocks), it is also possible to use an endpoint reference to the top-left diagonal neighbour block.

Endpoint references for the primary and secondary subblocks have different meaning.

The primary ETC1 subblock has the reference value of 0 if the endpoint is decoded from the input stream, the value of 1 if the endpoint is copied from the secondary subblock
of the left neighbour ETC1 block, the value of 2 if the endpoint is copied from the primary subblock of the top neighbour ETC1 block, and the value of 3 if the endpoint
is copied from the secondary subblock of the top-left neighbour ETC1 block:

The reference value of secondary ETC1 subblock contains information about the block tiling and flipping. It has the reference value of 0 if the endpoint is copied from the primary subblock (note that in this case
flipping doesn’t need to be encoded, as endpoints are equal), the value of 1 if the endpoint is decoded from the input stream and the corresponding ETC1 block is split horizontally, and the value of 2 if
the endpoint is decoded from the input stream and the corresponding ETC1 block is split vertically:

The
following example shows ETC1 texture endpoints with tiles and references (considering that flipping has been already performed by the decoder):

Quantization

Considering that each endpoint codebook element corresponds to a single ETC1 base color, the original endpoint quantization algorithm works almost the same way for the ETC1 encoding as for the DXT1 encoding. An endpoint of en ETC1 tile can be represented with
a (low_color.r, low_color.g, low_color.b, high_color.r, high_color.g, high_color.b) vector, where low_color and high_color are
generated by the tile palletizer, exactly the same way as for the DXT1 encoding.

Note that low_color and high_color, computed for a tile, implicitly contain information about the base color and the modifier table index, computed for this tile. Indeed,
the base color normally lies somewhere in the middle between low_color and high_color, while the modifier table index corresponds to the distance between low_color and high_color.
Vectors which represent tiles with close values of low_color and high_color, will most likely get into the same cluster after vector quantization. But this also means
that for the tiles from the same cluster, the average values of low_color and high_color, and distances between low_color and high_color should
be also pretty close. In other words, the original endpoint quantization algorithm will generate tile clusters with close values of the base color and the modifier table index.

Selectors of each 4×4 block can be represented with a vector of 16 components, corresponding to the selector values of each block pixel. This means that ETC1 selector quantization step is identical to the DXT1 selector quantization step.

The result of the vector quantization performed for both ETC1 endpoints and selectors can be represented in the following way:

endpoint codebook:

selector codebook:

Note that according to the ETC1 format, the base colors within an ETC1 block can be encoded either as RGB444 and RGB444, or differentially as RGB555 and RGB333. For simplicity, this aspect is currently not taken into account (all the quantized endpoints are
encoded as 3555 in the codebook). If it appears that the base colors in the resulting ETC1 block can not be encoded differentially, the decoder will convert both base colors from RGB555 to RGB444 during decompression.

Compression of ETC2 textures

The Crunch algorithm doesn’t yet support ETC2 specific modes (T, H or P), but it’s capable of efficiently encoding the ETC2 Alpha channel. This means that the current ETC2 + Alpha compression format is equivalent to ETC1 + Alpha. Note that ETC2 encoding is
a superset of ETC1, so any texture, which consists of ETC1 color blocks and ETC2 Alpha blocks, can be correctly decoded by an ETC2_RGBA8 decoder.

ETC2 encoding for the alpha channel is very similar to the ETC1 encoding of the color information. Information about the alpha channel of each block is stored using 64 bits: 8-bit base alpha, 4-bit modifier table index, 4-bit multiplier and 16 3-bit selector
values (one selector value per pixel).

The modifier table index and selector value determine a modifier value for a pixel, which is selected from the ETC2
alpha modifier table. For performance reasons, ETC2 Crunch compressor is currently using only the following subset of the modifier table:

modifier table index

modifier0

modifier1

modifier2

modifier3

modifier4

modifier5

modifier6

modifier7

11

-2

-5

-7

-10

1

4

6

9

13

-1

-2

-3

-10

0

1

2

9

The final alpha value for each pixel is calculated as base_alpha + modifier * multiplier, which is then clamped.

Note that unlike ETC1 color, ETC2 Alpha is encoded using a single base alpha value per 4×4 pixel block. This means that each element of the alpha endpoint dictionary should correspond to a 4×4 pixel block, covering both primary and secondary ETC1 subblocks.
For this reason, alpha channel can be ignored when performing color endpoint tiling.

The compression scheme for ETC2 Alpha blocks is equivalent to the compression scheme for DXT5 Alpha blocks. As has been shown before, vector representation of alpha endpoints doesn’t depend on the used encoding. This means that all the initial processing steps,
including alpha endpoint quantization, will be almost identical for DXT5 and ETC2 Alpha channels. The only part which is actually different for the ETC2 Alpha encoding is the final Alpha endpoint optimization step.

In order to perform ETC2 Alpha endpoint optimization, the already existing DXT5 Alpha endpoint optimization algorithm is run to obtain the initial approximate solution. Then the approximate solution is refined based on the ETC2 Alpha modifier table values.
Note that ETC2 format supports 16 different Alpha modifier indices, but for performance reasons, only 2 Alpha modifier indices are currently used: modifier index 13, which allows to perform precise approximation
on short Alpha intervals, and modifier index 11, which has more or less regularly distributed values, and is used for large Alpha intervals.

At first it might seem that different size of the color and alpha blocks can bring some complications for Crunch, as according to the original algorithm, both color and alpha endpoints should share the same endpoint references. This, however, is easily resolved
in the following way: each alpha block is using the endpoint reference of the corresponding primary color subblock (this allows to copy alpha endpoint from the left, top, left-top or from the input stream), while the endpoint reference of the secondary color
subblock is simply ignored when decoding alpha channel.

Closing summary

The performed research demonstrates that Crunch compression algorithm in not limited to the DXT format and with some modifications can be used on a different gpu texture formats. We see some research potential to expand this work to cover further texture formats
in the future.

]]>
https://blog.csdn.net/cubesky/article/details/78230390
https://blog.csdn.net/cubesky/article/details/78230390cubesky2017/10/13 21:04:33
1、Concatenation is
the process of appending one string to the end of another string. When you concatenate string literals or string constants by using the + operator,
the compiler creates a single string. No run time concatenation occurs. However, string variables can be concatenated only at run time. In this case, you should understand the performance implications of the various approaches.

2、If you are not concatenating large numbers of strings (for example, in a loop), the performance cost of this
code is probably not significant. The same is true for the String.Concat and String.Format methods.

However, when performance is important, you should always use the StringBuilder class
to concatenate strings.

3、

This is the most interesting test because we have several options here. We can concatenate strings with +, String.Concat, String.Join and StringBuilder.Append.

And the winner for String Concatenation is ... Not string builder but String.Join? After taking
a deep look with Reflector I found that String.Join has the most efficient algorithm implemented which allocates in the first pass the final buffer size and then memcopy each string into the just allocated buffer. This is simply unbeatable. StringBuilder does
become better above 7 strings compared to the + operator but this is not really code one would see very often.

So it seems that when OnRenderImage is being used, or when HDR rendering is on, or when anything that needs the camera to render
to an intermediate buffer is enabled, the camera automatically renders to an intermediate buffer instead of the backbuffer (screen) directly. This is nice.

However when Unity cannot predict that the camera is going to need an intermediate buffer, and your command buffer does a Blit(),
things go wrong: since Unity cannot know in advance what you're doing in the CB, it grabs the image from the back buffer again and the Blit() (which does not know where the input is coming from) flips it.

The issue can be fixed by enabling the camera's "forceIntoRenderTexture". This
forces Unity to quit rendering to the back buffer directly and always render the camera to an intermediate buffer (regardless of HDR state, whether or not there are image effects present, etc.)

]]>
https://blog.csdn.net/cubesky/article/details/74276109
https://blog.csdn.net/cubesky/article/details/74276109cubesky2017/07/03 21:07:30Unity 5 is a very popular video game engine on mobile devices and other platforms. Being a state of the art game engine, it supports everything you
might need when it comes to character animation including compression.

The relevant FBX Importer and Animation
Clip documentation is very sparse. It’s worth mentioning that Unity 5 is a closed source software and as such, there is some amount of uncertainty
and speculation. However, I was able to get in touch with an old colleague working at Unity to clarify what happens under the hood.

Before we dig into what each compression setting does we must first briefly cover the data representations that Unity 5 uses internally.

Track Data Encoding

The engine uses one of three encodings to represent an animation track regardless of the track data type (quaternion, vector, float, etc.):

Rotation tracks are always encoded as four curves to represent a full quaternion (one curve per component). An obvious win here could be to instead encode rotations as quaternion logarithms or by dropping the quaternion W component
or the largest component. This would of course immediately reduce the memory footprint for rotation tracks by 25% at the expense of a few instructions to reconstruct the original quaternion.

Legacy Curve

Legacy curves are a strange beast. The source data is sampled uniformly at a fixed interval such as 30 FPS and is kept unsorted in full precision. Using discreet samples during decompression a Hermite
curve is constructed on the fly and interpolated. It is unclear to me how this format emerged but it has since been superseded by the other two and it is not frequently used.

It must have been quite slow to decompress and should probably be avoided.

Streaming Curve

Streaming curves are proper curves that use Hermite coefficients. A track is split into intervals and each interval is encoded as a distinct spline.
This allows discontinuities between intervals. For example, a camera cut or teleporting the root in a cinematic. Each interval has a small header of 8 bytes and each control point is stored in full floating point precision
plus an index. This is likely overkill. Full floating point precision is typically far too much for encoding rotations and using simple
quantization to store them on 16 bits per component or less could provide significant memory savings.

The resulting control points are sorted by time followed by track to render them as cache as efficient as possible which is a very good thing. At decompression, a cursor or cache is used to avoid repeatedly searching for our control points when playback is
continuous and predictable. For these two reasons streaming curves are very fast to decompress in the average use case.

Dense Curve

What Unity 5 calls a dense curve I would call a raw format. The original source data is sampled at a fixed interval such as 30
FPS and nothing more is done to it as far as I am aware. The data is sorted to make it cache efficient by time and track. No linear
key reduction is performed or attempted. The sampled values are not quantized and are simply stored with full precision.

Dense curves will typically have a smaller memory footprint than streaming curves only for very short tracks or for tracks
where the data is very noisy such as a motion capture. For this reason, they are unlikely to be used in practice.

Overall their implementation is simple but perhaps a bit naive. Using simple quantization would give significant memory gains here
without degrading decompression performance and might even speed it up! On the upside decompression speed is very likely to be faster than with streaming curves.

Compression Settings

No Compression

The most detailed quote from the documentation about what this setting does is:

Disables animation compression. This means that Unity doesn’t reduce keyframe count on import, which leads to the highest precision animations, but slower performance and bigger file and runtime memory size. It is
generally not advisable to use this option - if you need higher precision animation, you should enable keyframe reduction and lower allowed Animation Compression Error values instead.

From what I could gather the originally imported clip is sampled uniformly (e.g. 30 FPS) and each track is converted into a streaming
curve. This ensures everything remains smooth and accurate but the overhead can be very significant since all samples are retained. To make things worse nothing is done for constant tracks with this setting.

Keyframe Reduction

The most detailed quote from the documentation is:

Removes redundant keyframes.

When this setting is used constant tracks will be collapsed to a single value and redundant control points in animated tracks which will be removed within the specified error threshold. This setting uses streaming
curves to represent track data.

Only three thresholds are exposed for this compression setting. One for each track type: rotation, translation, and scale. This is very likely to lead to the problems discussed in my post on measuring
accuracy. And indeed, a quick search yields this gem. Even though it dates from Unity
3(2010), I doubt the animation compression has changed much. Unfortunately, the problems it raises are both very telling and common with local space error metric functions. Here are some relevant excerpts:

Now, you may be asking yourself, why would this guy turn off the key reducer in the first place? The answer is simple. The key reducer sucks. Here’s why.

Every animation I have completed for this project uses planted keys to anchor the feet (and sometimes hands) to the floor. This allows me to grab any part of the body and animate it, knowing that the feet will not
move. When I export the FBX, the keys stay intact. I can bring the animation back into Max or into Maya using the keyframe reducer for either software, and the feet remain anchored. When I bring the same FBX into Unity the feet slide around. Often quite noticably.
The only way to stop the feet from sliding is to turn off the key reducer.

This is a very common problem with local space error functions. Tweaking them is hard! The end result is that very often a weaker compression or no compression at all is used when issues are found on a clip by clip basis. I have seen this exact behavior from
animators working on Unreal 3 back in the day and very recently in a proprietary AAA game engine. Even though from the user’s perspective
the issue is the animation compression algorithm, in reality, the issue is almost entirely due to the error function.

What I would really like to see is some options within Unity’s animation importer. A couple ideas:

1) Max’s FBX keyframe reduction has several precision threshold settings that dictate how accurate the keyframe reduction should be. In Unity, it’s all or nothing. I would love the ability to adjust the threshold
in which a keyframe gets added. I could turn up the sensitivity on some animations to reduce sliding and possibly turn it down on areas that need fewer keys than are given by the current value.

2) I’m not sure if this is possible, but it would be great to set keyframe reductions on some bones and not others. That way I can keep the arm chain in the proper location without having to bloat the keyframes of
every bone in the whole skeleton.

Exposing a single error threshold per track type is very common and provides a source of frustration for animators. They often know which bones need higher accuracy but are unable to properly tweak per bone thresholds. Sadly, when this feature is present the
settings often end up being copy & pasted with overly conservative values which yield a higher than needed memory footprint. Nobody has time to tweak a large number of thresholds repeatedly.

Unity 3 actually corrects the problem by giving us direct control over the keyframe reduction vs allowable error. If you find your animation is sliding to much, dial down the Position Error and/or Rotation Error
settings in the animation import settings.

Unfortunately, I didn’t find any satisfying setup :/

I got some characters that move super fast in their anim, and I need the player to see the move correctly for gameplay / balance reasons.

So it can works for some anims, but not for others (making them feel like they are teleporting).

And under a certain reduction threshold, the memory size benefit is too small to resolve loading times problem :/

In fact, the only reduction setting I found that didn’t caused teleportations was :

Position : 0.1
Rotation : 0.1
Scale : 0 (as there is never any animated scale)

But this is still causing huge file sizes :(

A single error threshold per track type also means that the error threshold has to be as low as your most sensitive bone requires. This will, in turn, retain higher accuracy that might otherwise be needed yielding again a higher memory footprint that is often
unacceptable.

Optimal

The most detailed quote from the documentation is:

Let unity decide how to compress. Either by keyframe reduction or by using dense format. Unity will pick the most optimal of the two.

If a track is very short or very noisy (which could happen with motion capture clips or baked simulations), the key reduction algorithm might not give appreciable gains and it is possible that a dense
curve might end up having a smaller memory footprint than a streaming curve. When this happens for a particular
track it will use the curve with the smallest memory footprint. As such, within a single clip, we can have a mix of dense and streaming curves.

Conclusion

The Unity 5 documentation is sparse and at times unclear. It leads to rampant speculation as to what might be going on under the hood and a lot of
confusing results.

Its error function is poor, exposing a single value per track type. This leads to classic issues such as turning compression off to retain accuracy and using an overly conservative threshold to retain accuracy at the expense of the memory footprint. It perpetuates
the stigma that animation compression can be painful to work with and can easily butcher an animator’s work without manual tweaking. Fixing the error function could be a reasonably simple task.

The optimal compression setting seems to be a very reasonable default value but it is not clear why the other two are exposed at all. Users are very likely to use one of the other settings instead of tweaking the error function thresholds which is probably
a bad idea.

All curve types encode the data in full precision with 32-bit floating point numbers. This is likely overkill in a very large number of scenarios and implementing some form of simple
quantization could provide huge memory gains with little work and little effort. Due to the reduced memory footprint, decompression timings might even improve.

Furthermore, rotation tracks could be encoded in a better format than a full quaternion further reducing the memory footprint for minimal work.

From what I could find nobody seemed to complain about animation decompression performance at runtime. This is mostly likely a direct result of the cache friendly data format and the usage of a cursor for streaming
curves.

]]>
https://blog.csdn.net/cubesky/article/details/63757260
https://blog.csdn.net/cubesky/article/details/63757260cubesky2017/03/19 22:46:14
A Bi-directional Reflectance Distribution Function (BRDF) is a mathematical function that describes how light is reflected when it hits a surface. This
largely corresponds to a lighting model in
Unity-speak (although note that BDRFs are concerned only with reflected light, whereas lighting models can also account for emitted light and other lighting effects).

The “bi-directional” bit refers to the fact that the function depends on two directions:

the direction at which light hits the surface of the object (the direction of incidence, ωi)

the direction at which the reflected light is seen by the viewer (the direction of reflection, ωr).

These are typically both defined relative to the normal vector of the surface, n, as shown in the following diagram (rather than simple angles, each direction
is actually modelled in the BRDF using spherical coordinates (θ, φ)
making the BRDF a four-dimensional function):

Given that our perception of a material is determined to a large extent by its reflectance properties, it’s understandable that several different BRDFs have been developed, with different effectiveness and efficiency at modelling different types of surfaces:

Lambert: Models perfectly diffuse smooth surfaces, in which
apparent surface brightness is affected only by angle of incident light. The observer’s angle of view has no effect.

Phong and Blinn-Phong:
Models specular reflections on smooth shiny surfaces by considering both the direction of incoming light and that of the viewer.

Oren-Nayar: Models diffuse reflection from rough
opaque surfaces (considers surface to be made from many Lambertian micro-facets)

Torrance-Sparrow:
Models specular reflection from rough opaque surfaces (considers surface to be made from many mirrored micro-facets).

BRDF Maps

In the case of game development, it’s often not necessary to strive for physically-accurate BRDF models of how a surface reacts to light. Instead, it’s sufficient to aim for something that “looks” right. And that’s where BRDF maps come in (sometimes also called
“Fake BRDF”).

A BRDF map is a two-dimensional texture. It’s used in a similar way to a one-dimensional “ramp” texture, which are commonly used to lookup replacement values for individual lighting coefficients. However, the BRDF map represents different parameters on each
of its two axes – the incoming light direction and the viewing direction as shown below:

A shader can use a tex2D lookup based on these two parameters to retrieve the pixel colour value for any point on a surface as a very cheap way of modelling light reflection. Here’s an example Cg BRDF surface shader:

//
For illustrative purpsoes, let's set the pixel colour based entirely on the BRDF texture

//
In practice, you'd normally also have Albedo and lightcolour terms here too.

c.rgb
= brdf * (atten * 2);

c.a
= s.Alpha;

returnc;

}

structInput
{

float2
uv_MainTex;

};

sampler2D
_MainTex;

voidsurf
(Input IN, inout SurfaceOutput o) {

o.Albedo
= tex2D (_MainTex, IN.uv_MainTex).rgb;

}

ENDCG

}

Fallback "Diffuse"

}

And here’s the image it produces – notice how the shading varies from red to yellow based on view direction, and from light to dark based on direction to the light source.

As a slightly less trivial example, here’s another BRDF texture map that again uses light direction relative to surface on the x axis, but this time, instead of using view direction, uses curvature of the surface on the y axis (the gradient in the y axis is
quite subtle but you should be able to note reddish hue towards the top centre of the image, and blueish tint at the top right):

This map can be used to generate convincing diffuse reflection of skin that varies across the surface of the model (such that, say, the falloff at the nose appears different from the forehead), as shown here:

When the obj is an instance of GameObject, the null check will failed.

Reason

C# operators are overloaded -
not overridden. This means that their implementation
is determined at compile time - not runtime, and so is decided by the type of the reference - not the object being referenced.

In your case you are doing a null check on a variable of type object, which will invoke the == operator as defined for System.Object, the functionality of which is
to check if the actual reference is equal to null, which it will not be for a reference to a destroyed GameObject.

To achieve the functionality you want, you need to also perform a null check with a reference of type GameObject, thus invoking the == operator as defined for GameObject,
the functionality of which is to also equal null if the GameObject has been destroyed.

Specifies a scale factor that is used to create a variable depth offset for each polygon. The initial value is 0.

units

Is multiplied by an implementation-specific value to create a constant depth offset. The initial value is 0.

When GL_POLYGON_OFFSET_FILL, GL_POLYGON_OFFSET_LINE,
or GL_POLYGON_OFFSET_POINT is enabled, each fragment's depth value
will be offset after it is interpolated from the depth values of the appropriate vertices. The
value of the offset is factor×DZ+r×unitsfactor×DZ+r×units,
where DZDZ is
a measurement of the change in depth relative to the screen area of the polygon, and rr is
the smallest value that is guaranteed to produce a resolvable offset for a given implementation. The offset is added before the depth test is performed and before the value is written into the depth buffer.

]]>
https://blog.csdn.net/cubesky/article/details/51462352
https://blog.csdn.net/cubesky/article/details/51462352cubesky2016/05/20 15:55:52
There is a requirement for imported meshes to be usable on Combine(): "Read/Write Enabled"
flag must be set on mesh import settings.

]]>
https://blog.csdn.net/cubesky/article/details/51015802
https://blog.csdn.net/cubesky/article/details/51015802cubesky2016/03/30 15:57:53Unity SerializationSo you are writing a really cool editor extension in Unity and things seem to be going really well. You get your data structures all sorted
out are really happy with how the tool you have written works.

Then you enter and exit play mode.

Suddenly all the data you had entered is gone and your tool is reset to the default, just initialized state. It’s very frustrating! “Why does
this happen?” you ask yourself. The reason has to do with how the managed (mono) layer of Unity works. Once you understand it, then things get much easier

What happens when an assembly is reloaded?When you enter / exit play mode or change a script Unity has to reload the mono assemblies, that is the dll's associated with Unity.

On the user side this is a 3 step process:

Pull all the serializable data out of managed land, creating an internal representation of the data on the C++ side of Unity.

Destroy all memory / information associated with the managed side of Unity, and reload the assemblies.

Reserialize the data that was saved in C++ back into managed land.

What this means is that for your data structures / information to survive an assembly reload you need to ensure that can get serialized into
and out of c++ memory properly. Doing this also means that (with some minor modifications) you can save this data structure to an asset file and reload it at a later date.

How do I work with Unity's serialization?The easiest way to learn about Unity serialization is by working through an example. We are going to start with a simple editor window, it
contains a reference to a class which we want to make survive an assembly reload.

When you run this and force an assembly reload you will notice that any value in the window you have changed will not survive. This is because
when the assembly is reloaded the reference to the ‘m_SerialziedThing’ is gone. It is not marked up to be serialized.

There are a few things that need to be done to make this serialization work properly:In MyWindow.cs:

The field ‘m_SerializedThing’ needs to have the attribute [SerializeField] added to it. What this tells Unity is that it should attempt to serialize this field on assembly reload or similar events.

In SerializeMe.cs:

The class ‘SerializeMe’ needs to have the [Serializable] attribute added to it. This tells Unity that the class is serializable.

The struct ‘NestedStruct’ needs to have the [Serializable] attribute added to it.

Each (non public) field that you want to be serialized needs to have the [SerializeField] attribute added to it.

After adding these flags open the window and modify the fields. You will notice that after an assembly reload that the fields retain their
values; that is apart from the field that came from the struct. This brings up the first important point, structs are not very well supported for serialization. Changing ‘NestedStruct’ from a struct to a class fixes this issue.

Classes you want to be serializable need to be marked with [Serializable]

Public fields are serialized (so long as they reference a [Serializable] class)

Private fields are serialized under some circumstances (editor).

Mark private fields as [SerializeField] if you wish them to be serialized.

[NonSerialized] exists for fields that you do not want to serialize

Scriptable ObjectsSo far we have looked at using normal classes when it comes to serialization. Unfortunately using plain classes has some issues when it comes
to serialization in Unity. Lets take a look at an example.

This is a contrived example to show a very specific corner case of the Unity serialization system that can catch you if you are not careful.
You will notice that we have two fields of type NestedClass. The first time the window is drawn it will show both the fields, and as m_Class1 and m_Class2 point to the same reference, modifying one will modify the other.

Now try reloading the assembly by entering and exiting play mode... The references have been decoupled. This is due to how serialization of
works when you mark a class as simply [Serializable]

When you are serializing standard classes Unity walks through the fields of the class and serializes each one individually, even if the reference
is shared between multiple fields. This means that you could have the same object serialized multiple times, and on deserialization the system will not know they are really the same object. If you are designing a complex system this is a frustrating limitation
because it means that complex interactions between classes can not be captured properly.

Enter ScriptableObjects! ScriptableObjects are a type of class that correctly serializes as references, so that they only get serialized once.
This allows complex class interactions to be stored in a way that you would expect. Internally in Unity ScriptableObjects and MonoBehaviours are the same; in userland code you can have a ScriptableObject that is not attached to a GameObject; this is different
to how MonoBehaviour works. They are great for general data structure serialization.

We create an instance using the CreateInstance<> function instead of calling the constructor.

We also set the hide flags... this will be explained later

These simple changes mean that the instance of the NestedClass will only be serialized once, with each of the references to the class pointing
to the same one.

ScriptableObject InitializationSo now we know that for complex data structures where external referencing is needed it is a good idea to use ScriptableObjects. But what is
the correct way to work with ScriptableObjects from user code? The first thing to examine is HOW scriptable objects are initialized, especially from the Unity serialization system.

The constructor is called on the ScriptableObject

Data is serialized into the object from the c++ side of unity (if such data exists)

OnEnable() is called on the ScriptableObject

Working with this knowledge there are some things that we can say:

Doing initialization in the constructor isn’t a very good idea as data will potentially be overridden by the serialization system.

Serialization happens AFTER construction, so we should do our configuration stuff after serialization.

OnEnable() seems like the best candidate for initialization.

Lets make some changes to the ‘SerializeMe’ class so that it is a ScriptableObject. This will allow us to see the correct initialization pattern
for ScriptableObjects.

Code (csharp):

// also updated the Window to call CreateInstance instead of the constructor

On the surface it seems that we have not really changed this class much, it now inherits from ScriptableObject and instead of using a constructor
has an OnEnable(). The important part to take note of is slightly more subtle... OnEnable() is called AFTER serialization; because of this we can see if the [SerializedFields] are null or not. If they are null it indicates that this the first initialization,
and we need to construct the instances. If they are not null then they have been loaded into memory, and do NOT need to be constructed. It is common in OnEnable() to also call a custom Initialization function to configure any private / non serialized fields
on the object, much like you would do in a constructor.

HideFlagsIn the examples using ScriptableObjects you will notice that we are setting the ‘hideFlags’ on the object to HideFlags.HideAndDontSave. This
is a special setup that is required when writing custom data structures that have no root in the scene. This is to get around how the Garbage Collector works in Unity.

When the garbage collector is run it (for the most part) uses the scene as ‘the root’ and traverses the hierarchy to see what can get GC’d.
Setting the HideAndDontSave flag on a ScriptableObject tells Unity to consider that object as a root object. Because of this it will not just disappear because of a GC / assembly reload. The object can still be destroyed by calling Destroy().

Some ScriptableObject Rules

ScriptableObjects will only be serialized once, allowing you to use references properly

Use OnEnable to initialize ScriptableObjects

Don’t ever call the constructor of a ScriptableObject, use CreatInstance instead.

For nested data structures that are only referenced once don’t use ScriptableObject as they have more overhead.

If your scriptable object is not rooted in the scene set the hideFlags to HideAndDontSave

Concrete Array SerializationLets have a look at a simple example that serializes a range of concrete classes.

This basic example has a list of BaseClasses, by clicking the ‘Add Simple’ button it creates an instance and adds it to the list. Due to the
SerializeMe class being configured properly for serialization (as discussed before) it ‘just works’. Unity sees that the List is marked for serialization and serializes each of the List elements.

General Array SerializationLets modify the example to serialize a list that contains members of a base class and child class:

The example has been extended so that there is now a ChildClass, but we are serializing using the BaseClass. If you create a few instance of
the ChildClass and the BaseClass they will render properly. Issues arise when they are placed through an assembly reload. After the reload completes every instance will be a BaseClass, with all the ChildClass information stripped. The instances are being sheared
by the serialization system.

The way to work around this limitation of the serialization system is to once again use ScriptableObjects:

After running this, changing some values, and reloading assemblies you will notice that ScriptableObjects are safe to use in arrays even if
you are serializing derived types. The reason is that when you serialize a standard [Serializable] class it is serialized ‘in place’, but a ScriptableObject is serialized externally and the reference inserted into the collection. The shearing occurs because
the type can not be properly be serialized as the serialization system thinks it is of the base type.

Serializing Abstract ClassesSo now we have seen that it’s possible to serialize a general list (so long as the members are of type ScriptableObject). Lets see how abstract
classes behave:

This code much like the previous example works. But it IS dangerous. Lets see why.

The function CreateInstance<>() expects a type that inherits from ScriptableObject, the class ‘MyBaseClass’ does in fact inherit from ScriptableObject.
This means that it’s possible to add an instance of the abstract class MyBaseClass to the m_Instances array. If you do this and then try and access an abstract method bad things will happen because there is no implementation of that function. In this specific
case that would be the OnGUI method.

Using abstract classes as the serialized type for lists and fields DOES work, so long as they inherit from ScriptableObject, but it is not
a recommended practice. Personally I think it’s better to use concrete classes with empty virtual methods. This ensures that things will not go bad for you.

It
seems like there is some sort of bug with shaders / materials regarding render queues.I have shader that is set to be in Geometry+10, and another shader that is set to be in the Geometry queue. It the frame debugger the shader
that is supposed to be in the Geometry+10 queue is drawn first! And the shader that is supposed to be in the Geometry queue is drawn in the alpha pass!

Solution:

It's been reported for a while by several of us since 5.0. When you change the shader a material uses
it sets the material'squeue
to that of the shader at the moment you change it. This is very frustrating for shader development. The easiest fix is, as you said, to change to another shader and back, but another way is to right click on the inspector and select debug and then change the
custom queue to -1. This is a special setting that means "use the shader's queue" and is what it was set to prior to 5.0.