Texture compression:
Texture compression is a process used to reduce the file size of texture images, which are essential for detailing 3D models. This technique allows for more efficient storage and faster loading times while minimizing the impact on graphical quality. Compression can be lossy, where some data is lost to achieve smaller sizes, or lossless, preserving the original image data. Efficiently compressed textures are crucial for optimizing performance, particularly in real-time applications like video games or interactive media, where memory and bandwidth are limited.
Prioritizing the use of lossless source textures—such as PNG formats—is critical before submitting these assets through the compression pipelines inherent to engines like Unreal, Unity, and PlayCanvas. This step is paramount to circumventing the pitfalls of dual compression.
The engine’s internal pipeline compressing an already compressed source texture usually does not result in relevant size reduction and can lead to significant quality degradation. This is because the initial compression has already optimized the texture's data structure, removing as much redundancy as possible. Further compression often targets the same data in a diminishing returns scenario, where each additional compression pass yields smaller gains and potentially compromises the texture's fidelity, impacting the visual quality in applications.
In-engine compression pipelines:
In-engine compression formats vary across different game engines but commonly include DXT (DirectX Texture Compression) for standard textures, PVRTC for iOS devices, ETC (Ericsson Texture Compression) for Android devices, and ASTC (Adaptive Scalable Texture Compression) offering a wide range of compression options suitable for various graphical fidelity and performance needs.
Color models like RGB and RGBA cater to different use cases based on the need for transparency in textures. Choosing between them depends on the specific requirements of the texture's application in the game or 3D scene, balancing file size with visual fidelity.
Linear color space represents colors as they are mathematically, corresponding directly to the amount of light emitted. This is ideal for rendering calculations, as it reflects the physical properties of light and color accurately. Non-linear color space, often referred to as gamma space, adjusts colors to account for the non-linear way human vision perceives light and color, making it better suited for displaying images on screens. Each space serves different phases of digital content creation: linear for creating and processing images and non-linear for viewing them optimally on display devices. RGB and sRGB are not exactly the same as linear and non-linear color spaces, but they are closely related.
RGB (Red, Green, Blue) is a color model used to represent colors in digital displays and images. It describes colors based on the intensity of red, green, and blue components.
sRGB (standard Red Green Blue) is a specific RGB color space that was created to standardize how colors are displayed across different devices, such as monitors, printers, and cameras. It defines a specific gamma curve (a non-linear transformation) to account for the way human vision perceives brightness.
Choosing between RGB vs RGBA, RGB vs sRGB and linear vs non-linear color spaces is an integral part of the texture compression pipeline, impacting the final appearance and performance of digital content.
JPEGs and many non-professional tools don't allow users to explicitly choose color spaces and color models with the same level of control provided by professional-grade software or game engines.
As an example, for PC platforms targeting high-quality visuals, general recommended compression settings in-engine would include:
Diffuse Maps: Use formats like DXT5 or BC7, which offer a balance between quality and file size, supporting detailed color textures with or without alpha channels for transparency.
Grayscale Maps (Specularity, Metallic, Roughness): DXT5 or BC4/BC5 are suitable, providing the necessary precision for these maps that don't require full color data but benefit from the alpha channel for detail.
Normals: BC5 or BC7 formats are recommended, as they preserve the detail and accuracy needed for the lighting effects normals influence, with BC5 specifically designed for compressing normal maps.
These settings aim to maintain visual fidelity while managing memory use efficiently.
Normal compression:
The compression settings for normal maps are critically important because they directly impact the visual quality of lighting and shading in 3D environments. Normal maps require careful handling to preserve the nuances of surface textures and angles accurately, ensuring that lighting interacts with surfaces as intended.
Incorrect compression settings for normal maps – such as using non-linear color space – can significantly affect how assets appear in a 3D environment. Since normal maps influence the way lighting and shading work on a surface, any degradation or loss of detail due to improper compression can disrupt these interactions, leading to materials that look unrealistic or visually incorrect. This impacts not just the appearance of textures but also the effectiveness of specularity, roughness, and other material properties critical to achieving the desired visual outcome.
Texture resolutions and mipmapping.
Mipmaps are pre-calculated, optimized sequences of images accompanying a main texture, designed to increase rendering efficiency and reduce aliasing. Each subsequent mipmap is a downscaled version of the previous one, allowing graphics hardware to select the most appropriate texture size during rendering, based on the distance and angle of the object in view.
Mip-offset can usually be used in quality settings to manually adjust which level of mipmap is used, allowing for finer control over texture quality and performance.
Using a lower mipmap level in the compression stage will result in a smaller file size because it utilizes a lower-resolution version of the texture, reducing memory usage and potentially increasing performance, especially on hardware with limited resources. This trade-off between quality and performance is a key consideration in graphics optimization.
For mipmapping to be effective, the textures need to start at power-of-two resolutions (e.g., 2048x2048, 1024x1024, 512x512 etc.). This allows for the efficient generation of each subsequent mipmap level with minimal artifacting, each being half the size of the previous one in both dimensions, until reaching a 1x1 pixel texture. This is because each level scales down smoothly by half, maintaining the integrity of the image data more effectively than if starting with non-standard resolutions. This consistency ensures that each mipmap level is a proper, artifact-minimized representation of the original.
Non-square textures like 2048x1024 for efficient mipmapping are also a viable option. While power-of-two dimensions are preferred for compatibility and efficiency, modern graphics hardware and APIs often support non-square textures, allowing for flexible mipmap generation and usage. This can be particularly useful for specific cases like environment maps or UI elements where aspect ratio is key.
CHANNEL packing:
The traditional approach to texture mapping in 3D graphics involves assigning separate image files to different texture maps, such as diffuse, specular, roughness, metallic and normal maps, for each material. Compared to texture packing, this method, while straightforward, can lead to increased draw calls and texture binds, slowing down rendering processes and consuming substantial memory. It lacks the efficiency and optimization offered by channel packing, requiring more resources for asset management and potentially leading to longer load times and reduced performance, particularly on hardware with limited capabilities.
In contrast, channel packing approach streamlines the use of textures in 3D graphics by merging multiple texture maps into a single file. This method drastically reduces draw calls and texture binds, enhancing rendering efficiency and performance. It employs sophisticated strategies, to maximize texture space utilization and minimize memory footprint. This approach not only simplifies asset management but also optimizes hardware resources, making it ideal for achieving high visual quality in performance-constrained environments.
The caveat:
Given that each RGBA texture file provides four color channels, with each channel typically spanning a 0-255 range, the essence of texture packing is to maximize the use of this space. The goal is to embed as much information as possible into a single texture file, optimizing the storage and processing of graphical data. This approach strategically reduces memory expenditure and lowers the count of texture sample instructions, streamlining the underlying rendering process.
Texture packing does however shift some performance requirements from texture sampling and memory constraints towards shader computations. By packing multiple textures or texture information into fewer texture files and utilizing shader programs to unpack and interpret this data, there is a decrease in memory usage and potentially fewer texture binds and draw calls. However, this efficiency comes at the cost of additional complexity in shader computations, as shaders must now handle the unpacking of packed texture channels and the application of those textures to the 3D models, including driving color gradients or other effects based on the packed data.
Basic packing technique:
Considering that maps such as roughness, specularity, metallic, opacity or translucency are fundamentally single-channel, incorporating them into a single texture file becomes a logical step. This can offer performance benefits over using separate, single-channel maps, even considering their inherently smaller size.
This optimization reduces the number of texture samples the GPU must fetch, streamlining the rendering pipeline. By combining these maps, we’re minimizing the overhead associated with switching between textures, which can be particularly advantageous in rendering environments where performance and resource management are critical, such as mobile or web devices.
Due to the specifics of common texture compression format’s treatment of alpha channel, generally, the alpha channel is ideal for storing maps that benefit from this precision, such as roughness or opacity. The RGB channels, due to their equal treatment in most compression formats, are suited for maps where precision is less critical, such as specularity and metallic maps. This allocation ensures optimal use of texture data within the constraints of common compression techniques.
While the specific channel allocation for different grayscale maps might not be inherently critical outside of considerations for alpha channel precision, what remains paramount is the adoption of a uniform standard across the project. This ensures consistency in texture application and streamlines the development workflow, facilitating easier management and modification of materials and textures.
Advanced techniques:
Normal packing:
A normal map is an advanced graphical texture representing the finer surface details of a 3D object. It does so by encoding the directions of normals—vectors perpendicular to the surface—across the object, simulating textures like bumps or scratches without adding geometric complexity. Each color channel in a normal map—Red (R), Green (G), and Blue (B)—corresponds to the X, Y, and Z components of these normals, respectively. Typically, the Z (B) component, which represents depth, can be algorithmically derived from the R and G channels, allowing for further optimization by freeing up a channel for other data, and reducing memory footprint. This method can effectively balance visual detail with computational efficiency.
Deriving Z makes sense when the normals' variation is relatively subtle – such as with human or other organic skin shaders. In such cases, the X and Y components (R and G channels) provide sufficient information to approximate the Z component with minimal visual discrepancy.
Given the X and Y components from the R and G channels, the Z component can be calculated by ensuring the vector's length is 1. This is based on the equation x²+y²+z²=1 solving for Z:
This is a simplified case. In practice, z would be negated considering the direction (positive or negative) based on the surface normal's orientation, and then normalized. This ensures the vector maintains a unit length. This step is crucial for maintaining the integrity of lighting calculations, ensuring that the normals accurately represent surface angles relative to light sources, regardless of the initial magnitude of the vector.
The implementation in shader graph might differ based on the target engine. While for instance Unreal engine provides this functionality inherently, other popular engines like Unity or PlayCanvas while providing advanced shader graph functionality as well, might require custom implementation.
Single and Two channel diffuse gradients:
A diffuse map is a texture map used in to define the base color or surface color of an object indicating how light interacts with the surface material without considering reflections or other lighting phenomena. The term "diffuse" refers to the scattering of each of the light components (RGB) against the surface.
In instances when the surface color variation is relatively low – like human skin shaders, certain metallic shaders such as steel or iron, or organic shaders like some types of wood, we might get away with packing diffuse channel data into two, or even just a single color channel.
If using single-channel packing, the data is extracted from the texture file and used to drive color interpolation logic in-engine, similarly to blender’s (or equivalent) material editor’s color ramp node.
In the case of dual-channel approach, the flow is similar, but the two output gradients are added together before being fed into the diffuse input of the material. Based on the input texture and requirements of the shader, this approach will result in some loss of color resolution. The visual impact of which will vary and may range from heavy, to negligible.
While implementing diffuse packing into less than 3 channels might offer some optimization gains by itself, its biggest value added is in freeing the third and possibly second channel for other data (such as RG packed normals, specularity or roughness), resulting in lower overall shader sampler counts.
While a relatively novel approach, packing diffuse channel data into fewer channels and utilizing other channels for additional data like normals, specularity, or roughness is a technique that's becoming increasingly utilized, especially in performance-sensitive applications like mobile games or VR where memory and computational resources are at a premium.
Conclusion:
In optimizing 3D graphics for real-time applications, understanding and applying texture compression, mipmapping, and texture packing are essential. Recommended optimization order is as follows:
Correct texture compression is crucial for preserving quality while reducing file sizes and avoiding rendering errors.
Mipmapping enhances rendering efficiency and visual fidelity at various distances.
Texture packing consolidates data into fewer textures, optimizing memory usage.
Prioritizing these strategies—compression for foundational efficiency, followed by mipmapping for adaptive quality, and packing for maximal data use—can significantly improve graphical performance and resource management in development projects.
Commentaires