I will talk about the above table of contents. First, we’ll talk about pipeline state objects, how to use them in Unreal Engine, and also add a cache using PSO to see how to use them.
Let’s look at the PSO cache first.
Let’s move on because we know everything.
It’s a graphics pipeline … you know all this, so let’s move on.
Compute shaders are now universally used in the latest mobile games.
Pipeline? Graphics hardware support
Optimized hardware unit allocation for each stage stage.
It can be nested because it is divided into stages: maximum efficiency.
When you run the pipeline, what if you want to do a slightly different action?
Example) If you have the ABCDE pipeline.
1:AB–E Only / 2:A–DE only / 3:AB’C-E
“What is hardware support in the graphics pipeline?” Each step is assigned to an optimized hardware unit so that it can be optimized for each hardware. On the right is how the CPU handles instructions. If you look at the process of processing, it is 8 cycles because of 2 commands because it is serial processing. It can be said that doing with a pipeline made it possible to solve the overlapping instruction so that it could be processed in 5 cycles.
The pipeline consists of several stages.
Each stage has different actions based on the preset state information.
What if the state is not changed? Use the previous state as it is and use it for performance optimization.
State information is called State. State setting is required for each pipeline use. When We run this pipeline, you have to make an appointment in advance which stage and how it should work. The state itself is heavy to change. It is best not to change the state. Therefore, there is a way to optimize by listing similar states. Unreal Engine itself is structured in that context.
In the past, all of these states were handled individually. In the case of DX9, for example, the state for alpha blending was used one by one.
In the past, all states one by one.
Ex) D3D9 Render State: Alpha Blending State, Texture Stage State
Improvement: so that we can do some related settings at once.
The goal is to reduce the overload of station changes by setting other related settings as well.
Can be created and set at render time.
Dependencies between hardware units exist.
When setting Ex.hardware blend, Taster State also affects Blend State.
After that, a slight improvement was made by bundling related states and processing them at once. For example, in the case of DX11, Blend State is the alpha-to-coverage value, and information that determines how to render each MRT when the render target supports MRT, and whether to use these blending MRTs as one information or individually. Until it is processed.
Let’s set the state at a pipeline level. Pipeline State
Hardware configuration of how the input data will be interpreted and drawn.
Shaders and render states (Blend, Depth Stencil,Rasterizer,…) and others.
Pipeline State Objects Manage pipeline state through PSO.
The concept of letting the pipeline work at once is the pipeline state. It refers to the configuration for the entire hardware. It is controlled through an object called PSO. An object that contains pipeline state information.
Pipeline state object. An object containing pipeline state information.Pipeline State Object ==PSO
Supported Graphics API: D3D12 / Vulkan / Metal
Used for pipeline state management.
Judging and validating the state in advance.
Allows pipeline states to be replaced more quickly at render time.
Pipeline State Objects Sets most pipeline states through PSO.
Set to PSO.
All shader bytecodes, Blend State, Rasterizer, DepthStencil State, Multi-Sampling information, and more.
The purpose itself is intended to manage pipeline states. It is to determine whether the pipeline works without problems with the pipeline state in advance. In actual use, it is the level to believe and use. We can change the entire state much faster.
Things that don’t change well on a pipeline basis. Viewport precision, scissors testing, etc… are supposed to be handled at the command list level.
Low level cache.
Low level cache Since the PSO itself was already created with the assumption of recycling, it has already been arranged in the graphics API stage. D3D12 / Vulkan / Metal
Cache support for runtime generated PSO.
D3D12 / Vulkan
Create a load-time PSO by file out the PSO to disk.
ProgramBinary 지원 디바이스(OpenGL ES 3.0 이상)Create a load-time PSO by file out the PSO to disk.
However, OpenGL is not actually an API that supports PSO. Instead, it works like a PSO on hardware that supports a feature called ProgramBinary, and reads it later.
RHI A thin layer on the platform-specific graphics API. Platform-independent code that handles all operations. PSO generated at the low level is stored as a render resource. Utilizing this archived information, the Map container containing the PSO is used to search the cache.
Low level Cache – D3D12 Simultaneous use of runtime cache and cache loaded from file.
Runtime cache = Search and download “GraphicsPopelineStateInitializer” from RHI.
Loaded cache = Search using low level description information.
Low level description: Platform-dependent pipeline state descriptor.
By assigning LRU to the PSO object, the memory space can be flexibly secured even if a heatcing phenomenon occurs. This is mainly because of the Android platform. When developing the Android version of Fortnite Mobile, it was applied to solve the Android platform memory problem.
Low Level Cache – OpenGL. Does not support PSO.
Not a bulk change through the pipeline state…
Shader State + Render State updated respectively.
Low-level cache support for bound shader states (BoundShaderState, BSS).
Helps to make batch changes only for shader states, not batch changes. BSS Cache
Low Level Cache – OpenGL. Program Binary Cache
OpenGL compiles and shades individual shaders and creates them as Program Objects.
Ability to write program files so that program objects are not recompiled so that they can be loaded and reused later.
Separate Shader Object support.
LRU algorithm support.
Very useful for OpenGL ES platforms that lack shader memory.
For Mali GPU, the maximum shader memory heap size allowed by the driver is small.
Crash when Shader Code Library is not ready when trying to access DLC content after pak mount.
Cause: When the engine is initialized, FShaderCodeLibrary::InitForRuntime(…) does an open operation for the plugin, but the DLC plugin that appears after downloading the content and mounting the Pak will be excluded from this operation.
The simplest solution is to open the plugin Shader Code Library directly after mounting Pak.
DLC + Shader Code Library Reopen PSO cache?
Basic engine operation.
At engine Perinit, open Shader Code Library with project name (Global, Game) or plugin name (excluding DLC).
When the engine is preinit, the Shader Pipeline Cache is also opened as the project name.
Create/load Program Binary Cashe using the same GUID as Shader Pipeline Cache.
In general, proceed as follows.
[Engine initialization] Run engine with AK->Open ShaderCodeLibray in AKP->Open PSO cache->PSO cache Precompile.
[Level for patching] Pak mount->Open ShaderCodeLibrary in DLC to remove crash->Reopen PSO cache?