Commit graph

4635 commits

Author SHA1 Message Date
ReinUsesLisp
80f687a811 vk_rasterizer: Skip transform feedbacks when extension is unavailable
Avoids calling transform feedback procedures when
VK_EXT_transform_feedback is not available.
2020-05-29 03:05:29 -03:00
ReinUsesLisp
4686947d38 texture_cache: Handle overlaps with multiple subresources
Implement more surface reconstruct cases. Allow overlaps with more than
one layer and mipmap and copies all of them to the new texture.

- Fixes textures moving around objects on Xenoblade games
2020-05-29 02:57:30 -03:00
bunnei
f984cf489f Merge pull request #3991 from ReinUsesLisp/depth-sampling
texture_cache: Implement depth stencil texture swizzles
2020-05-28 23:33:38 -04:00
ReinUsesLisp
d1e0f2095c maxwell_3d: Reduce severity of logs that can be spammed
These logs were killing performance on some games when they were
spammed. Reduce them to Debug severity.
2020-05-28 18:23:25 -03:00
ReinUsesLisp
454954bcf0 format_lookup_table: Implement G24S8 format as S8Z24 2020-05-28 17:16:07 -03:00
bunnei
595b97a0d7 Merge pull request #3993 from ReinUsesLisp/fix-zla
gl_shader_manager: Unbind GLSL program when binding a host pipeline
2020-05-28 12:15:22 -04:00
ReinUsesLisp
fb620ba4be buffer_cache: Avoid copying twice on certain cases
Avoid copying to a staging buffer on non-granular memory addresses.
Add a callable argument to StreamBufferUpload to be able to copy to the
staging buffer directly from ReadBlockUnsafe.
2020-05-27 23:05:50 -03:00
ReinUsesLisp
eccf9098ae texture_cache: Use unordered_map::find instead of operator[] on hot code 2020-05-27 17:59:04 -03:00
bunnei
cd2ce9ed2d Merge pull request #3961 from Morph1984/bgra8_srgb
maxwell_to_vk: Add format B8G8R8A8_SRGB and add Attachable capability for B8G8R8A8_UNORM
2020-05-27 16:44:22 -04:00
ReinUsesLisp
fa6a64eb72 texture_cache: Use small vector for surface vectors
This avoids most heap allocations when collecting surfaces into a
vector.
2020-05-27 17:31:14 -03:00
ReinUsesLisp
de665d6485 maxwell_3d: Initialize line widths
Initialize line widths to avoid setting a line width of zero.
2020-05-27 16:53:43 -03:00
ReinUsesLisp
2dba4bc34f maxwell_3d: Initialize polygon modes
NVN expects this to be initialized as Fill, otherwise games that never
bind a rasterizer state will log an invalid polygon mode.
2020-05-27 16:52:52 -03:00
ReinUsesLisp
6e0420fe20 shader/other: Implement MEMBAR.CTS
This silences an assertion we were hitting and uses workgroup memory
barriers when the game requests it.
2020-05-27 00:19:45 -03:00
ReinUsesLisp
387b7926c0 texture_cache: Fix layered null surfaces
Null texture cubes were not considered arrays, causing issues on Vulkan
and OpenGL when creating views.
2020-05-26 17:50:08 -03:00
ReinUsesLisp
11f626f034 gl_texture_cache: Implement small texture view cache for swizzles
This fixes cases where the texture swizzle was applied twice on the same
draw to a texture bound to two different slots.
2020-05-26 17:50:08 -03:00
ReinUsesLisp
ed74f3008b texture_cache: Implement depth stencil texture swizzles
Stop ignoring image swizzles on depth and stencil images.

This doesn't fix a known issue on Xenoblade Chronicles 2 where an OpenGL
texture changes swizzles twice before being used. A proper fix would be
having a small texture view cache for this like we do on Vulkan.
2020-05-26 17:44:50 -03:00
ReinUsesLisp
d748723a77 gl_rasterizer: Port front face flip check from Vulkan
While Vulkan was assuming we had no negative viewports, OpenGL code
was assuming we had them. Port the old code from Vulkan to OpenGL,
checking if the first viewport is negative before flipping faces.

This is not a complete implementation since we only check for the first
viewport to be negative. That said, unless a game is using Vulkan,
OpenGL and NVN games should be fine here, and we can always compare with
our Vulkan backend to see if there's a difference.
2020-05-26 16:33:50 -03:00
ReinUsesLisp
c4b6e36a00 fixed_pipeline_state: Remove unnecessary check for front faces flip
The check to flip faces when viewports are negative were a left over
from the old OpenGL code. This is not required on Vulkan where we have
negative viewports.
2020-05-26 16:32:27 -03:00
bunnei
cb82125d87 Merge pull request #3981 from ReinUsesLisp/bar
shader/other: Implement BAR.SYNC 0x0
2020-05-26 14:40:13 -04:00
bunnei
54a3697cac Merge pull request #3980 from ReinUsesLisp/red-op
shader/memory: Implement non-addition operations in RED
2020-05-26 12:50:41 -04:00
ReinUsesLisp
1188c79557 gl_shader_manager: Unbind GLSL program when binding a host pipeline
Fixes regression in Link's Awakening caused by a075bbcf36
2020-05-26 04:20:39 -03:00
bunnei
2736532246 Merge pull request #3978 from ReinUsesLisp/write-rz
shader_decompiler: Visit source nodes even when they assign to RZ
2020-05-25 21:31:33 -04:00
bunnei
e3855321f7 Merge pull request #3905 from FernandoS27/vulkan-fix
Correct a series of crashes and intructions on Async GPU and Vulkan Pipeline
2020-05-24 15:23:38 -04:00
bunnei
83ef89fc45 Merge pull request #3964 from ReinUsesLisp/arb-integration
renderer_opengl: Add assembly program code paths
2020-05-24 00:34:12 -04:00
bunnei
9995a05e94 Merge pull request #3979 from ReinUsesLisp/thread-group
shader/other: Implement thread comparisons (NV_shader_thread_group)
2020-05-24 00:33:06 -04:00
ReinUsesLisp
5db0df833a shader/other: Implement BAR.SYNC 0x0
Trivially implement this particular case of BAR. Unless games use OpenCL
or CUDA barriers, we shouldn't hit any other case here.
2020-05-21 23:20:43 -03:00
ReinUsesLisp
c5d03c7ff1 shader/memory: Implement non-addition operations in RED
Trivially implement these instructions. They are used in Astral Chain.
2020-05-21 23:19:46 -03:00
ReinUsesLisp
d4ba9c4fe5 shader/other: Implement thread comparisons (NV_shader_thread_group)
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially
implement these using Nvidia's extension on OpenGL or naively stubbing
them with the ARB instructions to match. This might cause issues if the
host device warp size doesn't match Nvidia's. That said, this is
unlikely on proper shaders.

Refer to the attached url for more documentation about these flags.
https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
2020-05-21 23:18:37 -03:00
ReinUsesLisp
20f8c8dfff shader_decompiler: Visit source nodes even when they assign to RZ
Some operations like atomicMin were ignored because they returned were
being stored to RZ. This operations have a side effect and it was being
ignored.
2020-05-21 23:16:03 -03:00
ReinUsesLisp
262aea9b19 vk_shader_decompiler: Don't assert for void returns
Atomic instructions can be used without returning anything and this is
valid code. Remove the assert.
2020-05-21 23:16:03 -03:00
ReinUsesLisp
9223dc7a10 buffer_cache: Remove unused boost headers 2020-05-21 16:44:00 -03:00
ReinUsesLisp
f8678b635a map_interval: Add interval allocator and drop hack
Drop the std::list hack to allocate memory indefinitely.

Instead use a custom allocator that keeps references valid until
destruction. This allocates fixed chunks of memory and puts pointers in
a free list. When an allocation is no longer used put it back to the
free list, this doesn't heap allocate because std::vector doesn't change
the capacity. If the free list is empty, allocate a new chunk.
2020-05-21 16:44:00 -03:00
ReinUsesLisp
4d06e9f5cc buffer_cache: Use boost::container::small_vector for maps in range
Most overlaps in the buffer cache only contain one mapped address.
We can avoid close to all heap allocations once the buffer cache is
warmed up by using a small_vector with a stack size of one.
2020-05-21 16:44:00 -03:00
ReinUsesLisp
7a45f97357 buffer_cache: Use boost::intrusive::set for caching
Instead of using boost::icl::interval_map for caching, use
boost::intrusive::set. interval_map is intended as a container where the
keys can overlap with one another; we don't need this for caching
buffers and a std::set-like data structure that allows us to search with
lower_bound is enough.
2020-05-21 16:44:00 -03:00
ReinUsesLisp
c7f9bceb80 buffer_cache: Remove shared pointers
Removing shared pointers is a first step to be able to use intrusive
objects and keep allocations close to one another in memory.
2020-05-21 16:02:54 -03:00
ReinUsesLisp
3460d01352 buffer_cache: Minor style changes
Minor style changes. Mostly done so I avoid editing it while doing other
changes.
2020-05-21 16:02:20 -03:00
ReinUsesLisp
a075bbcf36 renderer_opengl: Add assembly program code paths
Add code required to use OpenGL assembly programs based on
NV_gpu_program5. Decompilation for ARB programs is intended to be added
in a follow up commit. This does **not** include ARB decompilation and
it's not in an usable state.

The intention behind assembly programs is to reduce shader stutter
significantly on drivers supporting NV_gpu_program5 (and other required
extensions). Currently only Nvidia's proprietary driver supports these
extensions.

Add a UI option hidden for now to avoid people enabling this option
accidentally.

This code path has some limitations that OpenGL compatibility doesn't
have:
- NV_shader_storage_buffer_object is limited to 16 entries for a single
OpenGL context state (I don't know if this is an intended limitation, an
specification issue or I am missing something). Currently causes issues
on The Legend of Zelda: Link's Awakening.
- NV_parameter_buffer_object can't bind buffers using an offset
different to zero. The used workaround is to copy to a temporary buffer
(this doesn't happen often so it's not an issue).

On the other hand, it has the following advantages:
- Shaders build a lot faster.
- We have control over how floating point rounding is done over
individual instructions (SPIR-V on Vulkan can't do this).
- Operations on shared memory can be unsigned and signed.
- Transform feedbacks are dynamic state (not yet implemented).
- Parameter buffers (uniform buffers) are per stage, matching NVN and
hardware's behavior.
- The API to bind and create assembly programs makes sense, unlike
ARB_separate_shader_objects.
2020-05-19 18:00:04 -03:00
Morph
e54e7d322e maxwell_to_vk: Add format B8G8R8A8_SRGB
Add format B8G8R8A8_SRGB and add Attachable capability for B8G8R8A8_UNORM
Used by Bravely Default II
2020-05-18 13:02:09 -04:00
Fernando Sahmkow
b9eff4d004 OpenGL: Enable Debug Context and Synchronous debugging when graphics debugging is enabled.
This commit aims to help easing debugging of driver crashes without
having to modify existing code.
2020-05-17 21:45:09 -04:00
David Marcec
67d7c0f45e DmaPusher: Remove dead code in step 2020-05-16 12:42:27 +10:00
ReinUsesLisp
635a3add8b vk_rasterizer: Match OpenGL's FlushAndInvalidate behavior
Match OpenGL's behavior. This can fix or simplify bisecting issues on
Vulkan.
2020-05-15 20:40:08 -03:00
bunnei
b0d4fdb600 Merge pull request #3899 from ReinUsesLisp/float-comparisons
shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL
2020-05-13 09:51:14 -04:00
ReinUsesLisp
a7c7f48177 vk_rasterizer: Implement constant attributes
Constant attributes (in OpenGL known disabled attributes) are not
supported on Vulkan, even with extensions. To emulate this behavior we
return zero on reads from disabled vertex attributes in shader code.
This has no caching cost because attribute formats are not dynamic state
on Vulkan and we have to store it in the pipeline cache anyway.

- Fixes Animal Crossing: New Horizons terrain borders
2020-05-13 04:36:47 -03:00
ReinUsesLisp
305164eaac vk_rasterizer: Remove buffer check in attribute selection
This was a left over from OpenGL when disabled buffers where not properly
emulated. We no longer have to assert this as it is checked in vertex
buffer initialization.
2020-05-13 04:36:47 -03:00
bunnei
250593c05d Merge pull request #3816 from ReinUsesLisp/vk-rasterizer-enable
vk_graphics_pipeline: Implement rasterizer_enable on Vulkan
2020-05-11 18:22:51 -04:00
ReinUsesLisp
f39591b20a gl_shader_decompiler: Properly emulate NaN behaviour on NE
"Not equal" operators on GLSL seem to behave as unordered when we expect
an ordered comparison.

Manually emulate this checking for LGE values (numbers, not-NaNs).
2020-05-10 02:59:33 -03:00
Fernando Sahmkow
66035b1641 RasterizerCache: Correct documentation. 2020-05-09 21:03:39 -04:00
Fernando Sahmkow
5c9ab7d1e2 VkPipelineCache: Use a null shader on invalid address. 2020-05-09 20:51:34 -04:00
Fernando Sahmkow
a14e93d0ba VideoCore: Use SyncGuestMemory mechanism for Shader/Pipeline Cache invalidation. 2020-05-09 19:25:29 -04:00
Rodrigo Locatti
bab843f4e9 Merge pull request #3839 from Morph1984/r8g8ui
texture: Implement R8G8UI
2020-05-09 05:28:55 -03:00