Commit graph

4651 commits

Author SHA1 Message Date
ReinUsesLisp
c7f9bceb80 buffer_cache: Remove shared pointers
Removing shared pointers is a first step to be able to use intrusive
objects and keep allocations close to one another in memory.
2020-05-21 16:02:54 -03:00
ReinUsesLisp
3460d01352 buffer_cache: Minor style changes
Minor style changes. Mostly done so I avoid editing it while doing other
changes.
2020-05-21 16:02:20 -03:00
ReinUsesLisp
a075bbcf36 renderer_opengl: Add assembly program code paths
Add code required to use OpenGL assembly programs based on
NV_gpu_program5. Decompilation for ARB programs is intended to be added
in a follow up commit. This does **not** include ARB decompilation and
it's not in an usable state.

The intention behind assembly programs is to reduce shader stutter
significantly on drivers supporting NV_gpu_program5 (and other required
extensions). Currently only Nvidia's proprietary driver supports these
extensions.

Add a UI option hidden for now to avoid people enabling this option
accidentally.

This code path has some limitations that OpenGL compatibility doesn't
have:
- NV_shader_storage_buffer_object is limited to 16 entries for a single
OpenGL context state (I don't know if this is an intended limitation, an
specification issue or I am missing something). Currently causes issues
on The Legend of Zelda: Link's Awakening.
- NV_parameter_buffer_object can't bind buffers using an offset
different to zero. The used workaround is to copy to a temporary buffer
(this doesn't happen often so it's not an issue).

On the other hand, it has the following advantages:
- Shaders build a lot faster.
- We have control over how floating point rounding is done over
individual instructions (SPIR-V on Vulkan can't do this).
- Operations on shared memory can be unsigned and signed.
- Transform feedbacks are dynamic state (not yet implemented).
- Parameter buffers (uniform buffers) are per stage, matching NVN and
hardware's behavior.
- The API to bind and create assembly programs makes sense, unlike
ARB_separate_shader_objects.
2020-05-19 18:00:04 -03:00
Morph
e54e7d322e maxwell_to_vk: Add format B8G8R8A8_SRGB
Add format B8G8R8A8_SRGB and add Attachable capability for B8G8R8A8_UNORM
Used by Bravely Default II
2020-05-18 13:02:09 -04:00
Fernando Sahmkow
b9eff4d004 OpenGL: Enable Debug Context and Synchronous debugging when graphics debugging is enabled.
This commit aims to help easing debugging of driver crashes without
having to modify existing code.
2020-05-17 21:45:09 -04:00
David Marcec
67d7c0f45e DmaPusher: Remove dead code in step 2020-05-16 12:42:27 +10:00
ReinUsesLisp
635a3add8b vk_rasterizer: Match OpenGL's FlushAndInvalidate behavior
Match OpenGL's behavior. This can fix or simplify bisecting issues on
Vulkan.
2020-05-15 20:40:08 -03:00
bunnei
b0d4fdb600 Merge pull request #3899 from ReinUsesLisp/float-comparisons
shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL
2020-05-13 09:51:14 -04:00
ReinUsesLisp
a7c7f48177 vk_rasterizer: Implement constant attributes
Constant attributes (in OpenGL known disabled attributes) are not
supported on Vulkan, even with extensions. To emulate this behavior we
return zero on reads from disabled vertex attributes in shader code.
This has no caching cost because attribute formats are not dynamic state
on Vulkan and we have to store it in the pipeline cache anyway.

- Fixes Animal Crossing: New Horizons terrain borders
2020-05-13 04:36:47 -03:00
ReinUsesLisp
305164eaac vk_rasterizer: Remove buffer check in attribute selection
This was a left over from OpenGL when disabled buffers where not properly
emulated. We no longer have to assert this as it is checked in vertex
buffer initialization.
2020-05-13 04:36:47 -03:00
bunnei
250593c05d Merge pull request #3816 from ReinUsesLisp/vk-rasterizer-enable
vk_graphics_pipeline: Implement rasterizer_enable on Vulkan
2020-05-11 18:22:51 -04:00
ReinUsesLisp
f39591b20a gl_shader_decompiler: Properly emulate NaN behaviour on NE
"Not equal" operators on GLSL seem to behave as unordered when we expect
an ordered comparison.

Manually emulate this checking for LGE values (numbers, not-NaNs).
2020-05-10 02:59:33 -03:00
Fernando Sahmkow
66035b1641 RasterizerCache: Correct documentation. 2020-05-09 21:03:39 -04:00
Fernando Sahmkow
5c9ab7d1e2 VkPipelineCache: Use a null shader on invalid address. 2020-05-09 20:51:34 -04:00
Fernando Sahmkow
a14e93d0ba VideoCore: Use SyncGuestMemory mechanism for Shader/Pipeline Cache invalidation. 2020-05-09 19:25:29 -04:00
Rodrigo Locatti
bab843f4e9 Merge pull request #3839 from Morph1984/r8g8ui
texture: Implement R8G8UI
2020-05-09 05:28:55 -03:00
ReinUsesLisp
d8598a8717 shader_ir: Separate float-point comparisons in ordered and unordered
This allows us to use native SPIR-V instructions without having to
manually check for NAN.
2020-05-09 04:55:15 -03:00
bunnei
c8a4398fe8 Merge pull request #3842 from makigumo/maxwell_to_vk_vertexattribute_signed_int
maxwell_to_vk: implement missing signed int formats
2020-05-09 00:36:09 -04:00
bunnei
4fb531a576 Merge pull request #3885 from ReinUsesLisp/viewport-swizzles
video_core: Implement viewport swizzles with NV_viewport_swizzle
2020-05-08 15:16:53 -04:00
bunnei
2a51a66227 Merge pull request #3884 from ReinUsesLisp/border-colors
vk_sampler_cache: Use VK_EXT_custom_border_color when available
2020-05-07 12:18:53 -04:00
bunnei
a1d3586e60 Merge pull request #3815 from FernandoS27/command-list-2
GPU: More optimizations to GPU Command List Processing and DMA Copy Optimizations
2020-05-05 17:12:42 -04:00
bunnei
97607e0675 Update src/video_core/gpu.cpp
Co-authored-by: David <25727384+ogniK5377@users.noreply.github.com>
2020-05-05 15:39:44 -04:00
bunnei
58b467877c Update src/video_core/gpu.cpp
Co-authored-by: David <25727384+ogniK5377@users.noreply.github.com>
2020-05-05 15:39:37 -04:00
ReinUsesLisp
aa86309d15 vk_sampler_cache: Use VK_EXT_custom_border_color when available
This should fix grass interactions on Breath of the Wild on Vulkan.
It is currently untested against validation layers.

Nvidia's Windows 443.09 beta driver or Linux 440.66.12 is required for
now.
2020-05-04 20:49:23 -03:00
ReinUsesLisp
1f5d41ac05 vk_graphics_pipeline: Implement viewport swizzles with NV_viewport_swizzle 2020-05-04 18:31:17 -03:00
ReinUsesLisp
9054f86f03 gl_rasterizer: Implement viewport swizzles with NV_viewport_swizzle 2020-05-04 17:51:30 -03:00
ReinUsesLisp
ff7b0d7329 maxwell_3d: Add viewport swizzles 2020-05-04 17:50:59 -03:00
bunnei
99075400e3 Merge pull request #3808 from ReinUsesLisp/wait-for-idle
{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
2020-05-03 02:43:18 -04:00
bunnei
18a1e885a1 Merge pull request #3732 from lioncash/header
vulkan: Remove unnecessary includes
2020-05-02 01:36:57 -04:00
bunnei
716a999346 Merge pull request #3809 from ReinUsesLisp/empty-index
vk_rasterizer: Skip index buffer setup when vertices are zero
2020-05-02 01:21:57 -04:00
ReinUsesLisp
02e6cc9247 vk_graphics_pipeline: Implement rasterizer_enable on Vulkan
We can simply enable rasterizer discard matching the current pipeline
key.
2020-05-02 01:47:25 -03:00
bunnei
33a218bea2 Merge pull request #3693 from ReinUsesLisp/clean-samplers
shader/texture: Support multiple unknown sampler properties
2020-05-02 00:45:41 -04:00
Jan Beich
1d30ce7b5b fixed_pipeline_state: explicitly use template keyword after be8742e286
In file included from src/video_core/renderer_opengl/renderer_opengl.cpp:25:
In file included from src/./video_core/renderer_opengl/gl_rasterizer.h:26:
In file included from src/./video_core/renderer_opengl/gl_fence_manager.h:11:
src/./video_core/fence_manager.h:91:32: error: use 'template' keyword
      to treat 'Write' as a dependent template name
                memory_manager.Write<u32>(current_fence->GetAddress(), current_fence->GetPayload());
                               ^
                               template
src/./video_core/fence_manager.h:137:32: error: use 'template'
      keyword to treat 'Write' as a dependent template name
                memory_manager.Write<u32>(current_fence->GetAddress(), current_fence->GetPayload());
                               ^
                               template
2020-05-01 23:38:23 +00:00
Dan
712c96db1a maxwell_to_vk: implement missing signed int formats 2020-04-30 23:39:16 +02:00
Morph
6665cd04f1 texture: Implement R8G8UI
- Used by The Walking Dead: The Final Season
2020-04-30 13:19:36 -04:00
bunnei
6a57e5fc49 Merge pull request #3807 from ReinUsesLisp/fix-depth-clamp
maxwell_3d: Fix depth clamping register
2020-04-30 13:07:31 -04:00
bunnei
e043a955f1 Merge pull request #3799 from ReinUsesLisp/iadd-cc
shader: Implement P2R CC, IADD Rd.CC and IADD.X
2020-04-30 12:56:36 -04:00
bunnei
2765bb73a2 Merge pull request #3805 from ReinUsesLisp/preserve-contents
texture_cache: Reintroduce preserve_contents accurately
2020-04-30 12:56:19 -04:00
bunnei
e20f3161a3 Merge pull request #3788 from FernandoS27/revert
Revert: shader_decode: Fix LD, LDG when track constant buffer.
2020-04-30 12:55:39 -04:00
Lioncash
a35345d217 vulkan: Remove unnecessary includes
Reduces some header churn and reduces rebuilds when some header
internals change.

While we're at it we can also resolve a missing include in buffer_cache.
2020-04-28 21:54:46 -04:00
ReinUsesLisp
8d28a56a6c shader/arithmetic_integer: Fix tracking issue in temporary
This temporary is not needed as we mark Rd.CC + IADD.X as unimplemented.
It caused issues when tracking global buffers.
2020-04-28 17:14:53 -03:00
Fernando Sahmkow
9f9e662f1f Clang Format and Documentation. 2020-04-28 14:02:51 -04:00
Fernando Sahmkow
4d2b23d3c5 MaxwellDMA: Optimize micro copies. 2020-04-28 13:44:14 -04:00
bunnei
510842d827 Merge pull request #3784 from ReinUsesLisp/shader-memory-util
shader/memory_util: Deduplicate code
2020-04-28 12:05:50 -04:00
ReinUsesLisp
6e79375f46 vk_rasterizer: Skip index buffer setup when vertices are zero
Xenoblade 2 invokes a draw call with zero vertices.
This is likely due to indirect drawing (glDrawArraysIndirect).

This causes a crash in the staging buffer pool when trying to create a
buffer with a size of zero. To workaround this, skip index buffer setup
entirely when the number of indices is zero.
2020-04-28 02:24:33 -03:00
ReinUsesLisp
8835d40024 {maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
Drop MemoryBarrier from the buffer cache and use Maxwell3D's register
WaitForIdle.

To implement this on OpenGL we just call glMemoryBarrier with the
necessary bits.

Vulkan lacks this synchronization primitive, so we set an event and
immediately wait for it. This is not a pretty solution, but it's what
Vulkan can do without submitting the current command buffer to the queue
(which ends up being more expensive on the CPU).
2020-04-28 02:18:12 -03:00
Fernando Sahmkow
4c11487d1e VideoCore/GPU: Delegate subchannel engines to the dma pusher. 2020-04-27 22:07:21 -04:00
Fernando Sahmkow
b916b58702 VideoCore/Engines: Refactor Engines CallMethod. 2020-04-27 21:47:58 -04:00
ReinUsesLisp
b82e61dff0 maxwell_3d: Fix depth clamping register
Using deko3d as reference:
4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)

We were using bits 3 and 4 to determine depth clamping, but these are
the same both enabled and disabled:

state->depthClampEnable ? 0x101A : 0x181D

The same happens on Nvidia's OpenGL driver, where they do something like
this (default capabilities, GL 4.5 compatibility):

(state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c

There's always a difference between the first bits in this register, but
bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This
commit changes yuzu's behaviour to use bit 11 to determine depth
clamping.

- Fixes depth issues on Super Mario Odyssey's intro.
2020-04-27 20:50:14 -03:00
Fernando Sahmkow
fe8ea9e0c9 Merge pull request #3766 from ReinUsesLisp/renderpass-cache-key
vk_renderpass_cache: Pack renderpass cache key and unify keys
2020-04-27 16:05:14 -04:00