Commit graph

409 commits

Author SHA1 Message Date
bunnei
e3855321f7 Merge pull request #3905 from FernandoS27/vulkan-fix
Correct a series of crashes and intructions on Async GPU and Vulkan Pipeline
2020-05-24 15:23:38 -04:00
bunnei
9995a05e94 Merge pull request #3979 from ReinUsesLisp/thread-group
shader/other: Implement thread comparisons (NV_shader_thread_group)
2020-05-24 00:33:06 -04:00
ReinUsesLisp
d4ba9c4fe5 shader/other: Implement thread comparisons (NV_shader_thread_group)
Hardware S2R special registers match gl_Thread*MaskNV. We can trivially
implement these using Nvidia's extension on OpenGL or naively stubbing
them with the ARB instructions to match. This might cause issues if the
host device warp size doesn't match Nvidia's. That said, this is
unlikely on proper shaders.

Refer to the attached url for more documentation about these flags.
https://www.khronos.org/registry/OpenGL/extensions/NV/NV_shader_thread_group.txt
2020-05-21 23:18:37 -03:00
ReinUsesLisp
7a45f97357 buffer_cache: Use boost::intrusive::set for caching
Instead of using boost::icl::interval_map for caching, use
boost::intrusive::set. interval_map is intended as a container where the
keys can overlap with one another; we don't need this for caching
buffers and a std::set-like data structure that allows us to search with
lower_bound is enough.
2020-05-21 16:44:00 -03:00
ReinUsesLisp
635a3add8b vk_rasterizer: Match OpenGL's FlushAndInvalidate behavior
Match OpenGL's behavior. This can fix or simplify bisecting issues on
Vulkan.
2020-05-15 20:40:08 -03:00
bunnei
b0d4fdb600 Merge pull request #3899 from ReinUsesLisp/float-comparisons
shader_ir: Add separate instructions for ordered and unordered comparisons and fix NE on GLSL
2020-05-13 09:51:14 -04:00
bunnei
250593c05d Merge pull request #3816 from ReinUsesLisp/vk-rasterizer-enable
vk_graphics_pipeline: Implement rasterizer_enable on Vulkan
2020-05-11 18:22:51 -04:00
Fernando Sahmkow
5c9ab7d1e2 VkPipelineCache: Use a null shader on invalid address. 2020-05-09 20:51:34 -04:00
Fernando Sahmkow
a14e93d0ba VideoCore: Use SyncGuestMemory mechanism for Shader/Pipeline Cache invalidation. 2020-05-09 19:25:29 -04:00
Rodrigo Locatti
bab843f4e9 Merge pull request #3839 from Morph1984/r8g8ui
texture: Implement R8G8UI
2020-05-09 05:28:55 -03:00
ReinUsesLisp
d8598a8717 shader_ir: Separate float-point comparisons in ordered and unordered
This allows us to use native SPIR-V instructions without having to
manually check for NAN.
2020-05-09 04:55:15 -03:00
bunnei
c8a4398fe8 Merge pull request #3842 from makigumo/maxwell_to_vk_vertexattribute_signed_int
maxwell_to_vk: implement missing signed int formats
2020-05-09 00:36:09 -04:00
bunnei
4fb531a576 Merge pull request #3885 from ReinUsesLisp/viewport-swizzles
video_core: Implement viewport swizzles with NV_viewport_swizzle
2020-05-08 15:16:53 -04:00
ReinUsesLisp
aa86309d15 vk_sampler_cache: Use VK_EXT_custom_border_color when available
This should fix grass interactions on Breath of the Wild on Vulkan.
It is currently untested against validation layers.

Nvidia's Windows 443.09 beta driver or Linux 440.66.12 is required for
now.
2020-05-04 20:49:23 -03:00
ReinUsesLisp
1f5d41ac05 vk_graphics_pipeline: Implement viewport swizzles with NV_viewport_swizzle 2020-05-04 18:31:17 -03:00
bunnei
99075400e3 Merge pull request #3808 from ReinUsesLisp/wait-for-idle
{maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
2020-05-03 02:43:18 -04:00
bunnei
18a1e885a1 Merge pull request #3732 from lioncash/header
vulkan: Remove unnecessary includes
2020-05-02 01:36:57 -04:00
bunnei
716a999346 Merge pull request #3809 from ReinUsesLisp/empty-index
vk_rasterizer: Skip index buffer setup when vertices are zero
2020-05-02 01:21:57 -04:00
ReinUsesLisp
02e6cc9247 vk_graphics_pipeline: Implement rasterizer_enable on Vulkan
We can simply enable rasterizer discard matching the current pipeline
key.
2020-05-02 01:47:25 -03:00
bunnei
33a218bea2 Merge pull request #3693 from ReinUsesLisp/clean-samplers
shader/texture: Support multiple unknown sampler properties
2020-05-02 00:45:41 -04:00
Dan
712c96db1a maxwell_to_vk: implement missing signed int formats 2020-04-30 23:39:16 +02:00
Morph
6665cd04f1 texture: Implement R8G8UI
- Used by The Walking Dead: The Final Season
2020-04-30 13:19:36 -04:00
bunnei
6a57e5fc49 Merge pull request #3807 from ReinUsesLisp/fix-depth-clamp
maxwell_3d: Fix depth clamping register
2020-04-30 13:07:31 -04:00
bunnei
e043a955f1 Merge pull request #3799 from ReinUsesLisp/iadd-cc
shader: Implement P2R CC, IADD Rd.CC and IADD.X
2020-04-30 12:56:36 -04:00
bunnei
2765bb73a2 Merge pull request #3805 from ReinUsesLisp/preserve-contents
texture_cache: Reintroduce preserve_contents accurately
2020-04-30 12:56:19 -04:00
Lioncash
a35345d217 vulkan: Remove unnecessary includes
Reduces some header churn and reduces rebuilds when some header
internals change.

While we're at it we can also resolve a missing include in buffer_cache.
2020-04-28 21:54:46 -04:00
bunnei
510842d827 Merge pull request #3784 from ReinUsesLisp/shader-memory-util
shader/memory_util: Deduplicate code
2020-04-28 12:05:50 -04:00
ReinUsesLisp
6e79375f46 vk_rasterizer: Skip index buffer setup when vertices are zero
Xenoblade 2 invokes a draw call with zero vertices.
This is likely due to indirect drawing (glDrawArraysIndirect).

This causes a crash in the staging buffer pool when trying to create a
buffer with a size of zero. To workaround this, skip index buffer setup
entirely when the number of indices is zero.
2020-04-28 02:24:33 -03:00
ReinUsesLisp
8835d40024 {maxwell_3d,buffer_cache}: Implement memory barriers using 3D registers
Drop MemoryBarrier from the buffer cache and use Maxwell3D's register
WaitForIdle.

To implement this on OpenGL we just call glMemoryBarrier with the
necessary bits.

Vulkan lacks this synchronization primitive, so we set an event and
immediately wait for it. This is not a pretty solution, but it's what
Vulkan can do without submitting the current command buffer to the queue
(which ends up being more expensive on the CPU).
2020-04-28 02:18:12 -03:00
ReinUsesLisp
b82e61dff0 maxwell_3d: Fix depth clamping register
Using deko3d as reference:
4e47ba0013/source/maxwell/gpu_3d_state.cpp (L42)

We were using bits 3 and 4 to determine depth clamping, but these are
the same both enabled and disabled:

state->depthClampEnable ? 0x101A : 0x181D

The same happens on Nvidia's OpenGL driver, where they do something like
this (default capabilities, GL 4.5 compatibility):

(state & DEPTH_CLAMP) != 0 ? 0x201a : 0x281c

There's always a difference between the first bits in this register, but
bit 11 is consistently disabled on both deko3d/NVN and OpenGL. This
commit changes yuzu's behaviour to use bit 11 to determine depth
clamping.

- Fixes depth issues on Super Mario Odyssey's intro.
2020-04-27 20:50:14 -03:00
Fernando Sahmkow
fe8ea9e0c9 Merge pull request #3766 from ReinUsesLisp/renderpass-cache-key
vk_renderpass_cache: Pack renderpass cache key and unify keys
2020-04-27 16:05:14 -04:00
Fernando Sahmkow
2043198e50 Merge pull request #3756 from ReinUsesLisp/integrated-devices
vk_memory_manager: Remove unified memory model flag
2020-04-27 16:04:22 -04:00
ReinUsesLisp
8e3af5d3ca texture_cache: Reintroduce preserve_contents accurately
This reverts commit 1f1e80c67d.

preserve_contents proved to be a meaningful optimization. This commit
reintroduces it but properly implemented on OpenGL.

We have to make sure the clear removes all the previous contents of the
image.

It's not currently implemented on Vulkan because we can do smart things
there that's preferred to be introduced in a separate commit.
2020-04-26 19:53:02 -03:00
Rodrigo Locatti
0f7a89c2ef Merge pull request #3753 from ReinUsesLisp/ac-vulkan
{gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers
2020-04-26 01:55:43 -03:00
ReinUsesLisp
9b433b2467 shader/memory_util: Deduplicate code
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as
well as shader decoder code.

While we are at it, fix a bug in gl_shader_cache where compute shaders
had an start offset of a stage shader.
2020-04-26 01:38:51 -03:00
ReinUsesLisp
7e8f51273c shader/arithmetic_integer: Implement CC for IADD 2020-04-25 22:55:26 -03:00
bunnei
6cd0fc2a95 Merge pull request #3721 from ReinUsesLisp/sort-devices
vulkan/wrapper: Sort physical devices
2020-04-25 03:27:40 -04:00
ReinUsesLisp
88a6c10687 vk_rasterizer: Pack texceptions and color formats on invalid formats
Sometimes for unknown reasons NVN games can bind a render target format
of 0. This may be a yuzu bug.

With the commits before this the formats were specified without being
"packed", assuming all formats and texceptions will be written like in
the color_attachments vector.

To address this issue, iterate all render targets and pack them as they
are valid. This way they will match color_attachments.

- Fixes validation errors and graphical issues on Breath of the Wild.
2020-04-24 22:21:29 -03:00
Markus Wick
ac24f0506c Fix -Werror=conversion error. 2020-04-24 09:33:04 +02:00
ReinUsesLisp
c9b4c56d69 shader_ir: Turn classes into data structures 2020-04-23 18:00:06 -03:00
ReinUsesLisp
f78f26b75a vk_rasterizer: Fix framebuffer creation validation errors
Framebuffer creation was ignoring the number of color attachments.
2020-04-23 17:34:16 -03:00
ReinUsesLisp
ab7eae6fff vk_pipeline_cache: Unify pipeline cache keys into a single operation
This allows us to call Common::CityHash and std::memcmp only once for
GraphicsPipelineCacheKey. While we are at it, do the same for compute.
2020-04-23 17:34:16 -03:00
ReinUsesLisp
7b76c67803 vk_renderpass_cache: Pack renderpass cache key to 12 bytes 2020-04-23 17:34:16 -03:00
bunnei
c916ad62e7 Merge pull request #3677 from FernandoS27/better-sync
Introduce Predictive Flushing and Improve ASYNC GPU
2020-04-22 22:09:38 -04:00
ReinUsesLisp
910decd9cb vk_pipeline_cache: Fix unintentional memcpy into optional
The intention behind this was to assign a float to from an uint32_t, but
it was unintentionally being copied directly into the std::optional.

Copy to a temporary and assign that temporary to std::optional. This can
be replaced with std::bit_cast<float> once we are in C++20.
2020-04-22 21:36:05 -03:00
Fernando Sahmkow
9fe7972120 Merge pull request #3653 from ReinUsesLisp/nsight-aftermath
renderer_vulkan: Integrate Nvidia Nsight Aftermath on Windows
2020-04-22 11:39:01 -04:00
Fernando Sahmkow
491aea4a91 Async GPU: Correct flushing behavior to be similar to old async GPU behavior. 2020-04-22 11:36:26 -04:00
Fernando Sahmkow
d9f1d5a4fd ShaderCache/PipelineCache: Cache null shaders. 2020-04-22 11:36:25 -04:00
Fernando Sahmkow
ea522da8b5 Address Feedback. 2020-04-22 11:36:24 -04:00
ReinUsesLisp
0b9454849d vk_fence_manager: Initial implementation 2020-04-22 11:36:19 -04:00