Commit graph

381 commits

Author SHA1 Message Date
bunnei
e043a955f1 Merge pull request #3799 from ReinUsesLisp/iadd-cc
shader: Implement P2R CC, IADD Rd.CC and IADD.X
2020-04-30 12:56:36 -04:00
bunnei
2765bb73a2 Merge pull request #3805 from ReinUsesLisp/preserve-contents
texture_cache: Reintroduce preserve_contents accurately
2020-04-30 12:56:19 -04:00
bunnei
510842d827 Merge pull request #3784 from ReinUsesLisp/shader-memory-util
shader/memory_util: Deduplicate code
2020-04-28 12:05:50 -04:00
Fernando Sahmkow
fe8ea9e0c9 Merge pull request #3766 from ReinUsesLisp/renderpass-cache-key
vk_renderpass_cache: Pack renderpass cache key and unify keys
2020-04-27 16:05:14 -04:00
Fernando Sahmkow
2043198e50 Merge pull request #3756 from ReinUsesLisp/integrated-devices
vk_memory_manager: Remove unified memory model flag
2020-04-27 16:04:22 -04:00
ReinUsesLisp
8e3af5d3ca texture_cache: Reintroduce preserve_contents accurately
This reverts commit 1f1e80c67d.

preserve_contents proved to be a meaningful optimization. This commit
reintroduces it but properly implemented on OpenGL.

We have to make sure the clear removes all the previous contents of the
image.

It's not currently implemented on Vulkan because we can do smart things
there that's preferred to be introduced in a separate commit.
2020-04-26 19:53:02 -03:00
Rodrigo Locatti
0f7a89c2ef Merge pull request #3753 from ReinUsesLisp/ac-vulkan
{gl,vk}_rasterizer: Add lazy default buffer maker and use it for empty buffers
2020-04-26 01:55:43 -03:00
ReinUsesLisp
9b433b2467 shader/memory_util: Deduplicate code
Deduplicate code shared between vk_pipeline_cache and gl_shader_cache as
well as shader decoder code.

While we are at it, fix a bug in gl_shader_cache where compute shaders
had an start offset of a stage shader.
2020-04-26 01:38:51 -03:00
ReinUsesLisp
7e8f51273c shader/arithmetic_integer: Implement CC for IADD 2020-04-25 22:55:26 -03:00
bunnei
6cd0fc2a95 Merge pull request #3721 from ReinUsesLisp/sort-devices
vulkan/wrapper: Sort physical devices
2020-04-25 03:27:40 -04:00
ReinUsesLisp
88a6c10687 vk_rasterizer: Pack texceptions and color formats on invalid formats
Sometimes for unknown reasons NVN games can bind a render target format
of 0. This may be a yuzu bug.

With the commits before this the formats were specified without being
"packed", assuming all formats and texceptions will be written like in
the color_attachments vector.

To address this issue, iterate all render targets and pack them as they
are valid. This way they will match color_attachments.

- Fixes validation errors and graphical issues on Breath of the Wild.
2020-04-24 22:21:29 -03:00
Markus Wick
ac24f0506c Fix -Werror=conversion error. 2020-04-24 09:33:04 +02:00
ReinUsesLisp
f78f26b75a vk_rasterizer: Fix framebuffer creation validation errors
Framebuffer creation was ignoring the number of color attachments.
2020-04-23 17:34:16 -03:00
ReinUsesLisp
ab7eae6fff vk_pipeline_cache: Unify pipeline cache keys into a single operation
This allows us to call Common::CityHash and std::memcmp only once for
GraphicsPipelineCacheKey. While we are at it, do the same for compute.
2020-04-23 17:34:16 -03:00
ReinUsesLisp
7b76c67803 vk_renderpass_cache: Pack renderpass cache key to 12 bytes 2020-04-23 17:34:16 -03:00
bunnei
c916ad62e7 Merge pull request #3677 from FernandoS27/better-sync
Introduce Predictive Flushing and Improve ASYNC GPU
2020-04-22 22:09:38 -04:00
ReinUsesLisp
910decd9cb vk_pipeline_cache: Fix unintentional memcpy into optional
The intention behind this was to assign a float to from an uint32_t, but
it was unintentionally being copied directly into the std::optional.

Copy to a temporary and assign that temporary to std::optional. This can
be replaced with std::bit_cast<float> once we are in C++20.
2020-04-22 21:36:05 -03:00
Fernando Sahmkow
9fe7972120 Merge pull request #3653 from ReinUsesLisp/nsight-aftermath
renderer_vulkan: Integrate Nvidia Nsight Aftermath on Windows
2020-04-22 11:39:01 -04:00
Fernando Sahmkow
491aea4a91 Async GPU: Correct flushing behavior to be similar to old async GPU behavior. 2020-04-22 11:36:26 -04:00
Fernando Sahmkow
d9f1d5a4fd ShaderCache/PipelineCache: Cache null shaders. 2020-04-22 11:36:25 -04:00
Fernando Sahmkow
ea522da8b5 Address Feedback. 2020-04-22 11:36:24 -04:00
ReinUsesLisp
0b9454849d vk_fence_manager: Initial implementation 2020-04-22 11:36:19 -04:00
Fernando Sahmkow
1966f1d948 OpenGL: Guarantee writes to Buffers. 2020-04-22 11:36:18 -04:00
Fernando Sahmkow
7986c97ed2 GPU: Implement Flush Requests for Async mode. 2020-04-22 11:36:17 -04:00
Fernando Sahmkow
af9f901764 FenceManager: Manage syncpoints and rename fences to semaphores. 2020-04-22 11:36:16 -04:00
Fernando Sahmkow
6092308fe4 Rasterizer: Document SignalFence & ReleaseFences and setup skeletons on Vulkan. 2020-04-22 11:36:14 -04:00
Fernando Sahmkow
e7195b5f87 ThreadManager: Sync async reads on accurate gpu. 2020-04-22 11:36:12 -04:00
Fernando Sahmkow
de53bc96c0 BufferCache: Implement OnCPUWrite and SyncGuestHost 2020-04-22 11:36:07 -04:00
Fernando Sahmkow
c689dc6804 GPU: Refactor synchronization on Async GPU 2020-04-22 11:36:06 -04:00
ReinUsesLisp
88b092e717 vk_memory_manager: Remove unified memory model flag
All drivers (even Intel) seem to have a device local memory type that is
not host visible. Remove this flag so all devices follow the same path.

This fixes a crash when trying to map to host device local memory on
integrated devices.
2020-04-21 22:06:38 -03:00
ReinUsesLisp
00bef5d0d3 vk_rasterizer: Add lazy default buffer maker and use it for empty buffers
Introduce a default buffer getter that lazily constructs an empty
buffer. This is intended to match OpenGL's buffer 0.

Use this for disabled vertex and uniform buffers.

While we are at it, include vertex buffer usages for staging buffers to
silence validation errors.
2020-04-21 19:55:52 -03:00
ReinUsesLisp
b33a0c0d5f gl_rasterizer: Fix buffers without size
On NVN buffers can be enabled but have no size. According to deko3d and
the behavior we see in Animal Crossing: New Horizons these buffers get
the special address of 0x1000 and limit themselves to 0xfff.

Implement buffers without a size by binding a null buffer to OpenGL
without a side.

1d1930beea/source/maxwell/gpu_3d_vbo.cpp (L62-L63)
2020-04-21 19:55:44 -03:00
Rodrigo Locatti
89ba13c7d2 Merge pull request #3718 from ReinUsesLisp/better-pipeline-state
fixed_pipeline_state: Pack structure, use memcmp and CityHash on it
2020-04-21 18:17:58 -03:00
Mat M
fe0364e257 Merge pull request #3733 from ambasta/patch-2
Initialize quad_indexed_pass before uint8_pass
2020-04-20 20:36:46 -04:00
Fernando Sahmkow
555a1273c9 Merge pull request #3700 from ReinUsesLisp/stream-buffer-sizes
vk_stream_buffer: Fix out of memory on boot on recent Nvidia drivers
2020-04-20 09:37:42 -04:00
Amit Prakash Ambasta
7915dc7ac9 Initialize quad_indexed_pass before uint8_pass
Fixes Werror=reorder in gcc
2020-04-20 04:53:52 +05:30
bunnei
a39273cb95 Merge pull request #3694 from ReinUsesLisp/indexed-quads
vk_compute_pass: Implement indexed quads
2020-04-19 16:52:40 -04:00
Jan Beich
cc5e71c5ad renderer_vulkan: assume X11 if not Windows/macOS after 30bbdc653c
Render.Vulkan <Error> video_core/renderer_vulkan/renderer_vulkan.cpp:CreateInstance:131: Presentation not supported on this platform
Render.Vulkan <Error> video_core/renderer_vulkan/renderer_vulkan.cpp:CreateSurface:378: Presentation not supported on this platform
Core <Critical> core/core.cpp:Load:199: Failed to initialize system (Error 5)!
2020-04-19 00:32:23 +00:00
ReinUsesLisp
fcbe714435 vulkan/wrapper: Sort physical devices
Sort discrete GPUs over the rest, Nvidia over AMD, AMD over Intel, Intel
over the rest. This gives us a somewhat consistent order when Optimus
is removed (renderdoc does this when it's attached).

This can break the configuration of users with an Intel GPU that
manually remove Optimus on yuzu. That said, it's a very unlikely to
happen.
2020-04-18 21:31:15 -03:00
ReinUsesLisp
98266da47c fixed_pipeline_state: Hash and compare the whole structure
Pad FixedPipelineState's size to 384 bytes to be a multiple of 16.

Compare the whole struct with std::memcmp and hash with CityHash. Using
CityHash instead of a naive hash should reduce the number of collisions.
Improve used type traits to ensure this operation is safe.

With these changes the improvements to the hashable pipeline state are:

Optimized structure
Hash:            89 ns
Comparison:     103 ns
Construction*:  164 ns
Struct size:    384 bytes

Original structure
Hash:           148 ns
Equal:          174 ns
Construction*:  281 ns
Size:          1384 bytes

* Attribute state initialization is not measured

These measures are averages taken with std::chrono::high_accuracy_clock
on MSVC shipped on Visual Studio 16.6.0 Preview 2.1.
2020-04-18 19:57:26 -03:00
ReinUsesLisp
89c816a3cf fixed_pipeline_state: Pack blending state
Reduce FixedPipelineState's size to 364 bytes.
2020-04-18 19:23:35 -03:00
ReinUsesLisp
d74b3a5a50 fixed_pipeline_state: Pack rasterizer state
Reduce FixedPipelineState's size to 600 bytes.
2020-04-18 19:22:57 -03:00
ReinUsesLisp
cc6af27ae7 fixed_pipeline_state: Pack depth stencil state
Reduce FixedPipelineState's size to 632 bytes.
2020-04-18 19:22:11 -03:00
ReinUsesLisp
fd2f04bbdc fixed_pipeline_state: Pack attribute state
Reduce FixedPipelineState's size from 1384 to 664 bytes
2020-04-18 19:21:19 -03:00
ReinUsesLisp
153845bba3 vk_stream_buffer: Fix out of memory on boot on recent Nvidia drivers
Nvidia recently introduced a new memory type for data streaming
(awesome!), but yuzu was assuming that all heaps had enough memory
for the assumed stream buffer size (256 MiB).

This worked fine on AMD but Nvidia's new memory heap was smaller than
256 MiB. This commit changes this assumption and allocates a bit less
than the size of the preferred heap, with a maximum of 256 MiB (to avoid
allocating all system memory on integrated devices).

- Fixes a crash on NVIDIA 450.82.0.0
2020-04-17 18:12:48 -03:00
ReinUsesLisp
14ab5c4b65 vk_compute_pass: Implement indexed quads
Implement indexed quads (GL_QUADS used with glDrawElements*) with a
compute pass conversion.

The compute shader converts from uint8/uint16/uint32 indices to uint32.
The format is passed through push constants to avoid having different
variants of the same shader.

- Used by Fast RMX
- Used by Xenoblade Chronicles 2 (it still has graphical due to
synchronization issues on Vulkan)
2020-04-16 21:12:32 -03:00
Fernando Sahmkow
7a9b83b658 Merge pull request #3600 from ReinUsesLisp/no-pointer-buf-cache
buffer_cache: Return handles instead of pointer to handles
2020-04-16 19:58:13 -04:00
ReinUsesLisp
c1ad40a3cb buffer_cache: Return handles instead of pointer to handles
The original idea of returning pointers is that handles can be moved.
The problem is that the implementation didn't take that in mind and made
everything harder to work with. This commit drops pointer to handles and
returns the handles themselves. While it is still true that handles can
be invalidated, this way we get an old handle instead of a dangling
pointer.

This problem can be solved in the future with sparse buffers.
2020-04-16 02:33:34 -03:00
Lioncash
3c3928a5f7 video_core: Amend doxygen comment references
Fixes broken documentation references.
2020-04-15 22:33:29 -04:00
Fernando Sahmkow
d06795c08a Merge pull request #3612 from ReinUsesLisp/red
shader/memory: Implement RED.E.ADD and minor changes to ATOM
2020-04-15 15:03:49 -04:00