Lioncash
953aff6f0e
async_shaders: emplace threads into the worker thread vector
...
Same behavior, but constructs the threads in place instead of moving
them.
2020-11-20 04:46:56 -05:00
Lioncash
a15dc601be
async_shaders: Simplify implementation of GetCompletedWork()
...
This is equivalent to moving all the contents and then clearing the
vector. This avoids a redundant allocation.
2020-11-20 04:44:44 -05:00
Lioncash
088767da00
async_shaders: Simplify moving data into the pending queue
2020-11-20 04:41:29 -05:00
Lioncash
8ac9c08758
async_shaders: std::move data within QueueVulkanShader()
...
Same behavior, but avoids redundant copies.
While we're at it, we can simplify the pushing of the parameters into
the pending queue.
2020-11-20 04:38:18 -05:00
Lioncash
ae52755d5c
gl_rasterizer: Make floating-point literal a float
...
Gets rid of an unnecessary expansion from float to double.
2020-11-20 04:24:33 -05:00
Lioncash
b482ef0f06
shader_bytecode: Make use of [[nodiscard]] where applicable
...
Ensures that all queried values are made use of.
2020-11-20 02:20:37 -05:00
Lioncash
b0172c1888
shader_bytecode: Eliminate variable shadowing
2020-11-20 02:13:45 -05:00
Rodrigo Locatti
4263ecebf5
Merge pull request #4308 from ReinUsesLisp/maxwell-3d-funcs
...
maxwell_3d: Move code to separate functions and insert instead of push_back
2020-11-20 01:57:22 -03:00
Lioncash
e71d8eef5e
rasterizer_interface: Make use of [[nodiscard]] where applicable
2020-11-17 07:19:13 -05:00
Lioncash
049b4d4427
render_base: Make use of [[nodiscard]] where applicable
2020-11-17 07:19:12 -05:00
Lioncash
47bc37f5cf
gpu: Make use of [[nodiscard]] where applicable
2020-11-17 07:19:09 -05:00
ReinUsesLisp
bdd8e12504
maxwell_3d: Use insert instead of loop push_back
...
This reduces the overhead of bounds checking on each element.
It won't reduce the cost of allocation because usually this vector's
capacity is usually large enough to hold whatever we push to it.
2020-11-11 19:52:19 -03:00
ReinUsesLisp
45cfd67ae6
maxwell_3d: Move code to separate functions
...
Deduplicate some code and put it in separate functions so it's easier to
understand and profile.
2020-11-11 19:52:19 -03:00
bunnei
0b6324b3a6
video_core: dma_pusher: Remove integrity check on command lists.
...
- This seems to cause softlocks in Breath of the Wild.
2020-11-07 00:08:19 -08:00
bunnei
a3ee29e9f5
Merge pull request #4891 from lioncash/clang2
...
General: Fix clang build
2020-11-06 10:33:13 -08:00
bunnei
cab7ed694c
Merge pull request #4854 from ReinUsesLisp/cube-array-shadow
...
shader: Partially implement texture cube array shadow
2020-11-05 16:25:00 -08:00
Lioncash
8fc37d6fca
General: Fix clang build
...
Allows building on clang to work again
2020-11-05 10:07:16 -05:00
bunnei
97547d70f5
Merge pull request #4858 from lioncash/initializer
...
General: Resolve a few missing initializer warnings
2020-11-04 12:10:10 -08:00
Chloe
c16acdfb9b
Merge pull request #4869 from bunnei/improve-gpu-sync
...
Improvements to GPU synchronization & various refactoring
2020-11-04 18:36:55 +11:00
bunnei
9a81537447
Merge pull request #4874 from lioncash/nodiscard2
...
nvdec: Make use of [[nodiscard]] where applicable
2020-11-03 16:34:07 -08:00
Lioncash
2ee59021c2
nvdec: Make use of [[nodiscard]] where applicable
...
Prevents bugs from occurring where the results of a function are
accidentally discarded
2020-11-02 02:45:15 -05:00
bunnei
a4825b81ac
Merge pull request #4865 from ameerj/async-threadcount
...
async_shaders: Increase Async worker thread count for >8 thread cpus
2020-11-01 01:54:01 -07:00
bunnei
af7ab45b45
video_core: dma_pusher: Add support for integrity checks.
...
- Log corrupted command lists, rather than crash.
2020-11-01 01:52:38 -07:00
bunnei
69f4a66d23
video_core: dma_pusher: Add support for prefetched command lists.
2020-11-01 01:52:38 -07:00
bunnei
c112a94dfe
video_core: gpu: Implement WaitFence and IncrementSyncPoint.
2020-11-01 01:52:37 -07:00
bunnei
e9b37e4d52
Merge pull request #4853 from ReinUsesLisp/fcmp-imm
...
shader/arithmetic: Implement FCMP immediate + register variant
2020-10-31 01:25:02 -07:00
Lioncash
6425e155f2
vp9: Be explicit with copy and move operators
...
It's deprecated in the language to autogenerate these if the destructor
for a type is specified, so we can explicitly specify how we want these
to be generated.
2020-10-29 22:57:35 -04:00
Lioncash
0c2f517cd0
vp9: Mark functions with [[nodiscard]] where applicable
...
Prevents values from mistakenly being discarded in cases where it's a
bug to do so.
2020-10-29 22:57:32 -04:00
Lioncash
114153e8c8
vp9: Provide a default initializer for "hidden" member
...
The API of VP9 exposes a WasFrameHidden() function which accesses this
member. Given the constructor previously didn't initialize this member,
it's a potential vector for an uninitialized read.
Instead, we can initialize this to a deterministic value to prevent that
from occurring.
2020-10-29 22:35:55 -04:00
Lioncash
e54c1f4a41
vp9: Make some member functions internally linked
...
These helper functions don't directly modify any member state and can be
hidden from view.
2020-10-29 22:34:46 -04:00
Lioncash
3e654ff0d0
General: Resolve a few missing initializer warnings
...
Resolves a few -Wmissing-initializer warnings.
2020-10-29 19:37:07 -04:00
bunnei
02234a9d92
Merge pull request #4837 from lioncash/nvdec-2
...
nvdec: Minor tidying up
2020-10-29 12:28:07 -07:00
ameerj
b139341d35
async_shaders: Increase Async worker thread count for 8+ thread cpus
...
Adds 1 async worker thread for every 2 available threads above 8
2020-10-29 14:16:45 -04:00
bunnei
f5ca79ae27
Merge pull request #4838 from lioncash/syncmgr
...
sync_manager: Amend parameter order of calls to SyncptIncr constructor
2020-10-28 22:49:22 -07:00
bunnei
6639ab9470
video_core: cdma_pusher: Add missing LOG_DEBUG field in ExecuteCommand.
2020-10-28 16:47:08 -07:00
ReinUsesLisp
5690bca1c4
shader: Partially implement texture cube array shadow
...
This implements texture cube arrays with shadow comparisons but doesn't
fix the asserts related to it.
Fixes out of bounds reads on swizzle constructors and makes them use
bounds checked ::at instead of the unsafe operator[].
2020-10-28 17:12:40 -03:00
ReinUsesLisp
016fb60ff5
shader/arithmetic: Implement FCMP immediate + register variant
...
Trivially add the encoding for this.
2020-10-28 17:05:41 -03:00
LC
cf0f8d0969
Merge pull request #4848 from ReinUsesLisp/type-limits
...
video_core: Enforce -Werror=type-limits
2020-10-28 03:16:10 -04:00
ReinUsesLisp
de16b5a409
video_core: Enforce -Wredundant-move and -Wpessimizing-move
...
Silence three warnings and make them errors to avoid introducing more in the future.
2020-10-28 02:44:50 -03:00
ReinUsesLisp
1ae83819d9
video_core: Enforce -Werror=type-limits
...
Silences one warning and avoids introducing more in the future.
2020-10-28 02:37:47 -03:00
Lioncash
7115197c4c
sync_manager: Amend parameter order of calls to SyncptIncr constructor
...
Corrects some cases where the arguments would be incorrectly swapped.
2020-10-27 03:22:57 -04:00
Lioncash
eb5c069e32
h264: Make WriteUe take a u32
...
Enforces the type of the desired value in calling code.
2020-10-27 03:21:53 -04:00
Lioncash
105a345f4e
vp9: std::move buffer within ComposeFrameHeader()
...
We can move the buffer here to avoid a heap reallocation
2020-10-27 02:27:31 -04:00
Lioncash
c0992fd760
vp9: Remove dead code
2020-10-27 02:26:17 -04:00
Lioncash
617f016746
vp9: Join declarations with assignments
2020-10-27 02:26:03 -04:00
Lioncash
9efecc310b
vp9: Remove pessimizing moves
...
The move will already occur without std::move.
2020-10-27 02:21:40 -04:00
Lioncash
4e10e539d8
vp9: Resolve variable shadowing
2020-10-27 02:20:17 -04:00
Lioncash
31d9c3c75e
nvdec: Tidy up header includes
...
Prevents a few unnecessary inclusions.
2020-10-27 02:16:42 -04:00
ameerj
9ef5c53e52
video_core: NVDEC Implementation
...
This commit aims to implement the NVDEC (Nvidia Decoder) functionality, with video frame decoding being handled by the FFmpeg library.
The process begins with Ioctl commands being sent to the NVDEC and VIC (Video Image Composer) emulated devices. These allocate the necessary GPU buffers for the frame data, along with providing information on the incoming video data. A Submit command then signals the GPU to process and decode the frame data.
To decode the frame, the respective codec's header must be manually composed from the information provided by NVDEC, then sent with the raw frame data to the ffmpeg library.
Currently, H264 and VP9 are supported, with VP9 having some minor artifacting issues related mainly to the reference frame composition in its uncompressed header.
Async GPU is not properly implemented at the moment.
Co-Authored-By: David <25727384+ogniK5377@users.noreply.github.com>
2020-10-26 23:07:36 -04:00
bunnei
8d2916a15c
Merge pull request #4706 from ReinUsesLisp/cmake-host-shaders
...
video_core: Fix instances where msbuild always regenerated host shaders
2020-10-23 10:01:16 -07:00