Commit graph

81 commits

Author SHA1 Message Date
liushuyu
ef00c213e8 video_core/codec: address comments 2021-12-02 21:01:34 -07:00
liushuyu
a32139fdab video_core/codecs: more robust ffmpeg hwdecoder selection logic 2021-12-02 21:01:34 -07:00
liushuyu
1a5c1d70de video_core/codec: address comments 2021-11-24 18:06:38 -07:00
liushuyu
f91cc356fb video_core/codecs: fix multiple decoding issues on Linux ...
* when someone installed Intel video drivers on an AMD system, the
  decoder will select the Intel VA-API decoding driver and yuzu will
  crash due to incorrect driver selection; the fix will check if the
  currently about-to-use driver is loaded in the kernel
* when using NVIDIA driver on Linux with a ffmpeg that does not have
  CUDA capability enabled, the decoder will crash; the fix simply
  making the decoder prefers the VDPAU driver over CUDA on Linux
2021-11-24 17:23:57 -07:00
ameerj
bf504f15f6 codes: Rename ComposeFrameHeader to ComposeFrame
These functions were composing the entire frame, not just the headers. Rename to more accurately describe them.
2021-11-12 23:52:19 -05:00
ameerj
048eb094ba vp8: Implement header composition
Enables frame decoding with FFmpeg
2021-11-12 23:52:18 -05:00
ameerj
538647f62b codecs: Add VP8 codec class 2021-11-12 19:49:45 -05:00
Morph
7d2aed7268 Merge pull request #7157 from ameerj/vic-surface-size
vic: Use the minimum of surface/frame dimensions when writing the final frame to the GPU
2021-10-13 20:41:17 -04:00
Ameer J
d0501cae90 Merge pull request #7109 from vonchenplus/fix_h264_max__reference_num_error
h264: Use max allowed max_num_ref_frames when using CPU decoding
2021-10-12 14:08:37 -04:00
ameerj
373d7189f9 vic: Use the minimum of surface/frame dimensions when writing the final frame to the GPU
Addresses possible buffer overflow behavior.
2021-10-10 18:44:16 -04:00
Feng Chen
78317b1a8d h264: Use max allowed max_num_ref_frames when using CPU decoding 2021-10-10 20:07:19 +08:00
Valeri
ebf2ab5afb vic: Allow surface to be higher than frame
Touhou Genso Wanderer Lotus Labyrinth R decodes 1920x1080 videos into 1920x1088 surface.
Only allow mismatch for height, since larger width would result in increasingly offset rows and somewhat defeat entire purpose of this check.
2021-10-09 20:22:09 +03:00
ameerj
92bd5571cd vic: Avoid memory corruption when multiple streams with different dimensions are decoded
This is a work around to avoid buffer overflow errors until multi channel/multi stream decoding is supported.
2021-10-08 01:22:38 -04:00
ameerj
71698d7351 vic: Refactor frame writing methods 2021-10-07 14:56:44 -04:00
ameerj
62efd87fd9 vic: Implement RGBX frame format 2021-10-07 11:06:57 -04:00
Morph
84b969a442 codec: Add missing <string_view> include 2021-09-11 17:19:14 -04:00
Fernando S
f35f5c5072 Merge pull request #6846 from ameerj/nvdec-gpu-decode
nvdec: Add GPU video decoding for all capable drivers and platforms
2021-09-11 23:11:32 +02:00
ameerj
8b0a45defd vp9_types: Minor refactor of VP9 info structs. 2021-08-25 21:42:43 -04:00
ameerj
01ac464999 vp9_types: Remove unused Vp9PictureInfo members 2021-08-25 21:29:22 -04:00
ameerj
681b194e24 h264: Lower max_num_ref_frames
GPU decoding seems to be more picky when it comes to the maximum number of reference frames.
2021-08-16 14:40:53 -04:00
ameerj
82906e26a5 configure_graphics: Add GPU nvdec decoding as an option
Some system configurations may see visual regressions or lower performance using GPU decoding compared to CPU decoding. This setting provides the option for users to specify their decoding preference.

Co-Authored-By: yzct12345 <87620833+yzct12345@users.noreply.github.com>
2021-08-16 14:40:53 -04:00
ameerj
7cd52be8c4 codec: Improve libav memory alloc and cleanup 2021-08-16 14:40:53 -04:00
ameerj
5fd82a4ec1 codec: Fallback to CPU decoding if no compatible GPU format is found 2021-08-16 14:40:53 -04:00
bunnei
1b274368b1 Merge pull request #6838 from ameerj/sws-align
vic: Specify sws_scale height stride.
2021-08-12 11:28:33 -07:00
ameerj
f885866fba codec: Replace deprecated av_init_packet usage 2021-08-12 01:28:01 -04:00
ameerj
561fd5f7a4 nvdec: Implement GPU accelerated decoding for all platforms
Supplements the VAAPI intel gpu decoder by implementing the D3D11VA decoder for Windows, and CUVID/VDPAU for Nvidia and AMD on drivers linux respectively.
2021-08-12 01:28:01 -04:00
ameerj
8236b4f4d7 vic: Specify sws_scale height stride.
Silences a sws_scale runtime warning about unaligned strides.
2021-08-09 23:24:16 -04:00
ameerj
4cd45cf374 vp9: Ensure the first frame is complete
Silences a runtime error due to the first frame missing the frame data, and being set to hidden despite being a key-frame.
2021-08-08 13:49:00 -04:00
ameerj
f52a3de990 nvdec: Better logging for unimplemented codecs 2021-08-07 01:08:33 -04:00
bunnei
cb6d198101 Merge pull request #6799 from ameerj/vp9-fixes
nvdec: Fix VP9 reference frame refreshes
2021-08-06 17:46:46 -07:00
ameerj
b34ded024c vp9: Cleanup unused variables
With reference frames refreshes fix, we no longer need to buffer two frames in advance.
We can also remove other unused or otherwise unneeded variables.
2021-08-06 20:08:11 -04:00
ameerj
27969c5943 vp9: Fix reference frame refreshes
This resolves the artifacting when decoding VP9 streams.
2021-08-06 20:08:08 -04:00
yzct12345
e13e98d99d nvdec: Implement VA-API hardware video acceleration (#6713)
* nvdec: VA-API

* Verify formatting

* Forgot a semicolon for Windows

* Clarify comment about AV_PIX_FMT_NV12

* Fix assert log spam from missing negation

* vic: Remove forgotten debug code

* Address lioncash's review

* Mention VA-API is Intel/AMD

* Address v1993's review

* Hopefully fix CMakeLists style this time

* vic: Improve cache locality

* vic: Fix off-by-one error

* codec: Async

* codec: Forgot the GetValue()

* nvdec: Address ameerj's review

* codec: Fallback to CPU without VA-API support

* cmake: Address lat9nq's review

* cmake: Make VA-API optional

* vaapi: Multiple GPU

* Apply suggestions from code review

Co-authored-by: Ameer J <52414509+ameerj@users.noreply.github.com>

* nvdec: Address ameerj's review

* codec: Use anonymous instead of static

* nvdec: Remove enum and fix memory leak

* nvdec: Address ameerj's review

* codec: Remove preparation for threading

Co-authored-by: Ameer J <52414509+ameerj@users.noreply.github.com>
2021-08-03 23:43:11 -04:00
Fernando S
51480d636e Merge pull request #6525 from ameerj/nvdec-fixes
nvdec: Fix Submit Ioctl data source, vic frame dimension computations
2021-07-15 15:17:50 +02:00
ameerj
34b718006e vic: Fix dimension compuation of YUV frames
Fixes out of bound memory crashes in Mario Golf
2021-07-15 00:51:50 -04:00
bunnei
88cb6c26f3 Merge pull request #6537 from Morph1984/warnings
general: Enforce multiple warnings in MSVC
2021-07-05 17:09:23 -07:00
Kelebek1
05fb3db000 Slightly refactor NVDEC and codecs for readability and safety 2021-07-01 06:22:05 +01:00
Morph
61fc23e127 video_core: Remove #pragma warning directives for external headers 2021-06-28 14:21:40 -04:00
ReinUsesLisp
d5154a3b19 codec,vic: Disable warnings in ffmpeg headers 2021-06-26 03:29:31 -03:00
lat9nq
3c453e0df0 vp9: Avoid memcpy with null pointers
Avoid sending null pointer to memcpy as reported by Undefined Behaviour
Sanitizer. Replaces the std::memcpy calls in SpliceVectors with
std::copy calls. Opting to replace all the memcpy's with copy's.

Co-authored-by: LC <mathew1800@gmail.com>
2021-04-05 00:44:38 -04:00
ameerj
01dec35df3 rebase, fix name shadowing, more const 2021-02-13 13:07:56 -05:00
ameerj
427eca063d streamline cdma_pusher/command_classes 2021-02-13 13:07:56 -05:00
ameerj
e97cd00753 streamline cdma_pusher/command_classes 2021-02-13 13:07:53 -05:00
ameerj
be6c487b4e nvdec cleanup 2021-02-13 13:07:31 -05:00
ReinUsesLisp
2dfce2fca6 video_core: Reimplement the buffer cache
Reimplement the buffer cache using cached bindings and page level
granularity for modification tracking. This also drops the usage of
shared pointers and virtual functions from the cache.

- Bindings are cached, allowing to skip work when the game changes few
  bits between draws.
- OpenGL Assembly shaders no longer copy when a region has been modified
  from the GPU to emulate constant buffers, instead GL_EXT_memory_object
  is used to alias sub-buffers within the same allocation.
- OpenGL Assembly shaders stream constant buffer data using
  glProgramBufferParametersIuivNV, from NV_parameter_buffer_object. In
  theory this should save one hash table resolve inside the driver
  compared to glBufferSubData.
- A new OpenGL stream buffer is implemented based on fences for drivers
  that are not Nvidia's proprietary, due to their low performance on
  partial glBufferSubData calls synchronized with 3D rendering (that
  some games use a lot).
- Most optimizations are shared between APIs now, allowing Vulkan to
  cache more bindings than before, skipping unnecesarry work.

This commit adds the necessary infrastructure to use Vulkan object from
OpenGL. Overall, it improves performance and fixes some bugs present on
the old cache. There are still some edge cases hit by some games that
harm performance on some vendors, this are planned to be fixed in later
commits.
2021-02-13 02:17:22 -03:00
Lioncash
d5bff783bd common/bit_util: Replace CLZ/CTZ operations with standardized ones
Makes for less code that we need to maintain.
2021-01-15 02:15:32 -05:00
Ameer J
21ff77c366 remove inaccurate reference
Co-authored-by: LC <mathew1800@gmail.com>
2021-01-07 14:33:45 -05:00
ameerj
30f3faf3e2 fix for nvdec disabled, cleanup host1x 2021-01-07 14:33:45 -05:00
ameerj
762de858e6 nvdec syncpt incorporation
laying the groundwork for async gpu, although this does not fully implement async nvdec operations
2021-01-07 14:33:45 -05:00
Morph
23413c0d44 general: Fix various spelling errors 2021-01-02 10:23:41 -05:00