Commit graph

106 commits

Author SHA1 Message Date
Morph
037e683b94 x64: cpu_wait: Implement MWAITX for non-MSVC compilers 2023-06-28 01:39:15 -04:00
Morph
88efcaf44e x64: cpu_wait: Remove magic values 2023-06-28 01:39:06 -04:00
Morph
cffefaf8a7 x64: cpu_wait: Make use of MWAITX in MicroSleep
MWAITX is equivalent to UMWAIT on Intel's Alder Lake CPUs.
We can emulate TPAUSE by using MONITORX in conjunction with MWAITX to wait for 100K cycles.
2023-06-28 01:38:55 -04:00
Morph
841a3559a5 x64: Add detection of monitorx instructions
monitorx introduces 2 instructions: MONITORX and MWAITX.
2023-06-28 01:36:06 -04:00
Morph
1b83c7eab4 (wall, native)_clock: Add GetGPUTick
Allows us to directly calculate the GPU tick without double conversion to and from the host clock tick.
2023-06-07 21:44:42 -04:00
Morph
c264630ba4 (wall, native)_clock: Rework NativeClock 2023-06-07 21:44:42 -04:00
Morph
728048edfe x64: Deduplicate RDTSC usage 2023-06-07 21:44:42 -04:00
Morph
fa3904acd9 x64: Simplify RDTSC on non-MSVC compilers
Co-Authored-By: liamwhite <liamwhite@users.noreply.github.com>
2023-03-27 17:45:22 -04:00
Morph
d260571440 x64: Add MicroSleep
MicroSleep allows the processor to pause for a "short" amount of time (in the microsecond range). This is useful for spin-waiting that does not require nanosecond precision.
This uses the new TPAUSE instruction introduced on Intel's newest processors as part of the waitpkg instructions. For CPUs that do not support waitpkg instructions, this is equivalent to yield().

Co-Authored-By: liamwhite <liamwhite@users.noreply.github.com>
2023-03-27 17:45:22 -04:00
Morph
95adf299e4 x64: cpu_detect: Add detection of waitpkg instructions
waitpkg introduces 3 instructions, UMONITOR, UMWAIT and TPAUSE.
2023-03-27 17:45:22 -04:00
Morph
e27dced550 native_clock: Wait for 10 seconds instead of 30
It was experimentally determined to be sufficient.
2023-03-07 21:17:46 -05:00
Morph
d766e783ea native_clock: Use RealTimeClock instead of SteadyClock
We want to synchronize RDTSC to real time.
2023-03-07 21:17:46 -05:00
Morph
afa678be3a native_clock: Re-adjust the RDTSC frequency
The RDTSC frequency reported by CPUID is not accurate to its true frequency.
We will spawn a separate thread to calculate the true RDTSC frequency after a measurement period of 30 seconds has elapsed.
2023-03-07 21:17:46 -05:00
Morph
38db5c2026 native_clock: Round RDTSC frequency to the nearest 1000 2023-03-05 02:36:31 -05:00
Matías Locatti
6b465c859b Add CPU core count to log files 2022-11-11 23:50:48 -03:00
Maide
68dcd946b7 Revert Coretiming PRs 8531 and 7454 (#8591) 2022-07-27 19:47:06 -04:00
Andrea Pappacoda
6a2efdda2f chore: make yuzu REUSE compliant
[REUSE] is a specification that aims at making file copyright
information consistent, so that it can be both human and machine
readable. It basically requires that all files have a header containing
copyright and licensing information. When this isn't possible, like
when dealing with binary assets, generated files or embedded third-party
dependencies, it is permitted to insert copyright information in the
`.reuse/dep5` file.

Oh, and it also requires that all the licenses used in the project are
present in the `LICENSES` folder, that's why the diff is so huge.
This can be done automatically with `reuse download --all`.

The `reuse` tool also contains a handy subcommand that analyzes the
project and tells whether or not the project is (still) compliant,
`reuse lint`.

Following REUSE has a few advantages over the current approach:

- Copyright information is easy to access for users / downstream
- Files like `dist/license.md` do not need to exist anymore, as
  `.reuse/dep5` is used instead
- `reuse lint` makes it easy to ensure that copyright information of
  files like binary assets / images is always accurate and up to date

To add copyright information of files that didn't have it I looked up
who committed what and when, for each file. As yuzu contributors do not
have to sign a CLA or similar I couldn't assume that copyright ownership
was of the "yuzu Emulator Project", so I used the name and/or email of
the commit author instead.

[REUSE]: https://reuse.software

Follow-up to b2eb103829
2022-07-27 12:53:49 +02:00
Marshall Mohror
cbadd75878 guard against div-by-zero 2022-07-06 13:00:00 -05:00
Marshall Mohror
b37f669584 common/x64: Use TSC clock rate from CPUID when available
The current method used to estimate the TSC is fairly accurate - within a few kHz - but the exact value can be extracted from CPUID if available.
2022-07-06 12:42:01 -05:00
Fernando Sahmkow
3adeb694b0 Adress Feedback. 2022-06-30 10:18:56 +02:00
Fernando Sahmkow
7f4debb936 Native clock: Use atomic ops as before. 2022-06-28 22:42:00 +02:00
Fernando Sahmkow
eadcaab9bd Native Clock: remove inaccuracy mask. 2022-06-28 01:47:00 +02:00
Fernando Sahmkow
d3becee4c0 Core: Fix tests. 2022-06-28 01:10:55 +02:00
Fernando Sahmkow
6b03abbbad Common: improve native clock. 2022-06-28 01:06:48 +02:00
Morph
2b87305d31 general: Convert source file copyright comments over to SPDX
This formats all copyright comments according to SPDX formatting guidelines.
Additionally, this resolves the remaining GPLv2 only licensed files by relicensing them to GPLv2.0-or-later.
2022-04-23 05:55:32 -04:00
Merry
42d6a01039 native_clock: Internal linkage for FencedRDTSC
__forceinline required on MSVC for function to be inlined
2022-04-03 22:38:12 +01:00
merry
b8d8677ed1 native_clock: Use lfence with rdtsc 2022-04-03 22:38:10 +01:00
merry
7470fdd77c native_clock: Use writeback from CAS to avoid double-loading 2022-04-02 22:22:48 +01:00
Merry
18ecb3053b native_clock: Use AtomicLoad128 2022-04-02 20:55:36 +01:00
ameerj
e70b4f3fc5 common: Reduce unused includes 2022-03-19 15:01:31 -04:00
Wunkolo
c802f8fbd2 cpu_detect: Add additional x86 flags and telemetry
Adds detection of additional CPU flags to cpu_detect and additions to telemetry output.

This is not exhaustive but guided by features that [dynarmic utilizes](bcfe377aaa/src/dynarmic/backend/x64/host_feature.h (L12-L33)) as well as features that are currently utilized but not reported to telemetry(invariant_tsc). This is intended to guide future optimizations.

AVX512 in particular is broken up into its individual subsets and some other processor features such as [sha](https://en.wikipedia.org/wiki/Intel_SHA_extensions) and [gfni](https://en.wikipedia.org/wiki/AVX-512#GFNI) are added to have some forward-facing data-points.

What used to be a single `CPU_Extension_x64_AVX512` telemetry field
is also broken up into individual `CPU_Extension_x64_AVX512{F,VL,CD,...}` fields.
2022-03-11 10:27:00 -08:00
Wunkolo
7cb99ccf23 cpu_detect: Revert __cpuid{ex} array-type argument
Restores compatibility with MSVC's `__cpuid` intrinsic.
2022-03-09 19:50:01 -08:00
Wunkolo
b603adb6ac cpu_detect: Add missing lzcnt detection 2022-03-09 13:57:47 -08:00
Wunkolo
4aa5b5779b cpu_detect: Refactor cpu/manufacturer identification
Set the zero-enum value to Unknown
Move the Manufacterer enum into the CPUCaps structure namespace
Add "ParseManufacturer" utility-function
Fix cpu/brand string buffer sizes(!)
2022-03-09 13:57:47 -08:00
Wunkolo
14618c0e98 cpu_detect: Update array-types to span and array
Update some uses of `int` into some more explicitly sized types as well
2022-03-09 13:57:47 -08:00
Wunkolo
609c64196b cpu_detect: Utilize Bit<N> utility function 2022-03-09 13:57:47 -08:00
Wunkolo
31f8d6f0cf cpu_detect: Compact capability fields
As this structure gets more explicit, bools can be bitfields and
small enums can use smaller types for their span of values.
2022-03-09 13:57:47 -08:00
Morph
fe2ff6b8a1 common: wall_clock: Utilize constants for ms, us, and ns ratios 2022-01-30 12:36:56 -05:00
Lioncash
1d5b635601 common/xbyak_api: Make BuildRegSet() constexpr
This allows us to eliminate any static constructors that would have been
emitted due to the function not being constexpr.
2022-01-26 16:29:15 -05:00
Morph
2e4b0fa68c common/cpu_detect: Remove CPU family and model
We currently do not make use of these fields, remove them for now.
2021-12-13 20:45:18 -05:00
Morph
875db1012b native_clock: Wait for less time in EstimateRDTSCFrequency
In my testing, waiting for 200ms provided the same level of precision as the previous implementation when estimating the RDTSC frequency.
This significantly improves the yuzu executable launch times since we reduced the wait time from 3 seconds to 200 milliseconds.
2021-12-03 19:55:59 -05:00
Morph
2b9afa4d56 general: Replace high_resolution_clock with steady_clock
On some OSes, high_resolution_clock is an alias to system_clock and is not monotonic in nature. Replace this with steady_clock.
2021-12-02 14:20:43 -05:00
Merry
891e19ef4c xbyak: Update include path 2021-08-15 19:26:38 +01:00
bunnei
e6f71e15a1 common: Merge uint128 to a single header file with inlines. 2021-02-15 14:46:04 -08:00
Fernando Sahmkow
659fb51dd9 X86/NativeClock: Reimplement RTDSC access to be lock free. 2021-01-02 04:00:27 +01:00
Fernando Sahmkow
50dd9a423a X86/NativeClock: Improve performance of clock calculations on hot path. 2021-01-02 00:43:47 +01:00
Lioncash
50516223e3 xbyak_abi: Shorten std::size_t to size_t
Makes for less reading.
2020-12-05 00:43:55 -05:00
Lioncash
65a52ab81a xbyak_abi: Avoid implicit sign conversions 2020-12-05 00:43:41 -05:00
Lioncash
29db886722 audio_core: Make shadowing and unused parameters errors
Moves the audio code closer to enabling warnings as errors in general.
2020-12-03 00:54:31 -05:00
Lioncash
38ffaef6eb common: Enable warnings as errors
Cleans up common so that we can enable warnings as errors.
2020-11-02 15:50:58 -05:00