Commit graph

190 commits

Author SHA1 Message Date
riperiperi
da587c8e49 Implement Viewport Transform Disable (#3328)
* Initial implementation (no specialization)

* Use specialization

* Fix render scale, increase code gen version

* Revert accidental change

* Address Feedback
2022-05-12 10:47:13 -03:00
gdkchan
7ffaeab4da New shader cache implementation (#3194)
* New shader cache implementation

* Remove some debug code

* Take transform feedback varying count into account

* Create shader cache directory if it does not exist + fragment output map related fixes

* Remove debug code

* Only check texture descriptors if the constant buffer is bound

* Also check CPU VA on GetSpanMapped

* Remove more unused code and move cache related code

* XML docs + remove more unused methods

* Better codegen for TransformFeedbackDescriptor.AsSpan

* Support migration from old cache format, remove more unused code

Shader cache rebuild now also rewrites the shared toc and data files

* Fix migration error with BRX shaders

* Add a limit to the async translation queue

 Avoid async translation threads not being able to keep up and the queue growing very large

* Re-create specialization state on recompile

This might be required if a new version of the shader translator requires more or less state, or if there is a bug related to the GPU state access

* Make shader cache more error resilient

* Add some missing XML docs and move GpuAccessor docs to the interface/use inheritdoc

* Address early PR feedback

* Fix rebase

* Remove IRenderer.CompileShader and IShader interface, replace with new ShaderSource struct passed to CreateProgram directly

* Handle some missing exceptions

* Make shader cache purge delete both old and new shader caches

* Register textures on new specialization state

* Translate and compile shaders in forward order (eliminates diffs due to different binding numbers)

* Limit in-flight shader compilation to the maximum number of compilation threads

* Replace ParallelDiskCacheLoader state changed event with a callback function

* Better handling for invalid constant buffer 1 data length

* Do not create the old cache directory structure if the old cache does not exist

* Constant buffer use should be per-stage. This change will invalidate existing new caches (file format version was incremented)

* Replace rectangle texture with just coordinate normalization

* Skip incompatible shaders that are missing texture information, instead of crashing

This is required if we, for example, support new texture instruction to the shader translator, and then they allow access to textures that were not accessed before. In this scenario, the old cache entry is no longer usable

* Fix coordinates normalization on cubemap textures

* Check if title ID is null before combining shader cache path

* More robust constant buffer address validation on spec state

* More robust constant buffer address validation on spec state (2)

* Regenerate shader cache with one stream, rather than one per shader.

* Only create shader cache directory during initialization

* Logging improvements

* Proper shader program disposal

* PR feedback, and add a comment on serialized structs

* XML docs for RegisterTexture

Co-authored-by: riperiperi <rhy3756547@hotmail.com>
2022-04-10 10:49:44 -03:00
gdkchan
401f0cd555 Implement VMAD shader instruction and improve InvocationInfo and ISBERD handling (#3251)
* Implement VMAD shader instruction and improve InvocationInfo and ISBERD handling

* Shader cache version bump

* Fix typo
2022-04-08 12:42:39 +02:00
merry
5d66aa0d56 Lop3Expression: Optimize expressions (#3184)
* lut3

* bugfixes

* TruthTable

* false/true -> 0/-1

* add or to expressions

* fix inversions

* increment cache version
2022-04-08 11:17:38 +02:00
gdkchan
837c4f0dfe Fix shader textureSize with multisample and buffer textures (#3240)
* Fix shader textureSize with multisample and buffer textures

* Replace out param with tuple return value
2022-04-04 14:43:58 -03:00
gdkchan
083bb71694 Initialize indexed inputs used on next shader stage (#3198) 2022-03-14 20:02:50 -03:00
gdkchan
9baa9ea772 Do not initialize geometry shader passthrough attributes (#3196) 2022-03-14 19:35:41 -03:00
gdkchan
1d150b6730 Only initialize shader outputs that are actually used on the next stage (#3054)
* Only initialize shader outputs that are actually used on the next stage

* Shader cache version bump
2022-03-06 20:42:13 +01:00
gdkchan
e5c1412aa8 Prefer texture over textureSize for sampler type (#3132)
* Prefer texture over textureSize for sampler type

* Shader cache version bump
2022-02-18 02:44:46 +01:00
gdkchan
39ebb000c1 Do not allow render targets not explicitly written by the fragment shader to be modified (#3063)
* Do not allow render targets not explicitly written by the fragment shader to be modified

* Shader cache version bump

* Remove blank lines

* Avoid redundant color mask updates

* HostShaderCacheEntry can be null

* Avoid more redundant glColorMask calls

* nit: Mask -> Masks

* Fix currentComponentMask

* More efficient way to update _currentComponentMasks
2022-02-16 23:15:39 +01:00
gdkchan
17b84e4279 Fix missing geometry shader passthrough inputs (#3106)
* Fix missing geometry shader passthrough inputs

* Shader cache version bump
2022-02-11 19:52:20 +01:00
gdkchan
73d10233ef Stop using glTransformFeedbackVaryings and use explicit layout on the shader (#3012)
* Stop using glTransformFeedbackVarying and use explicit layout on the shader

* This is no longer needed

* Shader cache version bump

* Fix gl_PerVertex output for tessellation control shaders
2022-01-21 12:35:21 -03:00
gdkchan
4c4bd46cc3 Add capability for BGRA formats (#3011) 2022-01-20 08:37:21 -03:00
gdkchan
1f5a6e43c2 Implement IMUL, PCNT and CONT shader instructions, fix FFMA32I and HFMA32I (#2972)
* Implement IMUL shader instruction

* Implement PCNT/CONT instruction and fix FFMA32I

* Add HFMA232I to the table

* Shader cache version bump

* No Rc on Ffma32i
2022-01-10 12:08:00 -03:00
riperiperi
142cdc54d9 Add support for render scale to vertex stage. (#2763)
* Add support for render scale to vertex stage.

Occasionally games read off textureSize on the vertex stage to inform the fragment shader what size a texture is without querying in there. Scales were not present in the vertex shader to correct the sizes, so games were providing the raw upscaled texture size to the fragment shader, which was incorrect.

One downside is that the fragment and vertex support buffer description must be identical, so the full size scales array must be defined when used. I don't think this will have an impact though. Another is that the fragment texture count must be updated when vertex shader textures are used. I'd like to correct this so that the update is folded into the update for the scales.

Also cleans up a bunch of things, like it making no sense to call CommitRenderScale for each stage.

Fixes render scale causing a weird offset bloom in Super Mario Party and Clubhouse Games. Clubhouse Games still has a pixelated look in a number of its games due to something else it does in the shader.

* Split out support buffer update, lazy updates.

* Commit support buffer before compute dispatch

* Remove unnecessary qualifier.

* Address Feedback
2022-01-08 14:48:48 -03:00
gdkchan
63de118842 Fix SUATOM and other texture shader instructions with RZ dest (#2885)
* Fix SUATOM and other texture shader instructions with RZ dest

* Shader cache version bump
2021-12-08 18:36:09 -03:00
gdkchan
b74332e17d Implement remaining shader double-precision instructions (#2845)
* Implement remaining shader double-precision instructions

* Shader cache version bump
2021-12-08 17:54:12 -03:00
gdkchan
9effc3e9ff Fix FLO.SH shader instruction with a input of 0 (#2876)
* Fix FLO.SH shader instruction with a input of 0

* Shader cache version bump
2021-12-05 13:25:05 +01:00
Mary
4109f0077a infra: Migrate to .NET 6 (#2829)
* infra: Migrate to .NET 6

* Rollback version naming change

* Workaround .NET 6 ZipArchive API issues

* ci: Switch to VS 2022 for AppVeyor

CI is now ready for .NET 6

* Suppress WebClient warning in DoUpdateWithMultipleThreads

* Attempt to workaround System.Drawing.Common changes on 6.0.0

* Change keyboard rendering from System.Drawing to ImageSharp

* Make the software keyboard renderer multithreaded

* Bump ImageSharp version to 1.0.4 to fix a bug in Image.Load

* Add fallback fonts to the keyboard renderer

* Fix warnings

* Address caian's comment

* Clean up linux workaround as it's uneeded now

* Update readme

Co-authored-by: Caian Benedicto <caianbene@gmail.com>
2021-11-28 21:24:17 +01:00
gdkchan
d8bb173c9e Fix shader integer from/to double conversion (#2831) 2021-11-14 21:37:07 -03:00
gdkchan
4dbaeeec95 Support shader gl_Color, gl_SecondaryColor and gl_TexCoord built-ins (#2817)
* Support shader gl_Color, gl_SecondaryColor and gl_TexCoord built-ins

* Shader cache version bump

* Fix back color value on fragment shader

* Disable IPA multiplication for fixed function attributes and back color selection
2021-11-08 13:18:46 -03:00
gdkchan
b4497affe2 Fix bindless/global memory elimination with inverted predicates (#2826)
* Fix bindless/global memory elimination with inverted predicates

* Shader cache version bump
2021-11-08 12:57:28 -03:00
gdkchan
17f90e39b7 Fix InvocationInfo on geometry shader and bindless default integer const (#2822)
* Fix InvocationInfo on geometry shader and bindless default integer const

* Shader cache version bump

* Consistency for the default value
2021-11-08 11:39:30 -03:00
gdkchan
f229550167 Add support for fragment shader interlock (#2768)
* Support coherent images

* Add support for fragment shader interlock

* Change to tree based match approach

* Refactor + check for branch targets and external registers

* Make detection more robust

* Use Intel fragment shader ordering if interlock is not available, use nothing if both are not available

* Remove unused field
2021-10-28 19:53:12 -03:00
gdkchan
a7bca3c25d Preserve image types for shader bindless surface instructions (.D variants) (#2779)
* Preserve image types for SULD/SUST .D variants

* Make format unknown for surface atomic if bindless and not sized
2021-10-24 19:40:20 -03:00
gdkchan
34ef09afb1 Fix shader 8-bit and 16-bit STS/STG (#2741)
* Fix 8 and 16-bit STG

* Fix 8 and 16-bit STS

* Shader cache version bump
2021-10-18 20:24:15 -03:00
riperiperi
69a91e4e7f Another workaround for NVIDIA driver 496.13 shader bug (#2750)
* Another workaround for NVIDIA driver 496.13 shader bug

This might work better than the other one. Give this a test to see if it fixes/doesn't fix issues with the other one.

The problem seems to be when any variable assignment happens with a negation. `temp_1 = -temp_0;` seems to trigger weird behaviour, but `temp_1 = 0.0 - temp_0;` does not. This also might to extend towards integer types?

* Update cache version

* Add disclaimer comment

* Wording
2021-10-18 20:04:06 -03:00
gdkchan
6435cd798f Initial tessellation shader support (#2534)
* Initial tessellation shader support

* Nits

* Re-arrange built-in table

* This is not needed anymore

* PR feedback
2021-10-18 18:38:04 -03:00
gdkchan
88e8c82671 Add missing U8/S8 types from shader I2I instruction (#2740)
* Add missing U8/S8 types from shader I2I instruction

* Better names

* Fix dstIsSignedInt
2021-10-17 17:48:36 -03:00
gdkchan
7c576d44af Extend bindless elimination to work with masked and shifted handles (#2727)
* Extent bindless elimination to work with masked handles

* Extend bindless elimination to catch shifted pattern, refactor handle packing/unpacking
2021-10-17 17:28:18 -03:00
gdkchan
909ea6f84b Implement SHF (funnel shift) shader instruction (#2702)
* Implement SHF shader instruction

* Shader cache version bump

* Better name
2021-10-17 17:02:20 -03:00
gdkchan
a1b0fd1ba9 Rewrite shader decoding stage (#2698)
* Rewrite shader decoding stage

* Fix P2R constant buffer encoding

* Fix PSET/PSETP

* PR feedback

* Log unimplemented shader instructions

* Implement NOP

* Remove using

* PR feedback
2021-10-12 22:35:31 +02:00
gdkchan
f94e708453 Only make render target 2D textures layered if needed (#2646)
* Only make render target 2D textures layered if needed

* Shader cache version bump

* Ensure topology is updated on channel swap
2021-09-29 01:55:12 +02:00
gdkchan
43eedc9c06 Use shader subgroup extensions if shader ballot is not supported (#2627)
* Use shader subgroup extensions if shader ballot is not supported

* Shader cache version bump + cleanup

* The type is still required on the table
2021-09-19 14:38:39 +02:00
riperiperi
6b3737546e Fix TXQ for 3D textures. (#2613)
* Fix TXQ for 3D textures.

Assumes the texture is 3D if the component mask contains Z.

This fixes a bug in UE4 games where parts of the map had garbage pointers to lighting voxels, as the lookup 3D texture was not being initialized. Most notable game is THPS1+2.

May need another PR to keep image store data alive and properly flush it in order using the AutoDeleteCache.

* Get sampler type for TextureSize from bound textures.
2021-09-02 00:17:43 -03:00
riperiperi
ae821e8238 Implement Shader Instructions SUATOM and SURED (#2090)
* Initial Implementation

* Further improvements (no support for float/64-bit types)

* Merge atomic and reduce instructions, add missing format switch

* Fix rebase issues.

* Not used.

* Whoops. Fixed.

* Partial implementation of inc/dec, cleanup and TODOs

* Remove testing path

* Address Feedback
2021-08-31 02:51:57 -03:00
gdkchan
c8077a711a Fix out-of-bounds shader thread shuffle (#2605)
* Fix out-of-bounds shader thread shuffle

* Shader cache version bump
2021-08-30 14:02:40 -03:00
gdkchan
dfc0ebb616 Initial support for shader attribute indexing (#2546)
* Initial support for shader attribute indexing

* Support output indexing too, other improvements

* Fix order

* Address feedback
2021-08-27 01:44:47 +02:00
gdkchan
0819b88a1a Ensure render scale is initialized to 1 on the backend (#2543) 2021-08-11 19:44:41 -03:00
gdkchan
0485141e8a Workaround for Intel FrontFacing built-in variable bug (#2540) 2021-08-11 23:01:06 +02:00
gdkchan
b56a469566 Make sure attributes used on subsequent shader stages are initialized (#2538) 2021-08-11 22:27:00 +02:00
gdkchan
3b3a1193a2 Replace BGRA and scale uniforms with a uniform block (#2496)
* Replace BGRA and scale uniforms with a uniform block

* Setting the data again on program change is no longer needed

* Optimize and resolve some warnings

* Avoid redundant support buffer updates

* Some optimizations to BindBuffers (now inlined)

* Unify render scale arrays
2021-08-11 21:33:43 +02:00
gdkchan
d7fc382e9e Use a new approach for shader BRX targets (#2532)
* Use a new approach for shader BRX targets

* Make shader cache actually work

* Improve the shader pattern matching a bit

* Extend LDC search to predecessor blocks, catches more cases

* Nit

* Only save the amount of constant buffer data actually used. Avoids crashes on partially mapped buffers

* Ignore Rd on predicate instructions, as they do not have a Rd register (catches more cases)
2021-08-11 20:59:42 +02:00
Mary
9804a472c3 shadertools: Prepare for new target Languages and APIs (#2465)
* shadertools: Prepare for new target Langugaes and APIs

This improves shader tools command line by adding support for target
language and api.

* Address gdkchan's comments
2021-07-18 12:49:39 +02:00
gdkchan
f3cb413aa6 Fix shader compilation on shaders that uses rectangle textures (#2471) 2021-07-12 16:20:33 -03:00
gdkchan
f1ddf692fe Unscale textureSize when resolution scaling is used (#2441)
* Unscale textureSize when resolution scaling is used

* Fix textureSize on compute

* Flag texture size as needing res scale values too
2021-07-09 00:09:07 -03:00
gdkchan
e4a83daf3e Allow shader language and target API to be specified on the shader translator (#2402) 2021-07-06 21:20:06 +02:00
gdkchan
520965bcec Fix default value for unwritten shader outputs (#2412)
* Fix shader default output values

* Shader cache version bump
2021-06-25 19:56:03 -03:00
gdkchan
5d0b20a56e Fix texture sampling with depth compare and LOD level or bias (#2404)
* Fix texture sampling with depth compare and LOD level or bias

* Shader cache version bump

* nit: Sorting
2021-06-25 00:54:50 +02:00
gdkchan
7c537a84b3 Fix shader texture LOD query (#2397) 2021-06-23 23:31:14 +02:00