Linux support. (#54)

* Initial Linux attempt.

* Add clang toolchain & make tools compile.

* vcpkg as submodule.

* First implementation of IO rewrite. (#31)

* Fix directory iteration resolving symlinks.

* Refactor kernel objects to be lock-free.

* Implement guest critical sections using std::atomic.

* Make D3D12 support optional. (#33)

* Make D3D12 support optional.

* Update ShaderRecomp, fix macros.

* Replace QueryPerformanceCounter. (#35)

* Add Linux home path for GetUserPath(). (#36)

* Cross-platform Sleep. (#37)

* Add mmap implementations for virtual allocation. (#38)

* Cross-platform TLS. (#34)

* Cross-platform TLS.

* Fix front() to back(), use Mutex.

* Fix global variable namings.

---------

Co-authored-by: Skyth <19259897+blueskythlikesclouds@users.noreply.github.com>

* Unicode support. (#39)

* Replace CreateDirectoryA with Unicode version.

* Cross platform thread implementation. (#41)

* Cross-platform thread implementation.

* Put set thread name calls behind a Win32 macro.

* Cross-platform semaphore implementation. (#43)

* xam: use SDL for keyboard input

* Cross-platform atomic operations. (#44)

* Cross-platform spin lock implementation.

* Cross-platform reference counting.

* Cross-platform event implementation. (#47)

* Compiling and running on Linux. (#49)

* Current work trying to get it to compile.

* Update vcpkg.json baseline.

* vcpkg, memory mapped file.

* Bitscan forward.

* Fix localtime_s.

* FPS patches high res clock.

* Rename Window to GameWindow. Fix guest pointers.

* GetCurrentThreadID gone.

* Code cache pointers, RenderWindow type.

* Add Linux stubs.

* Refactor Config.

* Fix paths.

* Add linux-release config.

* FS fixes.

* Fix Windows compilation errors & unicode converter crash.

* Rename physical memory allocation functions to not clash with X11.

* Fix NULL character being added on RtlMultiByteToUnicodeN.

* Use std::exit.

* Add protection to memory on Linux.

* Convert majority of dependencies to submodules. (#48)

* Convert majority of dependencies to submodules.

* Don't compile header-only libraries.

* Fix a few incorrect data types.

* Fix config directory.

* Unicode fixes & sizeof asserts.

* Change the exit function to not call static destructors.

* Fix files picker.

* Add RelWithDebInfo preset for Linux.

* Implement OS Restart on Linux. (#50)

---------

Co-authored-by: Dario <dariosamo@gmail.com>

* Update PowerRecomp.

* Add Env Var detection for VCPKG_ROOT, add DLC detection.

* Use error code version on DLC directory iterator.

* Set D3D12MA::ALLOCATOR_FLAG_DONT_PREFER_SMALL_BUFFERS_COMMITTED flag.

* Linux flatpak. (#51)

* Add flatpak support.

* Add game install directory override for flatpak.

* Flatpak'ing.

* Flatpak it some more.

* We flat it, we pak it.

* Flatpak'd.

* The Marvelous Misadventures of Flatpak.

* Attempt to change logic of NFD and show error.

* Flattenpakken.

* Use game install directory instead of current path.

* Attempt to fix line endings.

* Update io.github.hedge_dev.unleashedrecomp.json

* Fix system time query implementation.

* Add Present Wait to Vulkan to improve frame pacing and reduce latency. (#53)

* Add present wait support to Vulkan.

* Default to triple buffering if presentWait is supported.

* Bracey fellas.

* Update paths.h

* SDL2 audio (again). (#52)

* Implement SDL2 audio (again).

* Call timeBeginPeriod/timeEndPeriod.

* Replace miniaudio with SDL mixer.

* Queue audio samples in a separate thread.

* Enable CMake option override policy & fix compilation error.

* Fix compilation error on Linux.

* Fix but also trim shared strings.

* Wayland support. (#55)

* Make channel index a global variable in embedded player.

* Fix SDL Audio selection for OGG on Flatpak.

* Minor installer wizard fixes.

* Fix compilation error.

* Yield in model consumer and pipeline compiler threads.

* Special case Sleep(0) to yield on Linux.

* Add App Id hint.

* Correct implementation for auto reset events. (#57)

---------

Co-authored-by: Dario <dariosamo@gmail.com>
Co-authored-by: Hyper <34012267+hyperbx@users.noreply.github.com>
This commit is contained in:
Skyth (Asilkan)
2024-12-21 00:44:05 +03:00
committed by GitHub
parent f547c7ca6d
commit 67633917bf
109 changed files with 3373 additions and 2850 deletions

View File

@@ -196,7 +196,7 @@ void ImFontAtlasSnapshot::GenerateGlyphRanges()
{
std::vector<std::string_view> localeStrings;
for (auto& config : Config::Definitions)
for (auto& config : g_configDefinitions)
config->GetLocaleStrings(localeStrings);
std::set<ImWchar> glyphs;

View File

@@ -3345,7 +3345,8 @@ namespace plume {
D3D12MA::ALLOCATOR_DESC allocatorDesc = {};
allocatorDesc.pDevice = d3d;
allocatorDesc.pAdapter = adapter;
allocatorDesc.Flags = D3D12MA::ALLOCATOR_FLAG_DEFAULT_POOLS_NOT_ZEROED | D3D12MA::ALLOCATOR_FLAG_MSAA_TEXTURES_ALWAYS_COMMITTED;
allocatorDesc.Flags = D3D12MA::ALLOCATOR_FLAG_DEFAULT_POOLS_NOT_ZEROED |
D3D12MA::ALLOCATOR_FLAG_MSAA_TEXTURES_ALWAYS_COMMITTED | D3D12MA::ALLOCATOR_FLAG_DONT_PREFER_SMALL_BUFFERS_COMMITTED;
res = D3D12MA::CreateAllocator(&allocatorDesc, &allocator);
if (FAILED(res)) {

View File

@@ -29,12 +29,18 @@
typedef struct _NSWindow NSWindow;
#endif
#ifdef SDL_VULKAN_ENABLED
#include <SDL_vulkan.h>
#endif
namespace plume {
#if defined(_WIN64)
// Native HWND handle to the target window.
typedef HWND RenderWindow;
#elif defined(__ANDROID__)
typedef ANativeWindow* RenderWindow;
#elif defined(SDL_VULKAN_ENABLED)
typedef SDL_Window *RenderWindow;
#elif defined(__linux__)
struct RenderWindow {
Display* display;

View File

@@ -1989,6 +1989,13 @@ namespace plume {
fprintf(stderr, "vkCreateWin32SurfaceKHR failed with error code 0x%X.\n", res);
return;
}
# elif defined(SDL_VULKAN_ENABLED)
VulkanInterface *renderInterface = commandQueue->device->renderInterface;
SDL_bool sdlRes = SDL_Vulkan_CreateSurface(renderWindow, renderInterface->instance, &surface);
if (sdlRes == SDL_FALSE) {
fprintf(stderr, "SDL_Vulkan_CreateSurface failed with error %s.\n", SDL_GetError());
return;
}
# elif defined(__ANDROID__)
assert(renderWindow != nullptr);
VkAndroidSurfaceCreateInfoKHR surfaceCreateInfo = {};
@@ -2124,6 +2131,12 @@ namespace plume {
}
bool VulkanSwapChain::present(uint32_t textureIndex, RenderCommandSemaphore **waitSemaphores, uint32_t waitSemaphoreCount) {
constexpr uint64_t MaxFrameDelay = 1;
if (commandQueue->device->capabilities.presentWait && (currentPresentId > MaxFrameDelay)) {
constexpr uint64_t waitTimeout = 100000000;
vkWaitForPresentKHR(commandQueue->device->vk, vk, currentPresentId - MaxFrameDelay, waitTimeout);
}
thread_local std::vector<VkSemaphore> waitSemaphoresVector;
waitSemaphoresVector.clear();
for (uint32_t i = 0; i < waitSemaphoreCount; i++) {
@@ -2138,6 +2151,15 @@ namespace plume {
presentInfo.pImageIndices = &textureIndex;
presentInfo.pWaitSemaphores = !waitSemaphoresVector.empty() ? waitSemaphoresVector.data() : nullptr;
presentInfo.waitSemaphoreCount = uint32_t(waitSemaphoresVector.size());
VkPresentIdKHR presentId = {};
if (commandQueue->device->capabilities.presentWait) {
currentPresentId++;
presentId.sType = VK_STRUCTURE_TYPE_PRESENT_ID_KHR;
presentId.pPresentIds = &currentPresentId;
presentId.swapchainCount = 1;
presentInfo.pNext = &presentId;
}
VkResult res;
{
@@ -2297,6 +2319,8 @@ namespace plume {
GetClientRect(renderWindow, &rect);
dstWidth = rect.right - rect.left;
dstHeight = rect.bottom - rect.top;
# elif defined(SDL_VULKAN_ENABLED)
SDL_GetWindowSize(renderWindow, (int *)(&dstWidth), (int *)(&dstHeight));
# elif defined(__ANDROID__)
dstWidth = ANativeWindow_getWidth(renderWindow);
dstHeight = ANativeWindow_getHeight(renderWindow);
@@ -4058,7 +4082,11 @@ namespace plume {
// VulkanInterface
#if SDL_VULKAN_ENABLED
VulkanInterface::VulkanInterface(RenderWindow sdlWindow) {
#else
VulkanInterface::VulkanInterface() {
#endif
VkResult res = volkInitialize();
if (res != VK_SUCCESS) {
fprintf(stderr, "volkInitialize failed with error code 0x%X.\n", res);
@@ -4085,11 +4113,31 @@ namespace plume {
std::vector<VkExtensionProperties> availableExtensions(extensionCount);
vkEnumerateInstanceExtensionProperties(nullptr, &extensionCount, availableExtensions.data());
std::unordered_set<std::string> missingRequiredExtensions = RequiredInstanceExtensions;
std::unordered_set<std::string> requiredExtensions = RequiredInstanceExtensions;
std::unordered_set<std::string> supportedOptionalExtensions;
# if DLSS_ENABLED
const std::unordered_set<std::string> dlssExtensions = DLSS::getRequiredInstanceExtensionsVulkan();
# endif
# if SDL_VULKAN_ENABLED
// Push the extensions specified by SDL as required.
// SDL2 has this awkward requirement for the window to pull the extensions from.
// This can be removed when upgrading to SDL3.
if (sdlWindow != nullptr) {
uint32_t sdlVulkanExtensionCount = 0;
if (SDL_Vulkan_GetInstanceExtensions(sdlWindow, &sdlVulkanExtensionCount, nullptr)) {
std::vector<char *> sdlVulkanExtensions;
sdlVulkanExtensions.resize(sdlVulkanExtensionCount);
if (SDL_Vulkan_GetInstanceExtensions(sdlWindow, &sdlVulkanExtensionCount, (const char **)(sdlVulkanExtensions.data()))) {
for (char *sdlVulkanExtension : sdlVulkanExtensions) {
requiredExtensions.insert(sdlVulkanExtension);
}
}
}
}
# endif
std::unordered_set<std::string> missingRequiredExtensions = requiredExtensions;
for (uint32_t i = 0; i < extensionCount; i++) {
const std::string extensionName(availableExtensions[i].extensionName);
missingRequiredExtensions.erase(extensionName);
@@ -4114,7 +4162,7 @@ namespace plume {
}
std::vector<const char *> enabledExtensions;
for (const std::string &extension : RequiredInstanceExtensions) {
for (const std::string &extension : requiredExtensions) {
enabledExtensions.push_back(extension.c_str());
}
@@ -4177,8 +4225,15 @@ namespace plume {
// Global creation function.
#if SDL_VULKAN_ENABLED
std::unique_ptr<RenderInterface> CreateVulkanInterface(RenderWindow sdlWindow) {
std::unique_ptr<VulkanInterface> createdInterface = std::make_unique<VulkanInterface>(sdlWindow);
return createdInterface->isValid() ? std::move(createdInterface) : nullptr;
}
#else
std::unique_ptr<RenderInterface> CreateVulkanInterface() {
std::unique_ptr<VulkanInterface> createdInterface = std::make_unique<VulkanInterface>();
return createdInterface->isValid() ? std::move(createdInterface) : nullptr;
}
#endif
};

View File

@@ -22,9 +22,18 @@
#define VK_USE_PLATFORM_XLIB_KHR
#endif
#include "volk.h"
#include <volk.h>
#include "vk_mem_alloc.h"
#ifdef __clang__
#pragma clang diagnostic push
#pragma clang diagnostic ignored "-Wnullability-completeness"
#endif
#include <vk_mem_alloc.h>
#ifdef __clang__
#pragma clang diagnostic pop
#endif
namespace plume {
struct VulkanCommandQueue;
@@ -220,6 +229,7 @@ namespace plume {
VkPresentModeKHR requiredPresentMode = VK_PRESENT_MODE_FIFO_KHR;
VkCompositeAlphaFlagBitsKHR pickedAlphaFlag = VK_COMPOSITE_ALPHA_OPAQUE_BIT_KHR;
std::vector<VulkanTexture> textures;
uint64_t currentPresentId = 0;
bool immediatePresentModeSupported = false;
VulkanSwapChain(VulkanCommandQueue *commandQueue, RenderWindow renderWindow, uint32_t textureCount, RenderFormat format);
@@ -412,7 +422,12 @@ namespace plume {
VkApplicationInfo appInfo = {};
RenderInterfaceCapabilities capabilities;
# if SDL_VULKAN_ENABLED
VulkanInterface(RenderWindow sdlWindow);
# else
VulkanInterface();
# endif
~VulkanInterface() override;
std::unique_ptr<RenderDevice> createDevice() override;
const RenderInterfaceCapabilities &getCapabilities() const override;

View File

@@ -26,56 +26,68 @@
#include <ui/message_window.h>
#include <ui/options_menu.h>
#include <ui/sdl_listener.h>
#include <ui/window.h>
#include <ui/game_window.h>
#include <user/config.h>
#include <xxHashMap.h>
#if defined(ASYNC_PSO_DEBUG) || defined(PSO_CACHING)
#include <magic_enum.hpp>
#include <magic_enum/magic_enum.hpp>
#endif
#include "../../tools/ShaderRecomp/ShaderRecomp/shader_common.h"
#ifdef SWA_D3D12
#include "shader/copy_vs.hlsl.dxil.h"
#include "shader/copy_vs.hlsl.spirv.h"
#include "shader/csd_filter_ps.hlsl.dxil.h"
#include "shader/csd_filter_ps.hlsl.spirv.h"
#include "shader/enhanced_motion_blur_ps.hlsl.dxil.h"
#include "shader/enhanced_motion_blur_ps.hlsl.spirv.h"
#include "shader/gamma_correction_ps.hlsl.dxil.h"
#include "shader/gamma_correction_ps.hlsl.spirv.h"
#include "shader/gaussian_blur_3x3.hlsl.dxil.h"
#include "shader/gaussian_blur_3x3.hlsl.spirv.h"
#include "shader/gaussian_blur_5x5.hlsl.dxil.h"
#include "shader/gaussian_blur_5x5.hlsl.spirv.h"
#include "shader/gaussian_blur_7x7.hlsl.dxil.h"
#include "shader/gaussian_blur_7x7.hlsl.spirv.h"
#include "shader/gaussian_blur_9x9.hlsl.dxil.h"
#include "shader/gaussian_blur_9x9.hlsl.spirv.h"
#include "shader/imgui_ps.hlsl.dxil.h"
#include "shader/imgui_ps.hlsl.spirv.h"
#include "shader/imgui_vs.hlsl.dxil.h"
#include "shader/imgui_vs.hlsl.spirv.h"
#include "shader/movie_ps.hlsl.dxil.h"
#include "shader/movie_ps.hlsl.spirv.h"
#include "shader/movie_vs.hlsl.dxil.h"
#include "shader/movie_vs.hlsl.spirv.h"
#include "shader/resolve_msaa_depth_2x.hlsl.dxil.h"
#include "shader/resolve_msaa_depth_2x.hlsl.spirv.h"
#include "shader/resolve_msaa_depth_4x.hlsl.dxil.h"
#include "shader/resolve_msaa_depth_4x.hlsl.spirv.h"
#include "shader/resolve_msaa_depth_8x.hlsl.dxil.h"
#endif
#include "shader/copy_vs.hlsl.spirv.h"
#include "shader/csd_filter_ps.hlsl.spirv.h"
#include "shader/enhanced_motion_blur_ps.hlsl.spirv.h"
#include "shader/gamma_correction_ps.hlsl.spirv.h"
#include "shader/gaussian_blur_3x3.hlsl.spirv.h"
#include "shader/gaussian_blur_5x5.hlsl.spirv.h"
#include "shader/gaussian_blur_7x7.hlsl.spirv.h"
#include "shader/gaussian_blur_9x9.hlsl.spirv.h"
#include "shader/imgui_ps.hlsl.spirv.h"
#include "shader/imgui_vs.hlsl.spirv.h"
#include "shader/movie_ps.hlsl.spirv.h"
#include "shader/movie_vs.hlsl.spirv.h"
#include "shader/resolve_msaa_depth_2x.hlsl.spirv.h"
#include "shader/resolve_msaa_depth_4x.hlsl.spirv.h"
#include "shader/resolve_msaa_depth_8x.hlsl.spirv.h"
#ifdef _WIN32
extern "C"
{
__declspec(dllexport) unsigned long NvOptimusEnablement = 0x00000001;
__declspec(dllexport) int AmdPowerXpressRequestHighPerformance = 1;
}
#endif
namespace plume
{
#ifdef SWA_D3D12
extern std::unique_ptr<RenderInterface> CreateD3D12Interface();
#endif
#ifdef SDL_VULKAN_ENABLED
extern std::unique_ptr<RenderInterface> CreateVulkanInterface(RenderWindow sdlWindow);
#else
extern std::unique_ptr<RenderInterface> CreateVulkanInterface();
#endif
}
#pragma pack(push, 1)
@@ -165,7 +177,7 @@ struct DirtyStates
static DirtyStates g_dirtyStates(true);
template<typename T>
static FORCEINLINE void SetDirtyValue(bool& dirtyState, T& dest, const T& src)
static void SetDirtyValue(bool& dirtyState, T& dest, const T& src)
{
if (dest != src)
{
@@ -174,7 +186,12 @@ static FORCEINLINE void SetDirtyValue(bool& dirtyState, T& dest, const T& src)
}
}
static bool g_vulkan;
#ifdef SWA_D3D12
static bool g_vulkan = false;
#else
static constexpr bool g_vulkan = true;
#endif
static std::unique_ptr<RenderInterface> g_interface;
static std::unique_ptr<RenderDevice> g_device;
@@ -197,7 +214,6 @@ static std::unique_ptr<RenderCommandFence> g_copyCommandFence;
static std::unique_ptr<RenderSwapChain> g_swapChain;
static bool g_swapChainValid;
static bool g_needsResize;
static constexpr RenderFormat BACKBUFFER_FORMAT = RenderFormat::B8G8R8A8_UNORM;
@@ -545,7 +561,7 @@ static void DestructTempResources()
g_tempBuffers[g_frame].clear();
}
static uint32_t g_mainThreadId;
static std::thread::id g_mainThreadId;
static ankerl::unordered_dense::map<RenderTexture*, RenderTextureLayout> g_barrierMap;
@@ -579,13 +595,18 @@ static std::unique_ptr<uint8_t[]> g_buttonBcDiff;
static void LoadEmbeddedResources()
{
const size_t decompressedSize = g_vulkan ? g_spirvCacheDecompressedSize : g_dxilCacheDecompressedSize;
g_shaderCache = std::make_unique<uint8_t[]>(decompressedSize);
ZSTD_decompress(g_shaderCache.get(),
decompressedSize,
g_vulkan ? g_compressedSpirvCache : g_compressedDxilCache,
g_vulkan ? g_spirvCacheCompressedSize : g_dxilCacheCompressedSize);
if (g_vulkan)
{
g_shaderCache = std::make_unique<uint8_t[]>(g_spirvCacheDecompressedSize);
ZSTD_decompress(g_shaderCache.get(), g_spirvCacheDecompressedSize, g_compressedSpirvCache, g_spirvCacheCompressedSize);
}
#ifdef SWA_D3D12
else
{
g_shaderCache = std::make_unique<uint8_t[]>(g_dxilCacheDecompressedSize);
ZSTD_decompress(g_shaderCache.get(), g_dxilCacheDecompressedSize, g_compressedDxilCache, g_dxilCacheCompressedSize);
}
#endif
g_buttonBcDiff = decompressZstd(g_button_bc_diff, g_button_bc_diff_uncompressed_size);
}
@@ -1023,7 +1044,7 @@ static void ProcSetRenderState(const RenderCommand& cmd)
}
}
static const std::pair<GuestRenderState, void*> g_setRenderStateFunctions[] =
static const std::pair<GuestRenderState, PPCFunc*> g_setRenderStateFunctions[] =
{
{ D3DRS_ZENABLE, HostToGuestFunction<SetRenderState<D3DRS_ZENABLE>> },
{ D3DRS_ZWRITEENABLE, HostToGuestFunction<SetRenderState<D3DRS_ZWRITEENABLE>> },
@@ -1062,6 +1083,8 @@ static GuestShader* g_csdShader;
static std::unique_ptr<GuestShader> g_enhancedMotionBlurShader;
#ifdef SWA_D3D12
#define CREATE_SHADER(NAME) \
g_device->createShader( \
g_vulkan ? g_##NAME##_spirv : g_##NAME##_dxil, \
@@ -1069,11 +1092,20 @@ static std::unique_ptr<GuestShader> g_enhancedMotionBlurShader;
"main", \
g_vulkan ? RenderShaderFormat::SPIRV : RenderShaderFormat::DXIL)
#else
#define CREATE_SHADER(NAME) \
g_device->createShader(g_##NAME##_spirv, sizeof(g_##NAME##_spirv), "main", RenderShaderFormat::SPIRV);
#endif
#ifdef _WIN32
static bool DetectWine()
{
HMODULE dllHandle = GetModuleHandle("ntdll.dll");
return dllHandle != nullptr && GetProcAddress(dllHandle, "wine_get_version") != nullptr;
}
#endif
static constexpr size_t TEXTURE_DESCRIPTOR_SIZE = 65536;
static constexpr size_t SAMPLER_DESCRIPTOR_SIZE = 1024;
@@ -1136,7 +1168,7 @@ static void CreateImGuiBackend()
OptionsMenu::Init();
InstallerWizard::Init();
ImGui_ImplSDL2_InitForOther(Window::s_pWindow);
ImGui_ImplSDL2_InitForOther(GameWindow::s_pWindow);
#ifdef ENABLE_IM_FONT_ATLAS_SNAPSHOT
g_imFontTexture = LoadTexture(
@@ -1278,7 +1310,7 @@ static void CreateImGuiBackend()
static void BeginCommandList();
void Video::CreateHostDevice()
void Video::CreateHostDevice(bool sdlVideoDefault)
{
for (uint32_t i = 0; i < 16; i++)
g_inputSlots[i].index = i;
@@ -1286,13 +1318,25 @@ void Video::CreateHostDevice()
IMGUI_CHECKVERSION();
ImGui::CreateContext();
Window::Init();
GameWindow::Init(sdlVideoDefault);
#ifdef SWA_D3D12
g_vulkan = DetectWine() || Config::GraphicsAPI == EGraphicsAPI::Vulkan;
#endif
LoadEmbeddedResources();
g_interface = g_vulkan ? CreateVulkanInterface() : CreateD3D12Interface();
if (g_vulkan)
#ifdef SDL_VULKAN_ENABLED
g_interface = CreateVulkanInterface(GameWindow::s_renderWindow);
#else
g_interface = CreateVulkanInterface();
#endif
#ifdef SWA_D3D12
else
g_interface = CreateD3D12Interface();
#endif
g_device = g_interface->createDevice();
g_triangleFanSupported = g_device->getCapabilities().triangleFan;
@@ -1314,7 +1358,17 @@ void Video::CreateHostDevice()
switch (Config::TripleBuffering)
{
case ETripleBuffering::Auto:
bufferCount = g_vulkan ? 2 : 3; // Defaulting to 3 is fine on D3D12 thanks to flip discard model.
if (g_vulkan)
{
// Defaulting to 3 is fine if presentWait as supported, as the maximum frame latency allowed is only 1.
bufferCount = g_device->getCapabilities().presentWait ? 3 : 2;
}
else
{
// Defaulting to 3 is fine on D3D12 thanks to flip discard model.
bufferCount = 3;
}
break;
case ETripleBuffering::On:
bufferCount = 3;
@@ -1324,7 +1378,7 @@ void Video::CreateHostDevice()
break;
}
g_swapChain = g_queue->createSwapChain(Window::s_handle, bufferCount, BACKBUFFER_FORMAT);
g_swapChain = g_queue->createSwapChain(GameWindow::s_renderWindow, bufferCount, BACKBUFFER_FORMAT);
g_swapChain->setVsyncEnabled(Config::VSync);
g_swapChainValid = !g_swapChain->needsResize();
@@ -1334,7 +1388,7 @@ void Video::CreateHostDevice()
for (auto& renderSemaphore : g_renderSemaphores)
renderSemaphore = g_device->createCommandSemaphore();
g_mainThreadId = GetCurrentThreadId();
g_mainThreadId = std::this_thread::get_id();
RenderPipelineLayoutBuilder pipelineLayoutBuilder;
pipelineLayoutBuilder.begin(false, true);
@@ -1626,9 +1680,9 @@ static uint32_t CreateDevice(uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a4,
memset(device, 0, sizeof(*device));
uint32_t functionOffset = 0x443344; // D3D
g_codeCache.Insert(functionOffset, reinterpret_cast<void*>(HostToGuestFunction<SetRenderStateUnimplemented>));
g_codeCache.Insert(functionOffset, HostToGuestFunction<SetRenderStateUnimplemented>);
for (size_t i = 0; i < _countof(device->setRenderStateFunctions); i++)
for (size_t i = 0; i < std::size(device->setRenderStateFunctions); i++)
device->setRenderStateFunctions[i] = functionOffset;
for (auto& [state, function] : g_setRenderStateFunctions)
@@ -1638,7 +1692,7 @@ static uint32_t CreateDevice(uint32_t a1, uint32_t a2, uint32_t a3, uint32_t a4,
device->setRenderStateFunctions[state / 4] = functionOffset;
}
for (size_t i = 0; i < _countof(device->setSamplerStateFunctions); i++)
for (size_t i = 0; i < std::size(device->setSamplerStateFunctions); i++)
device->setSamplerStateFunctions[i] = *reinterpret_cast<uint32_t*>(g_memory.Translate(0x8330F3DC + i * 0xC));
device->viewport.width = 1280.0f;
@@ -1763,7 +1817,7 @@ static void UnlockBuffer(GuestBuffer* buffer)
{
if (!buffer->lockedReadOnly)
{
if (GetCurrentThreadId() == g_mainThreadId)
if (std::this_thread::get_id() == g_mainThreadId)
{
RenderCommand cmd;
cmd.type = (sizeof(T) == 2) ? RenderCommandType::UnlockBuffer16 : RenderCommandType::UnlockBuffer32;
@@ -2765,9 +2819,6 @@ static void ProcSetScissorRect(const RenderCommand& cmd)
SetDirtyValue<int32_t>(g_dirtyStates.scissorRect, g_scissorRect.right, args.right);
}
static Mutex g_compiledSpecConstantLibraryBlobMutex;
static ankerl::unordered_dense::map<uint32_t, ComPtr<IDxcBlob>> g_compiledSpecConstantLibraryBlobs;
static RenderShader* GetOrLinkShader(GuestShader* guestShader, uint32_t specConstants)
{
if (g_vulkan ||
@@ -2808,8 +2859,12 @@ static RenderShader* GetOrLinkShader(GuestShader* guestShader, uint32_t specCons
shader = guestShader->linkedShaders[specConstants].get();
}
#ifdef SWA_D3D12
if (shader == nullptr)
{
static Mutex g_compiledSpecConstantLibraryBlobMutex;
static ankerl::unordered_dense::map<uint32_t, ComPtr<IDxcBlob>> g_compiledSpecConstantLibraryBlobs;
thread_local ComPtr<IDxcCompiler3> s_dxcCompiler;
thread_local ComPtr<IDxcLinker> s_dxcLinker;
thread_local ComPtr<IDxcUtils> s_dxcUtils;
@@ -2916,6 +2971,7 @@ static RenderShader* GetOrLinkShader(GuestShader* guestShader, uint32_t specCons
shader = linkedShader.get();
}
}
#endif
return shader;
}
@@ -4013,8 +4069,9 @@ static void ProcSetPixelShader(const RenderCommand& cmd)
static std::thread g_renderThread([]
{
#ifdef _WIN32
GuestThread::SetThreadName(GetCurrentThreadId(), "Render Thread");
#endif
RenderCommand commands[32];
while (true)
@@ -4821,40 +4878,20 @@ struct PipelineStateQueueItem
static moodycamel::BlockingConcurrentQueue<PipelineStateQueueItem> g_pipelineStateQueue;
struct MinimalGuestThreadContext
{
uint8_t* stack = nullptr;
PPCContext ppcContext{};
~MinimalGuestThreadContext()
{
if (stack != nullptr)
g_userHeap.Free(stack);
}
void ensureValid()
{
if (stack == nullptr)
{
stack = reinterpret_cast<uint8_t*>(g_userHeap.Alloc(0x4000));
ppcContext.fn = (uint8_t*)g_codeCache.bucket;
ppcContext.r1.u64 = g_memory.MapVirtual(stack + 0x4000);
SetPPCContext(ppcContext);
}
}
};
static void PipelineCompilerThread()
{
#ifdef _WIN32
GuestThread::SetThreadName(GetCurrentThreadId(), "Pipeline Compiler Thread");
MinimalGuestThreadContext ctx;
#endif
std::unique_ptr<GuestThreadContext> ctx;
while (true)
{
PipelineStateQueueItem queueItem;
g_pipelineStateQueue.wait_dequeue(queueItem);
ctx.ensureValid();
if (ctx == nullptr)
ctx = std::make_unique<GuestThreadContext>(0);
auto pipeline = CreateGraphicsPipeline(queueItem.pipelineState);
#ifdef ASYNC_PSO_DEBUG
@@ -4867,6 +4904,8 @@ static void PipelineCompilerThread()
cmd.addPipeline.hash = queueItem.pipelineHash;
cmd.addPipeline.pipeline = pipeline.release();
g_renderQueue.enqueue(cmd);
std::this_thread::yield();
}
}
@@ -5496,10 +5535,11 @@ static bool CheckMadeAll(const T& modelData)
static void ModelConsumerThread()
{
#ifdef _WIN32
GuestThread::SetThreadName(GetCurrentThreadId(), "Model Consumer Thread");
#endif
std::vector<boost::shared_ptr<Hedgehog::Database::CDatabaseData>> localPendingDataQueue;
MinimalGuestThreadContext ctx;
std::unique_ptr<GuestThreadContext> ctx;
while (true)
{
@@ -5508,7 +5548,8 @@ static void ModelConsumerThread()
while ((pendingDataCount = g_pendingDataCount.load()) == 0)
g_pendingDataCount.wait(pendingDataCount);
ctx.ensureValid();
if (ctx == nullptr)
ctx = std::make_unique<GuestThreadContext>(0);
if (g_pendingPipelineStateCache)
{
@@ -5673,6 +5714,8 @@ static void ModelConsumerThread()
if (allHandled)
localPendingDataQueue.clear();
std::this_thread::yield();
}
}

View File

@@ -14,7 +14,7 @@ using namespace plume;
struct Video
{
static void CreateHostDevice();
static void CreateHostDevice(bool sdlVideoDefault);
static void HostPresent();
static void StartPipelinePrecompilation();
static void WaitForGPU();
@@ -84,22 +84,26 @@ struct GuestResource
void AddRef()
{
std::atomic_ref atomicRef(refCount.value);
uint32_t originalValue, incrementedValue;
do
{
originalValue = refCount.value;
incrementedValue = ByteSwap(ByteSwap(originalValue) + 1);
} while (InterlockedCompareExchange(reinterpret_cast<LONG*>(&refCount), incrementedValue, originalValue) != originalValue);
} while (!atomicRef.compare_exchange_weak(originalValue, incrementedValue));
}
void Release()
{
std::atomic_ref atomicRef(refCount.value);
uint32_t originalValue, decrementedValue;
do
{
originalValue = refCount.value;
decrementedValue = ByteSwap(ByteSwap(originalValue) - 1);
} while (InterlockedCompareExchange(reinterpret_cast<LONG*>(&refCount), decrementedValue, originalValue) != originalValue);
} while (!atomicRef.compare_exchange_weak(originalValue, decrementedValue));
// Normally we are supposed to release here, so only use this
// function when you know you won't be the one destructing it.
@@ -274,8 +278,10 @@ struct GuestShader : GuestResource
std::unique_ptr<RenderShader> shader;
struct ShaderCacheEntry* shaderCacheEntry = nullptr;
ankerl::unordered_dense::map<uint32_t, std::unique_ptr<RenderShader>> linkedShaders;
#ifdef SWA_D3D12
std::vector<ComPtr<IDxcBlob>> shaderBlobs;
ComPtr<IDxcBlobEncoding> libraryBlob;
#endif
#ifdef ASYNC_PSO_DEBUG
const char* name = "<unknown>";
#endif
@@ -390,7 +396,7 @@ enum GuestTextureAddress
D3DTADDRESS_BORDER = 6
};
extern bool g_needsResize;
inline bool g_needsResize;
extern std::unique_ptr<GuestTexture> LoadTexture(const uint8_t* data, size_t dataSize, RenderComponentMapping componentMapping = RenderComponentMapping());