Directx 11和12

如果电脑可以稳定运行DX12那就开DX12,还是要看具体的游戏。像我玩堡垒之夜,DX12在这个游戏里还只是测试版,虽然帧数高但是容易崩溃,其他DX12的游戏就不存在这个问题。DX12的特点简单说就下面几个:

1、DX12帧数小幅提升,CPU的利用率更高一点;

2、同一张显卡用DX12功耗下降一点;

3、DX12带纹理压缩和镶嵌技术,画面细节丰富一点;

4、DX12只能用win10系统上,win7不可以;

5、DX12环境下显卡超频有点压力,拉高频游戏容易崩溃;

怎么看显卡支不支持DX12?按住Win+R,打开运行框,输入dxdiag回车,打开诊断工具之后,就可以直接看到支不支持DX12了。

Directx 11和12

对于普通玩家而言就是两个不同的图形API接口而已,10系及以前的显卡开了反而会有负加成,我用的1060,开DX12运行光明记忆和战地5时,相比DX11平均低5-10帧(2K屏,同样的特效等级,DX11正常运行时60-80,开了DX12就会跳到50-65,相当于损失10%+的性能)

DirectX 12 debuted two years back, promising significant performance and efficiency boosts across the board. This includes better CPU utilization, closer to metal access as well as a host of new features most notably ray-tracing or DXR (DirectX Ray-tracing). But what exactly is DirectX 12 and how is it different from DirectX 11. Let’s have a look.

What is DirectX: It’s an API

Like Vulkan and OpenGL, DirectX is an API that allows you to run video games on your computer. However, unlike its counterparts, DX is a Microsoft proprietary platform and only runs on Windows natively. OpenGL and Vulkan, on the other hand, run on Mac as well as Linux.

Directx 11和12

Now, what does a graphics API like DirectX do? It acts as an intermediate between the game engine and the graphics drivers, which in turn interact with the OS Kernel. A graphics API is a platform where the actual game designing and mechanics are figured out. Think of it as MS Paint where the game is the painting and the paint application is the API. However, unlike paint, the output program of a graphics API is only readable by the API used to design it. In general, an API is designed for a specific OS. That’s the reason why PS4 games don’t run on the Xbox One and vise versa.

DirectX 12 Ultimate is the first graphics API that breaks that rule. It will be used on both Windows as well as the next-gen Xbox Series X. With DX12 Ultimate, MS is basically integrating the two platforms.

DirectX 11 vs DirectX 12: What Does it Mean for PC Gamers

There are three main advantages of the DirectX 12 API for PC gamers:

Better Scaling with Multi-Core CPUs

Directx 11和12
CPU overhead with DX11 and 12 (lower is better)

One of the core advantages of low-level APIs like DirectX 12 and Vulkan is improved CPU utilization. Traditionally with DirectX 9 and 11 based games, most games only used 2-4 cores for the various mechanics: Physics, AI, draw-calls, etc. Some games were even limited to one. With DirectX 12 that has changed. The load is more evenly distributed across all cores, making multi-core CPUs more relevant for gamers.

Maximum hardware utilization

Directx 11和12

Many of you might have noticed that in the beginning, AMD GPUs favored DirectX 12 titles more than rival NVIDIA parts. Why is that?

The reason is better utilization. Traditionally, NVIDIA has had much better driver support while AMD hardware has always suffered from the lack thereof. DirectX 12 adds many technologies to improve utilization such as asynchronous compute which allows multiple stages of the pipeline to be executed simultaneously (read: Compute and Graphics). This makes poor driver support a less pressing concern.

Closer to Metal Support

Another major advantage of DirectX 12 is that developers have more control over how their game utilizes the hardware. Earlier this was more abstract and was mostly taken care of by the drivers and the API (although some engines like Frostbyte and Unreal provided low-level tools as well).

Directx 11和12
Source: PCWorld

Now the task falls to the developers. They have closer to metal access, meaning that most of the rendering responsibilities and resource allocation are handled by the game engines with some help from the graphics drivers.

This is a double-edged sword as there are multiple GPU architectures out in the wild and for indie devs, it’s impossible to optimize their game for all of them. Luckily, third-party engines like Unreal, CryEngine, and Unity do this for them and they only have to focus on designing.

How DirectX 12 Improves Performance by Optimizing Hardware Utilization

Again, there are a few main API advances that facilitate this gain:

Per-Call API Context

Like every application, graphics APIs like DirectX also feature a primary thread that keeps track of the internal API state (resources, their allocation, and availability). With DirectX 9 and 11, there’s a global state (or context). The games you run on your PC modify this state via draw calls to the API, after which it’s submitted to the GPU for execution. Since there’s a single global state/context (and a single main thread on which it’s run), it makes it difficult to multi-thread as multiple draw calls simultaneously can cause errors. Furthermore, modifying the global state via state calls is a relatively slower process, further complicating the entire process.

Directx 11和12

With DirectX 12, the draw calls are more flexible. Instead of a single global state (context), each draw call from the application has its own smaller state (see PSOs below for more). These draw calls contain the required data and associated pointers within and are independent of other calls and their states. This allows the use of multiple threads for different draw calls.

Pipeline State Objects

In DirectX 11, the objects in the GPU pipeline exist across a wide range of states such as Vertex Shader, Hull Shader, Geometry shader, etc. These states are often interdependent on one another and the next successive one can’t progress unless the previous stage is defined. When the geometry from a scene is sent to the GPU for rendering, the resources and hardware required can vary depending on the rasterizer state, blend state, depth stencil state, culling, etc.

Directx 11和12

Each of the objects in DirectX 11 needs to be defined individually (at runtime) and the next state can’t be executed until the previous one has been finalized as they require different hardware units (shaders vs ROPs, TMUs, etc). This effectively leaves the hardware under-utilized resulting in increased overhead and reduced draw calls.

Directx 11和12
  • Directx 11和12
  • Directx 11和12

In the above comparison, HW state 1 represents the shader code, 2 is a combination of the rasterizer and the control flow linking the rasterizer to the shaders. State 3 is the linkage between the blend and pixel shader. The Vertex Shader affects HW states 1 & 2, the Rasterizer state 2, Pixel shader states 1-3, and so on. As already explained in the above section, this introduces some additional CPU overhead as the driver generally prefers to wait till the dependencies are resolved.

DirectX 12 replaces the various states with Pipeline State Objects (PSO) which are finalized upon creation itself. A PSO in simple words is an object that describes the state of the draw call it represents. An application can create as many PSOs as required and can switch between them as needed. These PSOs include the bytecode for all shaders including, vertex, pixel, domain, hull, and geometry shader and can be converted into any state as per requirement without depending on any other object or state.

Directx 11和12
Source
Directx 11和12
NVIDIA’s Mesh and Task Shaders (Turing uarch)

NVIDIA and AMD’s latest GPUs, with the help of DirectX 12 introduce Task Shaders and Mesh Shaders. These two new shaders replace the various cumbersome shader stages involved in the DX11 pipeline for a more flexible approach.

The mesh shader performs the same task as the domain and geometry shaders but internally it uses a multi-threaded instead of a single-threaded model. The task shader works similarly. The major difference here is that while the input of the hull shader was patches and the output of the tessellated object, the task shader’s input and output are user-defined.

In the below scene, there are thousands of objects that need to be rendered. In the traditional model, each of them would require a unique draw call from the CPU. However, with the task shader, a list of objects using a single draw call is sent. The task shader then processes this list in parallel and assigns work to the mesh shader (which also works synchronously) after which the scene is sent to the rasterizer for 3D to 2D conversion.

This approach helps reduce the number of CPU draw calls per scene significantly, thereby increasing the level of detail.

Mesh shaders also facilitate the culling of unused triangles. This is done using the amplification shader. It runs prior to the mesh shader and determines the number of mesh shader thread groups needed. They test the various meshlets for possible intersections and screen visibility and then carry out the required culling. Geometry culling at this early rendering stage significantly improves performance. You can read more here…

NVIDIA’s Mesh and Hull Shaders also leverage DX12
Directx 11和12

Directx 11和12

  • AMD Radeon RDNA 2 “Big Navi” Architectural Deep Dive: A Focus on Efficiency

Command Queue

With DirectX 11, there’s only a single queue going to the GPU. This leads to uneven distribution of load across various CPU cores, essentially crippling multi-threaded CPUs.

Continue reading on the next page…