Create device

With the physical device queried, we can now create the Vulkan device intended for runtime use.

This will be the centerpoint for most of our following operation, as the Vulkan device is responsible for all resource management and other device-specific object creation.

For the creation of the device, we will need to not only specify the device extensions and layers for the device to use, but also a list of queues we wish to create. A queue can be from certain queue families.

Each queue family can have flags indicating support any of the following:

  • VK_QUEUE_GRAPHICS_BIT
  • VK_QUEUE_COMPUTE_BIT
  • VK_QUEUE_TRANSFER_BIT
  • VK_QUEUE_SPARSE_BINDING_BIT

A queue family can process commands of supported types independently of one another. For instance, most GPUs will have at least a separate graphics queue and a separate transfer queue, so transfering data from the CPU to the GPU does not have to block the GPU from rendering operations.

Multiple queues can be created in the same queue family, but these are not guaranteed to be able to operate asynchronously. Instead, multiple queues in the same queue family can be implemented to swap execution.

To query queue families, we first need to call vkGetPhysicalDeviceQueueFamilyProperties in the same way as we have done for layers and extensions. Then we check for VK_QUEUE_GRAPHICS_BIT, as we are only interested with rendering.

However, on top of that we will also need to query which queue is reponsible for presenting the image to the screen. Since all surface functionality is part of a set of extensions, this is not part of the core API function.

uint32_t queueFamilyPropertyCount = 0;
uint32_t graphicsQueueIndex = 0xFFFFFFFF, presentQueueIndex = 0xFFFFFFFF;
VkQueueFamilyProperties* queueFamilyProperties = NULL;
VkBool32* supportsPresent = NULL;

vkGetPhysicalDeviceQueueFamilyProperties (
    renderer->physicalDevice,
    &queueFamilyPropertyCount,
    NULL
);

queueFamilyProperties = _alloca ( queueFamilyPropertyCount * sizeof ( VkQueueFamilyProperties ) );
supportsPresent       = _alloca ( queueFamilyPropertyCount * sizeof ( VkBool32 ) );

vkGetPhysicalDeviceQueueFamilyProperties (
    renderer->physicalDevice,
    &queueFamilyPropertyCount,
    queueFamilyProperties
);

for ( uint32_t i = 0; i < queueFamilyPropertyCount; i++ )
{
    if ( queueFamilyProperties[i].queueFlags & VK_QUEUE_GRAPHICS_BIT )
    {
        graphicsQueueIndex = i;
        break;
    }
}

if ( graphicsQueueIndex == 0xFFFFFFFF )
    RETURN_ERROR(-1,"No graphics queue found");

for ( uint32_t i = 0; i < queueFamilyPropertyCount; i++ )
{
    result = renderer->vkInstanceVtbl.vkGetPhysicalDeviceSurfaceSupportKHR (
        renderer->physicalDevice,
        i,
        renderer->surface,
        supportsPresent + i
    );

    if ( result != VK_SUCCESS )
        RETURN_ERROR(-1,"vkGetPhysicalDeviceSurfaceSupportKHR(%u) failed (0x%08X)", i, (uint32_t)result);
}

for ( uint32_t i = 0; i < queueFamilyPropertyCount; i++ )
{
    if ( supportsPresent[i] )
    {
        presentQueueIndex = i;
        break;
    }
}

if ( presentQueueIndex == 0xFFFFFFFF )
    RETURN_ERROR(-1,"No graphics queue found");

First we query the number and properties of all family queues available on the physical device. Then we determine the first available graphics queue we can find, and save the index of this queue family. Then we test every queue family to see if that queue family supports the present operation for the type of backbuffer we are going to use, and again pick the first one available.

With that information, we can create the device:

result = vkCreateDevice (
    renderer->physicalDevice,
    &(VkDeviceCreateInfo){
        .sType                = VK_STRUCTURE_TYPE_DEVICE_CREATE_INFO,
        .pNext                = NULL,
        .flags                = 0,
        .queueCreateInfoCount = (graphicsQueueIndex != presentQueueIndex) ? 2 : 1,
        .pQueueCreateInfos = (VkDeviceQueueCreateInfo[2]){
            [0] = {
                .sType            = VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO,
                .pNext            = NULL,
                .flags            = 0,
                .queueFamilyIndex = graphicsQueueIndex,
                .queueCount       = 1,
                .pQueuePriorities = (float[1]) { 0.0f },
            },
            [1] = {
                .sType            = VK_STRUCTURE_TYPE_DEVICE_QUEUE_CREATE_INFO,
                .pNext            = NULL,
                .flags            = 0,
                .queueFamilyIndex = presentQueueIndex,
                .queueCount       = 1,
                .pQueuePriorities = (float[1]) { 0.0f },
            },
        },
#if !BARE_AS_CAN_BE
        .enabledLayerCount       = STATIC_ARRAY_SIZE(RequiredDeviceLayers),
        .ppEnabledLayerNames     = RequiredDeviceLayers,
#else
        .enabledLayerCount       = 0,
        .ppEnabledLayerNames     = NULL,
#endif
        .enabledExtensionCount   = STATIC_ARRAY_SIZE(RequiredDeviceExtensions),
        .ppEnabledExtensionNames = RequiredDeviceExtensions,
        .pEnabledFeatures        = NULL
    },
    NULL,
    &renderer->device
);
if ( result != VK_SUCCESS )
    RETURN_ERROR(-1,"vkCreateDevice failed (0x%08X)", (uint32_t)result);

We specify the two queue indices we have gathered in the pQueueCreateInfos array. Since these indices may be identical, the queueCreateInfoCount variable is set to 1 if both queue indices are indeed identical, and to 2 if they differ. This is to ensure we do not ask for more queues than we are intending to use, as certain queue families may only support a single queue.

The pEnabledFeatures struct is set to NULL as we do not use it, but this structure may prove useful when looking toward specific optimizations for certain platforms. Support for these structures can be queried using the vkGetPhysicalDeviceFeatures function. Features not explicitly enabled using pEnabledFeatures are unlikely to work on most implementations, and as such watching the values of this structure when attempting to optimize your rendering process may be advisable.