A1: Scene Viewer

What to Turn In

Turn in your code in /afs/cs.cmu.edu/academic/class/15472-f24/<andrewid>/A1/. (If you are accessing AFS via an andrew machine, you may need to aklog cs.cmu.edu to acquire cross-realm tokens.)
Your turn-in directory should include:

report/ report describing your code and illustrating that it works.
- report/report.html start with the report template and replace the "placeholder" sections.
- report/*.s72,*.b72 benchmarking scenes and data mentioned in your report.
- report/* other files (images, animations) needed by your report.
code/ the code you wrote for this assignment.
- code/.git your code must be stored in a git repository with a commit history showing your development process.
- code/Maekfile.js build script. When run as node Maekfile.js produces bin/viewer (bin/viewer.exe on Windows), your compiled scene viewer.
- Do not include any compilation products (e.g., executables, object files, or compiled SPIR-V).
- Do not include any scene files (if it is a scene used as part of your report it should be in the report/ folder; the scene file for your animation should be in the animation/ folder.)
- Do not include your report. Use a separate repository for your report if you want to version it.
animation/ your created animation loop.
- animation.s72 your main animation Scene'72 file.
- *.b72 any data files needed by your scene.
- animation.mp4 a screen recording (H.264 in MP4 container) of your animation playing in your viewer.

We expect your Maekfile.js to properly build your viewer on at least one of { Linux/g++; Windows/cl.exe; macOS/clang++ }. We will not penalize you for minor cross-platform compile problems; though we would appreciate it if you tested on Linux.

We will compile and run your code in an environment set up to build the nakluV tutorial -- GLFW 3.4 and the Vulkan SDK will be available.

We expect your report to be viewable in Firefox on Linux. You may wish to consult MDN to determine format compatibility for any embedded videos.

Command-Line Arguments

Some of the features you implement in this assignment are controlled by command-line arguments. Many of these are documented in more detail in the sections below. The "optional" and "required" tags indicate whether the argument is required on the command line, not whether it is required that you implement it.

--scene scene.s72 (required) -- A1-load -- specifies the scene (in .s72 format) to view.
--camera name (optional) -- A1-show -- view the scene through the camera named name. If such a camera doesn't exist in the scene, abort.
--physical-device name (required/optional) -- A1-show -- use the physical device whose VkPhysicalDeviceProperties::deviceName matches name. If such a device does not exist, abort. You may choose how to deal with the flag not being specified, and how to allow the user to determine valid options (perhaps a --list-physical-devices flag?).
--drawing-size w h (optional) -- A1-show -- set the initial size of the drawable part of the window in physical pixels. If the resulting swapchain extent does not match the requested size (e.g., because the window manager won't allow a window this requested size), abort. If this flag is not specified, do something reasonable, like creating a moderately-sized window.
--culling none|frustum|... (optional) -- A1-cull, A1x-fast -- sets the culling mode. You may add additional culling modes when tackling extra goals.
--headless events (optional) -- A1-hide -- if specified, run in headless mode (no windowing system connection), and read frame times and events from the file events. The flag --drawing-size is required when using headless mode, and specifies the size of the offscreen canvas that is rendered into.

You may add your own command line arguments; indeed, there is a section in the report template to document them.

Note that the tutorial codebase already includes some simple command-line parsing code in RTG.cpp -- specifically, the RTG::Configuration::parse function.

Scene Format

This assignment (and subsequent assignments) will use scene'72 (.s72) format.

In this assignment, you may make the following simplifying assumptions about scene'72 files:

There are no object reference cycles. (E.g., a node will never be its own descendant).
Every camera object has exactly one instance in the scene. (This need not be the case for mesh or node objects, however!)
No mesh object contains the "indices" property.
All mesh objects have a "topology" of "TRIANGLE_LIST".

All mesh objects use these "attributes" (though, likely, with a different "src" file):

"attributes":{
	"POSITION": { "src":"cube.b72", "offset":0,  "stride":48, "format":"R32G32B32_SFLOAT" },
	"NORMAL":   { "src":"cube.b72", "offset":12, "stride":48, "format":"R32G32B32_SFLOAT" },
	"TANGENT":  { "src":"cube.b72", "offset":24, "stride":48, "format":"R32G32B32A32_SFLOAT" },
	"TEXCOORD": { "src":"cube.b72", "offset":40, "stride":48, "format":"R32G32_SFLOAT" },
}

All material objects have type "lambertian" and no normal or displacement maps.
All texture files will be loadable by stb_image (specifically: png or jpeg format).
All "type":"2D" textures will have "format":"linear".
The scene will contain at most two lights: one or zero "sun"-type lights with angle 0 (i.e., a pure directional light); and one or zero "sun"-type lights with angle approximately 3.1415926 (i.e., a full hemisphere light). This means that the same lighting setup used in the nakluV tutorial will be sufficient for your scene.
It is okay for your code to ignore any "environment" objects in the scene.

A1-load Scene Data Structures and Loading

When launched with the command-line option --scene filename.s72, your viewer should load a scene in Scene'72 format from the file filename.s72. To complete this part of the assignment, you will both need to design a scene data structure to load the scene into and write parsing code to handle the scene format itself.

Suggestions:

Don't write a JSON parser unless you really, really want to. You may use sejp in your project; we advocate sejp because it is simple to read and understand, has a clean interface, and we wrote it. (Other JSON libraries are allowed by request -- see Zulip -- but not all will be allowed so be sure to ask in advance.)
You don't need to load into the same structs that you render. Consider loading into an intermediate data format and then translating this to whatever you end up doing for rendering.
This goes extra for vertex buffers. A performant renderer will probably pool all of the mesh data for a scene in order to avoid pipeline and binding switches. (And this is also be easier to deal with than trying to dynamically create pipelines for different vertex layouts.)
Focus on non-indexed meshes with a simple attribute stream format first. It will be easier to add support for different attribute data layouts and index formats once you have one working.
Don't be afraid to generate your own test scenes.
Consider ignoring animation data until later in the assignment process. It will be easier and more fun to debug animation loading once you can see your scene.

Warning: consider the Scene code from 15-466 as an anti-example! It only handles tree-structured scenes, isn't built for efficient traversal, and is pretty closely tied to OpenGL.

A1-show Scene Display

Now that you've loaded the scene, it is time to show it.

This will involve traversing the scene and sending each mesh + transformation to the GPU using the Vulkan API. You should be able to start with your code from the nakluV Tutorial, which is already set up for drawing object instances.

Cameras

Your viewer should have three camera modes:

in scene camera mode, the rendering happens through one of the cameras in the scene graph and the user cannot change the camera transformation (however, you should give them a way to cycle between scene cameras);
in user camera mode, the rendering happens through a camera the user can move through the scene with keyboard+mouse control (I suggest either an orbit-style camera like Blender or a free-flying camera like, e.g., Descent);
in debug camera mode, the rendering happens through a second user-controlled camera, but the culling happens for the previously-active camera (this is very useful for debugging culling). You may also want to add extra information to the debug camera view, like bounding boxes or the view frustum.

If started with the command-line option --camera name your viewer should launch in scene camera mode with the camera named name active. If the named camera does not exist in the scene, your viewer should print an error message and exit.

Otherwise, it is up to you to determine the details of how to activate and switch between scene, user, and debug camera modes. (Please document how in your report.) Some handy controls might include warping the user camera to the current scene camera and/or setting the debug camera to a position that can see the whole scene.

Materials / Lighting

As per the assumptions above, all materials in your scene will be lambertian, and the only lights in the scene will be a distant directional light and/or a hemisphere light.

Therefore, your fragment shader will probably do something like this:

	vec3 energy =
		  SKY_ENERGY * (dot(n, SKY_DIRECTION) * 0.5 + 0.5)
		+ SUN_ENERGY * max(dot(n, SUN_DIRECTION), 0.0);
	outColor = vec4(texture(ALBEDO, texCoord) * energy, 1.0);

Basic hemispherical lighting equation in glsl syntax, where: n is the per-pixel normal (remember to normalize after interpolation!); texCoord is the interpolated texture coordinate; *_DIRECTION are uniforms giving the light directions; *_ENERGY are uniforms giving the light energy in appropriate units; ALBEDO is the albedo texture; and outColor is the value that gets written to the framebuffer.

A1-cull Culling

The fastest triangle to render is the triangle you don't need to render. It's time to update your viewer to perform view frustum culling. When frustum culling is active, your code should check whether a mesh instance is visible before sending it to the GPU for drawing.

In order to perform a fast visibility test you should build a bounding volume for each mesh and check this volume against the viewing frustum. Use bounding boxes for your first implementation; you may test other shapes later for potential performance improvement.

When your viewer is run with the --culling mode command line option it will start with a given culling mode selected. The modes are as follows:

--culling none -- disable culling. Every mesh gets drawn every frame, even if it is offscreen.
--culling frustum -- check mesh bounding boxes vs the camera's viewing frustum. Any mesh whose bounding box is outside the view frustum gets skipped.
You may add additional options here if pursing A1x-fast improvements.

We are using a culling flag because, in your report, you will need to demonstrate that culling is working, and show scenes where it has both a positive and negative impact on frame rate.

Note: as mentioned above, when rendering with the debug camera, do culling as if rendering for the previously-selected (user or scene) camera. This can be very handy for checking to see if culling is actually working! (And demonstrating that it is working in your report.) You may also want to add some code to visualize the view frustum and/or bounding volumes for the meshes.

A1-hide Headless Operation

For benchmarking, it will be very useful to have your application run exactly the same workload, and to run it as fast as possible in the background.

Add support for the --headless events command line flag.

In headless mode, your code should not create a window (i.e., a GLFWwindow); it should not create a surface (i.e., a VkSurfaceKHR), it should not create a swapchain (i.e., a VkSwapchainKHR), and it should not use any of the WSI extension functions (e.g., vkAcquireNextImageKHR or vkQueuePresentKHR). Instead, it should emulate these functions by creating a list of VkImages itself, and both signalling and waiting on the right semaphores and fences to keep rendering and "presentation" separate.

Images in your application's fake swapchain will be made available for rendering by events in the events file (which also includes the frame times that should be reported to scene update code).

Important: events files contain timestamps to allow (e.g.) running animations at a repeatable rate; your code shouldn't try to delay to make these match wall-clock time.

Note: For reasons of stable timing, and to avoid mid-run failures, I suggest parsing the events file at startup and storing it into an events structure in memory.

Note: Note that your swapchain should behave as if associated with a surface with VK_PRESENT_MODE_FIFO_KHR as the present mode.

Note: There is a headless surface extension supported on some platforms. You may not use it. Part of this exercise is to get you to understand what a presentation layer must be doing to properly hand out and retrieve rendered images.

Events File Format

The events file passed to --headless will be a UTF-8-encoded text file with '\n' (unix-style) line endings. Each line follows the same general format:

ts EVENT param1 ... paramN

Where ts is the time since the first frame as an integer number of microseconds (i.e., millionths of a second); EVENT is the event type (a string in all caps); and param1 through paramN are parameters (defined per event type).

Note: ts values will be nondecreasing; i.e., events are listed in chronological order.

Your code should support the following events:

ts AVAILABLE -- make the least-recently-available image in the swapchain available to be rendered to (after waiting for any rendering pending on this image to complete). This, effectively, starts a frame rendering.
ts PLAY t rate -- set animation playback time to t, a time in seconds, and the playback rate to rate. A rate of 0 is paused (time doesn't advance). A playback rate of 1 plays in real-time (relative to the timestamp values). You may, optionally, support fractional and/or negative playback rates.
ts SAVE filename.ppm -- save the rendering from the most-recently-made-available image to filename.ppm in portable pixmap (binary/P6 variant). This event will never occur before an image is made available.
ts MARK description words -- your code should print the line "MARK description words" to stdout (useful for debugging and visualization).
You may choose to specify additional event types. If you do, be sure to document them in your report.

Example Events File

Here is an example events file and the script that generated it. (Usage: node events.mjs > example.events.) You may find yourself making similar files when doing the performance testing part of your report.

A1x-fast Extra: Go Fast

Note: this section is work 1 point + up to 1 extra point. I tagged it as "extra" because you can skip it entirely and still receive 14/15 = ~93% on the assignment. On the other hand, if you try (and document) several substantial approaches to improving your viewer's rendering speed you can earn some extra credit here.

Within the framework of this basic viewer there are a ton of ways to (attempt to) make your code go faster. The final segment of this assignment is an open-ended exploration of these possibilities.

To receive one point on this segment, you may add any of the following to your system (and document the impact in your report):

Use a BVH, spatial grid, or octree to accelerate frustum culling. Be aware of losing time to structure re-builds to animation!
Transform loaded meshes into indexed meshes and/or triangle strips to reduce GPU memory usage and accelerate vertex processing.
Flatten parts of the scene into static geometry buffers to reduce CPU overhead.
Optimize your mathematical operations and/or scene memory layout to reduce GPU usage.
Parallelize scene graph traversal / culling on the CPU.
Move more of scene graph evaluation, animation, and/or frustum culling to the GPU.

To receive up to one extra point, go beyond by attempting more items from the list above or developing and documenting other substantial improvements of your own devising.

A1 Scene Viewer

Collaboration, Plagiarism, and Copyright

What to Do