




























































































Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Prepara tus exámenes
Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Prepara tus exámenes con los documentos que comparten otros estudiantes como tú en Docsity
Encuentra los documentos específicos para los exámenes de tu universidad
Estudia con lecciones y exámenes resueltos basados en los programas académicos de las mejores universidades
Responde a preguntas de exámenes reales y pon a prueba tu preparación
Consigue puntos base para descargar
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Comunidad
Pide ayuda a la comunidad y resuelve tus dudas de estudio
Ebooks gratuitos
Descarga nuestras guías gratuitas sobre técnicas de estudio, métodos para controlar la ansiedad y consejos para la tesis preparadas por los tutores de Docsity
coontenido de ejemplos y cosass
Tipo: Exámenes
1 / 130
Esta página no es visible en la vista previa
¡No te pierdas las partes importantes!





























































































Akimitsu Hogge Software Engineer Activision Central Technology ® My name is Aki and I’m a programmer for Activision Central Tech. I work with a small team in Portland Maine where we tackle performance problems in Call of Duty titles. Today we’re going to be discussing controller to display latency: that’s the shortest duration between a player pressing a button and the player seeing the result of the press on screen. I’m only going to be discussing the game client, and won’t be covering anything related to networking or the server. We’re going to go over a dynamic throttling feature that was added to recent Call of Duty games to control the tradeoffs that impact input latency, with the ultimate goal of reducing controller to display latency. But first we’ll go over how we measure latency, and then we’ll dive into specific aspects of the engine that had to be factored into the throttle. Finally I’ll discuss the implementation of the throttle and its impact on a recent game.
Button press Controller / OS sample^ Game engine query and rendering^ Game logic scan-out^ Video Lit pixels Ground truth If you split the controller to display latency duration out into steps: First the player presses a button, and the controller samples the button and the console samples the controller at some frequency. At some point in the game loop the game queries the OS for its controller state, and uses the knowledge of the button press for game logic and rendering. The final rendered image then gets scanned out to the display, and the display lights up pixels for the player to see. Let’s call this entire duration ‘ground truth latency’
Button press Controller / OS sample^ Game engine query and rendering^ Game logic scan-out^ Video Lit pixels Ground truth In engine Now let's look at this subset of the latency duration between the game’s input query and the final image scan-out. This segment can be measured totally in software: let’s call it "in-engine latency"
Call of Duty®: Black Ops 4 The engine just has to timestamp the input sample and the start of the video scan-out along with frame indices, the correspond the two to get accurate per-frame in-engine latency measurements. Here’s a latency measurement overlay I added in Call of Duty Black Ops 4. Horizontally we’re measuring milliseconds, with the right edge corresponding to the start of the scan-out. Each row of the graph represents a single frame, with frames scrolling up over time. Each of the colored segments correspond to various engine events that contributed to that particular frame, with the left edge of the bright red bar corresponding to the frame’s input sample. In this example, in-engine latency is sitting at around 20ms.
Ground truth In-engine Δ 101.4 ms 57.3 ms 44.1 ms 64.7 ms 20.3 ms 44.4 ms Call of Duty®: Black Ops 4 Here’s a couple of measurement of the game running under different latency parameters. The first column shows ground-truth measurement averages, and the second column shows in-engine measurement averages. The difference is about 44ms, which means that I’m getting around 44ms of latency contribution from factors outside of the game itself. The fact that the difference is constant and independent from game behavior is really useful: it means we can rely purely on software measurements during development, checking ground-truth only for final confirmation. This significantly speeds up iteration time, and makes it easier to log and automate measurements. Moving forward, we’re only going to be focusing on how to reduce in-engine latency.
Latency Resilience to variance in performance Latency Performance Running at 60 FPS has been a cornerstone of the Call of Duty franchise, and a lot of effort goes into creating as rich of a gaming experience as possible while still holding 60. But there’s an intuition that performance is a tradeoff with input latency. And that’s true to a certain extent: One of the most important ways we fit more content into the game is by making full use of the available hardware. That means directing as many CPU and GPU cycles as possible towards game content: more pixels, more triangles, more physics, and so on. Some of the optimizations we make to improve hardware utilization do introduce latency into the engine. Our goal today is to claw back some of that latency without sacrificing the performance gains from those optimizations. I hope to show that the trade-off is not between latency and performance. Rather, that it’s between latency and resilience to variance in performance.
The duration from the input sample to the scan-out is the total in-engine latency for this particular frame.
The strategy we’re going to be employing for latency reduction is to introduce a throttle right before the input sample. By adding a delay here, the input sample gets squeezed closer to the end of the frame.
Everything else, let’s call “slop”.
Producer Consumer Slop Slop is not necessarily idle time: it's usually segments where one timeline is working on a previous frame, or when one timeline is waiting for another timeline to release a shared resource. If you look at a generic producer-consumer system, the producer performs a unit of work, passes the result on to the consumer, and then starts producing the next unit of work. If the consumer is consistently slower than the producer, the system becomes “consumer-bound”. The consumer’s timeline stays full, trying to keep up with the producer, but the producer is allowed to get as far ahead as possible. That’s why slop exists: the producer is ahead of the consumer, so there’s a gap between when the producer finishes and when the consumer starts. How far ahead the producer can get depends on how much buffering is allowed between the producer and consumer, and exactly when they acquire and release access to the buffered data.
When you delay the input sample, slop gets squeezed out starting with the first segment until it disappears, moving to the second segment, and so on. On the other hand work gets shifted forward in time, while the total work duration across the frame ideally stays constant. The key behavior of the throttle is that it squeezes slop while preserving work.
Latency = Work + Slop
We’ve split the latency duration up into work and slop. Our strategy is to throttle the input sample to reduce slop, thereby reducing latency. To figure out how much to throttle, we’re going to examine all the sources of work and slop in the game.
HDMI Frame buffer VBLANK scan The game renders a final image into a piece of memory called the frame buffer. The video scan-out hardware then reads the framebuffer row by row, top to bottom, and transmits the pixel data across a cable to the display. The scan-out transmits continuously at a rate matching the refresh rate of the display. Conventional displays today have a refresh rate of 60Hz, so the framebuffer is typically scanned-out at 60Hz as well, or once every ~16.6ms. Most of that 16.6ms period is spent actively transmitting visible pixel data, but there are brief pauses where the transmitted data doesn’t correspond to visible pixels. First, at the end of every row is a pause called the horizontal blank or HBLANK, and after the last row there’s a pause called the vertical blank or VBLANK. Theses pauses were physically necessary on old CRT monitors, but they still exist today for legacy reasons and for transmitting metadata.
vblank vblank vblank vblank vblank Scan-out Scan-out Scan-out Scan-out If we put this on a timeline, we get scan-out then vblank, scan-out then vblank, and so on, all happening at a fixed 60Hz.