cpu-architecture – IT Nursery

Why is a boolean 1 byte and not 1 bit of size?

June 3, 2022 by IT Nursery

In C++, Why is a boolean 1 byte and not 1 bit of size? Why aren’t there types like a 4-bit or 2-bit integers? I’m missing out the above things when writing an emulator for a CPU 13 Answers 13

What is the difference between Trap and Interrupt?

May 30, 2022 by IT Nursery

What is the difference between Trap and Interrupt? If the terminology is different for different systems, then what do they mean on x86? 10 Answers 10

What Every Programmer Should Know About Memory?

May 29, 2022 by IT Nursery

I am wondering how much of Ulrich Drepper’s What Every Programmer Should Know About Memory from 2007 is still valid. Also I could not find a newer version than 1.0 or an errata. (Also in PDF form on Ulrich Drepper’s own site: https://www.akkadia.org/drepper/cpumemory.pdf) 3 Answers 3

What is the purpose of the “Prefer 32-bit” setting in Visual Studio and how does it actually work?

May 25, 2022 by IT Nursery

It is unclear to me how the compiler will automatically know to compile for 64-bit when it needs to. How does it know when it can confidently target 32-bit? I am mainly curious about how the compiler knows which architecture to target when compiling. Does it analyze the code and make a decision based on … Read more

Difference between core and processor

May 23, 2022 by IT Nursery

What is the difference between a core and a processor? I’ve already looked for it on Google, but I only get definitions for multi-core and multi-processor, which is not what I am looking for. 7 Answers 7

What is a retpoline and how does it work?

May 20, 2022 by IT Nursery

In order to mitigate against kernel or cross-process memory disclosure (the Spectre attack), the Linux kernel1 will be compiled with a new option, -mindirect-branch=thunk-extern introduced to gcc to perform indirect calls through a so-called retpoline. This appears to be a newly invented term as a Google search turns up only very recent use (generally all … Read more

Deoptimizing a program for the pipeline in Intel Sandybridge-family CPUs

May 13, 2022 by IT Nursery

I’ve been racking my brain for a week trying to complete this assignment and I’m hoping someone here can lead me toward the right path. Let me start with the instructor’s instructions: Your assignment is the opposite of our first lab assignment, which was to optimize a prime number program. Your purpose in this assignment … Read more

How do I achieve the theoretical maximum of 4 FLOPs per cycle?

April 28, 2022 by IT Nursery

How can the theoretical peak performance of 4 floating point operations (double precision) per cycle be achieved on a modern x86-64 Intel CPU? As far as I understand it takes three cycles for an SSE add and five cycles for a mul to complete on most of the modern Intel CPUs (see for example Agner … Read more

Why is processing a sorted array faster than processing an unsorted array?

April 9, 2022 by IT Nursery

Here is a piece of C++ code that shows some very peculiar behavior. For some strange reason, sorting the data (before the timed region) miraculously makes the loop almost six times faster. #include <algorithm> #include <ctime> #include <iostream> int main() { // Generate data const unsigned arraySize = 32768; int data[arraySize]; for (unsigned c = … Read more