What information exactly does an instruction cache store?What determines the number of bits for the address field in a cache memory?Implementing multiplicationISA efficiency code compaction and memory traffic5-stage pipelined implementation (RISC) of a microprocessorWhat CPUs use a skewed associative cache?Does a CPU completely freeze when using a DMA?ARM architecture questionwhat exactly is single cycle instruction architectures?How Specifically the Control Unit (CU) WorksCache access time for write back and write through caches
Why does Taylor’s series “work”?
Bookshelves: the intruder
Warped chessboard
Bash - Execute two commands and get exit status 1 if first fails
Blender 2.8 How to rotate viewport 90 deg after Shift +7
Was murdering a slave illegal in American slavery, and if so, what punishments were given for it?
Isn't Kirchhoff's junction law a violation of conservation of charge?
Hotel booking: Why is Agoda much cheaper than booking.com?
Is presenting a play showing Military charactes in a bad light a crime in the US?
Why are Marine Le Pen's possible connections with Steve Bannon something worth investigating?
"File type Zip archive (application/zip) is not supported" when opening a .pdf file
On a piano, are the effects of holding notes and the sustain pedal the same for a single chord?
Was Tyrion always a poor strategist?
How do we explain the use of a software on a math paper?
Find the 3D region containing the origin bounded by given planes
Can a problematic AL DM/organizer prevent me from running a separate AL-legal game at the same store?
Latin words remembered from high school 50 years ago
Is it a good idea to teach algorithm courses using pseudocode instead of a real programming language?
What does it mean for a program to be 32 or 64 bit?
Why does string strummed with finger sound different from the one strummed with pick?
How does the "reverse syntax" in Middle English work?
How to plot a surface from a system of equations?
FIFO data structure in pure C
Have the writers and actors of Game Of Thrones responded to its poor reception?
What information exactly does an instruction cache store?
What determines the number of bits for the address field in a cache memory?Implementing multiplicationISA efficiency code compaction and memory traffic5-stage pipelined implementation (RISC) of a microprocessorWhat CPUs use a skewed associative cache?Does a CPU completely freeze when using a DMA?ARM architecture questionwhat exactly is single cycle instruction architectures?How Specifically the Control Unit (CU) WorksCache access time for write back and write through caches
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
$begingroup$
Processors use both data and instruction caches in order to reduce the number of slow accesses to main memory. However, while it is clear to me that the data cache's purpose is to store frequently used data items (such as elements in an array or inside a loop), I cannot see what exactly the instruction cache stores that helps alleviate memory access times.
In the image above, we have an example of an "addi" instruction which adds a constant value to the value stored in general purpose register "r2" and writes the result to general purpose register "r1".
After this instruction is executed, what exactly is saved to the cache?
- It can't just be the opcode - most CPU instruction sets contain just a few hundred unique opcodes or less, so if the instruction cache was pre-loaded with all possible opcodes, it would always have a 100% hit rate. However, that would defeat the purpose of having a cache, plus I've read that instruction cache misses are very much possible.
- It can't be the values from main memory which are loaded into the general purpose registers, since that's exactly what the data cache is for.
Thank you in advance.
memory cpu computer-architecture cache
New contributor
$endgroup$
add a comment |
$begingroup$
Processors use both data and instruction caches in order to reduce the number of slow accesses to main memory. However, while it is clear to me that the data cache's purpose is to store frequently used data items (such as elements in an array or inside a loop), I cannot see what exactly the instruction cache stores that helps alleviate memory access times.
In the image above, we have an example of an "addi" instruction which adds a constant value to the value stored in general purpose register "r2" and writes the result to general purpose register "r1".
After this instruction is executed, what exactly is saved to the cache?
- It can't just be the opcode - most CPU instruction sets contain just a few hundred unique opcodes or less, so if the instruction cache was pre-loaded with all possible opcodes, it would always have a 100% hit rate. However, that would defeat the purpose of having a cache, plus I've read that instruction cache misses are very much possible.
- It can't be the values from main memory which are loaded into the general purpose registers, since that's exactly what the data cache is for.
Thank you in advance.
memory cpu computer-architecture cache
New contributor
$endgroup$
$begingroup$
Why do you think it matters to the cache if a particular instruction was executed or not? Instructions usually don't change at runtime.
$endgroup$
– Dmitry Grigoryev
May 14 at 6:50
add a comment |
$begingroup$
Processors use both data and instruction caches in order to reduce the number of slow accesses to main memory. However, while it is clear to me that the data cache's purpose is to store frequently used data items (such as elements in an array or inside a loop), I cannot see what exactly the instruction cache stores that helps alleviate memory access times.
In the image above, we have an example of an "addi" instruction which adds a constant value to the value stored in general purpose register "r2" and writes the result to general purpose register "r1".
After this instruction is executed, what exactly is saved to the cache?
- It can't just be the opcode - most CPU instruction sets contain just a few hundred unique opcodes or less, so if the instruction cache was pre-loaded with all possible opcodes, it would always have a 100% hit rate. However, that would defeat the purpose of having a cache, plus I've read that instruction cache misses are very much possible.
- It can't be the values from main memory which are loaded into the general purpose registers, since that's exactly what the data cache is for.
Thank you in advance.
memory cpu computer-architecture cache
New contributor
$endgroup$
Processors use both data and instruction caches in order to reduce the number of slow accesses to main memory. However, while it is clear to me that the data cache's purpose is to store frequently used data items (such as elements in an array or inside a loop), I cannot see what exactly the instruction cache stores that helps alleviate memory access times.
In the image above, we have an example of an "addi" instruction which adds a constant value to the value stored in general purpose register "r2" and writes the result to general purpose register "r1".
After this instruction is executed, what exactly is saved to the cache?
- It can't just be the opcode - most CPU instruction sets contain just a few hundred unique opcodes or less, so if the instruction cache was pre-loaded with all possible opcodes, it would always have a 100% hit rate. However, that would defeat the purpose of having a cache, plus I've read that instruction cache misses are very much possible.
- It can't be the values from main memory which are loaded into the general purpose registers, since that's exactly what the data cache is for.
Thank you in advance.
memory cpu computer-architecture cache
memory cpu computer-architecture cache
New contributor
New contributor
New contributor
asked May 13 at 16:39
MartinXMartinX
395
395
New contributor
New contributor
$begingroup$
Why do you think it matters to the cache if a particular instruction was executed or not? Instructions usually don't change at runtime.
$endgroup$
– Dmitry Grigoryev
May 14 at 6:50
add a comment |
$begingroup$
Why do you think it matters to the cache if a particular instruction was executed or not? Instructions usually don't change at runtime.
$endgroup$
– Dmitry Grigoryev
May 14 at 6:50
$begingroup$
Why do you think it matters to the cache if a particular instruction was executed or not? Instructions usually don't change at runtime.
$endgroup$
– Dmitry Grigoryev
May 14 at 6:50
$begingroup$
Why do you think it matters to the cache if a particular instruction was executed or not? Instructions usually don't change at runtime.
$endgroup$
– Dmitry Grigoryev
May 14 at 6:50
add a comment |
3 Answers
3
active
oldest
votes
$begingroup$
It literally stores lines of machine code from program memory (aka the entire instruction you line in your original post.
The fact you even discuss "storing all possible op codes in cache" points to a deeper misunderstanding. Talking about storing all possible op codes in cache (or any memory for that matter) has no meaning. All the possible opcodes that the processor can run are hard-wired into the logic circuitry of the processor. They aren't "stored" anywhere.
$endgroup$
3
$begingroup$
Note that this is only true for most CPUs. Recent Intel x86s store decoded micro-operations (ie. the output of an early stage of the execution process), and I think AMD may have also switched to a micro-op cache rather than a strict instruction cache.
$endgroup$
– Mark
May 13 at 20:29
2
$begingroup$
@MartinX When you say "the entire instructions were somehow hard wired" are you saying that you thought something like entire "ADD, Reg1, Reg2" was hardwired? And then something like "ADD, Reg2, Reg3" was a separate hard-wiring? Because that's not the case. Not every possible combination of opcode and argument has a unique circuitry hard-wired into the CPU..
$endgroup$
– Toor
May 13 at 21:50
6
$begingroup$
@Mark: Intel P4 had a trace cache instead of an L1i cache. This worked out badly and was a big bottleneck (because it was slow to build traces on misses with its weak decoders). Intel since Sandybridge (realworldtech.com/sandy-bridge) and AMD since Zen still have regular L1i caches that cache x86 machine code bytes, but also have smaller very fast decoded-uop caches. They still have powerful decoders for good throughput on uop cache misses, and it's not a trace cache. (A uop cache line can only cache contiguous uops from one 32B chunk, instead of following jumps.)
$endgroup$
– Peter Cordes
May 13 at 22:04
5
$begingroup$
@Mark: Some older AMD CPUs do store extra metadata alongside the L1i cache: they mark instruction boundaries in the cache to speed up decode. See Agner Fog's microarch pdf. Also David Kanter mentions the pre-decode metadata in realworldtech.com/bulldozer/4. More info about it in his K10 write-up: realworldtech.com/barcelona/4
$endgroup$
– Peter Cordes
May 13 at 22:10
3
$begingroup$
@Toor: Intel calls their decoded-uop cache the "Decode Stream Buffer (DSB)" including in HW perf counter event names. Physically, its very much built as an associative cache with each "way" of a set holding up to 6 uops. It's indexed and tagged by virtual address (so it bypasses TLB lookups). Of course caches are built out of "tightly coupled" SRAM arrays, but what makes them caches is the management system and the lookup / indexing mechanism.
$endgroup$
– Peter Cordes
May 13 at 22:16
|
show 3 more comments
$begingroup$
The Instruction cache stores the most recently used instructions and their addresses so that if an instruction needs to be repeated it doesn't have to be retrieved from main memory - this is much quicker.
For example the first time a loop is performed the instructions will be retrieved from main memory and simultaneously placed into the cache. On subsequent iterations of the loop the instructions can then be quickly retrieved from the fast cache memory.
The addresses are stored in the cache together with information that indicates whether the cache is up-to-date so the CPU control knows whether it can use the cached instructions or needs to go to main memory.
$endgroup$
5
$begingroup$
Good answer. May be worth emphasizing that the instructions are placed into cache as they are retrieved from memory (and indeed, before they are executed) to clear up the OP's potential misunderstanding that the instruction is saved to cache "after [it] is executed."
$endgroup$
– Shamtam
May 13 at 17:07
add a comment |
$begingroup$
The instruction cache stores the individual instructions for the CPU of the currently executing program. It is the program itself. Main memory is often too slow (or has too much latency) to be able to feed the CPU its next instruction every time it is ready for one. This is why a fast cache near the CPU is used, this is the instruction cache.
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("schematics", function ()
StackExchange.schematics.init();
);
, "cicuitlab");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "135"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
MartinX is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2felectronics.stackexchange.com%2fquestions%2f438294%2fwhat-information-exactly-does-an-instruction-cache-store%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
It literally stores lines of machine code from program memory (aka the entire instruction you line in your original post.
The fact you even discuss "storing all possible op codes in cache" points to a deeper misunderstanding. Talking about storing all possible op codes in cache (or any memory for that matter) has no meaning. All the possible opcodes that the processor can run are hard-wired into the logic circuitry of the processor. They aren't "stored" anywhere.
$endgroup$
3
$begingroup$
Note that this is only true for most CPUs. Recent Intel x86s store decoded micro-operations (ie. the output of an early stage of the execution process), and I think AMD may have also switched to a micro-op cache rather than a strict instruction cache.
$endgroup$
– Mark
May 13 at 20:29
2
$begingroup$
@MartinX When you say "the entire instructions were somehow hard wired" are you saying that you thought something like entire "ADD, Reg1, Reg2" was hardwired? And then something like "ADD, Reg2, Reg3" was a separate hard-wiring? Because that's not the case. Not every possible combination of opcode and argument has a unique circuitry hard-wired into the CPU..
$endgroup$
– Toor
May 13 at 21:50
6
$begingroup$
@Mark: Intel P4 had a trace cache instead of an L1i cache. This worked out badly and was a big bottleneck (because it was slow to build traces on misses with its weak decoders). Intel since Sandybridge (realworldtech.com/sandy-bridge) and AMD since Zen still have regular L1i caches that cache x86 machine code bytes, but also have smaller very fast decoded-uop caches. They still have powerful decoders for good throughput on uop cache misses, and it's not a trace cache. (A uop cache line can only cache contiguous uops from one 32B chunk, instead of following jumps.)
$endgroup$
– Peter Cordes
May 13 at 22:04
5
$begingroup$
@Mark: Some older AMD CPUs do store extra metadata alongside the L1i cache: they mark instruction boundaries in the cache to speed up decode. See Agner Fog's microarch pdf. Also David Kanter mentions the pre-decode metadata in realworldtech.com/bulldozer/4. More info about it in his K10 write-up: realworldtech.com/barcelona/4
$endgroup$
– Peter Cordes
May 13 at 22:10
3
$begingroup$
@Toor: Intel calls their decoded-uop cache the "Decode Stream Buffer (DSB)" including in HW perf counter event names. Physically, its very much built as an associative cache with each "way" of a set holding up to 6 uops. It's indexed and tagged by virtual address (so it bypasses TLB lookups). Of course caches are built out of "tightly coupled" SRAM arrays, but what makes them caches is the management system and the lookup / indexing mechanism.
$endgroup$
– Peter Cordes
May 13 at 22:16
|
show 3 more comments
$begingroup$
It literally stores lines of machine code from program memory (aka the entire instruction you line in your original post.
The fact you even discuss "storing all possible op codes in cache" points to a deeper misunderstanding. Talking about storing all possible op codes in cache (or any memory for that matter) has no meaning. All the possible opcodes that the processor can run are hard-wired into the logic circuitry of the processor. They aren't "stored" anywhere.
$endgroup$
3
$begingroup$
Note that this is only true for most CPUs. Recent Intel x86s store decoded micro-operations (ie. the output of an early stage of the execution process), and I think AMD may have also switched to a micro-op cache rather than a strict instruction cache.
$endgroup$
– Mark
May 13 at 20:29
2
$begingroup$
@MartinX When you say "the entire instructions were somehow hard wired" are you saying that you thought something like entire "ADD, Reg1, Reg2" was hardwired? And then something like "ADD, Reg2, Reg3" was a separate hard-wiring? Because that's not the case. Not every possible combination of opcode and argument has a unique circuitry hard-wired into the CPU..
$endgroup$
– Toor
May 13 at 21:50
6
$begingroup$
@Mark: Intel P4 had a trace cache instead of an L1i cache. This worked out badly and was a big bottleneck (because it was slow to build traces on misses with its weak decoders). Intel since Sandybridge (realworldtech.com/sandy-bridge) and AMD since Zen still have regular L1i caches that cache x86 machine code bytes, but also have smaller very fast decoded-uop caches. They still have powerful decoders for good throughput on uop cache misses, and it's not a trace cache. (A uop cache line can only cache contiguous uops from one 32B chunk, instead of following jumps.)
$endgroup$
– Peter Cordes
May 13 at 22:04
5
$begingroup$
@Mark: Some older AMD CPUs do store extra metadata alongside the L1i cache: they mark instruction boundaries in the cache to speed up decode. See Agner Fog's microarch pdf. Also David Kanter mentions the pre-decode metadata in realworldtech.com/bulldozer/4. More info about it in his K10 write-up: realworldtech.com/barcelona/4
$endgroup$
– Peter Cordes
May 13 at 22:10
3
$begingroup$
@Toor: Intel calls their decoded-uop cache the "Decode Stream Buffer (DSB)" including in HW perf counter event names. Physically, its very much built as an associative cache with each "way" of a set holding up to 6 uops. It's indexed and tagged by virtual address (so it bypasses TLB lookups). Of course caches are built out of "tightly coupled" SRAM arrays, but what makes them caches is the management system and the lookup / indexing mechanism.
$endgroup$
– Peter Cordes
May 13 at 22:16
|
show 3 more comments
$begingroup$
It literally stores lines of machine code from program memory (aka the entire instruction you line in your original post.
The fact you even discuss "storing all possible op codes in cache" points to a deeper misunderstanding. Talking about storing all possible op codes in cache (or any memory for that matter) has no meaning. All the possible opcodes that the processor can run are hard-wired into the logic circuitry of the processor. They aren't "stored" anywhere.
$endgroup$
It literally stores lines of machine code from program memory (aka the entire instruction you line in your original post.
The fact you even discuss "storing all possible op codes in cache" points to a deeper misunderstanding. Talking about storing all possible op codes in cache (or any memory for that matter) has no meaning. All the possible opcodes that the processor can run are hard-wired into the logic circuitry of the processor. They aren't "stored" anywhere.
edited May 13 at 17:04
answered May 13 at 16:59
ToorToor
2,849317
2,849317
3
$begingroup$
Note that this is only true for most CPUs. Recent Intel x86s store decoded micro-operations (ie. the output of an early stage of the execution process), and I think AMD may have also switched to a micro-op cache rather than a strict instruction cache.
$endgroup$
– Mark
May 13 at 20:29
2
$begingroup$
@MartinX When you say "the entire instructions were somehow hard wired" are you saying that you thought something like entire "ADD, Reg1, Reg2" was hardwired? And then something like "ADD, Reg2, Reg3" was a separate hard-wiring? Because that's not the case. Not every possible combination of opcode and argument has a unique circuitry hard-wired into the CPU..
$endgroup$
– Toor
May 13 at 21:50
6
$begingroup$
@Mark: Intel P4 had a trace cache instead of an L1i cache. This worked out badly and was a big bottleneck (because it was slow to build traces on misses with its weak decoders). Intel since Sandybridge (realworldtech.com/sandy-bridge) and AMD since Zen still have regular L1i caches that cache x86 machine code bytes, but also have smaller very fast decoded-uop caches. They still have powerful decoders for good throughput on uop cache misses, and it's not a trace cache. (A uop cache line can only cache contiguous uops from one 32B chunk, instead of following jumps.)
$endgroup$
– Peter Cordes
May 13 at 22:04
5
$begingroup$
@Mark: Some older AMD CPUs do store extra metadata alongside the L1i cache: they mark instruction boundaries in the cache to speed up decode. See Agner Fog's microarch pdf. Also David Kanter mentions the pre-decode metadata in realworldtech.com/bulldozer/4. More info about it in his K10 write-up: realworldtech.com/barcelona/4
$endgroup$
– Peter Cordes
May 13 at 22:10
3
$begingroup$
@Toor: Intel calls their decoded-uop cache the "Decode Stream Buffer (DSB)" including in HW perf counter event names. Physically, its very much built as an associative cache with each "way" of a set holding up to 6 uops. It's indexed and tagged by virtual address (so it bypasses TLB lookups). Of course caches are built out of "tightly coupled" SRAM arrays, but what makes them caches is the management system and the lookup / indexing mechanism.
$endgroup$
– Peter Cordes
May 13 at 22:16
|
show 3 more comments
3
$begingroup$
Note that this is only true for most CPUs. Recent Intel x86s store decoded micro-operations (ie. the output of an early stage of the execution process), and I think AMD may have also switched to a micro-op cache rather than a strict instruction cache.
$endgroup$
– Mark
May 13 at 20:29
2
$begingroup$
@MartinX When you say "the entire instructions were somehow hard wired" are you saying that you thought something like entire "ADD, Reg1, Reg2" was hardwired? And then something like "ADD, Reg2, Reg3" was a separate hard-wiring? Because that's not the case. Not every possible combination of opcode and argument has a unique circuitry hard-wired into the CPU..
$endgroup$
– Toor
May 13 at 21:50
6
$begingroup$
@Mark: Intel P4 had a trace cache instead of an L1i cache. This worked out badly and was a big bottleneck (because it was slow to build traces on misses with its weak decoders). Intel since Sandybridge (realworldtech.com/sandy-bridge) and AMD since Zen still have regular L1i caches that cache x86 machine code bytes, but also have smaller very fast decoded-uop caches. They still have powerful decoders for good throughput on uop cache misses, and it's not a trace cache. (A uop cache line can only cache contiguous uops from one 32B chunk, instead of following jumps.)
$endgroup$
– Peter Cordes
May 13 at 22:04
5
$begingroup$
@Mark: Some older AMD CPUs do store extra metadata alongside the L1i cache: they mark instruction boundaries in the cache to speed up decode. See Agner Fog's microarch pdf. Also David Kanter mentions the pre-decode metadata in realworldtech.com/bulldozer/4. More info about it in his K10 write-up: realworldtech.com/barcelona/4
$endgroup$
– Peter Cordes
May 13 at 22:10
3
$begingroup$
@Toor: Intel calls their decoded-uop cache the "Decode Stream Buffer (DSB)" including in HW perf counter event names. Physically, its very much built as an associative cache with each "way" of a set holding up to 6 uops. It's indexed and tagged by virtual address (so it bypasses TLB lookups). Of course caches are built out of "tightly coupled" SRAM arrays, but what makes them caches is the management system and the lookup / indexing mechanism.
$endgroup$
– Peter Cordes
May 13 at 22:16
3
3
$begingroup$
Note that this is only true for most CPUs. Recent Intel x86s store decoded micro-operations (ie. the output of an early stage of the execution process), and I think AMD may have also switched to a micro-op cache rather than a strict instruction cache.
$endgroup$
– Mark
May 13 at 20:29
$begingroup$
Note that this is only true for most CPUs. Recent Intel x86s store decoded micro-operations (ie. the output of an early stage of the execution process), and I think AMD may have also switched to a micro-op cache rather than a strict instruction cache.
$endgroup$
– Mark
May 13 at 20:29
2
2
$begingroup$
@MartinX When you say "the entire instructions were somehow hard wired" are you saying that you thought something like entire "ADD, Reg1, Reg2" was hardwired? And then something like "ADD, Reg2, Reg3" was a separate hard-wiring? Because that's not the case. Not every possible combination of opcode and argument has a unique circuitry hard-wired into the CPU..
$endgroup$
– Toor
May 13 at 21:50
$begingroup$
@MartinX When you say "the entire instructions were somehow hard wired" are you saying that you thought something like entire "ADD, Reg1, Reg2" was hardwired? And then something like "ADD, Reg2, Reg3" was a separate hard-wiring? Because that's not the case. Not every possible combination of opcode and argument has a unique circuitry hard-wired into the CPU..
$endgroup$
– Toor
May 13 at 21:50
6
6
$begingroup$
@Mark: Intel P4 had a trace cache instead of an L1i cache. This worked out badly and was a big bottleneck (because it was slow to build traces on misses with its weak decoders). Intel since Sandybridge (realworldtech.com/sandy-bridge) and AMD since Zen still have regular L1i caches that cache x86 machine code bytes, but also have smaller very fast decoded-uop caches. They still have powerful decoders for good throughput on uop cache misses, and it's not a trace cache. (A uop cache line can only cache contiguous uops from one 32B chunk, instead of following jumps.)
$endgroup$
– Peter Cordes
May 13 at 22:04
$begingroup$
@Mark: Intel P4 had a trace cache instead of an L1i cache. This worked out badly and was a big bottleneck (because it was slow to build traces on misses with its weak decoders). Intel since Sandybridge (realworldtech.com/sandy-bridge) and AMD since Zen still have regular L1i caches that cache x86 machine code bytes, but also have smaller very fast decoded-uop caches. They still have powerful decoders for good throughput on uop cache misses, and it's not a trace cache. (A uop cache line can only cache contiguous uops from one 32B chunk, instead of following jumps.)
$endgroup$
– Peter Cordes
May 13 at 22:04
5
5
$begingroup$
@Mark: Some older AMD CPUs do store extra metadata alongside the L1i cache: they mark instruction boundaries in the cache to speed up decode. See Agner Fog's microarch pdf. Also David Kanter mentions the pre-decode metadata in realworldtech.com/bulldozer/4. More info about it in his K10 write-up: realworldtech.com/barcelona/4
$endgroup$
– Peter Cordes
May 13 at 22:10
$begingroup$
@Mark: Some older AMD CPUs do store extra metadata alongside the L1i cache: they mark instruction boundaries in the cache to speed up decode. See Agner Fog's microarch pdf. Also David Kanter mentions the pre-decode metadata in realworldtech.com/bulldozer/4. More info about it in his K10 write-up: realworldtech.com/barcelona/4
$endgroup$
– Peter Cordes
May 13 at 22:10
3
3
$begingroup$
@Toor: Intel calls their decoded-uop cache the "Decode Stream Buffer (DSB)" including in HW perf counter event names. Physically, its very much built as an associative cache with each "way" of a set holding up to 6 uops. It's indexed and tagged by virtual address (so it bypasses TLB lookups). Of course caches are built out of "tightly coupled" SRAM arrays, but what makes them caches is the management system and the lookup / indexing mechanism.
$endgroup$
– Peter Cordes
May 13 at 22:16
$begingroup$
@Toor: Intel calls their decoded-uop cache the "Decode Stream Buffer (DSB)" including in HW perf counter event names. Physically, its very much built as an associative cache with each "way" of a set holding up to 6 uops. It's indexed and tagged by virtual address (so it bypasses TLB lookups). Of course caches are built out of "tightly coupled" SRAM arrays, but what makes them caches is the management system and the lookup / indexing mechanism.
$endgroup$
– Peter Cordes
May 13 at 22:16
|
show 3 more comments
$begingroup$
The Instruction cache stores the most recently used instructions and their addresses so that if an instruction needs to be repeated it doesn't have to be retrieved from main memory - this is much quicker.
For example the first time a loop is performed the instructions will be retrieved from main memory and simultaneously placed into the cache. On subsequent iterations of the loop the instructions can then be quickly retrieved from the fast cache memory.
The addresses are stored in the cache together with information that indicates whether the cache is up-to-date so the CPU control knows whether it can use the cached instructions or needs to go to main memory.
$endgroup$
5
$begingroup$
Good answer. May be worth emphasizing that the instructions are placed into cache as they are retrieved from memory (and indeed, before they are executed) to clear up the OP's potential misunderstanding that the instruction is saved to cache "after [it] is executed."
$endgroup$
– Shamtam
May 13 at 17:07
add a comment |
$begingroup$
The Instruction cache stores the most recently used instructions and their addresses so that if an instruction needs to be repeated it doesn't have to be retrieved from main memory - this is much quicker.
For example the first time a loop is performed the instructions will be retrieved from main memory and simultaneously placed into the cache. On subsequent iterations of the loop the instructions can then be quickly retrieved from the fast cache memory.
The addresses are stored in the cache together with information that indicates whether the cache is up-to-date so the CPU control knows whether it can use the cached instructions or needs to go to main memory.
$endgroup$
5
$begingroup$
Good answer. May be worth emphasizing that the instructions are placed into cache as they are retrieved from memory (and indeed, before they are executed) to clear up the OP's potential misunderstanding that the instruction is saved to cache "after [it] is executed."
$endgroup$
– Shamtam
May 13 at 17:07
add a comment |
$begingroup$
The Instruction cache stores the most recently used instructions and their addresses so that if an instruction needs to be repeated it doesn't have to be retrieved from main memory - this is much quicker.
For example the first time a loop is performed the instructions will be retrieved from main memory and simultaneously placed into the cache. On subsequent iterations of the loop the instructions can then be quickly retrieved from the fast cache memory.
The addresses are stored in the cache together with information that indicates whether the cache is up-to-date so the CPU control knows whether it can use the cached instructions or needs to go to main memory.
$endgroup$
The Instruction cache stores the most recently used instructions and their addresses so that if an instruction needs to be repeated it doesn't have to be retrieved from main memory - this is much quicker.
For example the first time a loop is performed the instructions will be retrieved from main memory and simultaneously placed into the cache. On subsequent iterations of the loop the instructions can then be quickly retrieved from the fast cache memory.
The addresses are stored in the cache together with information that indicates whether the cache is up-to-date so the CPU control knows whether it can use the cached instructions or needs to go to main memory.
edited May 13 at 17:15
answered May 13 at 17:01
Kevin WhiteKevin White
13.4k11623
13.4k11623
5
$begingroup$
Good answer. May be worth emphasizing that the instructions are placed into cache as they are retrieved from memory (and indeed, before they are executed) to clear up the OP's potential misunderstanding that the instruction is saved to cache "after [it] is executed."
$endgroup$
– Shamtam
May 13 at 17:07
add a comment |
5
$begingroup$
Good answer. May be worth emphasizing that the instructions are placed into cache as they are retrieved from memory (and indeed, before they are executed) to clear up the OP's potential misunderstanding that the instruction is saved to cache "after [it] is executed."
$endgroup$
– Shamtam
May 13 at 17:07
5
5
$begingroup$
Good answer. May be worth emphasizing that the instructions are placed into cache as they are retrieved from memory (and indeed, before they are executed) to clear up the OP's potential misunderstanding that the instruction is saved to cache "after [it] is executed."
$endgroup$
– Shamtam
May 13 at 17:07
$begingroup$
Good answer. May be worth emphasizing that the instructions are placed into cache as they are retrieved from memory (and indeed, before they are executed) to clear up the OP's potential misunderstanding that the instruction is saved to cache "after [it] is executed."
$endgroup$
– Shamtam
May 13 at 17:07
add a comment |
$begingroup$
The instruction cache stores the individual instructions for the CPU of the currently executing program. It is the program itself. Main memory is often too slow (or has too much latency) to be able to feed the CPU its next instruction every time it is ready for one. This is why a fast cache near the CPU is used, this is the instruction cache.
$endgroup$
add a comment |
$begingroup$
The instruction cache stores the individual instructions for the CPU of the currently executing program. It is the program itself. Main memory is often too slow (or has too much latency) to be able to feed the CPU its next instruction every time it is ready for one. This is why a fast cache near the CPU is used, this is the instruction cache.
$endgroup$
add a comment |
$begingroup$
The instruction cache stores the individual instructions for the CPU of the currently executing program. It is the program itself. Main memory is often too slow (or has too much latency) to be able to feed the CPU its next instruction every time it is ready for one. This is why a fast cache near the CPU is used, this is the instruction cache.
$endgroup$
The instruction cache stores the individual instructions for the CPU of the currently executing program. It is the program itself. Main memory is often too slow (or has too much latency) to be able to feed the CPU its next instruction every time it is ready for one. This is why a fast cache near the CPU is used, this is the instruction cache.
edited May 13 at 19:11
answered May 13 at 16:49
evildemonicevildemonic
3,04911027
3,04911027
add a comment |
add a comment |
MartinX is a new contributor. Be nice, and check out our Code of Conduct.
MartinX is a new contributor. Be nice, and check out our Code of Conduct.
MartinX is a new contributor. Be nice, and check out our Code of Conduct.
MartinX is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Electrical Engineering Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2felectronics.stackexchange.com%2fquestions%2f438294%2fwhat-information-exactly-does-an-instruction-cache-store%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Why do you think it matters to the cache if a particular instruction was executed or not? Instructions usually don't change at runtime.
$endgroup$
– Dmitry Grigoryev
May 14 at 6:50