SYNC and the 65CE02 instruction timingHow are 6502 and 65C02 JMP(abs) processed internallyWhy did the PDP-11 include a JMP instruction?PDP-11 instruction set inconsistenciesUses for the halt instruction?How to keep the instruction prefetcher filled upWhat is the format of a TurboGrafX instruction (modified 6502 instruction set)?Why does an instruction include the address of the next instruction on the IBM 650?Why does the 6502 have the BIT instruction?What did the 8086 (and 8088) do upon encountering an illegal instruction?Behavior of the zero and negative/sign flags on classic instruction sets

401(k) investment after being fired. Do I own it?

Is a normal-sized rug with the Animate Objects spell cast on it able to carry a person and fly?

Sextortion with actual password not found in leaks

Why keep the bed heated after initial layer(s) with PLA (or PETG)?

Memory capability and powers of 2

What should I say when a company asks you why someone (a friend) who was fired left?

Should I describe a character deeply before killing it?

Why is the return type for ftell not fpos_t?

Protected custom settings as a parameter in an @AuraEnabled method causes error

How do campaign rallies gain candidates votes?

Are glider winch launches rarer in the USA than in the rest of the world? Why?

How to sort and filter a constantly changing list of data?

Monty Hall Problem with a Fallible Monty

Do Rabbis get punished in Heaven for wrong interpretations or claims?

High income, sudden windfall

Grid/table with lots of buttons

Company requiring me to let them review research from before I was hired

Examples of solving for unknowns using equivalence relations that are not equality, inequality, or boolean truth?

What is the purpose of the fuel shutoff valve?

Invert Some Switches on a Switchboard

What are the exact meanings of roll, pitch and yaw?

Inadvertently nuked my disk permission structure - why?

Determine if a triangle is equilateral, isosceles, or scalene

Sitecore Powershell extensions module compatibility with Sitecore 9.2



SYNC and the 65CE02 instruction timing


How are 6502 and 65C02 JMP(abs) processed internallyWhy did the PDP-11 include a JMP instruction?PDP-11 instruction set inconsistenciesUses for the halt instruction?How to keep the instruction prefetcher filled upWhat is the format of a TurboGrafX instruction (modified 6502 instruction set)?Why does an instruction include the address of the next instruction on the IBM 650?Why does the 6502 have the BIT instruction?What did the 8086 (and 8088) do upon encountering an illegal instruction?Behavior of the zero and negative/sign flags on classic instruction sets






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








9















From the Wikipedia's 65CE02 page:




Internally, the pipeline of the 65CE02 was redesigned to reduce the
number of cycles required to execute an instruction. The 65CE02 can
recover faster from engagement of the SYNC signal, which reduces the
minimum instruction execution time from 2 cycles to 1 cycle.[3] These
improvements allow the 65CE02 to execute code up to 25% faster than
previous 65xx models.[2]




Are the two points, "internally... redesign" and "recover faster" one and the same, or two different things?



IIRC, SYNC is the result of an opcode fetch, not the cause of it. But the way this is worded suggests the CPU was waiting on SYNC to stabilize or something?



Can someone clarify what the original issue is and what they did to address it? This seems pretty major and I'm curious if this was something that could have happened in earlier generations.










share|improve this question
























  • I'm not a 6502 guru, but I would read it as meaning the 65CE02 releases the SYNC signal faster than the original 6502, and therefore avoids waiting for a clock cycle before reading the next opcode, which reduces the minimum execution time to 1 clock cycle. If propagation delays in the logic meant that SYNC was originally asserted for 1.01 cycles, that would prevent instruction fetches on successive clock cycles, but reducing that to 0.99 cycles would make all the difference.

    – alephzero
    Jul 15 at 14:53

















9















From the Wikipedia's 65CE02 page:




Internally, the pipeline of the 65CE02 was redesigned to reduce the
number of cycles required to execute an instruction. The 65CE02 can
recover faster from engagement of the SYNC signal, which reduces the
minimum instruction execution time from 2 cycles to 1 cycle.[3] These
improvements allow the 65CE02 to execute code up to 25% faster than
previous 65xx models.[2]




Are the two points, "internally... redesign" and "recover faster" one and the same, or two different things?



IIRC, SYNC is the result of an opcode fetch, not the cause of it. But the way this is worded suggests the CPU was waiting on SYNC to stabilize or something?



Can someone clarify what the original issue is and what they did to address it? This seems pretty major and I'm curious if this was something that could have happened in earlier generations.










share|improve this question
























  • I'm not a 6502 guru, but I would read it as meaning the 65CE02 releases the SYNC signal faster than the original 6502, and therefore avoids waiting for a clock cycle before reading the next opcode, which reduces the minimum execution time to 1 clock cycle. If propagation delays in the logic meant that SYNC was originally asserted for 1.01 cycles, that would prevent instruction fetches on successive clock cycles, but reducing that to 0.99 cycles would make all the difference.

    – alephzero
    Jul 15 at 14:53













9












9








9








From the Wikipedia's 65CE02 page:




Internally, the pipeline of the 65CE02 was redesigned to reduce the
number of cycles required to execute an instruction. The 65CE02 can
recover faster from engagement of the SYNC signal, which reduces the
minimum instruction execution time from 2 cycles to 1 cycle.[3] These
improvements allow the 65CE02 to execute code up to 25% faster than
previous 65xx models.[2]




Are the two points, "internally... redesign" and "recover faster" one and the same, or two different things?



IIRC, SYNC is the result of an opcode fetch, not the cause of it. But the way this is worded suggests the CPU was waiting on SYNC to stabilize or something?



Can someone clarify what the original issue is and what they did to address it? This seems pretty major and I'm curious if this was something that could have happened in earlier generations.










share|improve this question
















From the Wikipedia's 65CE02 page:




Internally, the pipeline of the 65CE02 was redesigned to reduce the
number of cycles required to execute an instruction. The 65CE02 can
recover faster from engagement of the SYNC signal, which reduces the
minimum instruction execution time from 2 cycles to 1 cycle.[3] These
improvements allow the 65CE02 to execute code up to 25% faster than
previous 65xx models.[2]




Are the two points, "internally... redesign" and "recover faster" one and the same, or two different things?



IIRC, SYNC is the result of an opcode fetch, not the cause of it. But the way this is worded suggests the CPU was waiting on SYNC to stabilize or something?



Can someone clarify what the original issue is and what they did to address it? This seems pretty major and I'm curious if this was something that could have happened in earlier generations.







6502 instruction-set






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jul 18 at 1:15









chicks

1932 silver badges11 bronze badges




1932 silver badges11 bronze badges










asked Jul 15 at 13:50









Maury MarkowitzMaury Markowitz

3,9638 silver badges32 bronze badges




3,9638 silver badges32 bronze badges












  • I'm not a 6502 guru, but I would read it as meaning the 65CE02 releases the SYNC signal faster than the original 6502, and therefore avoids waiting for a clock cycle before reading the next opcode, which reduces the minimum execution time to 1 clock cycle. If propagation delays in the logic meant that SYNC was originally asserted for 1.01 cycles, that would prevent instruction fetches on successive clock cycles, but reducing that to 0.99 cycles would make all the difference.

    – alephzero
    Jul 15 at 14:53

















  • I'm not a 6502 guru, but I would read it as meaning the 65CE02 releases the SYNC signal faster than the original 6502, and therefore avoids waiting for a clock cycle before reading the next opcode, which reduces the minimum execution time to 1 clock cycle. If propagation delays in the logic meant that SYNC was originally asserted for 1.01 cycles, that would prevent instruction fetches on successive clock cycles, but reducing that to 0.99 cycles would make all the difference.

    – alephzero
    Jul 15 at 14:53
















I'm not a 6502 guru, but I would read it as meaning the 65CE02 releases the SYNC signal faster than the original 6502, and therefore avoids waiting for a clock cycle before reading the next opcode, which reduces the minimum execution time to 1 clock cycle. If propagation delays in the logic meant that SYNC was originally asserted for 1.01 cycles, that would prevent instruction fetches on successive clock cycles, but reducing that to 0.99 cycles would make all the difference.

– alephzero
Jul 15 at 14:53





I'm not a 6502 guru, but I would read it as meaning the 65CE02 releases the SYNC signal faster than the original 6502, and therefore avoids waiting for a clock cycle before reading the next opcode, which reduces the minimum execution time to 1 clock cycle. If propagation delays in the logic meant that SYNC was originally asserted for 1.01 cycles, that would prevent instruction fetches on successive clock cycles, but reducing that to 0.99 cycles would make all the difference.

– alephzero
Jul 15 at 14:53










1 Answer
1






active

oldest

votes


















9














The two points are the same. The signal on the SYNC pin is neither the result nor the cause of an opcode fetch; it's internal signals in the chip that cause both the SYNC pin to go high and the data from the next memory fetch to be treated as the next opcode to execute. The Wikipedia article and the referenced patent are both talking about this separate internal "SYNC" signal, similar to but not the same thing as what appears on the SYNC pin.



The original 6502 always did a memory access (read or write) with every clock cycle, and always took at least two clock cycles to execute an instruction. For multibyte instructions it would normally set up things internally so that as it was reading the last byte, the next memory access would read the next instruction and load it into the appropriate internal CPU latches, where it would be ready to execute as soon as the current instruction had completed executing.



Single-byte instructions, however, still took two cycles so the second cycle would read the next byte but, since that wasn't further data for the current instruction, what was read would just be ignored. On the subsequent clock cycle (the third since the instruction had been read), the same memory location that had just been read would be read again, and this would load the internal latches with the next instruction to be executed.



A good example of this can be seen in the first example in jsbeeb Part Three - 6502 CPU timings.



  1. Cycle 2 reads a TAY (transfer A to Y) instruction from $0002 that takes no arguments.

  2. Cycle 3 reads the subsequent instruction, CLC (clear carry), from $0003 but the read data are ignored by the CPU during this cycle as it executes the TAY.

  3. Cycle 4 re-reads the CLC from $0003, loading the internal chip latches to execute it on the next cycle.

Clearly this could be done a bit more efficiently by changing the internal signalling to understand that, when cycle 4 comes around, the CLC has been loaded already (and presumbably the data read has been stored in some appropriate internal latches) and so that can be executed now, without re-reading it, and the memory controller can continue on reading the next byte from memory. That's what the patent describes.



And yes, this probably could have happened in earlier generations; it's basically just improved pipelining. However, it does seem to add not-insignificant extra logic, where part of the point of the 6502 was its very low transistor count for its relatively good feature set.* Adding such a feature later (when it's cheaper to do so) introduces the usual problem where changing timings breaks some existing software (games, drivers, copy protection—anything relying on tricks using timing) thus making it less useful as a substitute in existing microcomputer systems.




*For example, the 6502 had significantly more indexed addressing modes than the Intel 8080/8085, despite having not much more than half the transistor count.






share|improve this answer

























  • "And yes, this probably could have happened in earlier generations; it's basically improved pipelining." - and by that I assume this is a design-time issue in the decoder (etc), not something that would be physically difficult due to timing considerations or process?

    – Maury Markowitz
    Jul 15 at 20:08











  • @MauryMarkowitz It's hard to say; I don't know chip design at that low a level. I don't see anything obviously difficult about this, however; it seems mainly a matter of adding extra signals, logic and latches (such as the PRESYNC signal and predecode latch shown in the patent) to allow this extra pipelining.

    – Curt J. Sampson
    Jul 16 at 3:44






  • 1





    @MauryMarkowitz No, it wasn't difficult at all and could have been done already with the first 6502. Except, they tried to minimize the transistor count as that was at the core of their strategy, making the chip as cheap as possible. The issue was, BTW, already solved with the 65C02 (see the 1 cycle 'NOPs'). Just here all timing was kept compatible, as the CPU was meant as a drop in replacement and for low power applications. As drop in, it had to keep the timing exact as the NMOS to avoid screwing up timing dependant code.

    – Raffzahn
    Jul 16 at 7:20











  • Facinating @Raffzahn. It is then somewhat ironic that the issue was finally addressed by CBM, who one might argue had the most to lose by changing the timing.

    – Maury Markowitz
    Jul 16 at 13:14











  • @MauryMarkowitz That was quite visible with the C128. Then again the CE02 was made for the C65 supposed to run not only at way higher speeds but as well offering additional video modes. It was rather expected to run new, more up to date software than wasting time with compatibility. Weras the C02 was really just to enable a 1:1 transition. It's important that several magnitude more 6502 (cores) have been used in controlling/communication/consumer devices than there ever been 6502 based home computers (like 10^10 vs 10^7). Here compatibility is everything.

    – Raffzahn
    Jul 16 at 13:37













Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "648"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
noCode: true, onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fretrocomputing.stackexchange.com%2fquestions%2f11668%2fsync-and-the-65ce02-instruction-timing%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









9














The two points are the same. The signal on the SYNC pin is neither the result nor the cause of an opcode fetch; it's internal signals in the chip that cause both the SYNC pin to go high and the data from the next memory fetch to be treated as the next opcode to execute. The Wikipedia article and the referenced patent are both talking about this separate internal "SYNC" signal, similar to but not the same thing as what appears on the SYNC pin.



The original 6502 always did a memory access (read or write) with every clock cycle, and always took at least two clock cycles to execute an instruction. For multibyte instructions it would normally set up things internally so that as it was reading the last byte, the next memory access would read the next instruction and load it into the appropriate internal CPU latches, where it would be ready to execute as soon as the current instruction had completed executing.



Single-byte instructions, however, still took two cycles so the second cycle would read the next byte but, since that wasn't further data for the current instruction, what was read would just be ignored. On the subsequent clock cycle (the third since the instruction had been read), the same memory location that had just been read would be read again, and this would load the internal latches with the next instruction to be executed.



A good example of this can be seen in the first example in jsbeeb Part Three - 6502 CPU timings.



  1. Cycle 2 reads a TAY (transfer A to Y) instruction from $0002 that takes no arguments.

  2. Cycle 3 reads the subsequent instruction, CLC (clear carry), from $0003 but the read data are ignored by the CPU during this cycle as it executes the TAY.

  3. Cycle 4 re-reads the CLC from $0003, loading the internal chip latches to execute it on the next cycle.

Clearly this could be done a bit more efficiently by changing the internal signalling to understand that, when cycle 4 comes around, the CLC has been loaded already (and presumbably the data read has been stored in some appropriate internal latches) and so that can be executed now, without re-reading it, and the memory controller can continue on reading the next byte from memory. That's what the patent describes.



And yes, this probably could have happened in earlier generations; it's basically just improved pipelining. However, it does seem to add not-insignificant extra logic, where part of the point of the 6502 was its very low transistor count for its relatively good feature set.* Adding such a feature later (when it's cheaper to do so) introduces the usual problem where changing timings breaks some existing software (games, drivers, copy protection—anything relying on tricks using timing) thus making it less useful as a substitute in existing microcomputer systems.




*For example, the 6502 had significantly more indexed addressing modes than the Intel 8080/8085, despite having not much more than half the transistor count.






share|improve this answer

























  • "And yes, this probably could have happened in earlier generations; it's basically improved pipelining." - and by that I assume this is a design-time issue in the decoder (etc), not something that would be physically difficult due to timing considerations or process?

    – Maury Markowitz
    Jul 15 at 20:08











  • @MauryMarkowitz It's hard to say; I don't know chip design at that low a level. I don't see anything obviously difficult about this, however; it seems mainly a matter of adding extra signals, logic and latches (such as the PRESYNC signal and predecode latch shown in the patent) to allow this extra pipelining.

    – Curt J. Sampson
    Jul 16 at 3:44






  • 1





    @MauryMarkowitz No, it wasn't difficult at all and could have been done already with the first 6502. Except, they tried to minimize the transistor count as that was at the core of their strategy, making the chip as cheap as possible. The issue was, BTW, already solved with the 65C02 (see the 1 cycle 'NOPs'). Just here all timing was kept compatible, as the CPU was meant as a drop in replacement and for low power applications. As drop in, it had to keep the timing exact as the NMOS to avoid screwing up timing dependant code.

    – Raffzahn
    Jul 16 at 7:20











  • Facinating @Raffzahn. It is then somewhat ironic that the issue was finally addressed by CBM, who one might argue had the most to lose by changing the timing.

    – Maury Markowitz
    Jul 16 at 13:14











  • @MauryMarkowitz That was quite visible with the C128. Then again the CE02 was made for the C65 supposed to run not only at way higher speeds but as well offering additional video modes. It was rather expected to run new, more up to date software than wasting time with compatibility. Weras the C02 was really just to enable a 1:1 transition. It's important that several magnitude more 6502 (cores) have been used in controlling/communication/consumer devices than there ever been 6502 based home computers (like 10^10 vs 10^7). Here compatibility is everything.

    – Raffzahn
    Jul 16 at 13:37















9














The two points are the same. The signal on the SYNC pin is neither the result nor the cause of an opcode fetch; it's internal signals in the chip that cause both the SYNC pin to go high and the data from the next memory fetch to be treated as the next opcode to execute. The Wikipedia article and the referenced patent are both talking about this separate internal "SYNC" signal, similar to but not the same thing as what appears on the SYNC pin.



The original 6502 always did a memory access (read or write) with every clock cycle, and always took at least two clock cycles to execute an instruction. For multibyte instructions it would normally set up things internally so that as it was reading the last byte, the next memory access would read the next instruction and load it into the appropriate internal CPU latches, where it would be ready to execute as soon as the current instruction had completed executing.



Single-byte instructions, however, still took two cycles so the second cycle would read the next byte but, since that wasn't further data for the current instruction, what was read would just be ignored. On the subsequent clock cycle (the third since the instruction had been read), the same memory location that had just been read would be read again, and this would load the internal latches with the next instruction to be executed.



A good example of this can be seen in the first example in jsbeeb Part Three - 6502 CPU timings.



  1. Cycle 2 reads a TAY (transfer A to Y) instruction from $0002 that takes no arguments.

  2. Cycle 3 reads the subsequent instruction, CLC (clear carry), from $0003 but the read data are ignored by the CPU during this cycle as it executes the TAY.

  3. Cycle 4 re-reads the CLC from $0003, loading the internal chip latches to execute it on the next cycle.

Clearly this could be done a bit more efficiently by changing the internal signalling to understand that, when cycle 4 comes around, the CLC has been loaded already (and presumbably the data read has been stored in some appropriate internal latches) and so that can be executed now, without re-reading it, and the memory controller can continue on reading the next byte from memory. That's what the patent describes.



And yes, this probably could have happened in earlier generations; it's basically just improved pipelining. However, it does seem to add not-insignificant extra logic, where part of the point of the 6502 was its very low transistor count for its relatively good feature set.* Adding such a feature later (when it's cheaper to do so) introduces the usual problem where changing timings breaks some existing software (games, drivers, copy protection—anything relying on tricks using timing) thus making it less useful as a substitute in existing microcomputer systems.




*For example, the 6502 had significantly more indexed addressing modes than the Intel 8080/8085, despite having not much more than half the transistor count.






share|improve this answer

























  • "And yes, this probably could have happened in earlier generations; it's basically improved pipelining." - and by that I assume this is a design-time issue in the decoder (etc), not something that would be physically difficult due to timing considerations or process?

    – Maury Markowitz
    Jul 15 at 20:08











  • @MauryMarkowitz It's hard to say; I don't know chip design at that low a level. I don't see anything obviously difficult about this, however; it seems mainly a matter of adding extra signals, logic and latches (such as the PRESYNC signal and predecode latch shown in the patent) to allow this extra pipelining.

    – Curt J. Sampson
    Jul 16 at 3:44






  • 1





    @MauryMarkowitz No, it wasn't difficult at all and could have been done already with the first 6502. Except, they tried to minimize the transistor count as that was at the core of their strategy, making the chip as cheap as possible. The issue was, BTW, already solved with the 65C02 (see the 1 cycle 'NOPs'). Just here all timing was kept compatible, as the CPU was meant as a drop in replacement and for low power applications. As drop in, it had to keep the timing exact as the NMOS to avoid screwing up timing dependant code.

    – Raffzahn
    Jul 16 at 7:20











  • Facinating @Raffzahn. It is then somewhat ironic that the issue was finally addressed by CBM, who one might argue had the most to lose by changing the timing.

    – Maury Markowitz
    Jul 16 at 13:14











  • @MauryMarkowitz That was quite visible with the C128. Then again the CE02 was made for the C65 supposed to run not only at way higher speeds but as well offering additional video modes. It was rather expected to run new, more up to date software than wasting time with compatibility. Weras the C02 was really just to enable a 1:1 transition. It's important that several magnitude more 6502 (cores) have been used in controlling/communication/consumer devices than there ever been 6502 based home computers (like 10^10 vs 10^7). Here compatibility is everything.

    – Raffzahn
    Jul 16 at 13:37













9












9








9







The two points are the same. The signal on the SYNC pin is neither the result nor the cause of an opcode fetch; it's internal signals in the chip that cause both the SYNC pin to go high and the data from the next memory fetch to be treated as the next opcode to execute. The Wikipedia article and the referenced patent are both talking about this separate internal "SYNC" signal, similar to but not the same thing as what appears on the SYNC pin.



The original 6502 always did a memory access (read or write) with every clock cycle, and always took at least two clock cycles to execute an instruction. For multibyte instructions it would normally set up things internally so that as it was reading the last byte, the next memory access would read the next instruction and load it into the appropriate internal CPU latches, where it would be ready to execute as soon as the current instruction had completed executing.



Single-byte instructions, however, still took two cycles so the second cycle would read the next byte but, since that wasn't further data for the current instruction, what was read would just be ignored. On the subsequent clock cycle (the third since the instruction had been read), the same memory location that had just been read would be read again, and this would load the internal latches with the next instruction to be executed.



A good example of this can be seen in the first example in jsbeeb Part Three - 6502 CPU timings.



  1. Cycle 2 reads a TAY (transfer A to Y) instruction from $0002 that takes no arguments.

  2. Cycle 3 reads the subsequent instruction, CLC (clear carry), from $0003 but the read data are ignored by the CPU during this cycle as it executes the TAY.

  3. Cycle 4 re-reads the CLC from $0003, loading the internal chip latches to execute it on the next cycle.

Clearly this could be done a bit more efficiently by changing the internal signalling to understand that, when cycle 4 comes around, the CLC has been loaded already (and presumbably the data read has been stored in some appropriate internal latches) and so that can be executed now, without re-reading it, and the memory controller can continue on reading the next byte from memory. That's what the patent describes.



And yes, this probably could have happened in earlier generations; it's basically just improved pipelining. However, it does seem to add not-insignificant extra logic, where part of the point of the 6502 was its very low transistor count for its relatively good feature set.* Adding such a feature later (when it's cheaper to do so) introduces the usual problem where changing timings breaks some existing software (games, drivers, copy protection—anything relying on tricks using timing) thus making it less useful as a substitute in existing microcomputer systems.




*For example, the 6502 had significantly more indexed addressing modes than the Intel 8080/8085, despite having not much more than half the transistor count.






share|improve this answer















The two points are the same. The signal on the SYNC pin is neither the result nor the cause of an opcode fetch; it's internal signals in the chip that cause both the SYNC pin to go high and the data from the next memory fetch to be treated as the next opcode to execute. The Wikipedia article and the referenced patent are both talking about this separate internal "SYNC" signal, similar to but not the same thing as what appears on the SYNC pin.



The original 6502 always did a memory access (read or write) with every clock cycle, and always took at least two clock cycles to execute an instruction. For multibyte instructions it would normally set up things internally so that as it was reading the last byte, the next memory access would read the next instruction and load it into the appropriate internal CPU latches, where it would be ready to execute as soon as the current instruction had completed executing.



Single-byte instructions, however, still took two cycles so the second cycle would read the next byte but, since that wasn't further data for the current instruction, what was read would just be ignored. On the subsequent clock cycle (the third since the instruction had been read), the same memory location that had just been read would be read again, and this would load the internal latches with the next instruction to be executed.



A good example of this can be seen in the first example in jsbeeb Part Three - 6502 CPU timings.



  1. Cycle 2 reads a TAY (transfer A to Y) instruction from $0002 that takes no arguments.

  2. Cycle 3 reads the subsequent instruction, CLC (clear carry), from $0003 but the read data are ignored by the CPU during this cycle as it executes the TAY.

  3. Cycle 4 re-reads the CLC from $0003, loading the internal chip latches to execute it on the next cycle.

Clearly this could be done a bit more efficiently by changing the internal signalling to understand that, when cycle 4 comes around, the CLC has been loaded already (and presumbably the data read has been stored in some appropriate internal latches) and so that can be executed now, without re-reading it, and the memory controller can continue on reading the next byte from memory. That's what the patent describes.



And yes, this probably could have happened in earlier generations; it's basically just improved pipelining. However, it does seem to add not-insignificant extra logic, where part of the point of the 6502 was its very low transistor count for its relatively good feature set.* Adding such a feature later (when it's cheaper to do so) introduces the usual problem where changing timings breaks some existing software (games, drivers, copy protection—anything relying on tricks using timing) thus making it less useful as a substitute in existing microcomputer systems.




*For example, the 6502 had significantly more indexed addressing modes than the Intel 8080/8085, despite having not much more than half the transistor count.







share|improve this answer














share|improve this answer



share|improve this answer








edited Jul 16 at 7:12

























answered Jul 15 at 14:59









Curt J. SampsonCurt J. Sampson

1,5153 silver badges24 bronze badges




1,5153 silver badges24 bronze badges












  • "And yes, this probably could have happened in earlier generations; it's basically improved pipelining." - and by that I assume this is a design-time issue in the decoder (etc), not something that would be physically difficult due to timing considerations or process?

    – Maury Markowitz
    Jul 15 at 20:08











  • @MauryMarkowitz It's hard to say; I don't know chip design at that low a level. I don't see anything obviously difficult about this, however; it seems mainly a matter of adding extra signals, logic and latches (such as the PRESYNC signal and predecode latch shown in the patent) to allow this extra pipelining.

    – Curt J. Sampson
    Jul 16 at 3:44






  • 1





    @MauryMarkowitz No, it wasn't difficult at all and could have been done already with the first 6502. Except, they tried to minimize the transistor count as that was at the core of their strategy, making the chip as cheap as possible. The issue was, BTW, already solved with the 65C02 (see the 1 cycle 'NOPs'). Just here all timing was kept compatible, as the CPU was meant as a drop in replacement and for low power applications. As drop in, it had to keep the timing exact as the NMOS to avoid screwing up timing dependant code.

    – Raffzahn
    Jul 16 at 7:20











  • Facinating @Raffzahn. It is then somewhat ironic that the issue was finally addressed by CBM, who one might argue had the most to lose by changing the timing.

    – Maury Markowitz
    Jul 16 at 13:14











  • @MauryMarkowitz That was quite visible with the C128. Then again the CE02 was made for the C65 supposed to run not only at way higher speeds but as well offering additional video modes. It was rather expected to run new, more up to date software than wasting time with compatibility. Weras the C02 was really just to enable a 1:1 transition. It's important that several magnitude more 6502 (cores) have been used in controlling/communication/consumer devices than there ever been 6502 based home computers (like 10^10 vs 10^7). Here compatibility is everything.

    – Raffzahn
    Jul 16 at 13:37

















  • "And yes, this probably could have happened in earlier generations; it's basically improved pipelining." - and by that I assume this is a design-time issue in the decoder (etc), not something that would be physically difficult due to timing considerations or process?

    – Maury Markowitz
    Jul 15 at 20:08











  • @MauryMarkowitz It's hard to say; I don't know chip design at that low a level. I don't see anything obviously difficult about this, however; it seems mainly a matter of adding extra signals, logic and latches (such as the PRESYNC signal and predecode latch shown in the patent) to allow this extra pipelining.

    – Curt J. Sampson
    Jul 16 at 3:44






  • 1





    @MauryMarkowitz No, it wasn't difficult at all and could have been done already with the first 6502. Except, they tried to minimize the transistor count as that was at the core of their strategy, making the chip as cheap as possible. The issue was, BTW, already solved with the 65C02 (see the 1 cycle 'NOPs'). Just here all timing was kept compatible, as the CPU was meant as a drop in replacement and for low power applications. As drop in, it had to keep the timing exact as the NMOS to avoid screwing up timing dependant code.

    – Raffzahn
    Jul 16 at 7:20











  • Facinating @Raffzahn. It is then somewhat ironic that the issue was finally addressed by CBM, who one might argue had the most to lose by changing the timing.

    – Maury Markowitz
    Jul 16 at 13:14











  • @MauryMarkowitz That was quite visible with the C128. Then again the CE02 was made for the C65 supposed to run not only at way higher speeds but as well offering additional video modes. It was rather expected to run new, more up to date software than wasting time with compatibility. Weras the C02 was really just to enable a 1:1 transition. It's important that several magnitude more 6502 (cores) have been used in controlling/communication/consumer devices than there ever been 6502 based home computers (like 10^10 vs 10^7). Here compatibility is everything.

    – Raffzahn
    Jul 16 at 13:37
















"And yes, this probably could have happened in earlier generations; it's basically improved pipelining." - and by that I assume this is a design-time issue in the decoder (etc), not something that would be physically difficult due to timing considerations or process?

– Maury Markowitz
Jul 15 at 20:08





"And yes, this probably could have happened in earlier generations; it's basically improved pipelining." - and by that I assume this is a design-time issue in the decoder (etc), not something that would be physically difficult due to timing considerations or process?

– Maury Markowitz
Jul 15 at 20:08













@MauryMarkowitz It's hard to say; I don't know chip design at that low a level. I don't see anything obviously difficult about this, however; it seems mainly a matter of adding extra signals, logic and latches (such as the PRESYNC signal and predecode latch shown in the patent) to allow this extra pipelining.

– Curt J. Sampson
Jul 16 at 3:44





@MauryMarkowitz It's hard to say; I don't know chip design at that low a level. I don't see anything obviously difficult about this, however; it seems mainly a matter of adding extra signals, logic and latches (such as the PRESYNC signal and predecode latch shown in the patent) to allow this extra pipelining.

– Curt J. Sampson
Jul 16 at 3:44




1




1





@MauryMarkowitz No, it wasn't difficult at all and could have been done already with the first 6502. Except, they tried to minimize the transistor count as that was at the core of their strategy, making the chip as cheap as possible. The issue was, BTW, already solved with the 65C02 (see the 1 cycle 'NOPs'). Just here all timing was kept compatible, as the CPU was meant as a drop in replacement and for low power applications. As drop in, it had to keep the timing exact as the NMOS to avoid screwing up timing dependant code.

– Raffzahn
Jul 16 at 7:20





@MauryMarkowitz No, it wasn't difficult at all and could have been done already with the first 6502. Except, they tried to minimize the transistor count as that was at the core of their strategy, making the chip as cheap as possible. The issue was, BTW, already solved with the 65C02 (see the 1 cycle 'NOPs'). Just here all timing was kept compatible, as the CPU was meant as a drop in replacement and for low power applications. As drop in, it had to keep the timing exact as the NMOS to avoid screwing up timing dependant code.

– Raffzahn
Jul 16 at 7:20













Facinating @Raffzahn. It is then somewhat ironic that the issue was finally addressed by CBM, who one might argue had the most to lose by changing the timing.

– Maury Markowitz
Jul 16 at 13:14





Facinating @Raffzahn. It is then somewhat ironic that the issue was finally addressed by CBM, who one might argue had the most to lose by changing the timing.

– Maury Markowitz
Jul 16 at 13:14













@MauryMarkowitz That was quite visible with the C128. Then again the CE02 was made for the C65 supposed to run not only at way higher speeds but as well offering additional video modes. It was rather expected to run new, more up to date software than wasting time with compatibility. Weras the C02 was really just to enable a 1:1 transition. It's important that several magnitude more 6502 (cores) have been used in controlling/communication/consumer devices than there ever been 6502 based home computers (like 10^10 vs 10^7). Here compatibility is everything.

– Raffzahn
Jul 16 at 13:37





@MauryMarkowitz That was quite visible with the C128. Then again the CE02 was made for the C65 supposed to run not only at way higher speeds but as well offering additional video modes. It was rather expected to run new, more up to date software than wasting time with compatibility. Weras the C02 was really just to enable a 1:1 transition. It's important that several magnitude more 6502 (cores) have been used in controlling/communication/consumer devices than there ever been 6502 based home computers (like 10^10 vs 10^7). Here compatibility is everything.

– Raffzahn
Jul 16 at 13:37

















draft saved

draft discarded
















































Thanks for contributing an answer to Retrocomputing Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fretrocomputing.stackexchange.com%2fquestions%2f11668%2fsync-and-the-65ce02-instruction-timing%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Grendel Contents Story Scholarship Depictions Notes References Navigation menu10.1093/notesj/gjn112Berserkeree

Area configuration aggregation error after install Porto themeMagento 2.1 CE Installed but front/backend not loading/workingCSS not loading on page within Magento 2 pageCannot install module in Magento 2no commands defined in the “setup” namespace. in Magento2Magento 2: Static files are present but shows 404Why do i have to always run the commands to clean cache in Magento 2.1.8?Failure reason: 'Unable to unserialize value.'Error 500 after magento migrationIn production mode the site does not loadMagento 2 : Error 500 after installing

Middle Expansion Olielle Resaix Definition: Uttering songs of triumph shouting with joy triumphant exulting Sejunction Journal 붙다 달 고급 품목 외출 The stretch trades the screeching tin. Definition: The act of speaking with a drawl a drawl Cough Sand Definition: An uproar a quarrel a noisy outbreak Shake Iron Publicize Horse House Baby 사과 Resaix Flaggy Jelly Temporary Unequaled Puppet A drop in the bucket Shrew 성격 회원 성질 미팅 The burn frames the tacky quality. Materialistic The smoke reduces the way. Yammoe Nondescript Cheek 얼굴 배 약하다 날리다 타다 The illegal country shows the iron. Help Rule Drearien Smoke Teaching Meaty Wasp Abraham Lincoln Jaws 진심 수리하다 Size Cork Idea Convert Think Lark John Lennon 거울 청소 군 추천하다 아이스크림