While we’re digressing (will try to limit the extent… ps. I failed)…
As for programming, the (almost) lost art of Assembler is frowned upon as no longer relevant from many quarters.
More money for me! You cannot truly understand how a computer operates without having that assembly language->machine code->bare metal thing click in your mind. Only people who have written at least SOME assembly, and have seen a CPU diagram or two… understand what a CPU is, and hence “how” a computer works. I agree this is sadly becoming a lost piece of understanding.
When writing fast, robust, numerically intensive solution engines for scientific or engineering applications, one can of course do this in a high-level language but the programmer has a significant advantage if s/he knows what the compiled code looks like at the CPU’s level. Compilers often blindly add bits of library code that the programmer may not even be aware of and that are unnecessary for the code in question.
Indeed but as I think you imply, I’d still write it in good C++. The difference these days between that and raw assembly are negligible in all but the most extreme cases. Your I/O operations cost a lot more than CPU cycles, which can almost be seen as irrelevant. If you can do cache or memory optimisation… then yes, but from what I’ve read mere mortals are below understanding a modern cache system well enough to make it better with userland code, and usually memory access will be governed by a (relatively) expensive call into your OS anyway, which may just decide to swap your highly-hand-optimised memory access to hard-disk, unlucky. In 99.9999999% of cases, and even some cases people think they can do better, the compiler will be better. That part of the “forget assembly” argument I buy.
But then a (good) C/C++ compiler for a PIC microchip costs a buttload of money, out of the range of a hobbyist. So there I write assembly. But even on a 20Mhz PIC your code gets executed so freaking fast (no OS, no task-switching, no nothing, your code line by line at 20mhz, is actually pretty amazing) I’ve not yet found a situation where a PIC isn’t sitting idle most of the time. - Hence they tend to build all kinds of idle-switching, power save modes, etc… into them, even if your code has to run once every 10ms the chip can still go to sleep, save some power, and wake up again in time to do it’s job.
Also, CPU-specific optimisations, like effective multi-pipelining of concurrent instruction streams or instruction set extensions, are often lacking even from the best compilers.
It’s (a very unfortunate) practicality thing… usually people will compile to target i686 or even earlier architectures to ensure backwards compatibility. Very seldom do you see someone custom-roll a bleeeding-edge compile for the latest-and-greatest architectural advances. Perhaps more common in the sciences, but not very common in consumer software. AFAIK Intel’s compiler is the best when it comes to stuff like this (only logical), but I haven’t had the need to investigate this a lot.
It comes down to an understanding of what the code does at the CPU level, which understanding can help in eliminating a host of problems and inefficiencies before they occur.
A nice example of a bad problem is memory alignment issues. If a C++ developer doesn’t understand what the compiler is going to do with certain data structures, he’s gonna have code that works on his machine (probably by fluke), and crash badly on another platform, and he may have no idea why.