Fwd: [sdiy] was how_you_got_started...

Jay Vaughan (ibisum) ibisum at gmail.com
Thu Oct 8 09:47:42 CEST 2015


Great discussion going on right now on synth-diy about development tools, and I figured - in case we’re not all already on synth-diy - some of you guys might get a kick out of this post:

j.


> Begin forwarded message:
> 
> From: Olivier Gillet <ol.gillet at gmail.com>
> Date: September 30, 2015 at 00:05:49 GMT+2
> To: Phillip Harbison <alvitar at xavax.com>
> Cc: synth-diy diy <synth-diy at dropmix.xs4all.nl>
> Subject: Re: [sdiy] was how_you_got_started...
> 
>> At the lowest level, the main overhead of C++ is the virtual
>> function table and the code to navigate that table, all of
>> which is constructed by the compiler and linker. Otherwise,
>> C++ code should perform about the same as C code.
> 
> Note that this extra code is generated only if you use dynamic
> polymorphism (virtual functions).
> 
> Other C++ features that can possibly cause bloating:
> * Exceptions (code size increase of function intros/outtros + extra
> RAM for bookkeeping)
> * RTTI (at least one extra pointer stored in each object + some tables
> in RAM for each type)
> http://www.hexblog.com/wp-content/uploads/2012/06/Recon-2012-Skochinsky-Compiler-Internals.pdf
> 
> None of these are really necessary - and there is a fair amount of
> large C++ projects that do not make use of these features (especially
> the last 2).
> 
> Some of my favorite C++ features have no overhead at all, entirely
> take place during compilation, and have nothing to do with objects:
> * Data access policies (const, references)
> * Evaluation of expressions during compilation
> * Syntaxic sugar (auto, range loops, default arguments to functions)
> * Functors (and now lambda expressions) leading to the generation of
> very efficient code for filtering/traversing data-structures (eg
> filtering a list of MIDI events sorted by time in a sequencer).
> * Cleanup through RAII ("Set the CS line high or finish the
> transaction with this peripheral irrespectively of the exit point of
> this function").
> 
> That's actually the point of the language and what makes it unique -
> that all the towers of abstraction it provides can be made to collapse
> during compilation. Best example is templates. Yes, they are
> controversial - they can bloat code size, but they are my favourite
> C++ feature for expressing computations in a way that the compiler can
> see through.
> 
> Let's say I want to build a granular sample player. It can work in a
> mono and a stereo mode, using 8-bit and 16-bit samples, and depending
> on some settings, with or without a complex envelope on the grains. If
> I write the code naively in C or assembly, the code will be littered
> with if statements checking for these conditions inside the audio
> rendering loop. The tests will hamper code performance (taking time to
> execute the tests, mess with prefetch, waste some registers). So in
> such situations, you really want to create as many rendering loops as
> there are combinations of parameters - each of these rendering loops
> being tight and stripped of any test. While in C or assembly you'd
> have to duplicate the loops (or maybe use some ugly pre-processor
> macro tricks), you can do this in C++ with templates. The grain
> rendering loop from my module Clouds looks like this:
> https://github.com/pichenettes/eurorack/blob/master/clouds/dsp/grain.h#L120
> . And this actually gets compiled into 2 x 3 x 2 variants for all
> combinations of #channels, buffer resolution, and grain quality - each
> of them very deeply inlined and almost branch-free. Of course this
> generates 25k of highly redundant code - but factoring all the
> configuration tests out of the loop gave a 20-40% boost in
> performance.
> 
> I used a similar technique to write reusable oscillator code that is
> configured at compile time (sketched from this:
> https://github.com/pichenettes/polyblep/blob/master/dsp/oscillator.h#L193)
> - you can configure whether parameters like frequency or waveshape are
> provided at audio rate, linearly interpolated from control rate, or
> set once per block at control rate. In the context of the development
> of a module, you can quickly check which variant is feasible with your
> computational budget - you get generic code that is "flattened" for
> each project you use it in. Or you can include two variants, one
> suitable for a higher quality, lower polyphony mode.
> 
> Another nice application is FFTs. Chip manufacturers give you these
> nice little fft libraries in which the FFT size is an argument. But do
> you often write programs in which the FFT size is decided only at run
> time? If the FFT size is known at compile time, you can write your FFT
> code as a template
> (http://www.drdobbs.com/cpp/a-simple-and-efficient-fft-implementatio/199500857),
> and all the computations (including the computation of the root of
> unity constants) can be done at run time and put right there in the
> code, giving a gigantic branch-free blob of computation that can be
> unrolled and reorganized to the compiler's will. Obviously, the code
> size will suffer, but it can be worth it in terms of run time. I have
> an upcoming project in which I use a similar approach for the
> compile-time generation and "unrolling" of sample rate conversion code
> (polyphase FIR). Once all the computations were unrolled, the compiler
> found an efficient way of working through several adjacent samples at
> once using batch/load stores that I wouldn't have figured out myself.
> 
> Some of these tricks might be doable in hand-written assembly with a
> lot of patience (or with python/perl scripts generating the assembly
> code) - but to me it's much more convenient to do everything in a
> single language... that I can compile, test, debug and instrument to
> my heart's content on my workstation rather than on the target device.
> It's really worth the price of having to occasionally peek at the
> generated assembly code to make sure that the compiler gets a good
> grasp of the flow of the computations the code is meant to perform.
> 
> I really think this idiom of compile-time code
> unrolling/specialization is worth knowing for signal processing
> applications - embedded or not!
> 
> Olivier
> _______________________________________________
> Synth-diy mailing list
> Synth-diy at dropmix.xs4all.nl
> http://dropmix.xs4all.nl/mailman/listinfo/synth-diy

;
--
Jay Vaughan
ibisum at gmail.com




-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 842 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.music-bar.org/pipermail/music-bar/attachments/20151008/6e90304d/attachment.sig>


More information about the music-bar mailing list