...

cb321

1137

Karma

2020-07-25

Created

Recent Activity

  • The calling process being dynamically linked might impact fork() a lot to copy the various page table setups and then a tiny bit more in exec*() to tear them down. Not sure something like a shell has vfork() available as an option, but I saw major speed-ups for Python launching using vfork vs. fork. Of course, a typical Python instance has many more .so's linked in than osh probably has.

    One could probably set up a simple linear regression to get a good estimate of added cost-per-loaded .so on various OS-CPU combos, but I am unaware of a write up of such. It'd be a good assignment for an OS class, though.

  • You may well already be aware, but just in case you aren't, your bin-true benchmark mostly measures dynamic loader overhead, not fork-exec (e.g., I got 5.2X faster using a musl-gcc statically linked true vs. glibc dynamic coreutils). { Kind of a distro/cultural thing what you want to measure (static linking is common on Alpine Linux, BSDs, less so on most Linux), but good to know about the effect. }

  • My idea is not to submit the generated C any more than you submit C-compiler generated assembly, but to write directly in Nim. The niceness would mostly just be if you wanted to write some complex thing like a filesystem or driver in Nim with its templates & macros & user-defined operators and DSL building and all that goodness.

    Being a module means it can be separately distributed - kernel maintainers don't need to admit them or even be aware of them. ZFS is not distributed in mainline Linux out of Oracle fears (though that module is popular enough some distros help out). This is more or less the key to any extensible system. The C backend is helpful here in that internal kernel APIs can rely heavily on things like the C preprocessor, not just object file outputs.

    I think the main challenges would be "bindings for all the internal kernel APIs" and the ultimately limited ability &| extra work to make them Nimonic and adapting the "all set up for your C module" build process to ease a Nim build (as well as the generic in-kernel no-stdlib hassles). So, I imagine a 2- or 3-level/stage approach would be best - the background to make the Nim modules easy and then actual modules. { Linux itself, while the most popular free Unix, is also kind of a faster moving target. So, it would probably present an additional challenge of "tracking big shifts in internal APIs". }

  • While there are several "OS in Nim" projects (https://github.com/khaledh/fusion is probably the most interesting), this same ability to run bare-metal and generate to C should, in theory, make it possible to write kernel modules in Nim for Linux/SomeBSD without any special permission slip / drama over integration (the way Rust suffers, for example). I haven't heard of anyone doing such modules, but it might be a fun project for someone to try.

  • Glad you like it! Not sure what separator you would pick between the name(args)=expansion stuff. I could imagine some generic files/modules might have enough or long enough params that people might want to backslash line continue. So, maybe '@' or '`' { depending on if you want many or few pixels ;-) } ?

        #include "file.c" _(_x)=myNamePrefix ## _x `\
                          KEY=charPtr VAL=int `\
                          ....
    
    The idea being that inside any generic module your private / protected names are all spelled _(_add)(..).

    By doing that kind of namespacing, you actually can write a generic module which allows client code manual instantiators a lot of control to select "verb noun" instead of "noun verb" kinds of schemes like "add_word" instead of "word_add" and potentially even change snake_case to camelCase with some _capwords.h file that does `#define _get Get` like moves, though of course name collisions can bite. That bst/ thing I linked to does not have full examples of all the optionality. E.g., to support my "stack popping" of macro defs, without that but just with ANSI C 89 you might do something like this instead to get "namespace nesting":

        #ifndef CT_ENVIRON_H
        #define CT_ENVIRON_H
        /* This file establishes a macro environment suitable for instantiation of
           any of the Map/Set/Seq/Pool or other sorts of generic collections. */
        
        #ifndef _
        /* set up a macro-chain using token pasting *inside* macro argument lists. */
        #define _9(x)    x    /* an identity macro must terminate the chain. */
        #define _8(x) _9(x)
        #define _7(x) _8(x)   /* This whole chain can really be as long as   */
        #define _6(x) _7(x)   /* you want.  At some extreme point (maybe     */
        #define _5(x) _6(x)   /* dozens of levels) expanding _(_) will start */
        #define _4(x) _5(x)   /* to slow-down the Cpp phase.                */
        #define _3(x) _4(x)   /* Also, definition order doesn't matter, but  */
        #define _2(x) _3(x)   /* I like how top->bottom matches left->right  */
        #define _1(x) _2(x)   /* in the prefixing-expansions.               */
        #define _0(x) _1(x)
        #define _(x)  _0(x)   /* _(_) must start the expansion chain */
        #endif
        
        #ifndef CT_LNK
        #   define CT_LNK static
        #endif
        #endif /* CT_ENVIRON_H */
    
    and then with a setup like that in place you can do:

        #define _8(x) _9(i_ ## x)  /* some external client decides "i_"          */
        _(_foo)                    /* #include "I" -> i_foo at nesting-level 8   */
        #define _6(x) _7(e_ ## x)  /* impl of i_ decides "e_"                    */
        _(_foo)                    /* #include "E" -> i_e_foo at level 6         */
        #define _3(x) _4(c_ ## x)  /* impl of e_ decides "c_"                    */
        _(_foo)                    /* #include "C" -> i_e_c_foo at level 3       */
        #define _0(x) _1(l_ ## x)  /* impl of c_ decides "l_"                    */
        _(_t)
        _(_foo)                    /* #include "L" -> i_e_c_l_foo at level 0     */
        #define _0(x) _1(x)        /* c impl uses _(l_foo) to define _(bars)     */
        _(_foo)                    /* i_e_c_foo at nesting level 3 again         */
        #define _3(x) _4(x)        /* e impl uses _(c_foo) to define _(bars)     */
        _(_foo)                    /* i_e_foo at nesting level 6 again           */
        #define _6(x) _7(x)        /* i impl now uses _(e_foo) to define _(bars) */
        _(_foo)                    /* i_foo at nesting level 8 again             */
    
    Yes, yes. All pretty hopelessly manual (as is C in so many aspects!). But that smarter macro def semantics across parameterized includes I mentioned above could go a long way towards a quality of life improvement "for client code" with good "library code" file organization. I doubt it will ever be easy enough to displace C++ much, though.

    Personally, I started doing this kind of thing in the mid-1990s as soon as I saw people shipping "code in headers" in C++ template libraries and open source taking off. These days I think of it as an example of how much you can achieve with very simple mechanisms and the trade-offs of automating instantiation at all. But people sure seem to like to "just refer" to instances of generic types.

HackerNews