Programming

[#] Tue Aug 22 2023 02:26:40 UTC from IGnatius T Foobar

Ya that's legal, but unfurtunately it incurs the cost of a memcpy.

Hmm. I did a little bit of web searching and it appears that there can be a memcpy involved. Apparently if you return a type that is too large to fit in a register, the implementation details are at the discretion of the compiler on some platforms. A few (I'm not clear on which ones) have a standard convention but others don't.

One implementation that I read about suggested that the compiler pre-allocates space on the stack *before* calling the function, and when the function returns it has a place to stash the results. This seems to be consistent with the test I ran (using GCC) passing a struct of about 2 MB to a recursive function; it only made a few iterations before it blew the stack.

For my purposes this is fine; I was just looking for a convenient way to return a couple of values -- for example: a success/error code, a pointer to a buffer, and the size of the data in that buffer. (Disregard that the buffer needs to be freed later; that's a different issue here.) The previous version of the function had to malloc() a struct containing those items, and it returned a pointer to the caller, who was then responsible for freeing it. Being able to return a 16-byte struct directly from the function is a huge savings in time and effort, and it will probably manage the memory better than I could.

As far as I can tell, this has been legal C for some time now, but I don't know for how long. And as I indicated earlier, I just wanted to make sure that it's legal everywhere and not just on certain compilers.

[#] Tue Aug 22 2023 02:39:09 UTC from IGnatius T Foobar

Ok, so I wasn't completely wrong ... according to [ https://c-faq.com/struct/firstclass.html ]

"What K&R1 said (though this was quite some time ago by now) was that the restrictions on structure operations would be lifted in a forthcoming version of the compiler, and in fact the operations of assigning structures, passing structures as function arguments, and returning structures from functions were fully functional in Ritchie's compiler even as K&R1 was being published. A few ancient compilers may have lacked these operations, but all modern compilers support them, and they are part of the ANSI C standard, so there should be no reluctance to use them."

I learned C on some pretty ancient compilers ... Lattice C running on both MS-DOG and Amiga systems, and Micro$oft C running on Xenix. So I probably hit some errors when I tried it and just assumed from then on that it wouldn't work. The amazing part is that it took nearly 40 years for me to figure out that it has worked fine for decades now.

And it still doesn't seem to be a common practice.

[#] Fri Sep 01 2023 20:53:32 UTC from zelgomer

Subject: Should new C APIs report errors with errno?

Alternate subject: I should seek professional help.

One of my many curses is that I have a tendency to obsessively overanalyze things that ultimately do not matter much in the grand scheme. And when I start obsessing on a thing, it really weighs me down because it becomes difficult to accomplish anything until I can make a definitive choice. Despite this, I do think there is some upside because I find myself reflecting on and critiquing my own code, I really think more than the average person, and sometimes (admittedly not always) there are some insights gained from it which help me to grow.

One of those things that has kept popping up in recent years is what you see in the subject line: when I write a new API in C, whether it be a library intended for reuse or some module of a larger program, should it use errno? And I suppose you could interpret that in at least two ways: a) Should I use the errno.h macros at all, or should I define my own error constants, and b) If using the errno.h macros, should I give them to the caller by assigning to the errno "global" (yes, I know it's not really a global but actually TLS on most platforms now, I'll get to that later), or should I simply return an error number value directly from the function? For now I would like to focus on interpretation (b); i.e., let's assume we've decided we're going to report errno.h constants, and it is merely a question of how.

When I searched the web for this sort of discussion, I found a few comments that give me the sense that there is some camp out there who think that errno (the "global") should be reserved for use only by system calls or by the standard library. Does anybody know where this position comes from? For one, the system call version is just patently wrong. Admittedly I'm a Linux guy and completely ignorant of how other OSes do it, but on Linux, system calls return error numbers in a register, and it's the libc wrapper which takes that and puts it into the errno storage. And as for reserving it for use only by the libc, I can't find any rationale for this, and I've never encountered any problem using it for my own APIs. So I'm assuming this entire school of thought is simply a cargo cult. If anyone has any reason to say otherwise, please let me know.

One interesting example is the pthreads library, which, I'm no historian here, but I assume came relatively later. All of the pthreads functions chose to return 0 on success or some error number on failure, and never touch the errno global. I suppose the most obvious guess as to why is because errno being a global isn't thread-safe, and pthreads is specifically a multithreading library. But on the other hand... pthreads is a multithreading library, and so you would think they must have solved the "errno must become thread local store" issue at the same time, or else none of the classic libc and POSIX APIs would work. So then, I think, the next most obvious answer is that maybe it's for performance reasons. They must have recognized that since errno had to become thread local store, then accessing it becomes more expensive, so they decided to establish a new precedent to avoid using it in multithreaded programs in order to avoid this overhead.

But is it really that much of a performance overhead? Consider the four possible code paths.

A) int foo(int, int) passes error codes through errno:

   int output = foo(input1, input2); /* may or may not write errno */
   if (!output >= 0) { /* or some similar test of output's vailidity */
       /* success path; errno is not read */
   } else {
       /* failure path; read errno and handle it */
   }

B) int bar(int *, int, int) passes error codes with its return value:

   int output;
   int rc = foo(&output, input1, input2); /* returns 0 or an error */
   if (!rc) {
       /* success path; rc is zero and of no use to me, *
        * but output contains a valid value             */
   } else {
       /* failure path; error is in rc, handle it */
   }

First, let's just talk about convenience. This should, of course, be the least convincing reason, but since it contributes to readability I think it's worth bringing up. The B case is just irritating to write. Especially so because C's type checking has to be more anal about pointer types matching than assignment, which means that if you have several calls to make and two of them output ints but a third outputs an ssize_t, you can't cheat and declare and reuse a single ssize_t temporary for all three calls; you HAVE to give the first two calls addresses to an int and the third an address to an ssize_t (of course you can overlap them with a union but that's also tedious).

Second, let's consider the two success paths. B is actually less efficient for two reasons: one, it demands more local storage, and worse, it can't return output in a register but forces stack allocation for it and a memory write (from the callee) and a memory read (from the caller).

Third, let's consider the two failure paths. Again, B loses some efficiency because it demands an extra automatic storage that's wasted (output is never written to and never read), but it wins because it was able to pass back the error number without going through TLS. But, then it loses again because what you probably wanted to do in the failure path was "perror(...)", and now you're having to use variadic "fprintf(stderr, ..., strerror(rc))" and worse, you just wrote a bug that the static analyzers will peg you for because strerror() isn't reentrant.

And does it even make sense to try to optimize for the failure path at the expense of the success path? Shouldn't we hope that the program is going to follow the success path the vast majority of the time? If so, then A wins the performance contest, too.

Anyway, now that I've taken the time to think about and write all of that, I think I've convinced myself that, even when implemented as TLS, errno is in fact the superior error code reporting mechanism, and pthreads got it wrong (and it's not the first thing I've found that pthreads got wrong). Since I put a lot of time into this post I'll still go ahead and hit the save button in case anyone else gets some enjoyment from it. I hope you enjoyed this week's edition of Incoherent Ramblings by Zelgomer.

[#] Tue Sep 05 2023 15:21:21 UTC from IGnatius T Foobar

Subject: Re: Should new C APIs report errors with errno?

Actually I'd never even considered the idea of using errno for anything other than system-defined functions.

I'd say that you would only use errno for things that strerror() knows how to describe; otherwise stick to your own return variables.

Probably the only reason errno exists in the first place is because unix was written in C, and C can only return a single parameter from a function.
Newer languages can return multiple parameters, for example in Go:

func myfunc(foo int, bar string) (int, int) {
..
..
return baz, qux
}

Or in a language that isn't brain damaged like Go, here's Python:

def myfunc(foo, bar):
..
..
return baz, qux

This gives you the ability to return, for example, both the result of the operation the function performed, and a status code indicating success or an error code. To do that in C, at least traditionally, you have to supply pointers to buffers in which the function can store anything more than a single return value. Thus, errno was born. Thompson and Ritchie decided to store all of the error codes there, since they didn't have a better way.

As we discussed here a few weeks ago, however ... at some point it became legal to return a struct by value in C. So you could do something like this:

struct foo {
int returned_value;
int returned_error;
};

struct foo myfunc(int baz, char *qux) {
..
..
struct foo r;
r.returned_value = ... ;
r.returned_error = ... ;
return(r);
}

I don't know why they haven't just added multiple return parameters to the C language, since having the ability to return a struct means the compiler has already been taught how to return an object larger than will fit in a register. But even if they added that tomorrow, it would be a decade or so before you could count on it being available everywhere you expect the software to be compiled.

In the mean time, I've started building software that returns structs instead of having to supply buffers to stuff things in, and it's a game changer. It makes the code more readable, it reduces your use of malloc et al, which reduces the time required to debug memory leaks and bad pointers ... good stuff all around.

Obviously, you can't use this for really big objects like strings. However, the child function can dynamically allocate a string and then return its location as one of the function parameters. Obviously the parent would then have to deallocate it at some point. Or use something like Boehm GC to get roughly the effect of a managed language.

But managed languages are for sissies. Use the funcs. :)

[#] Fri Sep 08 2023 18:27:44 UTC from LoanShark

I'm pretty sure that if you pass -pthread to GCC, it will set up some #define's that cause errno to become thread-local.

[#] Tue Sep 12 2023 19:28:35 UTC from zelgomer

Subject: Re: Should new C APIs report errors with errno?

Or in a language that isn't brain damaged like Go, here's Python:

def myfunc(foo, bar):
..
..
return baz, qux

"Or in a language that isn't brain damaged like Go, here's a language that's brain damaged like Python:"

Anyway, that seems like it would suffer from the issue I pointed out that you often can't reuse return variables, except it's so much worse because you have a unique struct for every function call.

Lua supports and uses the multiple return value technique like you described, and is a beautiful, not at all brain damaged language.

[#] Tue Sep 12 2023 19:32:44 UTC from zelgomer

2023-09-08 18:27 from LoanShark <loanshark@uncensored.citadel.org>

I'm pretty sure that if you pass -pthread to GCC, it will set up some

#define's that cause errno to become thread-local.

Yeah, I mentioned that. In truth, at least in my experience, errno is always TLS. If you think about it, it has to be when your program is loaded by a dynamic linker. There's no way to tell what libraries are loaded at runtime might decide to spin up their own threads without your knowledge. I actually maintain some development tools at $DAYJOB that use LD_PRELOAD to run their own thread in the process under instrumentation. None of that would work if there were a chance that the target process might have a not-multithread-safe errno.

[#] Thu Sep 14 2023 20:04:29 UTC from IGnatius T Foobar

Subject: Re: Should new C APIs report errors with errno?

Anyway, that seems like it would suffer from the issue I pointed out

that you often can't reuse return variables, except it's so much worse

because you have a unique struct for every function call.

I'm not sure what you mean. You can definitely reuse return variables in both Go and Python, because the functions are returning values, not references.

Although I did get confused when I was learning this in Go because of its weird assignment syntax:

= means "assign"
:= means "declare and assign" -- creating a new scope if the variable had already been declared

[#] Sun Sep 17 2023 17:03:48 UTC from zelgomer

Subject: Re: Should new C APIs report errors with errno?

I'm not sure what you mean. You can definitely reuse return variables

in both Go and Python, because the functions are returning values, not

references.

I'm talking about your return struct thing in C, not Go or Python.

struct do_foo_return r = do_foo();
if (r.is_error) { ... }

struct do_bar_return r2 = do_bar();
if (r2.is_error) { ... }

You don't get to reuse r because r2 is a different type. And unless you have a bunch of functions that all do similar things or at least produce similarly-typed results, I imagine this will be the norm.

[#] Mon Sep 18 2023 00:08:04 UTC from IGnatius T Foobar

Subject: Re: Should new C APIs report errors with errno?

Ah, yes you are correct about that. I agree that it isn't as versatile as the way Go and Python do it ... but it can be far easier to deal with than having to pass pointers around inside the function arguments.

The way I was using it recently, I had a bunch of functions that worked with the same complex data type: an error (or success) code, a data object, and the length of that data object. It ended up being really convenient.

But if you've got lots of different data types, then you're stuck with a bunch of different return values. The struct return thing is, to me, less convenient than having multiple simple return values, but more convenient than switching to a different language.

[#] Fri Sep 29 2023 02:46:45 UTC from IGnatius T Foobar

*sigh*

I still kind of like VS Code. And I keep finding ways to make it more comfortable to work in.

I have to keep reminding myself that this isn't the old Microsoft run by Steve Ultra-Asshole Ballmer and Bill Hitler Hitler Hitler Hitler Hitler Gates.

Today I found the setting to always open the Terminal in the editor area, so now I don't have to do three more clicks every time. And of course the vim keybindings extension was the key to getting me to use this thing in the first place. Combine with *everything* running through an SSH connection to a remote server where the actual development is happening, and the fact that Git is now the native SCM ... geez, this thing is actually useful now.

And it's freaking open source. Send this thing back in time 20 years and everyone would think it came from Bizarro World.

[#] Fri Sep 29 2023 03:48:09 UTC from LadySerenaKitty

But it did come from Bizarro World, has you seen the news lately?

Thu Sep 28 2023 22:46:45 EDT from IGnatius T Foobar

*sigh*

I still kind of like VS Code. And I keep finding ways to make it more comfortable to work in.

I have to keep reminding myself that this isn't the old Microsoft run by Steve Ultra-Asshole Ballmer and Bill Hitler Hitler Hitler Hitler Hitler Gates.

Today I found the setting to always open the Terminal in the editor area, so now I don't have to do three more clicks every time. And of course the vim keybindings extension was the key to getting me to use this thing in the first place. Combine with *everything* running through an SSH connection to a remote server where the actual development is happening, and the fact that Git is now the native SCM ... geez, this thing is actually useful now.

And it's freaking open source. Send this thing back in time 20 years and everyone would think it came from Bizarro World.

[#] Fri Sep 29 2023 11:13:31 UTC from Nurb432

Nah as no one could actually run it to know. No one would have the resources to accommodate the library-bloat.

Thu Sep 28 2023 22:46:45 EDT from IGnatius T Foobar

Send this thing back in time 20 years and everyone would think it came from Bizarro World.

[#] Fri Oct 20 2023 13:23:01 UTC from IGnatius T Foobar

But it did come from Bizarro World, has you seen the news lately?

True, it might as well be Bizarro World.

[#] Sun Nov 19 2023 14:07:01 UTC from Nurb432

if anyone is interested, this is an example of the sorts of video conference meetings our forth group has once a month or so ( this was this months ). Charles was available for this one too.

https://www.youtube.com/watch?v=M14tCZiEPkg

[#] Sun Nov 19 2023 15:37:59 UTC from zelgomer

2023-11-19 14:07 from Nurb432 <nurb432@uncensored.citadel.org>
if anyone is interested, this is an example of the sorts of video
conference meetings our forth group has once a month or so ( this was
this months ). Charles was available for this one too.

https://www.youtube.com/watch?v=M14tCZiEPkg

Coincidentally (or maybe subconciously not), I've started playing with FORTH again. I think I know, psychologically, why I was not productive with it before, and I have a strategy to recalibrate my expectations and avoid falling into the same traps this time around.

But still... I don't need this in my life. I've spent two days just tinkering with register assignments and ITC versus DTC. Like I said before, it's like I have some kind of obsessive personality flaw that is really not suited for this. Most people look at FORTH and say "Ew! That's not for me, I'm not detail oriented!" For me it's the opposite. I look at FORTH and say "Look at all these little details and nuances I can waste my life away with!" and then I don't leave the closet for weeks on end, and when I finally do emerge, I have nothing noteworthy to show for my time.

That's what I mean when I say "I have a problem." I mean it in the "I'm an alcoholic" sense.

[#] Sun Nov 19 2023 15:46:21 UTC from Nurb432

lol

but true...

Sun Nov 19 2023 10:37:59 EST from zelgomer

"Look at all these little details and nuances I can waste my life away with!"

[#] Tue Nov 21 2023 18:16:23 UTC from LadySerenaKitty

You're a codaholic. Yea, nobody ever told me how addictive programming can be, look at me now. 🫂

Sun Nov 19 2023 10:37:59 EST from zelgomer

That's what I mean when I say "I have a problem."

[#] Tue Nov 21 2023 18:42:51 UTC from Nurb432

You can break it. I did. I used to dream of Z80 code at my worst. Sure, it never goes away, but you can break the habit. Wonder if there needs to be a "programmers anonymous" support group or something. "hi my name is.. " lol

[#] Wed Nov 22 2023 09:42:17 UTC from LadySerenaKitty

Hi. My name is Jessica, and I wrote PHP, Java, C, and C++.

Uncensored

New user? Register now

Please wait...