Hacker Newsnew | past | comments | ask | show | jobs | submit | st_goliath's commentslogin

> ... a partial download would be totally useless ...

no, not totally. The directory at the end of the archive points backwards to local headers, which in turn include all the necessary information, e.g. the compressed size inside the archive, compression method, the filename and even a checksum.

If the archive isn't some recursive/polyglot nonsense as in the article, it's essentially just a tightly packed list of compressed blobs, each with a neat, local header in front (that even includes a magic number!), the directory at the end is really just for quick access.

If your extraction program supports it (or you are sufficiently motivated to cobble together a small C program with zlib....), you can salvage what you have by linearly scanning and extracting the archive, somewhat like a fancy tarball.


At work, our daily build (actually 4x per day) is a handful of zip files totaling some 7GB. The script to get the build would copy the archives over the network, then decompress then into your install directory.

This works great on campus, but when everyone went remote during COVID it wasn't anymore. It went from three minutes to like twenty minutes.

However. Most files change only rarely. I don't need all the files, just the ones which are different. So I wrote a scanner thing which compares the zip file's filesize and checksum to the checksum of the local file. If they're the same, we skip it, otherwise, we decompress out of the zip file. This cut the time to get the daily build from 20 minutes to 4 minutes.

Obviously this isn't resilient to an attacker, crc32 is not secure, but as an internal tool it's awesome.


How would this have compared to using rsync?

Not as much geek cred for using an off the shelf solution? ;)

XPS (Microsoft's alternative to PDF) supported this. XPS files were ZIP files under the hood and were handled directly by some printers. The problem was the printer never had enough memory to hold a large file so you had to structure the document in a way it could be read a page at a time from the start.

> the directory at the end is really just for quick access.

No, its purpose was to allow multi floppy disks archives. You would insert the last disk, then the other ones, one by one…


That literally is quick access, it does the same thing in both cases, trying to get rid of the linear scan and having to plow through data unnecessarily.

If the archive is on a hard disk, the program reads the directory at the end and then seeks to the local header, rather than doing a linear scan. Or the floppy motor, if it is a small archive on a single floppy.

If you have multiple floppies, you insert the last one, the program reads the header and then tells you what floppy to insert, rather than having to go through them one by one, which you know, would be slower.

In one case, a hard disk arm, or the floppy motor, does the seeking, in the other case, your hands do the seeking. But it's still the same algorithm, doing the same thing, for the same reason.


> rodent controlled surveillance drones

See also: WWII era, pigeon-controlled guided bomb: https://en.wikipedia.org/wiki/Project_Pigeon


Much smaller than that, some might even say a utility box is overkill: https://old.reddit.com/r/techsupportgore/comments/nvwcuh/the...


You can use `git format-patch` to export a range of commits from your local git tree as a set of patches. You can then use `git send-email` to send that patch set out to the appropriate mailing list and maintainers (or just do it in one step, send-email accepts a similar commit range instead of patch files). It talks directly to an SMTP server you have configured in your `.gitconfig` and sends out e-mail.

Of course, `git send-email` has a plethora of options, e.g. you'd typically add a cover letter for a patch set.

Also, in the Linux kernel tree, there are some additional helper scripts that you might want to run first, like `checkpatch.pl` for some basic sanity checks and `get_maintainer.pl` that tells you the relevant maintainers for the code your patch set touches, so you can add them to `--cc`.

The patches are reviewed/discussed on the mailing list that you sent them to.

On the receiving side, as a maintainer, you'd use `git am` (apply mail) that can import the commits from a set of mbox files into your local git tree.


> ... don't require absurdly bloated development environments.

Outside hobbies, I've been mostly away from this field for little over a decade by now. Is it still that bad? I remember back then, every single professional electronics engineer that I met had this die hard belief that this was simply how things work:

You want to use DerpSemi micro controllers? You must install DerpStudio '98! Not the later 2005 version tough, that has some major UI bugs. You want to program a HerpSoft PLC? You need EasyHerpes IDE! What, command line toolchain? Text editor? You must be completely insane!

It's been somewhat of a personal fight against windmills for me back then. That, plus suggesting that we are actually developing software and the C/Assembly/VHDL maybe shouldn't be an undocumented, tested-only-once pile of spaghetti based off a circuit diagram inside one guys head (and no, a 50 line comment block with a German transcription of the code below is not documentation).


> and no, a 50 line comment block with a German transcription of the code below is not documentation

You had it good back then. Now it's a one-line comment in Chinese. Line is 300 characters wide. /Yorkshire men skit.


Now that's funny. The irony here is that, besides German (native), I do speak some HSK4-ish Mandarin as a 3rd language. A few years ago, a single line Chinese comment next to a blob of magic hex values did help me figure out a bug in a touch controller driver :-)


It's not that the official tools are the only way to work with parts.

It's that they're the only vendor supported method of working with the parts. If you build your product with an unsupported toolset and something doesn't work, you need to be prepared to reproduce the issue in the vendor support toolchain if you want support.

People coming from desktop and mobile development roll their eyes at this because they aren't coming across bugs in their x86-64 process or M1 Silicon that haven't already been patched over by some combination of microcode, the OS, and the toolchain. Everything works as expected.

Not so in the world of embedded. On modern complex systems, vendor involvement can be a critical part of the development process. Many of the products you have in your house like your router, Wifi gear, and IoT devices were probably co-developed to some degree with the hardware vendor. Starting with the vendor-provided reference design gets you to market much faster, even though you often could forge your own path with a separate toolchain and start from scratch.

It's still this way even in MCU development. You can go out and develop something like an STM32 system completely without STMicro's tools, but it's much easier to start the project in STMicro's tools and copy over the parts you need for setting up everything from clocks to peripherals, then to maintain a skeleton project in the official tools in case your separate toolchain starts acting funny.


>It's that they're the only vendor supported method of working with the parts

And it's because those engineers do not demand. "I Want something easy to use and has GUI, not some Terminal command line"


It's much better in microcontrollers. Almost everything can now be handled with VSCode and some open source compiler toolchains.

PLCs and FPGAs are still pretty damn bad though.


> Is it still that bad?

IMHO yes, if not worse.

Working now with some advanced devices from Xilinx (using very expensive top-line SoC. If you change version of the Vivado tools, you have to basically start again from 0 your project. Is a complicated mess because of matching of the OS in the ARM controller and the FPGA part…


One thing I will say in favour of the Gowin IDE - it does seem to be much more lightweight than the larger vendors' tools. For smaller designs it will often go from zero to bitstream in less time than Quartus or Vivado would have taken to even start synthesizing.


It's that bad. FPGA designers on my team would routinely start a Vivado job and come back 8 hours later (because that's how long it typically took) and discover Vivado had crashed.

Vivado is expensive, bloated, buggy garbage.


I have dedicated a large chunk of my (arguably short) professional career on improving upon this, mostly in the safety critical software domain. What was your experience back then, what made you leave ultimately, and what do you do now?


> I didn’t care about the 1,000 words a single person wrote about their trip abroad. There was no way to interact with it?

I wonder, have you ever read a novel? Hundreds of pages a single person wrote about a story that happened (usually) entirely in their head, printed on paper, no way to interact with it. It's a great experience if the author has some skill at this.


Yes, I have read novels. I don’t think blog posts and novels compare at all.


I can't downvote, but this comment feels a little rude or standoffish towards someone who read what you wrote, thought about it, and gave a response.

You said you didn't care for 1000 words that someone wrote about their trip abroad, and that's clearly an example to illustrate something, but it's not clear what, because it's contrived and falls apart easily: nobody else really read those blogs either, people read blogs from people and topics they're interested in.

So what about 1000 word blog from an a single individual that does interest you? Or more than 1000 words from a single individual on a different topic, like a novel?


There are `asprintf` and `vasprintf` (takes a va_list argument). Those allocate a sufficiently sized buffer that can be released with `free`.

Yes, it's a GNU extension, but it's also supported by various BSDs [1][2][3], and yes, Musl has it too. It's present in pretty much any sane C library.

[1] https://man.openbsd.org/man3/printf.3

[2] https://man.netbsd.org/vasprintf.3

[3] https://man.freebsd.org/cgi/man.cgi?query=vasprintf&sektion=...


And combine it with __attribute__((cleanup)) to get the string automatically freed at the end of your function (if that's the right thing to do). Looks like cleanup with be standardized finally in the next C2x.


> And combine it with __attribute__((cleanup)) to get the string automatically freed at the end of your function (if that's the right thing to do). Looks like cleanup with be standardized finally in the next C2x.

The problem is that on error the buffer pointer value is undefined, so you can't just unconditionally call free on the pointer. There's at least one proposal for C2x that avoids adopting asprintf for this reason despite it already being added to POSIX.

This undefined'ness is a vestige of the original glibc implementation. The proper solution is to either require that the pointer value be preserved on error (thus preserving NULL if the caller initialized it) or require the implementation set it to NULL. IIRC, when added by the BSDs (1990s) and later Solaris they explicitly documented it to set the pointer to NULL. And it seems that late last year glibc finally adopted this behavior as well.[1]

[1] https://sourceware.org/git/?p=glibc.git;a=commit;h=cb4692ce1...


Another tip: don't use normal asprintf as-is, but write your own very similar helper!

1. have it free the passed-in buffer, so that you can reuse the same pointer

2. have it do step 1 after the formatting, so the old value can be a format argument

3. when getting the size of the full expansion, don't format to NULL, but do it to a temp buffer (a few KB in size) - then if the expansion is small enough, you can skip the second format into the actual buffer. Just malloc and memcpy. You know how many chars to memcpy, because that's the return value from snprintf

(Don't forget to check for errors and all that.)


asprintf and vasprintf are part of POSIX, now.


Thanks, first I've heard of them and they happen to solve a real problem I'm working on today. Always nice when you can learn something new...


> Thanks, first I've heard of them and they happen to solve a real problem I'm working on today. Always nice when you can learn something new...

You don't really need to, TBH. I pretty much always wrote a malloc-memory `sprintf` alternative if the system didn't have one. it's only a few lines of code, that'll take maybe 10m of your day the first time you realise `sprintf` on the platform doesn't exist.

Here is a sample from more recently: https://github.com/lelanthran/libds/blob/b5289f6437b30139d42...


I know, that is what I was planning on doing. (and might be what I end up doing anyway since I need to truncate the utf-8 string if it is > 1024 bytes...) Still it is nice to have other options - this code is run in some tight loops so I will be profiling all the options.


Came here to say exactly this.

The lost art of RTFM.


I've actually learned a few little tricks reading the fucking gcc manual. If you're coding C (or C++) regularly, the manual is a good learning source and is well-written.


> This is a thing I've seen a bunch of times recently ...

> ... in the comments everyone ignores the content and argues about the headline.

Surely you must be new to Hacker News…


It was more that there were a couple of particularly frustrating recent examples that happened to come to my attention. Of course this has always been a problem.


So, eternal September is now officially coming to an end?


AOL pulled the plug on usenet access 20 years ago.


It's funny, many people complain that the web got way worse after smartphones became common. It's the second eternal September.


> Early mainboards...

... like in the PC AT, PC XT[1] or the Compaq DeskPro 386[2] that the article discusses didn't have those ports at all.

Those were instead on ISA expansion cards, just like the floppy controller that would often share a card with the UART controller for the serial interface.

[1] https://theretroweb.com/motherboards/s/ibm-xt-type-5160-64-2...,

[2] https://theretroweb.com/motherboards/s/compaq-deskpro-386-20...


IDE was just coming in (in the UK) in 1990. The acronym got updated to "AT Attachment" because "Integrated Drive Electronics" was generic, and it wasn't as if the older drives had no electronics on them. Much later when SATA showed up, the name evolved again as ATA became known as Parallel ATA to distinguish the two.

Before that, when you installed a hard disk you had to go into the BIOS to specify the geometry of the drive. 46 types were already defined, to match individual drives on the market. "Type 47" allowed -- required -- manually specifying the drive geometry in terms of cylinders, heads and sectors. So for a short while some traditional MFM or RLL drives would be informally classed as Type 47 because their geometry and capacity differed from earlier drives.


Yes, the earliest mainboards I know of with on-board I/O including ATA is around Socket 5, the first mainstream Pentium boards. Some slightly older Socket 4 boards (circa 1994) have on-board I/O, but they weren't as common.

My 486 and earlier systems have all I/O provided by ISA cards, other than the 5-pin DIN keyboard port which was standard since the original PC.


I remember my dad's Dell 486P/33 from 1991 had integrated IDE, but that was a fairly high-end machine at the time (the forerunner of their "Precision" workstation range).

Details here: https://theretroweb.com/motherboards/s/dell-system-486p

But, yes, most bog standard machines would've had a separate "SuperIO" card containing serial, parallel, and IDE interfaces until the mid 90s.


Wow! Very impressive board, I had no idea. It's kinda cool how we can directly see some of the chips that would be on the SuperIO card but directly on the mainboard. Thanks for sharing.


White box systems didn't really acquire onboard I/O til the late 486/early 586 era, but it was pretty common on name-brand systems to integrate IDE/floppy/serial/parallel and usually video.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: