hathawsh

11 days ago

•on: Iroh 1.0

Iroh looks very interesting!

How current is the PyPI package? https://pypi.org/project/iroh/

Arqu•

11 days ago

We bumped to 1.0 an hour ago https://pypi.org/project/iroh/#history

•on: How memory safety CVEs differ between Rust and C/C...

11 days ago

If I were doing a code review, I would probably accept the code either with or without the assertion. The context of curl_getenv() makes it clear that null is not acceptable. If the author of curl_getenv() had evidence that callers are frequently breaking the contract by passing null, then perhaps the assertion would help shed some light on violators. Otherwise, I would expect everyone to play by the rules, making the assertion unnecessary.

favorited•

11 days ago

It's also just a wrapper around getenv that provides consistent behavior across platforms, and passing a NULL name to the POSIX getenv function is UB.

vintagedave•

11 days ago

That is exactly why you have a precondition or assertion.

If everyone expects specific behavior - ie it’s in the contract - you require that contract.

lathiat•

11 days ago

The problem with asserts is that they are pretty dramatic and you crash the entire program.

We generally did this in the avahi libraries, be fairly liberal with asserts that "shouldn't happen", it is a source of complaints though because basically you can be using a third party library that uses avahi and have your program crash due to a bug in that library, or in avahi. It's extra fun when using some historical libc systems such as "NSS" and you load a plugin to do hostname resolution, which nss-mdns does.. now you can have any program on the entire system crash if you are assert happy.

On the one hand I agree that if the result is going to be memory un-safety then perhaps you should assert, but more ideally you'd just fail gracefully and throw or return an error. That can sometimes be tricky though, if there is no good way to return an error or return a NULL value or similar. Depending on the API.

Of course, this is the entire reason behind the error return traditions of Golang and Rust, e.g: https://doc.rust-lang.org/book/ch09-00-error-handling.html

Which basically says what I said above :)

But in the case of curl_getenv, returning NULL seems a valid possibility (https://curl.se/libcurl/c/curl_getenv.html) as that is indicated to be done if you don't find the requested environment variable. Arguably the NULL environment variable is not found. so, this feels likely to be acceptable. Though I could see an argument for you now assuming the environment variable you were actually looking for not existing, but you didn't actually ask for one, and now your logic is broken and maybe you introduce a different class of security bug because you change your behaviour based on some environment variable not existing.

As always everything is a trade-off...

josephg•

11 days ago

Returning to the context of this post, this is one of the things I really like about rust. (And zig, haskell, typescript, swift and others). These languages make invalid states impossible to represent. If my function takes a value of type T (or &T), you can't accidentally receive NULL. So you just don't need to worry about this stuff any more. The compiler simply won't compile the program if type checking fails. At runtime, I only have to consider valid values.

pjmlp•

10 days ago

Zig still doesn't have a way to represent that a pointer to a heap allocated region is no longer valid.

pjmlp•

10 days ago

Crashing a program is always a much better alternative than behaviours that silently lead to memory corrupt, having much severe outcomes than a crash.

Ah but what high integrity computing, well there neither crashes nor memory corruption are welcomed, hence programming guidelines and certification workflows that would make most C devs cry with the language features they are allowed to use, and how each line of code gets analysed by tools and humans.

11 days ago

Yes, but null pointers are so pervasive in C code that we really can't afford to put assertions everywhere. It's often better to let the app crash on violations.

11 days ago

An assertion is an app crashing on a violation. The problem is when it's not guaranteed to crash, and instead does something very wrong.

11 days ago

A bug is a bug even when it doesn't clearly manifest itself 100% of the time, and furthermore it is pretty much guaranteed that NULL dereference crashes with segfault in practice, only not for the people playing theoretic games whose essence of life is finding gotchas where it maybe isn't so and then feeling smarter than everyone else.

But it's >> 99.9% true that this will just crash even though it's acshually UB, nasal demons and so forth. Now raise this << 0.1% likelihood that it isn't true on some system with some compiler and build flags, to the power of the number of distinct deployed configurations out there, and you get the result which is the correct engineering decision of just moving on instead of spending your life filling straightforward code with pointless boilerplate assertions.

NB it can make sense to assert nonnull when the condition won't be tested on all code paths or the intention is otherwise not super obvious.

lmm•

11 days ago

> it's >> 99.9% true that this will just crash even though it's acshually UB, nasal demons and so forth.

Is it though? Linux saw enough bugs from that kind of issue that they now build with -fno-delete-null-pointer-checks and accept the (supposed) performance penalty.

uecker•

11 days ago

The kernel is perhaps bit special. In the past they had bugs such as first derferencing and then checking for null and weird possibilities to map the zero page. But today I am not convinced this is really needed.

In general on a system where you trap when accessing the zero page, this optimization should be safe and a null pointer dereferences should (safely) trap.

lmm•

10 days ago

> In general on a system where you trap when accessing the zero page, this optimization should be safe and a null pointer dereferences should (safely) trap.

If you mean that C compiler writers "should" prioritise sanity over high scores on microbenchmarks, then I agree. However in practice they do not and this optimization is not remotely safe.

uecker•

10 days ago

Do you have any evidence for this? On GCC it should be safe.

(EDIT: what is not safe is indexing into a null pointer. For this you need to be safe you need -fsanitize=null)

lmm•

10 days ago

I don't understand your comment - dereferencing a null pointer is unsafe, in the sense that it does not reliably crash but may do other things, as we saw in the kernel case we're talking about. Yes that particular case was only exploitable if you mapped the zero page, but given how all-bets-are-off a situation it created (where extremely experienced programmers thought they knew what the code did, thought it was safe, and were wrong), I would not want to count on all cases not being exploitable without mapping the zero page.

10 days ago

May. If. If. If. In case.

We are talking about an extremely simple straightforward API with an obvious contract. It's good enough for this function to reliably surface almost all wrong uses with a segfault immediately. Wrong use will result in segfaults and otherwise bugs and crashes. The goal is not to work when used wrong but to work when used right. You cannot save the world from scratch in every little function. You still have a job to get done, and you have to move on.

lmm•

10 days ago

> You cannot save the world from scratch in every little function. You still have a job to get done, and you have to move on.

Or you can take all of 10 minutes to put sanity-check assertions at the start of all your public-facing API functions, eliminating a source of security bugs, get on with your life, and worry about the performance implications as and when it becomes a problem (hint: it's never going to become a problem).

10 days ago

You can try and do this if it's a relatively narrow public facing API, but otherwise this is a theoretic ideal. In practice, if you add an assertion for every pointer argument to every little function, you'll go insane, and it is completely pointless, and the code will not be readable anymore.

There are so many other interesting and relevant invariants that are usually in an API contract that are much harder or impossible to check upfront (let alone express formally in a type system), and even violations may be impossible to diagnose when they happen.

People focus on NULL because that's the only way they can apply their silly limited type systems. But NULL checks give very little return for investment. In practice, you'll see templated Option<T> types and whatnot, and when I have to look at or even work with such code I want to kill myself because it's so painful.

lmm•

10 days ago

No, people focus on a handful of things like null, buffer overrun, and use-after-free because they still make up the majority of security vulnerabilities that we see exploited in the wild. You may imagine that subtle logic errors are more common, but the data doesn't bear that out; also FWIW I've never seen one of these detailed invariants be impossible to express in a type system if you spend 5 minutes actually trying.

9 days ago

Typical invariant for me would look like:

Given a, b, c input parameters to my func, it must hold that that a->m->t == b->t. c->mutex must be held, and c->cond is the condition variable that goes with c->mutex and will release any waiters on the buffer contained in a.

Or: Integer x is representable using 12 bits only, Integer y should be a multiple of N and I have a integer s is used as a bit-shift that should be less than 8.

Or: I need to guarantee that no locks have to be taken and no allocations have to be made on this complicated looking codepath. While holding a lock, we must not do any syscalls (syscall a, b, c are ok though), and surely not make any logging calls.

I know only one system that can express this, it's called STRAIGHTFORWARD CODE, and it requires doing engineering and casual logic out-of-band, and yes it does include making mistakes and repairing them incrementally.

I don't know a type system that would let me explain these things to me and tell me where I was wrong. But maybe you can show me, with 5 minutes of actually trying?

lmm•

9 days ago

> it must hold that that a->m->t == b->t.

So define a wrapper type that represents that invariant (it's not going to take up space at runtime), where the only constructor enforces it?

> Integer x is representable using 12 bits only, Integer y should be a multiple of N and I have a integer s is used as a bit-shift that should be less than 8.

Those are all standard things that already exist?

> Or: I need to guarantee that no locks have to be taken and no allocations have to be made on this complicated looking codepath. While holding a lock, we must not do any syscalls (syscall a, b, c are ok though), and surely not make any logging calls.

Sounds like a pretty standard free monad case? Define a command algebra in which the "ok" syscalls are a subtype, and then require that the thing you want to only use the ok calls to have a type that reflects that?

9 days ago

Please, go ahead and type the example. I think you are trolling.

> Define a command algebra in which the "ok" syscalls are a subtype

Dude, it's clear you're not doing any actual work. You are living in an ivory tower, and you underestimate the complexity and detail and volatility of real world applications by at least 3 orders of magnitude. You don't understand how to modularize and contain complexity.

You _cannot_ complete a project with this attitude.

You are ignorant of the fact that a type system is necessarily a blunt simplification of the real complexity. Therefore, use of types must be pragmatic, and actual logic must be coded in normal code (which should be obvious but it isn't to type theory weirdos). Otherwise, you require dependent typing or whatever, and you will have to write your code twice, once in a usable programming language and once in a very unusable programming language. Much more than only twice actually, given that all the implicit detail should apprently go explicitly formalized at the type level.

Just to make sure I'm not entirely talking out of my arse because I'm so incredibly annoyed by your otherworldly proposition, I asked an AI about the sel4 microkernel. It consists of 10,000 lines of C code (that says a lot about its practical utility, which is very limited), and of 1,3 million lines of manually written proof code (which says a lot about the practicality of proving).

10 days ago

It takes a lot longer to figure out if it'll be a problem than to just add the check. And you don't have to ponder whether it's possible for a null to get there, because now it's fine if it does.

10 days ago

Are you talking about extending the API contract to allow for NULL? That is often the path to madness, especially if it requires complicating the signature (return value etc). Better to just assert/crash.

10 days ago

No. I'm talking about adding the check to reject NULL. Then you don't have to spend time justifying or figuring out why a NULL can't turn up here.

10 days ago

So reject as in assert? But how does that go together with what you said, "because now it's fine if it does"?

10 days ago

Because no one is expecting it to work if a null is passed. Your total range of behaviours left are crashes, doesn't crash and is silently ok, or doesn't crash and causes something worse (data corruption, you get your product in a CVE, that area).

My proposition is that "it's silently ok" isn't likely enough, which is in line with your position on "don't extend the contract to accept null". So what's left is crash, or something worse.

So if those are your choices, don't waste time justifying that a null can't get there, just add a check to ensure you get the better behaviour. It takes seconds.

9 days ago

If you follow that line of reasoning, you will end up testing almost every pointer before accessing it. The reason is that you are extending your valid state space massively since you aren't able to specify "this subset of 7 trillion distinct states is invalid, if it was the case we would have failed before".

You are requiring yourself to find a valid outcome for an input that doesn't make _any_ sense in the context of what your application is meant to achieve. How is that not a Sysiphean task?

9 days ago

You're not "extending" the valid state space. That null value being passed to that function is already a potential state of your program.

You're actually pruning the valid state space; before, when the null value is passed to the function, there are more operations performed that have uncertain consequences. If you assert-and-fail when you get the null input, you've pruned those states.

9 days ago

So if I understand correctly now, you _do_ proclaim to put asserts, not write code that somehow copes with the "possiblity" of NULL.

"Because no one is expecting it to work if a null is passed", so you can do whatever. If you write an assert for every pointer passed to every function, that will be a lot of asserts, for pretty much the same outcome in practice. Asserts are just marginally more ergonomic when they trigger, but are a nuisance in the code often. So my position is to use them judiciously, but not overdo it, be instead focused on the actual task.

When the lack of non-null assertions is an actual problem during development, you have much larger structural issues.

9 days ago

Yes. The work to assert each pointer passed into a function isn't "high"; it's purely mechanical, it could almost be refactored automatically. But most of all, the effort required to prove you don't need to is _way_ higher.

Dylan16807•

11 days ago

I don't want to nitpick people often but your use of division sign to mean percent is really throwing me off.

11 days ago

Thanks for letting me know, nitpick appreciated. Typing on my phone.

asveikau•

11 days ago

An assert is not guaranteed to terminate the process. In C, the most common implementation choice is to completely omit the check if you're not building in debug mode.

uecker•

11 days ago

You need to turn it off by defining NDEBUG. While sometimes it is not for release builds, I am not sure this is common.

asveikau•

10 days ago

Visual Studio defaults to defining NDEBUG in release mode, and I think that default was pretty influential

•on: If you are asking for human attention, demonstrate...

14 days ago

When I tell my coworkers to stop using AI to dress up their words, it's not because I care about human effort. The problem is that my coworkers often start with incorrect assumptions, and AI is good at amplifying bad assumptions and making them sound plausible. I have to spend extra time guessing at what the author originally wrote and then address the partly-hidden original points rather than what the AI generated. Give me your spelling errors, your grammar, your mumbles, your incoherent streams of thought, your doubt and uncertainty. Those things are extremely important, yet your robot obscures them.

Strangely, I've also observed that some customers respond very well to words dressed up by AI, even if the words oversimplify the truth. Now I'm working to understand why they want that. Are my customers not swimming in AI slop like the rest of us?

BTW, this doesn't mean I'm anti-AI. AI coding is an incredible superpower and I use it constantly, but it seems to me that AI coding works because code expresses the minutiae that is rightfully omitted from most other communication.

•on: Azure Linux 4.0 is Microsoft's first general-purpo...

21 days ago

Sorry to break it to you, but on that timeline, the good things got poisoned. IBM enhanced Lisp with Enterprise Ready features like Spreadsheet Macro Builder, Microsoft took over development of Smalltalk and morphed it into BASIC 2.0, and the HURD community lost a bizarre copyright lawsuit. Fortunately for those folks, an intrepid hacker in the 90s saw some of the interesting ideas in MS-DOS and rebuilt it as LS-DOS. Today, most of their servers and mobile phones run LS-DOS or similar.

__patchbit__•

21 days ago

LSD-OS would be an AI core unsupported by runtime and operating system that cascades streams of consciousness in a portable cartridge smartphone form factor until mounted on an embodiment to become unified and coherent.

b33j0r•

21 days ago

Ah. A common (and understandable) misconception. LSD-OS doesn’t enhance anything in the UX, it just removes the filters that prevent you from seeing reality, man.

Some confuse this with LDS-OS, which makes the user weirdly and unquestionably `nice` by only accepting inputs from protected mode.

•on: Bun support is now limited and deprecated

35 days ago

Your HN account is too new for me to be sure whether you're being sarcastic or not. Perhaps you know, or perhaps you don't, that all code is machine-translated, even assembly language. None of it is perfect, but it's not garbage. Today's AI merely provides a new level. It's a weird, non-deterministic level, but hiring an employee to write code for you is similarly non-deterministic.

fdsajfkldsfklds•

35 days ago

Right, and that's why Mel was a true programmer!

Seriously though, that's an overly-pedantic definition of a compiler. Broadly speaking, languages compile in a direction of decreasing abstraction. Crossing from one high-level abstraction to another is just asking for trouble, especially in this case where the target language makes very specific performance promises as long as certain abstractions are maintained.

•on: Medicare's new payment model is built for AI. Most...

44 days ago

These are also the markers of human journalists who write daily. Journalism is the reason AI acquired these habits. Gemini says this article is probably not generated by AI, particularly because it has original quotes.

https://gemini.google.com/share/ba48849a15a9

ameliaquining•

44 days ago

Personally I wouldn't cite Gemini for this because I have no idea if it has any kind of track record of accurately distinguishing human from AI writing.

That said, Pangram agrees and its track record is pretty good.

bonsai_spool•

44 days ago

> particularly because it has original quotes.

I'm not saying the quotes are fake, that would be horrific. I'm saying the rest of the article appears to have had minimal human intervention.

jvanderbot•

44 days ago

At some point, however distasteful to the naturalists, do we accept that writing with AI is still writing? There will be an arms race the way there was moving from banner ads -> whatever hellscape we have today ...

asdff•

42 days ago

It's the same as copying and pasting the wikipedia article and calling that your article. We all can generate our own slop if we want. If all you are peddling is slop, you are peddling nothing I can't get myself.

ameliaquining•

44 days ago

Then why did you point to the em-dash in the quote as evidence of AI authorship?

yen223•

44 days ago

LLMs did not invent clickbaity headlines. Kinda odd that people think it did

•on: Why senior developers fail to communicate their ex...

45 days ago

Isn't that interesting? The job of exploring a theory or model to such an extent that it can be expressed in computer code always seems to fall on the shoulders of a software developer. Other people can write specifications and requirements all day long, but until a software developer has tackled the problem, the theory probably hasn't been explored well enough yet to express clearly in computer code. It feels like software developers are scientists who study their customers' knowledge domains.

Twisol•

45 days ago

> It feels like software developers are scientists who study their customers' knowledge domains.

I agree so much with this. It's why I feel so stifled when an e.g. product manager tries to insulate and isolate me from the people who I'm trying to serve -- you (or a collective of yous) need to have access to both expertise in the domain you're serving, and expertise in the method of service, in order to develop an appropriate and satisfactory solution. Unnecessary games of telephone make it much harder for anyone to build an internal theory of the domain, which is absolutely essential for applying your engineering skills appropriately.

Terr_•

45 days ago

> so stifled when an e.g. product manager

Another facet of this is my annoyance at other developers when they persistently incurious about the domain. (Thankfully, this has not been too common.)

I don't just mean when there are tight deadlines, or there's a customer-from-heck who insists they always know best, but as their default mode of operation. I imagine it's like a gardener who cares only about the catalogue of tools, and just wants the bare-minimum knowledge to deal with any particular set of green thingies in the dirt.

LandR•

44 days ago

This is why at my current place we are not supposed to do any dev without an SME on the call. We do the development and share the screen and get immediate feedback as we are working in real time! It's great.

eithed•

44 days ago

This might be an indicator that PM isn't doing their job; PM should be able to answer you questions regarding what the business wants (= people who you're trying to serve). Developers, by the nature of interacting with domain, do become experts in the domain, but really it should be up to PM what the domain should be doing business-wise.

Jensson•

44 days ago

If that is what a PM needs then there aren't enough good PM to warrant a PM role for most products, so just make software engineers do that in most cases.

Edit: The main role of PM is to decide which features to build, not how those features should be built or how they should work. Someone has to decide what to build, that is the PM, but most PM are not very good at figuring out the best way for those features to work so its better if the programmers can talk to users directly there. Of course a PM could do that work if they are skilled at it, but most PM wont be.

eithed•

44 days ago

> not [...] how they should work

So that we're on the same page, what I think should be PM responsibilities:

If I have a user story: "As a customer I want to purchase a product so that I can receive it at my address" - PM defines this user story as they have insight to decide if such feature is needed.

PM should then define acceptance criteria: "Given customer is logged in When they view Product page Then 'Add product to basket' button should appear", "Given 'Add product to basket' button When customers click on it Then Product information modal should appear" etc - PM should know what users actually want, ie whether modals should appears, or not; whether this feature should be available for logged users only, or not.

How this will work shouldn't matter to PM; these are AC they've defined.

Of course the process of defining AC should involve developers (and QA), because AC should be exhaustive to delivering given feature

imperfect_blue•

44 days ago

The problem, in my experience, is that most PMs don't add anything when it comes to drawing up the acceptance criteria.

In your example of an order placement - the PM has no special knowledge of what is a good customer order flow. Developers are usually way better at coming up with those by the dint of experience and technical knowledge of the current codebase and make the appropriate speed/polish trade-off.

PMs acts as an imperfect proxy for what the customer wants, making judgements off nothing more than their own taste. And though there are many great PMs, the taste of a PM is usually worse than that of developers and designers on average.

IMO the main business reason they exist is for organization accountability and ownership, despite the often negative value they bring.

BobbyTables2•

45 days ago

Agree 100%.

Even the most verbose specifications too often have glaring ambiguities that are only found during implementation (or worse, interoperability testing!)

kstenerud•

45 days ago

In theory, it's the same as in practice.

In practice, it isn't.

tsunamifury•

44 days ago

Sorry this is just the interior trapped nonsense that engineers find themselves in. Please touch grass

Product designers have to intuit the entire world model of the customer. Product managers have to intuit the business model that bridges both. And on and on.

Why do engineers constantly have these laughably mind blowing moments where they think they are the center of the universe.

Paracompact•

44 days ago

I agree so much with the both of you, to the point it's difficult to avoid cognitive dissonance one way or the other.

Software people do what they do better than anyone else. I mean obviously! Just listening to a non-software person discuss software is embarrassing. As it should be.

There's something close to mathematics that SWEs do, and yet it's so much more useful and economically relevant than mathematics, and I believe that's the bulk of how the "center of the universe" mindset develops. But they don't care that they're outclassed by mathematicians in matters of abstract reasoning, because they're doers and builders, and they don't care that they're outclassed by people in effective but less intellectual careers, because they're decoding the fundamental invariants of the universe.

I don't know. I guess I care so much because I can feel myself infected by the same arrogance when I finally succeed in getting my silicon golems to carry out my whims. It's exhilarating.

0xpgm•

44 days ago

We keep seeing things like cryptic error messages shown to end users simply because of the disconnect between the programmer and the end user.

If the programmer gets to intimately understand the user's experience software would be easier to use. That's why I support the idea of engineers taking support calls on rotation to understand the user.

Both can be true at the same time, a product manager who retains the big picture of the business and product, and engineers who understand tiny but important details of how the product is being used.

If there were indeed perfect product managers, there would no need for product support.

tonyedgecombe•

44 days ago

>We keep seeing things like cryptic error messages shown to end users simply because of the disconnect between the programmer and the end user.

A lot of the error messages I'd write were for me, especially those errors I never expected to see.

The typical feedback I'd get from end users is "your software doesn't work". If they can send me a screenshot of the error I'm halfway to solving the problem.

44 days ago

I actually agree with this. Product designers and product managers are often essential and sometimes they do up to 99% of the work of figuring out how something should work. To accomplish that, they often do things well outside the role of a software developer. On the other hand, in my experience, only someone with a software development mindset seems to be able to complete the last 1% (or 10%, or whatever) that reveals and resolves certain kinds of logic issues.

necovek•

44 days ago

You seem to be assuming a certain org structure with very clear, specialized roles. Many teams do not have this, and engineers are already Product Engineers. It sometimes even makes sense (whenever engineers dogfood their product, startups, or if it is a product targeting other engineers) and is not just a budget/capacity issue.

Similarly, by siloing the world model in one or two heads, you disable the team dynamics from contributing to building a better solution: eg. a product manager/designer might think the right solution is an "offline mode" for a privacy need without communicating the need, the engineering might decide to build it with an eventual consistency model — sync-when-reconnected — as that might be easier in the incumbent architecture, and the whole privacy angle goes out the window. As with everything, assuming non-perfection from anyone leads to better outcomes.

Finally, many of the software engineers are the creative type who like solving customer problems in innovative ways, and taking it away in a very specialized org actually demotivates them. Many have worked in environments where this was not just accepted, but appreciated, and I've it seen it lead to better products built _faster_.

•on: DNSSEC disruption affecting .de domains – Resolved

52 days ago

Is that actually true, though? Even though it's not really my job, I find myself debugging certificates and keys at least once a month, and that's after automating as much as possible with certbot and cloud certificates. PKI always seems to demand attention.

walrus01•

52 days ago

In my initial comment, I meant more in terms of complexity and planning from the perspective of the people who are running the public/private key infrastructure on the other side/upstream of what you're doing as a letsencrypt end user.

Broadly similar general concept to the team responsible for the DNSSSEC signing keys for an entire ccTLD.

Yeah a x509 PKI / root CA is a very different thing than DNSSSEC but they have a number of general logical similarities in that the chain of trust ultimately comes down to a "do not fuck this up" single point of failure.

•on: Easyduino: Open Source PCB Devboards for KiCad

60 days ago

This is an amazing resource. It was difficult to appreciate what this resource was for until I tried to create my own boards based on an ESP32. It's not really difficult to build around ESP32, it's just that I don't know what I don't know. With starting points like these, I can start with a lot more confidence. Thank you!

hattmall•

60 days ago

Does this help you build a custom PCB that you would send to a factory or like just design and simulate something you could build on your own? Or both / neither? I'm not fully understanding what this project does, could you offer insight?

numpad0•

60 days ago

This is File -> New Project... -> New Hello World Project. The New Project button in hardware engineering tools often don't have the trailing 3 dots.

I think most low-end projects done in KiCad are not tested beyond making sure there's no red squiggly underlines at a glance. You are your own F5 key and assembler/runtime crash reporter. Proper circuit verification through software simulation isn't needed for most digital designs unless you do your own wireless antenna, analog amps, and/or DRAM/PCIe/GbE/etc.

ua709•

59 days ago

Your analogy is more spot on that you may know. The syntax is just a bit off ;)

"File > New Project from Template"

KiCAD comes with all the usual suspects, including Arduino and the various hats. You can get pmod templates, etc. They're actually really nice.

I use the pmod template all the time because it saves time and they're convenient to plug into Arty dev boards. PCBs are so cheap and quick I'll often make a quick PCB with a template because I just want a cleaner connector system. PCBs are basically bread boards these days.

https://techexplorations.com/guides/kicad/3e/create-a-new-ki...

https://gitlab.com/kicad/libraries/kicad-templates

kreelman•

60 days ago

I like the "File -> New Project" analogy.

I guess in theory, the original question is whether this project allows a board to be sent of for construction at a company that makes and populates boards. Yes, you could do this if you wanted to. As numpad0 has said though, it's early days for these boards and if you wanted to do something commercially reliable, you will most likely run into issues with things not being completely tested on these boards yet.

These boards provide the ability to make your own boards to host the chipsets yourself, rather than relying on a third party providing the board. So what? What if you want USB-C? What if you want to make a square or a circular board? This project is a good step along the way to allowing you to make these kinds of things.

On the hobbyist and corporate side, they also provide a way to provide a modern design that can use USB-C, which is becoming very common and is better than older USB options.

As mentioned in the README.md "Available Development Boards" section, the Atmega16u2 chip was hard to come by for Hanqaqa in 2023. The Arduino guys (arduino.com ?) probably did a "lifetime buy" of these comms chips and they probably also have several shelves of fully built Arduino boards as well. Lifetime buys and keeping good stock levels mitigate the risk of difficulty building new boards... Just get one of the older working ones off the shelf and send it. However, for an organisation (even an open source board that becomes fairly popular) wanting to build their own board, not having a given comms chip is a problem. Replacing it with a commonly available one makes it much easier for people/companies wanting to build these boards in any kind of numbers.

Having the board design readily available is really useful for the reasons above. It does seem like overkill if you just want to fiddle with a board, but if you make something that becomes popular that needs any kind of hardware adjustment, having the design becomes almost essential.

coryrc•

60 days ago

https://news.ycombinator.com/item?id=47927171

kreelman•

60 days ago

Copy that!

Wonderful that there's a Free version of these designs out there. The bugs and kinks will get sorted out over time.

•on: Over-editing refers to a model modifying code beyo...

65 days ago

I'm either in a minority or a silent majority. Claude Code surpasses all my expectations. When it makes a mistake like over-editing, I explain the mistake, it fixes it, and I ask it to record what it learned in the relevant project-specific skills. It rarely makes that mistake again. When the skill file gets big, I ask Claude to clean and compact it. It does a great job.

It doesn't really make sense economically for me to write software for work anymore. I'm a teacher, architect, and infrastructure maintainer now. I hand over most development to my experienced team of Claude sessions. I review everything, but so does Claude (because Claude writes thorough tests also.) It has no problem handling a large project these days.

I don't mean for this post to be an ad for Claude. (Who knows what Anthropic will do to Claude tomorrow?) I intend for this post to be a question: what am I doing that makes Claude profoundly effective?

Also, I'm never running out of tokens anymore. I really only use the Opus model and I find it very efficient with tokens. Just last week I landed over 150 non-trivial commits, all with Claude's help, and used only 1/3 of the tokens allotted for the week. The most commits I could do before Claude was 25-30 per week.

(Gosh, it's hard to write that without coming across as an ad for Anthropic. Sorry.)

Swizec•

65 days ago

> I'm either in a minority or a silent majority. Claude Code surpasses all my expectations.

I looked at some stats yesterday and was surprised to learn Cursor AI now writes 97% of my code at work. Mostly through cloud agents (watching it work is too distracting for me)

My approach is very simple: Just Talk To It

People way overthink this stuff. It works pretty good. Sharing .md files and hyperfocusing on various orchestrations and prompt hacks of the week feels as interesting as going deep on vim shortcuts and IDE skins.

Just ask for what you want, be clear, give good feedback. That’s it

TranquilMarmot•

64 days ago

Right - I have a ton of coworkers who obsess over "skills" and different ways to run agents and whatnot but I just... spend some time to give very thorough, detailed instructions and it just Does The Thing. I rarely fight with Claude Code these days.

wongarsu•

64 days ago

We probably need something like the WET principle for skills. If you need to explain the same thing to an agent more than twice, turn it into a skill (or add it to AGENTS.md, or CLAUDE.md, or to you docs folder, or your guides folder, or whatever method you use). If you haven't needed to explain it more than twice, it's probably fine. The context pollution from the skill would likely be worse than not having the skill

Of course exceptions apply. Some basic information that will reliably be discovered is still worth adding to your AGENTS.md to cut down on token use. But after a couple obvious things you quickly get into the realm of premature optimization (unless you actually measure the effects)

ianhxu•

64 days ago

Same here. For me, this means a spec doc split into features/UX, technical requirements, and language-specific requirements, iterated before the model touches code.

theshrike79•

64 days ago

The trick is to "just use it", BUT every few weeks grab the logs (you do keep them, right?) and have a session with the model to find out if there are any repeated patterns.

If you find any, consider making them into skills or /commands or maybe even add them to AGENTS.md.

freedomben•

64 days ago

Which logs do you use for that?

wongarsu•

64 days ago

I would assume those in ~/.claude/projects/**/*.jsonl. They contain full conversation history, including the tool calls that were made, how man tokens were consumed, etc

theshrike79•

64 days ago

Claude has a built-in /insights feature for this, but you can replicate it with any other tool that keeps the session logs on disk.

shinycode•

64 days ago

I agree it works nicely for me. From my experience it’s not realistic to expect one-shot each time. But asking it to build chunks and entering a review cycle with nudging works well. Once I changed my mindset from it « didn’t do a one-shot so it’s crap » and took it as an iterative tool that build pieces that I assemble it’s been working nicely without external frameworks or anything. Plan-review, iterate, split, build, review iterate

phito•

64 days ago

You're wasting a ton of tokens doing that though. Right now you don't realize it because they're being heavily subsidized, but you will understand the point of have good orchestration and memory files when you will have to pay the real cost of your use.

mirekrusin•

64 days ago

Cost cannot go up, only down with time (with occasional short term fluctuations). Competition, including open weight models and consumer hardware (ie upcoming M5 Ultra) keeps moving ceiling of what you can charge down.

terseus•

64 days ago

If the cost is subsidized by another cash source (e.g. VC money) when the source stops prices can definitely go up.

pirates•

64 days ago

Company pays for company’s tokens, so company’s problem, not mine. I am happy to skill up and avoid overusing tokens for my personal sub, but if it’s getting results then I couldn’t care less how much my employer has to pay for it. They’re begging me to use it in the first place anyway.

Swizec•

64 days ago

> You're wasting a ton of tokens doing that though.

My time is worth more than tokens. I’m thinking of maybe creating some .md files to save me time in code review. If I do it right, it’s going to cost more in tokens because the robots will do more.

Foobar8568•

64 days ago

My experience as well on non trivial stuff for personal projects, just talk... It makes mistakes but considering the code I see in professionnal settings, I rather deal with an agent than third parties.

[0]: https://blog.glyph.im/2025/08/futzing-fraction.html [1]: https://bcantrill.dtrace.org/2026/04/12/the-peril-of-lazines...

65 days ago

I love the IDE skins analogy. Very true.

acessoproibido•

65 days ago

Everyone knows that a red UI skin goes faster

WatchDog•

65 days ago

How do you collect these stats?

Is it by characters human typed vs AI generated, or by commit or something?

Swizec•

65 days ago

> How do you collect these stats?

Cursor dashboard. I know they're incentivized to over-estimate but feels directionally accurate when I look at recent PRs.

anabis•

65 days ago

Are you mostly using the Composer model?

Swizec•

65 days ago

> Are you mostly using the Composer model?

Don’t really think about it. I think when I talk to it through Slack, cursor users codex, in my ide looks like it’s whatever highest claude. In Github comments, who even knows

Calavar•

64 days ago

It's interesting how variable people's experiences seem to be.

Personally, I tend to get crap quality code out of Claude. Very branchy. Very un-DRY. Consistently fails to understand the conventions of my codebase (e.g. keeps hallucinating that my arena allocator zero initializes memory - it does not). And sometimes after a context compaction it goes haywire and starts creating new regressions everywhere. And while you can prompt to fix these things, it can take an entire afternoon of whack-a-mole prompting to fix the fallout of one bad initial run. I've also tried dumping lessons into a project specific skill file, which sometimes helps, but also sometimes hurts - the skill file can turn into a footgun if it gets out of sync with an evolving codebase.

In terms of limits, I usually find myself hitting the rate limit after two or three requests. On bad days, only one. This has made Claude borderline unusable over the past couple weeks, so I've started hand coding again and using Claude as a code search and debugging tool rather than a code generator.

kowbell•

64 days ago

> In terms of limits, I usually find myself hitting the rate limit after two or three requests.

I'd absolutely love to see exactly what you're doing (...well, maybe in a world where I had unlimited time or could clone myself...) because as tight as the usage limits are I absolutely cannot fathom hitting them THAT early.

What are the requests like, and have you noticed what is Claude doing during them? Is it reading an entire massive codebase or files that are thousands of lines long? Or are you loaded up with many MCPs or have an ever-growing CLAUDE.md?

Calavar•

63 days ago

I'm writing a compiler. When I have Claude write a new feature, I have validate that suite against a test suite of ~200 tiny programs.

I have a shell script that automates this. If all tests pass, the shell script prints "200/200 passing" with very little token spend. If only 190/200 pass, the shell script reports the names of every test that failed, and now Claude does a process of

1) run the compiler binary -> 2) get assembly output and inspect for obvious errors -> 3) assemble -> 4) verify that the assembler did not report errors -> 5) run test binary, connect with gdb, and find the issue -> 6) edit the compiler source -> 7) recompile the compiler -> 8) back to 1

multiplied by 10 for the 10 failing tests. This eats up tokens very quickly. I realize that not every use case is going to look like this. But if I didn't have Claude verify against the test suite, then I'd be getting regressions left and right, and then what's the point?

The whole codebase (tests included) is less than 15k lines, so I don't think that's the issue. No MCPs. CLAUDE.md about 1.5k lines.

jampekka•

64 days ago

> Very branchy. Very un-DRY.

I've found this can be vastly reduced with AGENTS.md instructions, at least with codex/gpt-5.4.

Calavar•

64 days ago

What sorts of instructions?

jampekka•

64 days ago

Usually I just put something like "Prefer DRY code". I like to keep my AGENTS.md DRY too :)

Lionga•

64 days ago

also add "no hallucinations" and "make it works this time pretty please" while also say Claude will go to jail if does not do it right should work all the time (so like 60%)

jampekka•

64 days ago

There are of course limits to what prompting can do, but it does steer the models.

In TFA they found that prompting mitigates over-editing up to about 10 percentage points.

alsetmusic•

64 days ago

Similar to the observation (by simonw) that they respond reasonably to "TDD: Red => Green"

I've used that ever since. Works most of the time, but other stuff is often failing and I've learned to become distrustful of an agent very quickly. One mistake where I point it out and the agent corrects itself is fine if it keeps working well after. A second mistake when it's trying to fix the first one or an inability to understand or a claim that it fixed it but it didn't is instant termination (after dumping context for the next agent).

maxbond•

65 days ago

When I see people talking about Claude Code becoming "unusable" for them recently, I believe them, but I don't understand. It's a deeply flawed and buggy piece of software but it's very effective. One of the strangest things about AI to me is that everyone seems to have a radically different experience.

kaoD•

64 days ago

> everyone seems to have a radically different experience

What people have is radically different expectations.

I noticed engineers will review Claude's output and go "holy crap that's junior-level code". Coders will just commit because looking at the code is a waste of time. Move fast, break things, disrupt, drown yourself into tech debt: the investors won't care anyways.

And no, telling the agent to "be less shit" doesn't work. I have to painstakingly point every single shit architectural decision so Claude can even see and fix it. "Git gud" didn't work for people and doesn't work for LLMs.

It's not that the code isn't DRY, it's just DRY at the wrong points of abstraction, which is even worse than not being DRY. I manage to find better patterns in each and every single task I tell Claude or Copilot to autonomously work on, dropping tons of code in the process (DRY or not). You can't prompt Claude out of making these wrong decisions (at best from very basic mistakes) since they are too granular to even extract a rule.

This is what separates a senior from a junior.

If you think Claude writes good code either you're very lucky, I'm very bad at prompting, or your standards are too low.

Don't get me wrong. I love Claude Code, but it's just a tool in my belt, not an autonomous engineer. Seeing all these "Claude wrote 97% of my code" makes me shudder at the amount of crap I will have to maintain 5 years down the line.

htfu•

64 days ago

You have to tell it both what and how. That way it's decidedly less shit. Still needs tons of passes just keeping things somewhat coherent, but it mostly works.

roncesvalles•

64 days ago

>One of the strangest things about AI to me is that everyone seems to have a radically different experience.

I've thought about this and I think the reason is as follows: we hold code written by ourselves to a much higher standard than code written by somebody else. If you think of AI code as your own code, then it probably won't seem very acceptable because it lacks the beauty (partly subjective as all beauty tends to be) that we put into our own code. If you think of it as a coworker's code, then it's usually alright i.e. you wouldn't be wildly impressed with that coworker but it would also not be bad enough to raise a stink.

It follows from this that it also depends on how you regard the codebase that you're working on. Do you think of it as a personal masterpiece or is it some mishmash camel by committee as the codebases at work tend to be?

enraged_camel•

65 days ago

I use it through the desktop app, which has a lot of features I appreciate. Today it was implementing a feature. It came across a semi-related bug that wasn’t a stopper but should really be fixed before go live. Instead of tackling it itself or mentioning it at the final summary (where it becomes easy to miss), it triggered a modal inside the Claude app with a description of the issue and two choices: fix in another session or fix in current session. Really good way to preserve context integrity and save tokens!

sroussey•

65 days ago

How to you get CC to connect to your dev container? I have the CC app but it’s kinda useless as I’m not have it barebacking my system, so I’m left with the cli and vs code extension.

gommm•

64 days ago

I just run CC in a VM. It gets full control over the VM. The VM doesn't have access to my internal networks. I share the code repos it works on over virtiofs so it has access to the repos but doesn't have access to my github keys for pushing and pulling.

This means it can do anything in the VM, install dependencies, etc... So far, it managed to bork the VM once (unbootable), I could have spent a bit of time figuring out what happened but I had a script to rebuild the VM so didn't bother. To be entirely fair to claude, the VM runs arch linux which is definitely easier to break than other distros.

thatxliner•

64 days ago

You have to try Codex. My friend's been trying to convert me for months and he was right all along: with Codex you don't GSD or whatever prompting metaframework. You rarely (I actually haven't need to do this at all) need to ask it to retry because its implementation is bugged: it literally just works first try.

Maybe that's because the harness maybe (not sure; haven't looked at their source code) has it baked in? Doesn't matter; the point is that it works.

Now, the one thing I heavily dislike is the UI it generates...it doesn't seem to realize that matching UI patterns with the existing codebase is quite important.

PunchyHamster•

64 days ago

> One of the strangest things about AI to me is that everyone seems to have a radically different experience.

Because it is that uneven. Some problems it nails at first go or with very little cosmetic changes.

In others it decides on solution, hallucinates parts that do not exist like adding API calls or config options that do not exists and gets the basics wrong.

Similarly you do something that's somewhat common pattern, it usually nails it. If you do something that subtly differs in certain way from a common pattern, it will just do the common pattern and you get something wrong.

shimman•

65 days ago

My workflow is to just use LLMs for small context work. Anything that involves multiple files it truly doesn't do better than what I'd expect from a competent dev.

It's bitten me several times at work, and I rather not waste any more of my limited time doing the re-prompt -> modify code manually cycle. I'm capable of doing this myself.

It's great for the simple tasks tho, most feature work are simple tasks IMO. They were only "costly" in the sense that it took a while to previously read the code, find appropriate changes, create tests for appropriate changes, etc. LLMs reduce that cycle of work, but that type of work in general isn't the majority of my time at my job.

I've worked at feature factories before, it's hell. I can't imagine how much more hell it has become since the introduction of these tools.

Feature factories treat devs as literal assembly line machines, output is the only thing that matters not quality. Having it mass induced because of these tools is just so shitty to workers.

I fully expect a backlash in the upcoming years.

---

My only Q to the OP of this thread is what kind of teacher they are, because if you teach people anything about software while admitting that you no longer write code because it's not profitable (big LOL at caring about money over people) is just beyond pathetic.

rustyhancock•

64 days ago

I think on HN atleast. People enamoured by Claude are the vocal majority.

The view of Claude on HN is extremely positive and nearly every thread will have highly positive comment "that is not an ad".

I think people are seeing others just irked by the constant stream what feels like ads and reading it as Claude being somehow disliked.

nicbou•

64 days ago

Same. It's surprisingly good as a labour saving device. It produces code that I would accept without reservations from a coworker. I still read every line and make tweaks, but they're the same tweaks I would ask for in a code review.

I don't measure my productivity, but I see it in the sort of tasks I tackle after years of waiting. It's especially good at tedious tasks like turning 100 markdown files into 5 json files and updating the code that reads them, for example.

rtpg•

65 days ago

Are you writing code that gets reviewed by other people? Were code reviews hard in the past? Do your coworkers care about "code quality" (I mean this in scare quotes because that means different things to different people).

Are you working more on operational stuff or on "long-running product" stuff?

My personal headcanon: this tooling works well when built on simple patterns, and can handle complex work. This tooling has also been not great at coming up with new patterns, and if left unsupervised will totally make up new patterns that are going to go south very quickly. With that lens, I find myself just rewriting what Claude gives me in a good number of cases.

I sometimes race the robot and beat the robot at doing a change. I am "cheating" I guess cuz I know what I want already in many cases and it has to find things first but... I think the futzing fraction[0] is underestimated for some people.

And like in the "perils of laziness lost"[1] essay... I think that sometimes the machine trying too hard just offends my sensibilities. Why are you doing 3 things instead of just doing the one thing!

One might say "but it fixes it after it's corrected"... but I already go through this annoying "no don't do A,B, C just do A, yes just that it's fine" flow when working with coworkers, and it's annoying there too!

"Claude writes thorough tests" is also its own micro-mess here, because while guided test creation works very well for me, giving it any leeway in creativity leads to so many "test that foo + bar == bar + foo" tests. Applying skepticism to utility of tests is important, because it's part of the feedback loop. And I'm finding lots of the test to be mainly useful as a way to get all the imports I need in.

If we have all these machines doing this work for us, in theory average code quality should be able to go up. After all we're more capable! I think a lot of people have been using it in a "well most of the time it hits near the average" way, but depending on how you work there you might drag down your average.

65 days ago

You hinted at an aspect I probably haven't considered enough: The code I'm working on already has many well-established, clean patterns and nearly all of Claude's work builds on those patterns. I would probably have a very different experience otherwise.

rtpg•

65 days ago

I legit think this is the biggest danger with velocity-focused usage of these tools. Good patterns are easy to use and (importantly!) work! So the 32nd usage of a good pattern will likely be smooth.

The first (and maybe even second) usage of a gnarly, badly thought out pattern might work fine. But you're only a couple steps away from if statement soup. And in the world where your agent's life is built around "getting the tests to pass", you can quickly find it doing _very_ gnarly things to "fix" issues.

sroussey•

65 days ago

I’ve seen ai coding agents spin out and create 1_000 line changesets that I have to stop before they are 10_000. And then I look at the problem and change one line instead.

chickensong•

64 days ago

This is it right here. Claude loves to follow existing patterns, good or bad. Once you have a solid foundation, it really starts to shine.

I think you're likely in the silent majority. LLMs do some stupid things, but when they work it's amazing and it far outweighs the negatives IMHO, and they're getting better by leaps and bounds.

I respect some of the complaints against them (plagiarism, censorship, gatekeeping, truth/bias, data center arms race, crawler behavior, etc.), but I think LLMs are a leap forward for mankind (hopefully). A Young Lady's Illustrated Primer for everyone. An entirely new computing interface.

TranquilMarmot•

64 days ago

We noticed this and spent a week or two going through and cleaning up tests, UI components, comments, and file layout to be a lot more consistent throughout the codebase. Codebase was not all AI written code - just many humans being messy and inconsistent over time as they onboard/offboard from the project.

Much like giving a codebase to a newbie developer, whatever patterns exist will proliferate and the lack of good patterns means that patterns will just be made up in an ad-hoc and messy way.

esalman•

65 days ago

You haven't answered the question though. Are your code peer reviewed? Are they part of client-facing product? No offense, I like what you are doing, but I wouldn't risk delegation this much workload in my day job, even though there is a big push towards AI.

gommm•

64 days ago

> My personal headcanon: this tooling works well when built on simple patterns, and can handle complex work. This tooling has also been not great at coming up with new patterns, and if left unsupervised will totally make up new patterns that are going to go south very quickly. With that lens, I find myself just rewriting what Claude gives me in a good number of cases.

I've been doing a greenfield project with Claude recently. The initial prototype worked but was very ugly (repeated duplicate boilerplate code, a few methods doing the same exact thing, poor isolation between classes)... I was very much tempted to rewrite it on my own. This time, I decided to try and get it to refactor so get the target architecture and fix those code quality issues, it's possible but it's very much like pulling teeths... I use plan mode, we have multiple round of reviews on a plan (that started based on me explaining what I expect), then it implements 95% of it but doesn't realize that some parts of it were not implemented... It reminds me of my experience mentoring a junior employee except that claude code is both more eager (jumping into implementation before understanding the problem), much faster at doing things and dumber.

That said, I've seen codebases created by humans that were as bad or worse than what claude produced when doing prototype.

swader999•

65 days ago

I feel the same way. Doesn't make sense economically or even in good faith for me to use company paid time writing code for line of business apps at anymore and I'm 28 years into this kind of work.

archerx•

64 days ago

I used Claude to help me with a function once and it added a memory leak, it wouldn’t have been noticeable to most people but I saw. I still write my own code and find LLMs frustrating because they almost get it right and it’s just more efficient for me to just write the code correctly instead of having an LLM write something that’s almost correct and me fixing it after the fact.

I can’t wait for all the future vibe coded projects to be exploited by the black hats waiting in the shadows for things to reach a critical state. I don’t believe in anthropic because they love to lie.

ytoawwhra92•

65 days ago

> I intend for this post to be a question: what am I doing that makes Claude profoundly effective?

I'm fascinated by this question.

I think the first two sections of this article point towards an answer: https://aphyr.com/posts/412-the-future-of-everything-is-lies...

I've personally had radically different experiences working on different projects, different features within the same project, etc.

nobodywillobsrv•

64 days ago

How much does it cost though?

This is the problem.

I think there is a huge gap between people on salaries getting effectively more responsibility by being given spend that they otherwise would not have had and people hustling on projects on their own.

Yes it is 100% what I use but I am never happy with usage. It burns up by sub fast and there is little feelings of control. Experiments like using lower tier models are hard to understand in reality. Graphify might work or it might not. I have no idea.

Unbeliever69•

65 days ago

I think a lot of use have implemented our own ad hoc self-improvement checks into our agentic workflows. My observations are the same as yours.

archargelod•

64 days ago

I am genuinely interested to know some details:

1. Is a product/software you develop novel? As in does it do something useful and unique? Or it's a product that already exists in many varietes and yours is just "one of ..."?

2. What if one day, LLMs will get regulated/become terrible/raise prices above your budget. Do you have plans for that?

Pannoniae•

64 days ago

1. Fairly - I definitely don't see any training material about the stuff I do on the internet:D it's really far from your avg front-end app. And of course you can't let any of those make decisions automatically. Remember the IBM quote, "a computer can not be held accountable therefore a computer must not make any management decisions"... Even on completely greenfield and groundbreaking projects there's lots of throwaway code, scaffolding and so on. You contribute the value-add, you use the flanker to speed up the boring and grey parts.

2. Regulation? I'm sceptical that the cat can be put back into the bag. It's already out there. More realistic problem is the business model part - openweight/local provides a counterpoint to that.

cultofmetatron•

64 days ago

I'm in a similar situation

1. Even really novel projects have large chunks of glue code and boring infrastructure that the novel bits depend on. claude means I spend 10% of my time on the borng stuff and 90% of time on stuff I previously onky had 10% of my day to work on. In my experience the software picked up our idioms fast and for context, we have a skill file explaining code standards.

2. codex and gemini are comparable when paired with a good harness (pi.dev). if things ever get really bad, I'll drop 8k on a dedicated agent coding server and run it locally. I tried it recently with my current system and it was sub par but I was running a drasticly simpler model.

max_streese•

64 days ago

To people stating these high commit numbers: What is your average changeset size? I have found that having agent do large changes (few hundred lines or more) results in a lot of friction for me and it feels like at some point I leave a happy path where instead of moving quickly I get dragged down.

leonidasv•

65 days ago

The article has a benchmark and Opus has best score in two categories and the second-best in another (there are only three categories). Opus is probably the best choice when it comes to producing readable code right now. GPT (for example) lags way behind.

baq•

64 days ago

Anecdotally it’s the exact opposite for me: gpt 5.4 is leagues ahead of opus for the kind of backend work I do. Opus keeps making stupid mistakes while overengineering the irrelevant parts. However when I have to work on the backoffice ui, I still pick opus.

keybored•

64 days ago

The silent majority of GenAI praise reaches the top of the thread again.

Edit: The lurkers and the commenters must be a pretty different set of people I suppose.

Powdering7082•

65 days ago

Is your claude.md, skills or other settings that you have honed public?