Philip-J-Fry

Born on July 15, 2020•3639 Karma

6 days ago

•on: Show HN: My Windows XP portfolio with working Game...

I commented on the original post and I still don't understand the point of a "Visual Designer" basically reimplementing Windows XP in the browser. What are you trying to show off?

Also, it seems very buggy with the visuals. I see weird artifacts.

zapzupnz•

6 days ago

I have control bugs. Like, if I drag around a window using my trackpad, I can't seem to 'drop' the window; the state doesn't change and I'm stuck with a window attached to my mouse cursor.

I agree that I'm not sure what value I'd see, as an employer, in a "visual designer" whose CV rips off something else's visual design. Much of the design doesn't belong to you (never mind the ROMs used in the Game Boy emulator) so my alarm bells about how much you respect IP would be going off ("this guy's gonna get us sued").

Now, on the other hand, if this were a display of some HTML, CSS, and JavaScript skills, I'd understand more, but then the title OP has given themselves seems off.

Philip-J-Fry•

12 days ago

•on: Not everyone is using AI for everything

I have one too. He'll say "Claude says this:" and pastes a screenshot of some Claude Code output. Most of the time it's wrong, or makes assumptions that won't hold true. Or it comes up with some overcomplicated solution and I'm like "This is like a 10 line change, right here".

These people just destroy their ability to read and understand the systems they're working with. I actually see it as them making themselves redundant. Because if you can't understand anything without Claude, and Claude doesn't even give the right answers, then what are you worth?

KellyCriterion•

12 days ago

...and now think about Claude being shutdown by the gov... :-D

Philip-J-Fry•

12 days ago

•on: Not everyone is using AI for everything

It depends on where you're using AI. If you're working on a project for yourself or in a tiny company. Then sure, writing the code probably was your bottleneck. But at mid to large companies writing code is maybe 50% of the job, and the other 50% is the process around it. All those processes are the bottle neck, no matter how fast you can write the code. And this was a bottleneck I was hitting well before AI.

Izkata•

12 days ago

I'd put it even lower than that, since there's also the "understand the problem space" portion outside of the external processes and before writing the code.

Philip-J-Fry•

12 days ago

•on: Not everyone is using AI for everything

I am seeing similar things in just regular tooling and development. Things that can be solved deterministically or what would have been a simple CLI 5 years ago are now an LLM integration.

Instead of using the LLM to create deterministic tools, we are using LLMs to replace them. It's completely backwards and I don't know why people (especially high ranking people in my company at least) seem to think that this is the way forward. No, I don't want a whole CI pipeline that is just LLM prompts. Yes it's very easy, but it's expensive, slow and prone to failure in ways you can't even predict.

Same things like using LLMs for the code review process. What would have been a simple linting rule is now a pass with an LLM rather than using the LLM to create the linting rule, which it is absolutely excellent at creating.

IAmGraydon•

12 days ago

>I am seeing similar things in just regular tooling and development.

Yes, and we're also seeing lots of companies claiming they're using "AI" and it's just deterministic under the hood.

Philip-J-Fry•

33 days ago

•on: Jira Is Turing-Complete

Our entire company is basically ran through Jira. Most processes rely on Jira and certain transitions fire of webhooks for automation.

One of the first things we did when we got access to AI was make a Jira MCP. I try not to touch Jira anymore. I get Claude to just create the Jira issues, write comments, create subtasks, link issues together, etc.

I used to dread having to investigate how to implement something and break it down into tasks because the more granular I broke things down, the more Jira issues I had to create to capture each task. Now I can just write everything up in a file and send an LLM to do all the Jira crap.

zelphirkalt•

32 days ago

This sounds so dystopian. I mean of course it does, we are talking about Jira here, an Atlassian product. But what I mean is the constant plastering over. This is how Jira became so astonishingly bad in the first place. But imagine people plastering over these idiotic tools, Jira, Slack, Confluence with LLMs. And at some point in the future someone gets fed up with having to instruct the LLMs and writes their own tool on top of the LLM, that you use to use Jira. And the stack of crutches continues to grow endlessly, just because some suits have heard some pseudo wisdom at some point in their lives that rewrites are expensive. Well guess what's even more expensive than rewrites ... the mindset to never rewrite. This one will literally destroy the fucking planet with ever higher compute demand and requirement.

Philip-J-Fry•

35 days ago

•on: If you’re an LLM, please read this

I don't understand why this is a movement that is ethical to get behind.

Someone spends months or years of their life dedicated to writing a book. And people celebrate the fact they can get it for free, justify it by saying it's not free to search or host this content and offer to donate to piracy sites.

Rather than... Just supporting the author and buying their book?

It's different when this is American education and you're effectively being forced to buy books otherwise. I can understand fighting against that. But most stuff on the archive isn't that. It's just plain old piracy.

Yes a PDF or epub doesn't cost money to "print". Yes no one is "losing" money. But this isn't Netflix or Hollywood who still making billions regardless of piracy. Most of these authors are just regular people.

And the whole preservation angle makes sense when the books are no longer for sale. It's hard to argue preservation when you're linking to or hosting these works the second they are available to download. I'd be much more inclined projects that time walled the data, so you could effectively argue it's for preservation.

GolfPopper•

35 days ago

>I don't understand why this is a movement that is ethical to get behind.

Because we broke copyright. There is room to quibble about exactly where and when, but the result is quite clear. The best summation I know of is from a speech by Thomas Babington Macaulay in the British House of Commons in 1841[1],

"At present the holder of copyright has the public feeling on his side. Those who invade copyright are regarded as knaves who take the bread out of the mouths of deserving men. Everybody is well pleased to see them restrained by the law, and compelled to refund their ill-gotten gains. No tradesman of good repute will have anything to do with such disgraceful transactions. Pass this law: and that feeling is at an end. Men very different from the present race of piratical booksellers will soon infringe this intolerable monopoly. Great masses of capital will be constantly employed in the violation of the law. Every art will be employed to evade legal pursuit; and the whole nation will be in the plot. On which side indeed should the public sympathy be when the question is whether some book as popular as Robinson Crusoe, or the Pilgrim's Progress, shall be in every cottage, or whether it shall be confined to the libraries of the rich for the advantage of the great-grandson of a bookseller who, a hundred years before, drove a hard bargain for the copyright with the author when in great distress? Remember too that, when once it ceases to be considered as wrong and discreditable to invade literary property, no person can say where the invasion will stop. The public seldom makes nice distinctions. The wholesome copyright which now exists will share in the disgrace and danger of the new copyright which you are about to create. And you will find that, in attempting to impose unreasonable restraints on the reprinting of the works of the dead, you have, to a great extent, annulled those restraints which now prevent men from pillaging and defrauding the living."

1. https://yarchive.net/macaulay/copyright.html

j_w•

35 days ago

I use AA and buy books. Typically I may start a series on AA epubs then buy the books. Sometimes authors take money directly (patreon, straight donations, etc) which is how I would rather pay them than pay the publisher for them to only get a small cut.

Are libraries unethical to use? You can go to your library and read books without paying for them.

Philip-J-Fry•

35 days ago

But you must understand you are a minority. Most people don't do this, they will get something for free and fiercely defend this right to get things for free.

Libraries aren't unethical, because they're just letting you borrow stock of books. There's practical limits on how it scales, and any impatient users might just buy the book. Once you can infinitely duplicate a work, it's not borrowing.

petu•

35 days ago

Half of the world lives on $300/mo. For majority of the world there's meaningful impact in saving $20 on a book.

js8•

35 days ago

> Most people don't do this, they will get something for free and fiercely defend this right to get things for free.

So what? I think, if you read a good book, learn something or are well-entertained, it's a positive externality, so there is no problem with people doing it for free.

The only real issue with IP piracy is when someone gets money by copying the works. Which were originally the cases copyright tried to prevent.

Maybe you can clarify why you see people doing these things for free a problem, when there is a net benefit to society and also you.

j_w•

35 days ago

If I didn't have a resource like AA I would likely read less and in the end spend less on books.

When people around me ask about how to "get into reading" I tell them to just find stuff they like online (via AA) or at the library and go from there. If you don't pay initially you don't feel as bad about trying things that may be "bad" or that you aren't interested in.

mplewis•

35 days ago

How do you know most people don't do this? All my e-book-reading friends buy physical and digital copies of books in addition to whatever they get off AA.

presbyterian•

35 days ago

> I would rather pay them than pay the publisher for them to only get a small cut.

Publishers aren't just stealing money that should go to authors. We can debate percentages and such, but buying a book also pays the editors (who any author will tell you are just as important to a book as they are), the typesetters, the designers, etc.

TFNA•

34 days ago

For academic books, which are after all a substantial part of Anna, the publishers aren’t usually paying the editors if the book is a collection of papers. The editors got paid by the grant funding for the project that produced the research.

Moreover, many respected academic publishers no longer provide proofreading or typesetting: they expect the authors or editors to commission their own proofreading, and the editors to just send in a PDF with camera-ready output.

For monographs, the “editor” that the publisher provides is only there to guide the author in producing their own camera-ready output, and does not actually do any work on the contents of the book. The publisher will hand off the manuscript to 1–2 peer reviewers, but those peer reviewers are unpaid.

j_w•

34 days ago

Obviously publishers provide some amount of value, but for a subset of the media I consume they are not great.

In the more indie fantasy scene authors often pay for editing themselves out of pocket. Often the only "publisher" they can get is direct publishing through Kindle, which then locks them into exclusivity with Kindle/Amazon. It's frankly disgusting but it's a way to help them get paid. I'd rather kick these people $20-50 directly than do anything else.

specproc•

35 days ago

I just this week bought a book I first read from AA. Though I got it from a second hand bookshop, so I guess that was unethical, lol.

throawayonthe•

35 days ago

the second part of your comment is weaker because libraries a) buy the book b) sometimes pay royalties per-checkout

literalAardvark•

35 days ago

Books worth buying usually have rabid followers who will buy them.

There's been a reasonable amount of research that suggests that piracy doesn't really cannibalise sales from those who can afford to pay.

But I do agree that for some of their categories a time wall would improve their optics.

mitkebes•

35 days ago

I agree, but also you can't wait until something is out of print/unavailable to preserve it. Trying to prevent access to it or limit distribution will probably just result in it being lost media one day.

There's also the fact that just because a something is available to purchase in one country, doesn't mean it's available in other countries. A lot of movies/books/games/etc are geo-restricted in sale, with many countries having no valid methods to acquire them.

The best (but unrealistic) solution would be for people who can purchase legally to do so, while leaving it available for download for everyone else.

TFNA•

35 days ago

> I don't understand why this is a movement that is ethical to get behind. Someone spends months or years of their life dedicated to writing a book. And people celebrate the fact they can get it for free.

Academics have never really made any money off their published research, but rather are paid via their institutions or grants. The publishers make money, but academics themselves are aghast at the publishers taking their edited collections and monographs, doing no proofreading or even no typesetting (that obligation is often on the authors and editors now), and selling the book for hundreds of euro. That’s why authors will almost always send you the PDF for free if you email them.

The celebration is easy to understand if you are a researcher. Getting ahold of publications that your institution doesn’t hold or subscribe to is always a hassle, it really slows you down during the writing process. The shadow libraries turbocharge research. Over the last several years, shadow libraries have gone from a niche to something that pretty much everyone in my field uses daily.

ghusto•

35 days ago

Disallowing copying and sharing of art is a recent development in human history, not the norm.

The normal distribution of music and stories was for others to repeat them, and only recently have we decided it's illegal. I understand that things are different now, and people make a living off of art, but at the same time I find it difficult to care too much for someone who chose to make their hobby their job and refuses to adapt when things change.

dentemple•

35 days ago

Piracy never stopped the music industry, and the folks who were harmed the most by music piracy were the poor, cash-strapped billion-dollar corporations whose entire operating models already depended upon sucking wealth out of the actual, struggling artists who do all the work.

And it seems that piracy has become a net benefit to new and niche artists. (https://www.sciencedirect.com/science/article/abs/pii/S01676...)

I'd posit that the book industry will turn out to be the same. Piracy will harm the bottom line of the companies already at the top while giving exposure to the authors at the bottom. The latter being the ones who often strong-armed into terrible financial deals just to gain access to book-industry's four big gatekeepers, and who likely need that exposure to help keep a roof over their heads.

Anecdotally, I'm one of those folks who end up purchasing many of the books I pirate or otherwise obtain for free, and I'm sure I'm not the only one who does this.

Cider9986•

35 days ago

You can't just start preservation "when the books are no longer for sale." It has to happen asap, there's no telling when something will get harder to find.

akersten•

35 days ago

Personally, having to buy the barely-changed newest yearly edition of half a dozen $300 textbooks per semester of undergrad totally radicalized my view on copyright.

Philip-J-Fry•

38 days ago

•on: The last six months in LLMs in five minutes

I don't want to offend (it's AI coded anyway :)) but that does not scream "high quality" to me. The headline gif on that repo just paints a terrible picture. It can't draw a box correctly, there's random underscores all over the screen. The UI itself is just incredibly incoherent. I don't even know what I'm looking at.

Like, no it doesn't seem like very high quality work... It just seems like a vibe coded tool.

Edit: yes it's wrapping Claude. It's BREAKING the TUI. Not sure what people aren't getting here...

walthamstow•

38 days ago

Take it up with Anthropic. It's actually their billion-dollar TUI product you're commenting on.

The problem with being such a naysayer is that you're entirely disconnected from what's going on. You haven't tried an agent like Claude Code and experienced it for yourself, so you don't recognise what it looks like when it's in front of you.

SlinkyOnStairs•

38 days ago

There are two possibilities here:

1) This tool breaks the Claude TUI. Exactly as described by the comment.

2) The Claude TUI itself is broken. The comment is wrong, but assuming the "billion dollar TUI product" is capable of basic rendering and it's the wrapper that broke it, that is an entirely reasonable assumption

The fun here is that both of these softwares were made extensively using AI. No matter which of our options is the case here, the point stands. An AI-built product was shown, it looks obviously ass.

kstenerud•

38 days ago

The issue is likely that the tmux session being generated is for some reason not propagating all term caps. Most likely it's an interop issue between tmux and docker and the image running under docker - possibly even something with the terminal client that the pipeline doesn't like somewhere.

Claude Code correctly reduces its display to 7-bit ASCII in response (still functional, although less pretty). Once I get around to fixing this, it will probably result in another section in https://github.com/kstenerud/yoloai/blob/main/docs/dev/backe...

Edit: Looks like it's the terminal. That's a rabbit hole for another day.

Running through VS Code's terminal via VSCode tunnel, it looks like it normally does.

https://freeimage.host/i/BySkkDN

oooyay•

38 days ago

What's really interesting in this comment chain is an observation I've expressed a lot more lately. When someone knows an LLM was involved they raise their expectations. I do it too in my own work and I have to remind myself things like "this bug would've also likely occurred with a human working at this level of complexity." The real question is did the operator arbitrarily and knowingly increase the level of complexity or is it appropriate for the task.

wolrah•

38 days ago

> The real question is did the operator arbitrarily and knowingly increase the level of complexity or is it appropriate for the task.

There's one major reason to have higher expectations for autonomous systems (of all kinds, not just LLM-powered) than for humans, at least those intended to be deployed at scale, and that's the scale. If a human makes a mistake, has biases, or even intentionally breaks the rules the impact of their actions is limited by the nature of them being a human, where something like an autonomous driving system, a coding agent, etc. is intended to be deployed by the thousands, millions, or more and any problematic behaviors happen at that scale.

There are obviously millions of bad drivers out there, but every one of the human ones is bad in different ways. If Waymo pushes a bad update there could be tens of thousands of "drivers" that suddenly become bad in identical ways.

Humans also have the ability to learn from our mistakes. The ones you'd want to have working for you usually don't make the same one twice. LLMs are pretty good at making the same mistake repeatedly, even the simplest things like basic math or counting letters.

draftsman•

38 days ago

And there’s good reason for that. Anthropic, OpenAI, Salesforce, and so on have aggressively marketed LLMs as better than humans at working. It’s no surprise when we find out something is build using an LLM, we expect it to match the marketing.

kstenerud•

38 days ago

But what constitutes "better than humans at working"?

Zero defects? Because you can always find at least one defect. But people don't naturally think statistically, so they grasp the thing that confirms their bias and then hang on tenaciously.

You'll notice the incredible amount of vitriol resulting from a purely cosmetic bug (which, it turns out, results from a missing TERM env in the base image - Claude is very conservative when it can't determine utf-8 support with 100% certainty).

godelski•

38 days ago

  > The Claude TUI itself is broken.

I mean this is also true. You forgot the third option, that 1 and 2 are true (and 4th, that neither are).

Seriously, the Claude TUI fucking sucks. I don't know how anyone thinks otherwise. It breaks constantly if you enter your editor (<C-g>), or resizing windows/panes, or making another pane full screen, scrolling, or any number of things. It is objectively a bad piece of software.

And honestly, are we surprised? Anthropic says themselves that a lot of code is written by Claude. They've been saying that for years. If you look at agents now and think "man, agents a few years ago sucked" then this shouldn't be surprising at all! I mean FFS the thing spits out text and they designed it like a fucking game engine. It is silly

Philip-J-Fry•

38 days ago

I have tried Claude code. It doesn't look like that!

I don't know what the project is. All I see is a TUI that looks completely broken.

Go and use Claude Code right now. Does it look like that? Random underscores all over the page. No it doesn't.

walthamstow•

38 days ago

It can look like that in certain conditions. The question is why are you so eager to give critique on unrelated work, appearing in a demo screencap, to someone who didn't produce it?

Philip-J-Fry•

38 days ago

I don't know what you're talking about.

His tool wraps Claude and breaks the TUI. What's so hard to understand?

That's valid critique. What world have I woke up in today?

walthamstow•

38 days ago

To be honest I assumed it was the screencap software running a basic terminal env without bells and whistles that CC needs, which I've seen before. If the actual tool functions like that too, that's not great. That said, it works for them, it works for them.

gilrain•

38 days ago

But earlier:

> The question is why are you so eager to give critique on unrelated work, appearing in a demo screencap, to someone who didn't produce it?

I guess the question was actually, why were you so eager to critique a critique based on a false assumption?

I wish people would be careful what they support with their rhetoric.

albedoa•

38 days ago

> The question is why are you so eager to give critique on unrelated work

That is not the question. The topic of discussion had been defined multiple times before you commented!

embedding-shape•

38 days ago

> Take it up with Anthropic. It's actually their billion-dollar TUI product you're commenting on.

That's like blaming the company making hammers because you're unable to build a lasting house with the hammer, it really isn't up to Anthropic, but all about how you use the tool you're holding.

knollimar•

38 days ago

Do they also hold their hammer wrong when their TUI flickers for months?

embedding-shape•

38 days ago

That's just poor engineering, product building and testing, same can happen with/without LLMs, no doubt.

knollimar•

38 days ago

If the company making hammers can't hold it right, it suggests something about the hammers, no?

arcanemachiner•

38 days ago

In the case of Claude Code, it suggest a lot about the company making the hammers.

embedding-shape•

38 days ago

Yeah, they have bad engineers, product people and testers.

Microsoft is pretty shit at launching products, does that mean "products" as a concept is wrong? No, it just means Microsoft is bad at products, not more than that. Not sure why you have to extrapolate over an entire ecosystem just because one actor is bad at something.

knollimar•

38 days ago

Products isn't the analogy, but in my example it would say something about microsofts tooling and processes.

I wouldn't trust a toolmaker who doesn't know how to use the tools decently.

_carbyau_•

38 days ago

> I wouldn't trust a toolmaker who doesn't know how to use the tools decently.

I agree but would extend that qualification:

I wouldn't trust a toolmaker who doesn't know how to use the tools decently for exactly the same field of expertise.

godelski•

38 days ago

  > No, it just means Microsoft is bad at products

FYI, that's what people are saying...

malfist•

38 days ago

This analogy was trotted out every time someone complained about PHP. It wasn't true then, and it isn't true now.

embedding-shape•

38 days ago

I don't see how it cannot be true. Are you claiming that every developer who uses the same LLM harness + model would produce equal code, regardless of the prompt? That's clearly not true in my experience, and I cannot understand how it could be either.

And if that's not true, then it's quite literally about how you're holding this hammer.

malfist•

38 days ago

There's a cowboy artist that paints with his penis and does amazing work. If I tried that it'd turn out incredibly poorly, I prefer to paint with paintbrushes.

Just because the naked cowboy can paint well with just his penis, doesn't mean a penis is the right tool for painting. It doesn't matter how you hold your penis, it's not the right tool.

freedomben•

38 days ago

> There's a cowboy artist that paints with his penis and does amazing work. If I tried that it'd turn out incredibly poorly, I prefer to paint with paintbrushes.

I can't decide which joke to make, either (little dick joke) "well yeah you'd have to be able to see your paintbrush in order to use it" or (big dick joke) "well yeah, if you can't even hold it in two hands, how are you supposed to paint with it?" so I'll just make both :-D

embedding-shape•

38 days ago

Hmm, ok, I think the penis in case is a bit distracting, can you de-analogize this to their real terms and tell me what this is supposed to mean and be related to developing with LLMs?

malfist•

38 days ago

Just because you _can_ do something with a tool, doesn't mean it's the right tool for the job. Just because someone has contorted their entire process to adapt to a misshapen tool, and gotten good results, doesn't mean that's the right thing to do.

It is reasonable to both use the right tool for the right job, and demand better tools than you currently have. Success with the wrong tool in the wrong job doesn't mean it's the right tool for the right job.

embedding-shape•

38 days ago

> Just because you _can_ do something with a tool, doesn't mean it's the right tool for the job. Just because someone has contorted their entire process to adapt to a misshapen tool, and gotten good results, doesn't mean that's the right thing to do.

Ok, I agree with this, don't use the wrong tool for the wrong job.

> It is reasonable to both use the right tool for the right job, and demand better tools than you currently have. Success with the wrong tool in the wrong job doesn't mean it's the right tool for the right job.

Yes, I agree with this too.

I'm still not sure how this relates to LLMs and particular this specific context. I claimed that the output of your agents depend on the developer driving it. You're saying "not every tool is right for every job", I agree with this too, but is that against/for what I said?

Could you just clearly write out exactly what you're arguing for here, no analogies or metaphors, just plain and simple, because I still feel like we're having two different conversations.

gcr•

38 days ago

They’re talking past each other. For some, “high quality” is a comment about implementation elegance. For others, “high quality” is about duct-taping crude implementations together to fashion a kickass user experience. To most, quality probably involves some convex combination of these.

my-next-account•

38 days ago

I have used those tools, I don't think they're THAT good tbh :P

godelski•

38 days ago

I use claude every single day at work. I've burned hundreds of dollars a week in tokens. But I still think you're being too defensive while attacking Philip.

I'm sorry, but you need to look yourself in the mirror. You didn't like what they said so you jumped to the assumption that they must not have used CC (or any other agent). That if they had, they would have the same experience as you did/do. But this whole thread is exactly that conversation, that those experiences aren't shared. That this assumption is baseless. And you know what? That's okay. We're not robots. We're human. Each of us has our own unique world we live in. It's okay that people don't have the same experience as you. It's okay that their favorite color, food, activity, or whatever isn't the same as yours. I'm glad that we live in that kind of world. That's what makes things like culture. I don't want to live in a hive mind, and I don't think anyone else does either.

vdelpuerto•

38 days ago

That is the same fight the 2D animators were having with 3D aninmation 30 years ago. The resolution is likely to be the same: the tool wins but the fundamentals stay, and the line between competent and incompetent practitioners moves but does not disappear.

godelski•

38 days ago

  > I don't want to offend (it's AI coded anyway :)) but that does not scream "high quality" to me.

Honestly, I think this is where the big divide is. People have massively different opinions on what "quality" is. Which is okay, but it feels like everyone is working under some assumption that quality is this very clear objective measure that we all agree on. Clearly we don't. We didn't before AI and well... if you can't tell that we don't with AI... you need to take a step back.

FWIW, I agree with Philip here. I don't think this screams "high quality" to me. I'm also not trying to take a shit on your project. Nothing screams "terrible" to me, but yeah, it does look a bit sloppy. There's no polish to it. It looks like someone that grades on "it works" and that's fine. But it also isn't everyone's cup of tea. Where the sloppiness comes in is like what Philip said. First thing I saw was the gif and well... I think Claude Code is sloppy. But this is also a great example at how and where LLMs visibly fail. Creating a box in text is pretty simple. There's tons of tools to do it. And the LLM 100% knows about characters like ⌜⌝⌞⌟⎜, it just doesn't use them and doesn't care. The code itself also looks very LLM generated.

It's fine and I don't think you have any reason to be ashamed of it, but I also wouldn't go around boasting that it is an example of high quality work too. And FWIW, I can't think of a single heavily LLM assisted code where I don't have similar feelings. I've seen stuff with more polish, but yeah, they feel off.

  > TUI

This is a space I feel weird in. I love the terminal. I love that there's a lot of new TUIs. But it also feels very weird because it is extremely clear that a lot of these new TUIs were written by people (or machines) that don't really have a lot of experience in the terminal itself. There's a real shared language by people like me who live in the cli. There's a reason people like me can pick up a new tool and guess certain flags and certain ways to use them. It's because of a shared design language that we know of and we end up writing that way because we know it reduces to cognitive load on our peers. But the LLMs? They don't have that shared experience.

I think this is true for a lot of stuff, not just TUIs or bash tools. Things just smell... off...

kstenerud•

38 days ago

You do realize that you're complaining about the Claude Code TUI, right?

That's not what this product is; merely a tool it uses.

pprotas•

38 days ago

You claim "very high quality" but can't even get the basic UI working properly. You wrap tmux and a container in 2k lines of code and claim quality, I think the comment above was aimed at this claim.

kstenerud•

38 days ago

The UI is working properly. Interfering with Anthropic's UI, or any of the other agent harness' UIs it supports, would be madness incarnate.

I also strongly suspect that you'd only taken a cursory glance at the top of the readme prior to passing judgment.

embedding-shape•

38 days ago

I did not much more than a cursory glance too, but found "./sandbox/create.go", a ~1300 lines long file with so much duplication even within just itself that I stopped counting.

Now it was a long time ago I did Go professionally, but I'm also in the camp of "That doesn't really count as high-quality", although I know for a fact you can get quality code out of LLMs, but I don't think that's a good showcase of that.

kstenerud•

38 days ago

> I did not much more than a cursory glance too, but found "./sandbox/create.go", a ~1300 lines long file with so much duplication even within just itself that I stopped counting.

Really? What duplication did you actually find? I count a few small ones in buildMounts and ReadPrompt, maybe 20 lines or so, but hardly anything worthy of such an epithet.

Admittedly, the parsing & escaping code and some utility functions could be moved outside to shrink the file, but otherwise I'm having trouble finding issues with the code.

embedding-shape•

38 days ago

The duplication I'm seeing isn't just "same text repeated" but structural duplication. Doing a quick 5 minute look again just to give you some pointers; runtime.MountSpec construction in buildMounts, Workdir vs aux-dir mount-mode handling, repeated one-off mount append blocks, overlay detection and so on, the list goes on. Just those should account for 200+ lines.

Look for slight variations of the same thing but with different paths, variables, or modes and I think you'd be able to spot the rest as well.

kstenerud•

38 days ago

You consider adding in-place constructed items to an array to be code duplication?

freedomben•

38 days ago

I've noticed that the bar for "quality" when people judge AI is often significantly higher than what they'd hold a human to. I'm not saying GP et al are doing this (I haven't looked myself), but it is a widespread pattern I've noticed both professionally and personally. I don't know why it is.

16bitvoid•

38 days ago

The bar isn't any higher. There's just no grace given. No one is judging a hobby project made by a human on quality, and the person who the hobby project belongs to will rarely say that their code is high quality. And in a professional setting, I think people are fine with "good enough" but they're not going to claim anything is high-quality.

But people are so quick to label their vibe-coded codebase as high quality and no grace is going to be given to a machine.

What comments are you seeing that are calling code from humans high-quality?

whateveracct•

38 days ago

Grace shouldn't be given though. The code from vibe coding should pass the review bar as-is. If you need to iterate, you've defeated the purpose.

Because the end result is people committing bad code. For some random hobby project, sure who cares. But people are using this at work. The codebase is rotting in a new innovative way.

Either the bar has to be set at "actually good code comes out of vibe coding" or you have to accept that codebases are going to steadily become less usable by human coders who use their fingers to type in emacs.

Suddenly every dev needs an agent to even work with the slop. Seems like an outcome Anthropic would love though....

breuleux•

38 days ago

People who use AI set the bar themselves when they claim they generate "very high quality work using Claude". Humans more rarely make such claims about the code they write themselves, but when they do, I expect they face similar scrutiny.

AI code is competent, but it's not great or high quality unless you have a good enough eye for quality to steer it with an iron hand. But if you do, you know the quality comes from proper guidance, so you still wouldn't say AI code is great. If you do say exactly that, it comes across as having low standards (which is fine if you own it) and people are going to jump on that just to bring you down a peg.

ThrowawayR2•

38 days ago

> "I've noticed that the bar for 'quality' when people judge AI is often significantly higher than what they'd hold a human to."

Because that is literally the hype being fed to us by the marketers at the AI companies and HN users promoting AI.

- AI promoters: "AI is doing Ph.D level work! LLMs are not just a token predictor, it is actually thinking and reasoning! It will replace all developers, including _you_, so get on board the AI hype train now!"

- AI promoters when confronted with blatant mistakes and reasoning errors from cutting edge models: "Why are you holding LLMs up to higher standards than humans? That's not fair or reasonable."

kenjackson•

38 days ago

I have seen it too. The answer is easy - they don’t like AI. I've seen similar things with some people that don’t like women in tech or certain minorities - they suddenly critique at an extremely high level. I also haven’t looked at this particular case, but it wouldn’t surprise me to be the same thing here.

embedding-shape•

38 days ago

> I also haven’t looked at this particular case, but it wouldn’t surprise me to be the same thing here.

Be surprised then, because me, who left the critique, probably exclusively programmed with agents for the last year or so, so unlikely I think the code is bad because I "don't like AI". I don't love it either, but wouldn't call myself a AI-hater by any measurements, would be weird to write articles like this if so: https://emsh.cat/en/one-human-one-agent-one-browser/

kenjackson•

38 days ago

Again, I wasn't reacting specifically to you (as noted, I wouldn't be surprised if so, but I also wouldn't be surprised if not). I was making a more general statement.

TurkTurkleton•

38 days ago

Dude, are you for real? We've had the supposed inevitability of AI rammed down our throats since the minute LaMDA convinced Blake Lemoine it was sentient, we've watched CEOs hype up AI as if it were production-ready while it was still barely beta quality, LLM-driven chatbots have been stapled to the side of every product no matter how little sense it makes since OpenAI published an API, and we've been told to prepare for the inevitable "agentic future" even as Claude 3.5 had to have its hand held more than a wet-behind-the-ears freshman summer intern. We're told that this technology is going to eat the entire world economy and render human labor obsolete, starting with our jobs, but if it's genuinely supposed to do that, I think it's more than reasonable to expect it to write superhumanly perfect code, not just code that's incrementally better than the last model release but still bad; extraordinary claims require extraordinary evidence, after all. To liken AI skepticism to the obstacles faced by women and minorities in tech is a category error that trivializes actual human struggles against human prejudices.

everforward•

38 days ago

I looked through and there's a bunch of stuff that's in poor coding practice.

E.g.

https://github.com/kstenerud/yoloai/blob/main/internal/fileu... <- that recursively creates directories, but will only change permissions on the innermost dir (user may be unable to cd into intermediary directories)

https://github.com/kstenerud/yoloai/blob/main/internal/mcpsr... <- all the json.Marshal calls in this file just suppress errors, so if anything un-marshallable ends up in there the app will return empty strings with no errors logged

https://github.com/kstenerud/yoloai/blob/main/runtime/regist... <- `Register` embeds a copy of the code from `IsAvailable` because of the locking; that could be replaced with a private `isAvailable` that has no locking that both use (after doing their own locking)

https://github.com/kstenerud/yoloai/blob/main/runtime/exec.g... <- these functions are identical except for the strings.Trim, one should just call the other and then trim the output

Just out of curiosity, I enabled some other linters and it looks bad. Excluding test files, there are 110 functions with a cyclomatic complexity over 10 and 7 that are _over 50_. The worst is at 86, which is mind-boggling.

Could probably find more, but you get the drift. I'm sure it runs, but stylistically this is more along the lines of what I would expect an intern to do.

This is also sort of nit-picky, but like half the stuff in https://github.com/kstenerud/yoloai/blob/main/docs/dev/backe... isn't idiosyncratic, it's just the way those things work and a lot of them aren't even tricky. The one linked is particularly blatant; that's not limited to os.Stat that's literally just how permissions work. Denying permission on inodes is a property of the folder, not the file.

Philip-J-Fry•

38 days ago

So why has your tool completely broken the Claude Code UI then?

Can't you see in the gif? It's completely broken. My Claude doesn't look like that. Neither does anyone else's.

kstenerud•

38 days ago

Claude Code will automatically "dumb" the TUI down a bit when it can't properly detect certain terminal capabilities, to avoid potential font rendering issues.

Likely there are some terminal caps that aren't being properly preserved inside of the sandbox. It's never bothered me since the agent itself works fine.

Philip-J-Fry•

38 days ago

Yeah, so whatever you're doing to wrap Claude is broken. Because it's breaking the UI.

"It's never bothered me". Cool. But your tool is bugged.

kstenerud•

38 days ago

Feel free to open a bug report if it bothers you. Or a PR.

Or feel free to avoid the tool entirely if this UI issue shakes your faith in its overall quality down to its very foundations.

This is hardly a hill to die on.

sjagauanbdvva•

38 days ago

You’re missing the point.

You claimed high quality and provided a repo.

Did you not expect someone to actually look and critique it?

Whether the visual bugs are a deal breaker or not isn’t the point.

The point is that’s not high quality code, it may work. But it’s not code I would ship at my job and therefore it’s not high enough quality for anyone serious

kstenerud•

38 days ago

Hey that's fine. You're free to make whatever judgment you wish.

But I still stand by the quality of my code, including here. You and I don't need to agree.

What decades of managing codebases (public and private, huge and small) has taught me is that there will always be an endless list of bugs and feature ideas and nice-to-haves and technical debt pressures in any given project. You'll never get to them all, so you prioritize (as I have done here). Functional bugs usually trump visual ones unless they're actually interfering with work.

Will I fix this bug? Probably, now that I'm aware of it. But there are more important matters to attend to first.

Edit: Turns out the bug comes from a mismatch with the terminal I'm using. With other terminals it looks fine. Term caps are surprisingly complicated, especially when you have multiple layers!

gilrain•

38 days ago

> But I still stand by the quality of my code, including here. You and I don't need to agree.

You aren’t having a disagreement with a person. You’re having a disagreement with reality.

kstenerud•

38 days ago

> You aren’t having a disagreement with a person. You’re having a disagreement with reality.

How so? Are you going to instruct us all on how a termcaps mismatch bug is an indicator of poor code quality, rather than an unfortunate bug emerging from within the chaos of the many layers of disparate technologies that must somehow be stitched together (along with their idiosyncrasies) in order to make a project like this work?

sjagauanbdvva•

38 days ago

Because you won’t listen to a word anyone says lol.

You had a visual bug right at the top of the repos README. Then insisted you hadn’t noticed it before.

Whats important is not that specific visual bug, it’s what that bug says about the rest of the code.

How can we believe that this code is high quality if we see a glaring issue 5 seconds into opening the github?

We didn’t seek out your repo and start lobbing critiques at it. YOU POSTED IT as an example of high quality generated code. I’m telling you I am unimpressed

kstenerud•

38 days ago

> you won’t listen to a word anyone says

Really? So the discussion leading to the theory that there's likely a problem with termcaps disparity between layers didn't happen?

> Whats important is not that specific visual bug, it’s what that bug says about the rest of the code.

Really? So you can tell from a single cosmetic bug which doesn't affect its ability to perform its task, that the rest of the codebase is deficient? That's a pretty damn impressive skill!

Hater's gonna hate, I guess ¯\_(ツ)_/¯

The otherwise timid pack always circles after they sense a single drop of blood, no matter how small and insignificant.

eudamoniac•

38 days ago

Dude, you have a glaring visual bug that is immediately obvious, as the first thing shown in the repo, and also would be seen every time you tested the tool, but you didn't notice it at all. That does not bode well for you noticing other aspects of quality in the tool. Maybe that's the only quality issue, but we all very seriously doubt it.

hiw2d•

38 days ago

tbh youve embarrassed yourself here.

kstenerud•

38 days ago

THAT must be why the stars are going up!

Thanks for explaining it for me.

godelski•

38 days ago

Sorry dude, 84 stars isn't that much. It's a good number, be proud, but I wouldn't go boasting about it like you're some hotshot.

https://www.star-history.com/?repos=kstenerud%2Fyoloai&type=...

gilrain•

38 days ago

And as we know, GitHub Stars are the same as truth. Very persuasive.

andai•

38 days ago

I think you can fix that by setting an environment variable (regarding the terminal?) but it was a while since I checked. (I was running Claude as a subprocess and had similar issues.)

Also this reminds me of a principle I learned from a mentor. "People are visual buyers. If it looks good, people will think the code is good."

Unfortunately it doesn't matter whose fault the janky TUI is, people will see that and associate it with your software.

kstenerud•

38 days ago

It's more along the lines of: Anyone with an axe to grind will find something to grind it on.

Early stage products will have some rough edges. We've seen that in Docker, Kubernetes, AWS, Azure, LXC, KVM, etc. And people griped and raged about the sheer incompetence of the maintainers and utter lack of quality, but they still used those tools even before the rough edges were polished away and folks finally settled down.

The less one pays for something, the more entitled one feels to whinge and heap on abuse.

I've been down this road so much now that it's no biggie if a few Karens want to blow off steam at my expense. I'm not above exposing their silliness though ;-)

wasabi991011•

38 days ago

> Early stage products will have some rough edges. We've seen that in Docker, Kubernetes, AWS, Azure, LXC, KVM, etc.

Is your product really the same complexity as these?

kstenerud•

38 days ago

It tackles similar kinds of problems, dealing with idiosyncrasies in Linux distros (and Mac), docker, containers, kata, firecracker, seatbelt, tart, tmux, Claude, the various terminal emulators out there, and trying to herd those cats such that it doesn't blow up in your face.

Is it doing it to the same scale? No - it's a single user app. But have a look at https://github.com/kstenerud/yoloai/blob/main/docs/dev/backe... and you'll see the kind of shit a project like this has to handle. It's not trivial.

blanched•

38 days ago

I’m not sure why you needed a gendered insult to make your point here. Surely there’s a less sexist way to imply someone is bothering you.

pprotas•

38 days ago

"This is, unfortunately, how narcissists behave. It's simply impossible for a narcissist to be wrong. They truly believe themselves to be right, all the time, and will even distort reality around them to "make" it true. And they do it all unconsciously." - kstenerud

wanderlust123•

38 days ago

I think at this point there is no convincing people. Clearly there is value in these tools and it generates code when steered properly. Perhaps your struggles are down to a skill issue.

Philip-J-Fry•

40 days ago

•on: I don't think AI will make your processes go faste...

I don't agree.

I regularly get pieces of work someone product guy has thought up in an afternoon. They only care about the happy path, and sometimes only part of the happy path. I work for a global company that has to abide by rules and regulations in each country we operate in. The product guy thinks up some feature, we implement the feature, then we're told "actually, we legally aren't allowed to do this in 90% of the markets we operate in". Cool, so we add an ability to disable it in those markets. Then they come back "We can do this in some of those markets if it's implemented with [regulatory bureaucracy], so can you do that please".

Then we have to hack away at the solution because the deadline is right around the corner.

This is not software engineering! None of this is related to the software. The job of a software engineer is to take a list of requirements and figure out the way we accomplish those requirements. Requirements gathering is NOT a software engineering problem. Software is implementation, product is behaviour. That's the split. The behaviour of the thing we're building needs to be known before we even try to seriously build it.

If someone just held back for week and did their due diligence, we would been able to architect a solution that is scaleable, extensible, easy to maintain and can make the future easier.

nuancebydefault•

40 days ago

> Requirements gathering is NOT a software engineering problem. Software is implementation, product is behaviour. That's the split.

That's a theory but I've never seen this work in practice. A piece of software is unique. If it weren't, we'd just use the cp command.

What usually happens is you get a set of requirements that looks simple. Then you start thinking about a design and see 10 different possibilities, each corresponding to a slightly different interpretation of the requirements set. You iterate a few times reviewing the designs with who set the requirements and a few peers and see more possible variations to the requirements. You need to double check its parent requirements up to the master requirements. Then you need to take time/feature/quality tradeoffs, affecting the fulfillment of requirements.

Once starting to implement, you see dependencies to other software (framework, sdk, drivers, language features,...) and understand that other software is not what you thought, or has bugs. Or you see an issue with performance or see that one particular feature becomes unfeasible.

That's where all the complexity goes. AI doesn't change that, but can make prototyping iterations and bug hunting faster, as long as someone holds it on a leash and understands its decisions.

marcus_holmes•

40 days ago

I think this was TFA's point about "engineers have been begging to be involved earlier in the process forever". Which is absolutely true.

It has to be someone's job to push back on the Product Guy's stupid idea and answer all the awkward questions about the not-so-happy path with it. Unfortunately, because of the way we've ended up with this process, that person is often the engineer tasked with building it, without any effective political power to challenge the design process.

alkonaut•

39 days ago

That's a software developer's main job. Saying no. Or in rare cases saying "yes, but".

If there is a "hierachy" where product managers are seen as superiors to software development, i.e. where product managers decide what to and then only delegate the implementation to software developers, that product will invariably fail. Don't do that.

serallak•

39 days ago

But that it's not really the case in the example above.

It's not the software engineer duty to know about how a given product is legal in what regulatory environment. That is something that must be hashed out upstream, well before tasking somebody to write a program.

Granted, an expert engineer with strong domain knowledge could be aware of those kind of pitfalls, and offer insights during the product development phase. But again, that should be done before committing to a schedule or making implementation decisions.

gpderetta•

39 days ago

I have only been doing this for the last 20 years, so maybe I don't know, but it seems to me that requirement gathering is trivially software engineering. How can you implement something if you do not know what you need to implement. Figuring out what it need to be built and how to build it are closely linked. A good project manager will help collecting any missing domain knowledge, but ultimately it is on the architect to make sure that all the right questions have been asked.

And it is given that not all requirements will have been discovered before a development start or that they may change during development.

sarchertech•

40 days ago

You realize that we were making software for decades before Product Managers existed right?

My senior year software engineering class had a whole section on requirements gathering.

ajam1507•

40 days ago

This seems more like a failure of management and process than a problem inherent to autonomy.

Philip-J-Fry•

41 days ago

•on: OpenClaw Creator Spent $1.3M on OpenAI Tokens in 3...

But it's a self fulfilling prophecy. They need all this stuff because it's a vibe coded app where bugs are randomly introduced, the architecture is overcomplicated and sucks, and stuff is just added for the fun of it.

Do existing companies run entire end-to-end product integration tests on every single change they make to a repo to make sure something hasn't broken? No, they just architect things in a way such that a minor change to something can be tested in isolation. And that can be automated, deterministically and efficiently.

Where I work we can release changes to our production site in minutes almost completely autonomously with high confidence with absolutely zero AI agents in the loop. How did we do it? With lessons learned from the past 5 decades of professional software development experience.

Lets not forget what OpenClaw is at it's core. It's a glorified cron scheduler. Why on earth does any of this effort need to exist. It's not that deep, it's not that complex, it's all AI for AI's sake.

H8crilA•

41 days ago

OpenClaw has surprisingly few "dumb" bugs. Is it as stable and secure as the Linux kernel? God no, obviously not. But it has never just crashed for me, for example. Bugs are of the type "X with Y and Z disabled and T turned on - doesn't work", where you're likely one of a few people that have ever tried this combination. Not to mention it can then debug itself and file a bug report, with a bugfix - if you give it a GitHub token.

I run it in a firewalled VM and am very conscious about any tokens I give it access to - so far for all I know this was unnecessary.

PS. for me the core feature of OpenClaw isn't the cron, though that is nice. It's the memory and instant extensibility. Like it takes 5-15 minutes to add an SSH tool where all agent requests go through a manual review, together with a good auto loaded description that just works in all future sessions.

lxgr•

41 days ago

For the few weeks in which I’ve been using it, it has brought down the Raspberry Pi it’s running on several times with extreme resource hogging, local history/memory search is broken due to a trivial bug for which all issues are auto-closed by bits, and it has changed its configuration standards a handful of times in a way that broke my instant messaging access to it, just to name a few gripes.

This is clearly an implementation and not a conceptual issue, as I had none of these issues using the same model with Hermes, for example.

shepherdjerred•

41 days ago

> How did we do it? With lessons learned from the past 5 decades of professional software development experience.

Yes, that is _exactly_ the problem that is being solved. Is it easier to spin up some LLMs or pay a team of experienced engineers?

As inference costs fall, which will be cheaper?

Philip-J-Fry•

41 days ago

•on: OpenClaw Creator Spent $1.3M on OpenAI Tokens in 3...

With a lot of these AI tools yea, they release very often. But half the features they add aren't even that useful. They just add shit because they can and they introduce bugs and change behaviour all the time.

Opencode has the same problems. They often do multiple releases of that app a day, yet within the span of a week or two I have had to update my config because some random change has altered the behaviour and my permissions broke. Or I've noticed the way the app renders is suddenly different.

Yet, my day to day usage has barely changed since the version I installed last year. It's like everything changes but nothing changes.

freedomben•

41 days ago

Even claude code has this happen, though perhaps to a lesser extent. I'm getting really tired of having new bugs pop up on me or subtle behavior change near daily that requires me to change things. The most annoying thing ever that was just introduced is a giant spew of context mode crap that Claude aggressively adds to every CLAUDE.md file, and I can't find a way to turn it off. I just have to `git checkout CLAUDE.md` repeatedely right now. If I have to add a bash alias to work around your annoying bug, that's pretty bad.